"The purpose of the ears is to point the eyes." ^{3}^{3}Georg von Békésy. https://en.wikipedia.org/wiki/Georg_von_B%C3%A9k%C3%A9sy.
As with vision, hearing is three dimensional.
Our system of hearing is binaural.
To specify the location of a sound source relative to the listener, we need a coordinate system. One natural choice is the head-centered rectangular coordinate system shown above. Here the x axis goes (approximately) through the right ear, the y axis points straight ahead, and the z axis is vertical. This defines three standard planes, the xy or horizontal plane, the xz or frontal plane, and the yz or median plane (also called the mid-sagittal plane). Clearly, the horizontal plane defines up/down separation, the frontal plane defines front/back separation, and the median plane defines right/left separation.
However, because the head is roughly spherical, a spherical coordinate system is usually used. Here the standard coordinates are azimuth, elevation and range. Unfortunately, there is more than one way to define these coordinates, and different people define them in different ways. The vertical-polar coordinate system (shown below on the left) is the most popular. Here one first measures the azimuth as the angle from the median plane to a vertical plane containing the source and the z axis, and then measures the elevation as the angle up from the horizontal plane. With this choice, surfaces of constant azimuth are planes through the z axis, and surfaces of constant elevation are cones concentric about the z axis.
The spherical coordinate system is defined as:
$$r\mathrm{\equiv}(r,\theta ,\phi )$$ |
$r$ = radial distance or radius
$\theta $ = elevation
$\phi $ = azimuth
Moreover, a position $r=(r,\theta ,\phi )$ represented in spherical coordinates can be related to the same position represented in Cartesian coordinates $x=(x,y,z)$ using
$x=r\mathrm{sin}\theta \mathrm{cos}\phi $
$y=r\mathrm{sin}\theta \mathrm{sin}\phi $
$z=r\mathrm{cos}\theta $