HRTF Explained-How 3D Audio Systems are implemented

by Mayank

You must have all been to movie theaters and then you know the difference between watching a movie in theaters and watching the same movie on your smartphone or television. Watching Netflix and chilling is fantastic but you get a more immersive experience when you watch a movie in theaters.

Here is Why

The key factor I would think of is the immersive sound experience you get inside the theaters. When watching a movie in cinemas, we experience what is called “3D Audio”.The 3D audio experience is much more immersive than the 2D audio we get when watching a movie on our smartphones with earphones. The phenomenon of reverberations or reverbs (which is essentially the reflection of sound waves from ceilings, walls and floors) and HRTF (explained below) simulates this 3D audio. Such mirrored sounds hit our ears from various directions using multiple speakers in front, back, sides and at the cinema hall ceiling.

Now before we dive into the “Tech stuff,” for you, here’s some biology that describes how we hear.

Anatomy of HRTF (Head Related Transfer Function)

Consider an object which produces a sound in space. The sound waves move in various directions. They reflect (or reverberate) off various objects near the sound source when moving, and enter the listener’s ear from several directions. When the sound hits the listener, the size and shape of the head, ears, ear canal, head size, nasal and oral cavity size, and shape all change the sound and influence the way it is heard. The alteration of sound also results in certain frequencies being boosted and others attenuated.

Such frequency differences are the reason each person in the world perceives a specific sound differently (because each person is physically different from each other). The HRTF describes how the sound can reach our ears from a given source.

Credits: Sony

Think of HRTF as a filter before our ears.

HRTF’s most important feature is our ability to locate an object precisely by listening to the sound that it makes. This is an evolutionary phenomenon indeed. In ancient times humans could use their ears to locate their prey and sense threat in dark environments. This is all possible due to the HRTF response in our ears.

Humans have two ears, but they can locate sounds in three dimensions-in range, above and below, at the front and at the rear and at the sides.

How is this possible

Each ear has its own HRTF, which is created by the brain, the inner ear, and the outer ear (or pinna). Our brain measures the transformed sound from each ear and measures the sound’s intensity and arrival times from each ear. After this difference is interpreted by the brain we can locate an object from the sound it produces.

Application of HRTF in consumer electronics

HRTF is used in a wide range of sound devices, such as headphones, to produce a surround sound effect. Home theater setups that we use also use HRTF. Certifications such as Dolby Digital are also based on surround sound effects that use the HRTF in one form or another.

Some HRTF processing is also simulated by using software that does not require external hardware to produce surround sound.

Mimicking the HRTF with current hardware

The most widely used HRTF implementation we see is in theatres. But the problem with such implementation is the fact that I have explained above-every person hears differently. So an HRTF simulated for your ear may sound bad to me.

Some companies have therefore tried to solve this problem on a hardware level by using sensors that calibrate the sound of your audio equipment specifically for you. You as a user need to provide some inputs (as you do when setting up Google Assistant) and the processor will match the HRTF from a database that is closest to your ears with these inputs.

At present research is being performed on achieving true HRTF using AI and Neural Networks.

To some degree we’ve achieved true 3D audio, yet we are still a long way from imitating the natural HRTF of our ears.

How HRTF will be implemented in the upcoming PlayStation 5

In the presentation which took place on March 18th, 2020, PS5’s lead system architect Mark Cerny gave a brief look at Sony’s upcoming console. You can learn more about the event here.

Though Sony could have used the current Dolby Atmos technology in their upcoming console,Cerny said in the presentation, this would require all PlayStation owners to set up a Dolby approved speaker or headphones. Instead, they wanted every PS5 owner to enjoy the immersive 3D surround sound experience with a standard headphone pair using HRTF.

He, therefore, introduced us to Sony’s approach to the implementation of the HRTF:

“The study team took about a hundred people to measure the HRTF of each of them. Each time a microphone was positioned between the left and right ears of an individual. We had the individual sit within an array of 22 speakers and started to play an audio from the speakers, one speaker at a time. We did this for all people, and after 20 minutes we were able to sample the HRTF at over 1000 locations”

said Mark Cerny, Lead System Architect (Sony Interactive Entertainment)

After sampling the HRTFs, they analyzed the sound from each of those 1000 locations and after some lengthy frequency domain computations, Sony built a custom hardware unit along with some complex algorithms to drive the unit which they call “Tempest 3D AudioTech.”

Setup for measuring HRTF
Credits: Sony

They put the advantages of parallelism in GPUs to their benefit in designing the Tempest engine. The PS5 uses an AMD’s RDNA 2 based GPU with 36 CU (Compute Units). In order to squeeze full output from this GPU, they custom-built each of the compute units.

Perhaps in the future, they will develop an algorithm that can calculate the size of your head, torso, ear canal, etc. through Neural Networks and AI to better optimize the HRTF for you.

Got bored with all that tech jargon?

No worries! The key thing you need to consider is how the Tempest Engine affects your gameplay:

Consider playing an FPS game, for example, take Battlefield V; if you hear an enemy approaching you from behind, using existing 3D audio, you can estimate the enemy’s position behind you. You will certainly not be able to tell, however, whether the enemy is to your right or left.

Tempest Engine takes a lead here. Using HRTF’s benefit, we’ll be able to sense the enemy’s exact position and quickly take him out.

Now all we have to see is how the surround sound experience continues to develop to the point that through our ears we can hear the true sound of the scene.

So, did you like this article? Show us some support by subscribing to our newsletter. Also comment if you didn’t understand some part of this article.

Related Posts

Leave a Comment