AAC Audio LOD for Game Streaming

The Problem
You are streaming game audio to a player over mobile or web. The connection is bad. 3G or spotty wifi. The audio needs to keep playing without cuts or buffering.
Raw audio is huge. One second of stereo audio at 44100 Hz (Hz means samples per second: how many times per second the sound wave is measured) is about 176 KB. On a bad connection you might only get 50-100 kbps (kilobits per second: how much data flows through per second). So you need compression.
What is AAC
AAC stands for Advanced Audio Coding. It is an audio compression format. Same job as MP3 but better. At the same file size AAC sounds better. Every browser and every phone supports it.
AAC works by removing sounds humans cannot hear. Example: a loud drum hits at the same time as a quiet flute. Your ear cannot hear the flute anyway. AAC throws away the flute data. This is called perceptual coding. It also uses math transforms (converting the sound wave into a different representation that is easier to compress) to store the remaining sound with less data.
What is LOD
LOD means Level of Detail. In 3D graphics: a tree far away uses fewer polygons. A tree close up uses more. You adapt quality to what the situation needs.
Audio LOD is the same idea for sound. You prepare multiple versions of the same audio stream at different quality levels:
High: AAC at 128 kbps. Full quality.
Medium: AAC at 64 kbps. Still good. Most people cannot tell the difference for game audio.
Low: AAC at 32 kbps. Thin but usable. Roughly voice chat quality.
How It Works
Two sides.
Server side. You encode the audio into AAC at multiple bitrates (bitrate: how many bits per second the audio uses, lower means smaller data means worse quality) at the same time. You chop the stream into small chunks, usually 2-4 seconds each. Each chunk exists in all quality levels. This is called adaptive bitrate streaming.
Client side. The player's device monitors its connection speed. If bandwidth drops: it requests the next chunk at a lower quality level. If bandwidth recovers: it steps back up. The switch happens at chunk boundaries so there is no glitch.
The Protocol: HLS
HLS stands for HTTP Live Streaming. It is the standard protocol for this. It uses a playlist file (called m3u8) that lists all the chunks and their quality levels. The player reads this playlist, checks its current bandwidth, and picks which quality to download next.
HLS uses AAC as its default audio codec (codec: coder-decoder, the algorithm that compresses and decompresses the audio). This is why AAC is the right choice here.
Why AAC Over MP3
AAC at 96 kbps sounds equal to MP3 at 128 kbps. You save about 25% bandwidth for the same quality.
AAC handles low bitrates better. MP3 at 32 kbps sounds bad. AAC at 32 kbps sounds acceptable.
AAC is the default codec in HLS. MP3 needs extra work.
Every modern browser and phone decodes AAC in hardware. Low battery usage.
Summary
You encode your game audio into AAC at 3 quality levels. You chop it into small chunks. The player picks the best chunk it can download in time. Connection gets worse: quality drops. Connection recovers: quality goes back up. No buffering. No stops.
That is audio LOD. Same idea as polygon LOD in graphics but for sound bitrate.





