2004-09-02

Bits

If you have enough bits to store lots of frequency data, then "compressed" music can sound just like the original.



What's interesting is what happens when you run out of bits to store all the frequency data.



That's when the "psychoacoustic" modeling kicks in. The encoder wants to throw away data and so the psychoacoustic model is used to throw away the frequencies you are least likely to hear.



A big optimization is to simply throw away big hunks of information: switching from stereo to mono, or using a trick called joint-stereo, where each frequency is allocated to only one side of the stereo image. Also just throwing away the high end frequencies is good when you're encoding at low bit rates. You've heard it a million times that people can hear up to 20kHz but that's a kid with perfect hearing. Dropping that limit down significantly gets rid of a lot of frequencies.



Even so, at really low bit rates, there still aren't enough bits for all of the frequency data you might want to store, so the frequencies that your ear just most likely won't hear are tossed.



I'm listening to this internet radio station right now on decent computer speakers and it sounds pretty good! It's a 20kbps .wma stream from WKSU (an NPR classical music station). The volume isn't too loud so I don't hear how bad it sounds and I'm busy typing this so I'm not being too critical.



Listening with headphones, as I am now, is more irritating. It just sounds hollow. That's probably because a boatload of frequencies are missing.



The most irritating artifact of low-bit rate encoding, especially in mp3 files, is something I'll call flutter, since I don't know if there is an official digital term for what happens.



Interestingly enough, music is compressed into frames (about 38 per second), just like movies are in frames. Each frame is encoded with separate rules. If you do Variable Bit Rate encoding, then the rules for each frame are really different - including the bit rate.



The flutter, I believe, comes from each frame getting encoded with slightly different rules. In one frame there is a certain frequency and in the next frame it is missing. This would occur for frequencies that were on the border for getting tossed. At a low enough bit rate, there will be lots of these kinds of frequencies. So they "flutter" in and out and it is REALLY irritating.



Someone should fix that in an encoder - put in some history or something so the frequencies on the edge don't flip back and forth. The mp3 format doesn't need to change - just the encoder.



Somone get to work on that!

No comments:

Post a Comment