2004-09-01

Lossless

There are two parts to this whole whoop-tee-do about music compression - the part you hear about, called "psychoacoustic modeling", and the part you don't hear about, called nothing, because you never hear about it.



The first part is somewhat obscure as to what it actually is. I spent a day at an Audio Engineering Seminar at the University of Washington and learned more about analog-to-digital coverters and compression and human hearing than I thought possible. (We have world famous sound guys here in the Pacific Northwest because of Microsoft, Real, and Mackie.)



When I first heard of psychoacoustic modeling, I thought it meant that they had figured out the psychology of listening, and knew, for instance, that if the drums were loud, you weren't concentrating on the trumpet, or some such magic.



Actually, it probably has very little to do with psychology. It's more about how your ear actually works physically. Of course, since we don't cut people open to see how their hearing works while they are alive, the details of how human hearing works are unknown at this time.



Anyway, what psychoacoustic modeling actually means is that when two frequencies are really close to each other, then depending on their relative volumes, you may only hear one of them.



So to compress drums and still have them sound good, you could probably throw away half or more of all the frequencies in the drums.



The more frequencies there are in the music, the more you can throw away without most people hearing it.



There you go. That's the first part - the "famous" part - the psychoacoustic modeling part.



The second part you never hear about is simply this: missing frequencies are expensive to encode the regular way (in .wav or pcm files or as they say, in the time domain), but can be cheaply encoded in the frequency domain (the way .wma or .mp3 or .rax files are stored).



What this means is that if your music has a low enough number of frequencies in it, then an mp3 file can nearly losslessly encode it. It's simply a more efficient encoding. Nothing needs to be thrown away except silence.



Well, I guess it's true that if psychoacoustic modeling throws away the frequencies you can't hear, then the ones that aren't there would count in that, but that's not generally what they are talking about.



Silence is golden. In this case, silent frequencies are golden.



The ramification of this is that you can compress music with mp3 or wma to a high degree and literally not lose any of the real music. Some math errors creep in, but they are minor.



When Microsoft says wma at 192 kbps is "transparent CD quality", and Sony's new Hi-MD Minidisc player can "transparently encode" CDs at 256 kbps, they mean it!



That compression is nearly lossless.



Nobody will ever tell you that, except me. I think they (whoever they are) don't want to draw attention to the fact that at high bit rates, "lossy" compression isn't.

No comments:

Post a Comment