The Crevices of Audio Compression

Ron Pellegrino, July 2003

What falls through the cracks in audio compression? Much of what contributes to musical subtlety is the simple straightforward answer. And where does that musical subtlety reside? The answer is in the subtle components of the sound, the first components dumped by compression algorithms because those components have been determined to fall below the threshold levels of "normal hearing" and what's required to make saleable music comprehensible. Talk about destructive reference points for musical subtlety - "normal hearing" and saleable music.

Removing subtle components from sound reduces the range of the feelings, the musical heart and soul, carried by the sound and dampens what's not been removed entirely. What stops at serving the motor parts will fail at serving the heart and head parts. Nothing much wrong with toe tapping except that it remains one of the lower functions in music.

Compression removes information. What gets removed first is what's most subtle. And what gets removed is based on our (therefore the engineers making the compression decisions in the design of the algorithms) very limited knowledge of psychoacoustics as it relates to musical values. Who's to determine musical values? The musicians or the audio engineers or both together? In most cases both groups are without conceptual and experiential foundations for determining much less articulating musical values. So audio compression is based on a serious quandary. The subject of musical values is not a significant part of the technical training of music performers and audio technicians so what we're talking about here is trained technicians, not inspired artists. Trained technicians do what they're been trained to do. Their approach is mechanical, not creative. Given the results it seems obvious that neither group knows what they're doing to the music so who should be trusted with music compression decisions?

Ask around. How many musicians and how many audio engineers have seriously studied the field of musical aesthetics, the philosophical study of musical values. Not long ago in a conversation with an academic composer who in California established a thriving university program in music technology and engineering I was stunned again as he summarily dismissed the value for his program of any subject related to philosophy. Obviously he's not the only academic in the music engineering field taking that position. In fact it's endemic to the field.

If the music to be compressed doesn't have much in the way of subtlety there's not much to be lost. Music lovers (and that includes the best of the audio engineers) need to understand that the function of compression is not intended to serve the music, rather it's designed to serve the efficient delivery of music, mostly for commercial purposes. Compression algorithms are designed to reduce file size by discarding audio information that a team of audio design engineers deemed unnecessary when they configured the algorithm.

You've probably heard many of the arguments for dropping sound information - it's above or below the frequency range of ear, it's below the sensitivity threshold of the ear, it's too loud, it's too noisy, the normal person won't notice the difference anyway, etc. With even a little bit of digging those arguments are found to be superficial at best and usually completely specious. For example, the ear is just one part of the hearing system. You also hear with your bones, your body cavities, your organs, and your flesh. And you direct your hearing with your intelligence in the form of your attention which is rooted in your experience and your expectations. Another example is that resultant tones which are created by the interaction of fundamentals with each other and with their respective spectra may be very subtle phenomena but therein lies the life of the sound, the life that gets removed because it's carried by information too subtle to make it through the compression algorithm. If you remove the subtle phenomena you remove the life; it's that simple. And another example has to do with frequencies that are theoretically above the range of hearing but will still interact (heterodyne) with each other to create resultant tones (difference tones) that fall into the commonly accepted auditory frequency range and of course when that happens those resultant tones add life and color to the sound; remove them and out goes the life and the color. The list of examples could go on and on as we examined the musical nature of noise, transients, dynamic levels, phase, etc.

The discussion in the paragraph above is all pure mechanics but it inflects on metaphysics, an area of thought concerned with where the soul in the sound resides. We won't go into a discussion of soul and sound because it would make most audio mechanics squirm. But if you're really interested in musical values don't waste a step when going there and you would be well served by starting with a book by Hazrat Inayat Khan called THE MUSIC OF LIFE.

Any audio recording by its nature is a reduction in the quality and a distortion of the sound that serves the music. Each piece of music recording equipment has its own sound signature, a signature that's easy to hear if your hearing system is in good working order and that's implied if you can understand how to read its specification list - upper and lower frequency response limits, response curve (which frequencies it tends emphasize and which it tends to de-emphasize), transient response, types and levels of distortions, and much more including loads of other loss and distortion details that the particular engineering group defining the specifications decided not to include. Add together the colorations (sound signatures) of all those recording devices - microphones, mixers, monitors, etc. - and the result is not what you'd hear if you were in the presence of the living unamplified music where the instruments project the musical thought unfettered. Instead you end up with canned audio colors more or less representative of the music. We are at the point in the history of music where the majority of music consumers act as though they believe those canned audio colors are the real music, the music just as it should be.

What else is there? those music consumers might ask. And they might ask that question because if they go to concerts what they'll most likely hear is music that sounds as though it's been recorded. And that's because it's being amplified and colored by all the usual audio suspects - microphones, mixers, amplifiers, monitors, and the audio engineer at the helm. Very few people today hear music that is not colored by an audio amplification system. They have no idea of what they're missing so of course you can't expect them to care about what they don't know and that also holds true for most audio engineers who will rarely if ever experience music in its natural state.

I started writing this essay after weeks of exploring and testing compression algorithms on my own music that I was posting on my site. After all that time and effort I'm still very unhappy with what happens to my music when it's treated to the compression algorithms targeted at Internet distribution. Yes I do realize the value of the presentational and demonstrational functions to posting compressed excerpts of my work on my site, but doing so remains a painful exercise in compromise. And that's because compression is actually reflecting my work in a distorted mirror marred by gaps, bumps, and assorted other expulsions.

What bothers me most about posting the excerpt Study 6 - Keb from Pythagoras & Pellegrino In Petaluma is that the simple purity of the sound is badly marred by a common digital compression technique. That compression technique involves, at some arbitrary period, doing comparisons of the information fields as they fly by moment to moment. The tack involves analyzing an audio reference field and then, in the new field, "pointing back" to the reference field to what hasn't changed in the information and just documenting what's new and carrying over what "hasn't changed." That technique is designed to save considerable file space (at the cost of musical subtlety). One problem with this tack involves the process of determining just how much change is required to be recognized by the compression algorithm as a change worthy of note. If the information is deemed too subtle, good-bye, it falls through the crevice. Another problem is that in the process of creating the fields the sound is sampled and compared at an arbitrary period and that results in interruptions that'll mar any music based on continuous sound with subtle variations; and that's an apt description of what happens to the sound for Study 6 - Keb. The compression algorithm is a machine for determining musical value but don't hold your breath while you're waiting for the audio engineers to own up to it; they're just trying to save you some file space and give you more for your money. And in a consumer culture, please tell me what's wrong with that? And we're back again to the issue of musical values.

What appears in the essay above is neither an apology nor an excuse for what you'll hear when you download my music excerpts or my video excerpts. It's just some light and a little heat on the nature of delivering music on the Internet in July 2003. The function of the excerpts is to provide a more complete sense of what I'm discussing in sections of this site such as Compositional Thinking and Visual Music. And as the audio engineers might say "nobody will notice the difference anyway." But I do and that's the subject of all this thinking out loud. Hope it didn't hurt your ears ;-)

