INTRODUCTION TO INTERNET AUDIO

Basic Overview of Digital Audio

Your ears are analog devices. This means that as sound waves reach our ears they are converted into mechanical pulses that the brain can understand. Your computer is a binary device, which means that it can only understand messages that are described in ones and zeros.

In order to convert an analog signal to a digital signal, a converter executes several operations. The main operation is the sampling of the incoming signal and the conversion into a 16-bit desciption of each sample. The standard sampling rate for digital audio on CD is 16-bit, 44.1kHz. This rate was standardized early on. A fellow named Nyquist figured out that the sample rate needed to be twice the highest frequency you want to hear. Most people can hear up to 20kHz so it was decided to go a little bit further than that to 22kHz. The range of human hearing is 20Hz to 20kHz. 20Hz is the lowest frequency you can hear. Rap records try to utilize low frequencys. 20kHz is the highest frequency you can hear. A dentist drill is a very high frequency.

Electrically, an analog audio signal looks like a bunch of wavy lines on an ocilliscope. When you use a hard disc audio recorder on your computer the program will represent audio in this manner.
The converter looks at the amplitude of the incoming signal 44,100 times per second! The amplitude is then described using 16 digits of either ones or zeros. This 16-digit number is called a word. This stream of words is recorded onto your hard drive and is converted back into analog audio by the program you are using to edit your audio. In other words, every audio file on the computer is just series of ones and zeros grouped into 16-bit words. These audio files are refered to as uncompressed digital audio. The most popular file formats for uncompressed digital audio are .wav (Windows Audio Volume- a windows native file format), .aiff (Audio Interchange File Format- the Macintosh version of a .wav) and .sdii (Sound Designer II - proprietary file format used by Digidesign for their suite of programs such as ProTools, Sound Designer and AVID

One digitized, audio can then be manipulated and processed using simple mathmatical calculations like addition, multiplication and time delay to create effects that we know as reverb, equalization, echo and phase shifting (Jimi Hendrix favorite).

Recording audio onto your hard disc is easy with the right tools but the file size is huge. It is estimated that one minute of stereo audio at 16-bit 44.1kHz has a file size of about 10.5 megs. This may not seem like a lot of space if you have a 27 gig hard drive but let's looks at it the way the Internet sees it. With a modem speed of 28.8 Kbps, each meg of information takes about 35 seconds to download. So a 60 second spot will take about 6 minutes to download. Not a very efficient way of transmiting data! For this reason, several companies have developed methods of reducing the files size so that reasonable quality can be maintained and the file sizes can go down dramatically.


Internet Audio Technologies

As we learned, several companies have developed methods of reducing the file size of 16 bit 44.1KHz audio into a more managable size for Internet distribution. The method of reducing the file size is known as a codec, which is short for compression/ decompression. Remember that .wav's and .aiff's are uncompressed audio. Applying a codec to an uncompressed audio file will yield a compressed file that is smaller in size yet (hopefully) maintains the sonic integrity of the original file. You might be familiar with WinZip or Stuff It. These programs compress computer data into smaller files that can be emailed or distributed in less time over the internet.

A codecs decide which information is unecessary and throws it away and as a result the file size is smaller. We learned that stereo audio is about 10.5 megs per minute. Mono audio will be half that size since stereo is actually two mono files! A codec works in this way but it does its magic by reducing bit resolution rates (16 to 8 to 4 bits) and reducing sample rates (44.1k to 32K to 22k to 11k).

Bit resolution is an important component for the fidelity of an audio file. Each reduction in bit resolution results in a less accurate description of the amplitude of each sample. For example, if I asked you to measure a wall using only full sheets of 8.5" x 11" paper, you would be able to give me a number (say 10 sheets) that will represent the height of the wall. When you get to that last sheet of paper, you might find that the wall is actually 9.5 sheets high, but the criteria is to describe the height using whole sheets of paper. So you opt to say 10 sheets. This is equivalent to 8 bit resolution. Now remeasure that wall with index cards. You will find that you can get much closer to describing the actual height of the wall because your measuring unit is smaller. This is equivalent to 16 bit.

Sample rate reduction affects the frequency response of your audio file. Remember Nyquist? The sample rate needs to be twice the highest frequency you plan to encode. 44.1 KHz is the standard for CD Quality audio. This means that the upper limit on the high end is 22.05 KHz which is beyond what most people can hear. 32K will give you a high end limit of 16K which is just below what the average, middle-aged person can hear (we lose high end response as we age). This sort of reduction in high end is almost undetectable to the average listener. 22K bit rate will limit the high end to about 11K. Cymbols on a drum set live in the 10K range so you can see that we are still at an acceptable frequency response (abeit slightly dull) which will be perceived as good quality by the majority of listeners. Also notice that the sampling frequency is now half of it's orignal 44.1 and therefore the file size is also half as large. Each reduction of these parameters yeilds a smaller file size but at the cost of fidelity. The race in this field is to provide a small file size and excellent quality. No small task.

There are two types of delivery over the Internet, Download and Streaming. Every platform can be downloaded. You can post or send a .wav or .aif to someone via email. They might not be too happy about it but it can be done. The result of downloading a .wav or .aif is massive connect times on the Internet because the files are so big.

Someone, somewhere thought that if the file could be reduced in size, yet maintaining the quality, then they would be the kings of Internet delivery. The most common form of downloadable audio delivery is mp3's or mp2's. These codecs analyze the audio information and result in a compression scheme of 5:1 with almost no detectable loss of fidelity. This means that a 60 second spot that was originally 10.5 megs is now 2.1 megs or smaller. The way this is accomplished is through the use of variable sample rates, variable bit rates and perceptual coding.

Look at a typical radio spot with intro music, voice over then outro music. When the voice comes in, the music drops down in level and is at times masked by the voice itself. Codecs analyze the waveforms and give the most bits to the voice (which is upfront) and less bits to the music in the background. There is no need to encode the music in full fidelity since it is covered by the voice most of the time.

Streaming media is the ability to see or hear content on demand from a web site. This is a hot field in the Internet world. The main players in the field of streaming are Apple's Quicktime, Real Network's Real Media and Microsoft's Media Player. These three companys lead the march to provide high quality media streams at the lowest bit rate possible. At one time, each company's player would only play their own files, but these days, most players able to decode all the other formats. Isn't direct competition grand?


Creating Digital Sound Files

The studio that creates your spots will most likely create the spots in one of several popular programs available to the pro audio market. These programs go by names such as ProTools, Cool Edit Pro, Sound Forge, Sonic Solutions, Sadie, TripleDAT, SAW and Sound Edit 16. Your studio already produces high quality .wav or .aif files that can be converted into other formats using a number of software encoders available. The most popular of these are Real Producer, Quicktime Pro, Audio Catalyst and Windows Media Encoder.