Running Linux, 5th Edition - Matthias Kalle Dalheimer [157]
Other issues that arise when storing sound in digital format are the number of channels and the encoding format. To support stereo sound, two channels are required. Some audio systems use four or more channels.
Often sounds need to be combined or changed in volume. This is the process of mixing, and can be done in analog form (e.g., a volume control) or in digital form by the computer. Conceptually, two digital samples can be mixed together simply by adding them, and volume can be changed by multiplying by a constant value.
Up to now we've discussed storing audio as digital samples. Other techniques are also commonly used. FM synthesis is an older technique that produces sound using hardware that manipulates different waveforms such as sine and triangle waves. The hardware to do this is quite simple and was popular with the first generation of computer sound cards for generating music. Many sound cards still support FM synthesis for backward compatibility. Some newer cards use a technique called wavetable synthesis that improves on FM synthesis by generating the sounds using digital samples stored in the sound card itself.
MIDI stands for Musical Instrument Digital Interface. It is a standard protocol for allowing electronic musical instruments to communicate. Typical MIDI devices are music keyboards, synthesizers, and drum machines. MIDI works with events representing such things as a key on a music keyboard being pressed, rather than storing actual sound samples. MIDI events can be stored in a MIDI file, providing a way to represent a song in a very compact format. MIDI is most popular with professional musicians, although many consumer sound cards support the MIDI bus interface.
File Formats
We've talked about sound samples, which typically come from a sound card and are stored in a computer's memory. To store them permanently, they need to be represented as files. There are various methods for doing this.
The most straightforward method is to store the samples directly as bytes in a file, often referred to as raw sound files. The samples themselves can be encoded in different formats. We've already mentioned sample size, with 8-bit and 16-bit samples being the most common. For a given sample size, they might be encoded using signed or unsigned representation. When the storage takes more than 1 byte, the ordering convention must be specified. These issues are important when transferring digital audio between programs or computers, to ensure they agree on a common format.
A problem with raw sound files is that the file itself does not indicate the sample size, sampling rate, or data representation. To interpret the file correctly, this information needs to be known. Self-describing formats such as WAV add additional information to the file in the form of a header to indicate this information so that applications can determine how to interpret the data from the file itself. These formats standardize how to represent sound information in a way that can be transferred between different computers and operating systems.
Storing the sound samples in the file has the advantage of making the sound data easy to work with, but has the disadvantage that it can quickly become quite large. We earlier mentioned CD audio which uses a 16-bit sample size and a 44,100 sample per second rate, with two channels (stereo). One hour of this Compact Disc Digital Audio (CDDA ) data represents more than 600 megabytes of data. To make the storage of sound more manageable, various schemes for compressing audio have been devised. One approach is to simply compress the data using the same compression algorithms used for computer data. However, by taking into account the characteristics of human hearing, it possible to compress audio more efficiently by removing components