Introduction to Digital Audio
Fall Semester 2002
Instructor: John Sappington <--email link 415.577.0807
Class Schedule: Lecture – Thursday 7:00-9:50 PM
The instructor will be available to assist MULTM 130 students in the lab (A-303), from 5:30 to 6:30 pm on Mondays and Tuesdays. For additional assistance or consultation please arrange to meet with the instructor during office hours.
Office Hours: In the lab (BFL-205) at the following times:
Monday 5:30 – 6:30 pm
Thursday 5:30 – 6:30 pm
Appointments are preferred though not required.
Students will learn the basics of digital audio production and presentation with a focus toward presentation via the network based technologies.
Students are required to obtain a 250MB Zip disk formatted for either the Macintosh or PC to store small project files.
Students should also purchase several blank recordable CDs to store larger movie files. Ideally the student will have a minimum of 4 CD’s to last the length of the semester. Some students may need significantly more, others significantly less. Recordable CD’s are especially recommended for students who intend to work on assignments at home.
Students are also required to have an e-mail account. This can be accessed through the lab and may be a free account like those available from Hotmail, Yahoo, Freemail, Excite, etc. The instructor will assist students in obtaining an e-mail account if necessary only on the first day of class. Students will be expected to check this e-mail account at least once a week. Assignment grades, class progress, and general class announcements will be reported via e-mail.
50% of student grades will be based on class participation. Students must attend class in order to effectively participate.
30% of student grades will be based on the completion of periodic exercises. These exercises will not be graded for skill or content, but will be checked only for completion. Incomplete assignments will be considered undone. Grades will correspond to the percentage of weekly exercises completed as follows.
20% of student grades will be determined by averaging grades on three evaluative class assignments and one final exam.
Students are expected to attend all of every class meeting unless they have received prior permission from the instructor. Attendance will be taken at the beginning of each class meeting. Anyone absent when attendance is taken will be assumed absent from the class. If you are late to class it is your responsibility to make sure your attendance is acknowledged by talking to the instructor.
If a student misses two weeks of class without being excused, it can be assumed that the student is intending to drop the class. Because this class meets only once a week that means two absences will be grounds for the instructor to drop you from the class.
If you are intending to drop the class, it is your responsibility to drop the class. You should not assume that the instructor will automatically drop you because of absences. If you stop attending classes, you do not drop the class, and the instructor has not dropped you from the class; the instructor may be required to give you a grade of F for the class.
Introduction to Digital Audio
(subject to change)
Fall Semester 2002
Class web site: http://www.basearts.com/audio.htmSession II
# 1 - Submit a Microsoft word file containing a paragraph which briefly describes your audio project. Describe original audio sources, intended output format and/or use. What programs will be required to complete this project? What other media will be used in conjunction with the audio. (web pages, graphics, video, animation, etc.). This file should be named as follows: firstinital, lastname.1.(wav or aiff) – jsappington.1.doc
# 2 – Copy to my drop box a first draft of your audio file(s)as 16 bit, 44.1k .wav or .aiff file(s). This file should be named as follows: firstinital, lastname.2.(wav or aiff) – jsappington.2.wav
#3 – Copy to my drop box your final draft of the audio recording in a 16 bit, 44.1k wav or aiff file. This file should be named as follows: firstinital, lastname.3.(wav or aiff) – jsappington.3.wav
#4 – Final – copy to my drop box the final presentation of your project. Minimum size is 1 minute in length and must contain a combination of 3 sources (Sources may include: cd audio, mp3s, collected wav or aiff files, field recordings ( personal recordings made from real to life sources). Presentation may take any of the forms described in your original proposal ( Flash, embedded in HTML or as a 16 bit stereo/ wav. or aiff) If you are sumbitting a wav or aiff file please include playlist information within the file and complete the authorship summary. This file/folder should be named as follows: firstinital, lastname.4 ( if individual files must include (.wav or .aiff) .
– Peak/Sound Forge - Extracting existing
- Digital Signal Processing
/ Effects - Begin class
– File types / saving out / exporting
to various delivery
audio via the Web .
Assignment: bring in source audio / recordings / music & individual project descriptions are submitted. #1
Integrating audio into media applications.
Assignment: Present first draft audio files from within
Assignment: Present second draft audio file from within another media delivery application (Shockwave, Flash, Premiere). #3
- Submit Final Draft - #4
Guests: Matmos - http://www.brainwashed.com/matmos/
Final Presentations & Critique
Introduction to Digital Audio
from MSP: The Documentation
Cycling '74 and IRCAM, December 1997
Technical Links - Application Notes - Codecs - Artists - Preparing Files - Digital Audio Overview - Tim White's Online Resources
|Resources - Technical||Artists/ Experimental|
Adobe Technical Guide
to Digital Audio
If you like to choose music to match your mood or perhaps pull you out of one, IBM has created an online tool that will be of interest. The tool called the Glass Engine allows users to surf a selection of Philip Glass works not only by standard categories like work title and year, but also by more subjective characteristics like joy, sorrow and density. If a selection is too soft, crank up the intensity. This is not an exercise in Artificial Intelligence: Glass' longtime producer and sound designer, Kurt Munkacsi, has assigned values to the tracks.
John Oswald : Plunderphonics
Leonardo / the International Society for the Arts, Sciences
Realab - Realtime Experimental Audio Laboratory
UCI Music Department
Center for New Music and Audio Technologies
CDeMUSIC / Electronic Music Foundation, Ltd.
The Music, Mind and Machine Group at the MIT Media Laboratory
PixelZoom [feat. spo0ky]_
The PixelFocus and PixelZoom projects are based on the spatial
dimension of sound. Using multiple microphones connected to a stereo input
or a stereo music cd, we can manipulate images based on the position of
the sound source.
George Legrady's Information
Theory - noise
A Catalog of Kinetic Sculpture by Norman Tuck
Electronic Music - Historical Links
About Shockwave Audio
Shockwave Audio is a technology that makes sounds smaller and plays them faster from disk or over the Internet.
Shockwave Audio can compress the size of sounds by a ratio of up to 176:1 and is streamable, which means Director doesn't have to load the entire sound into RAM before it begins playing. Director starts to play the beginning of the sound while the rest of the sound is still streaming from its source, whether coming from disk or over the Internet. When used properly, the Shockwave Audio compression and streaming features provide fast playback of high-quality audio, even for users with relatively slow modem connections to the Internet.
Compression quality in Shockwave Audio
Although Shockwave Audio uses advanced compression technology that alters original sounds as little as possible, the more a sound is compressed the more it is changed.
Set the amount of compression by choosing a bit rate setting in any of the Shockwave Audio Xtras. The bit rate is not related to sampling rates you may have used in other audio programs. Try compressing the same sound at several different bit rates to see how the sound changes.
Choose the bit rate appropriate for the intended delivery system (modem, ISDN, CD-ROM, hard disk, and so on), the type of movie, and the nature of the sound itself. Voice-over sound quality, for example, may not need to be as high as that of music. Test the sound on several systems to find the right balance between quality and performance.
The more compressed a sound is, the faster it streams. If you choose to use a high quality and low degree of compression, a slow delivery system may not be able to send the data fast enough, resulting in gaps during playback. Most developers choose 16 Kbps for the best results over the Internet.
The following table suggests some general guidelines for
setting the bit rate for different delivery systems. It also provides
a rough estimate of perceived quality for different rates of compression.
Note that real transmission times may be slower than the times shown in
this table, depending on network traffic and server load. Delivery
Note: Any sound compressed at less than 48 Kbps is converted
16-bit Big Endian and 16-bit Little Endian
24-bit Integer and 32-bit Integer
32-bit Floating Point and 64-bit Floating Point
MetaSound/MetaVoice Codecs (Mac OS only)
QDesign Music Codec
MACE 3:1 and MACE 6:1
MPEG Layer-3 Codec
Indeo® Audio Software
Microsoft GSM 6.10
Microsoft CCITT G.711
Microsoft IMA ADPCM
Keep everything DDD if you can. Audiophiles use the three Ds to refer to a track that's recorded digitally, produced (mixed and edited) digitally, and mastered digitally. Although this here Webmonkey is an analog junkie who smiles lovingly upon 2-inch tape, he knows that unless your product is going to vinyl, you should take advantage of the cleanliness of a fully digital process.
Make sure you normalize your files. Most audio editors have a "normalize" command, which lets you trim off amplitudes above and below certain levels. This leaves you with a steady and stable signal that doesn't peak or drop out dramatically, which is very important for digitized audio.
Use a good pair of headphones or speakers as your primary monitor. This may seem obvious, but it's crucial to do this throughout the entire production and post-production process.
Equalize your files. As you probably already know, equalization
balances waveforms so that the output contains the proper ratio of treble
to bass. Optimally you should do this with outboard gear - like a rack-mounted
unit - but most audio editors have a least a five-band EQ. For advanced
audio engineers, the EQing process is part of a larger affair called mixing,
but everyone doing any audio should equalize the files before mastering
them. And be careful with the bass - built-in computer speakers can't
handle bass response very well.
Downloadable audio was born from the combination of two technologies - the internet and audio compression. With access to the internet infiltrating homes over the past decade, only one element was necessary to achieve the obvious desire to get music from websites - audio files which could be downloaded in a reasonable amount of time. CD-quality audio (44 kHz, 16 bit) occupies about 10 MB per minute in storage. A typical song of 30-40 MB would take more than 2 hours to download using conventional modems, making internet distribution impractical - welcome mp3
While many audio compression methods exist, MPEG-1 Audio Layer 3 (or mp3) was the breakthrough everyone was waiting for. Utilizing the concept of masking, this compression technology provided CD-like quality at a fraction of the file size - that 30 MB song, once compressed, took up only 3 MB. Suddenly, music was downloadable at 15-20 minutes per song.
MP3 took the music industry by storm, and whatever your views of Napster or its clones, digital downloadable music is here to stay. Music distribution is in a state of transformation the likes of which we have not seen in generations. As broadband connections to the internet become commonplace, and as the internet reaches more homes, people will choose to get their music over the internet. Musicians should embrace this technology and this method of delivery.
The purpose of digital audio is to store sound in a format which is non-destructive, which means it doesn’t get degraded when copied. Analog sound gets worse with every copy made, as with a cassette for instance. Just as digital is converted to analog through a D/A converter, analog is converted to digital through an A/D converter. The fundamental element in this process is called sampling.
To produce CD-quality digital audio, the analog audio must be sampled at 44 kHz or higher. The human ear can detect frequencies up to about 20 kHz. According to the Nyquist principal, the sampling frequency must be at least two times the highest frequency that you want represented. Otherwise, aliasing will occur, which means that the frequencies above 1/2 the sample rate will be represented inappropriately as lower frequencies, because they were undersampled.
This is the reason for the 44 kHz – it is roughly twice the 20 kHz that we can hear. Some commercial audio recording software boasts sampling rates as high as 96 kHz, but unless you want your dog to enjoy those higher frequencies which you can’t hear, there’s no physical reason to go above 44 kHz.
Good temporal resolution can be achieved with commercial analog-to-digital conversion technology. A useful analogy is to compare temporal resolution (sound) with spatial resolution (imaging) – audio frequency is expressed in cycles per second, while spatial frequency is expressed in line pairs per millimeter; the greater the frequency, the better the detail. When converting an analog image into a digital image (as with a digital camera), the spatial resolution or detail of the resulting image depends on how many samples you take of the image over its area, or how many pixels you have. Commercial digital cameras have improved in spatial resolution in recent years, but they remain a far cry from the resolution of conventional film, which is on the order of a few microns.
The goal is to sample the image, or the audio waveform, at a sample rate which exceeds two times the highest frequency that can be resolved by the viewer, or listener. In audio, that frequency is 20 kHz, and the sample rate must therefore be above 40 kHz (44 kHz is the current standard for CD-quality audio).
In practice, utlilizing
a greater bit depth reduces noise in your audio, as well as quantization
distortion. Quantization refers to the practice of assigning whole numbers
to an input voltage level. Since rounding always occurs, the smaller the
step size the better – a greater number of bits achieves this. CD-quality
audio uses 16 bits. Sometimes bit depth is referred to as resolution,
not to be confused with temporal resolution, which refers to the sample
Greater bit depth increases the dynamic range. The dynamic range is the range between the lowest and highest level that can be reproduced by a system. A system with 16-bit resolution has a dynamic range of 96 dB, where dB refers to the decibel. The decibel is one tenth of a Bel (named after Alexander Graham Bell). This is a logarithmic scale and is relative. A 3 dB change is the minimum perceptable change, and a 10 dB change represents a sound that is twice as loud. Log scales are used frequently in both sound and imaging (as with optical density) to condense the dynamic range. Also, log scales often more closely approximate how our ears and eyes perceive sound and images, and thus their use is properly justified.
Signal-to-noise ratio is simply the level of the music divided by the level of the noise. All sound has a noise component. Noise refers to random fluctuations in sound which are not associated with the music. The goal is to reduce this noise to acceptable levels, or to separate the signal as far from the noise as possible. A large signal-to-noise ratio represents better reduction of noise. CDs typically achieve a signal-to-noise ratio of about 90 dB, which is enviable when compared with the SNR in medical imaging technologies, for instance.
File Size Reduction:
The size of an uncompressed
audio file is about 10 MB per minute of audio. This is not a concern from
a storage standpoint anymore, with multi-GB drives, but for web-based
audio this size is excessive. If you were to download a four minute song
using a 28.8 kpbs modem, it would take more than 3 hours. Without compression/decompression
algorithms (codecs), the only way to reduce the size of an audio file
is through manually degrading its sound quality in one of three ways.
If all three of these methods are used, you can reduce your file size by a factor of eight. A 31 MB song now takes up only 3.9 MB, which is an acceptable size for downloading over the internet. Unfortunately, the audio quality has been degraded beyond what is acceptable to listeners. This is where compression technology becomes important.
In reality, there are many audio compression formats, but MP3 is the format that has taken the music industry by storm in recent years, and is rapidly changing the landscape in the distribution of music over the internet. If you haven’t heard of MP3, then you probably just returned from your mission – or you’d better use that as your excuse, anyway.
The proliferation of digital music on the internet has since been very rapid. The most commonly downloaded material from the internet is now music. The most commonly requested word in search engines is "mp3". Digital music and the internet are a perfect fit. Why would anyone go to a store, purchase a CD, bring it home and put it in their player, when they can point and click instead? Why would you walk up and down the isles in a music store and sift through the thousands of titles to find the CD you want (and maybe never find it), when you can type in a few keywords and find the title in a matter of seconds?
The future of MP3 and music distribution holds many unanswered questions regarding formats, copyright infringement, and distribution channels. But one thing is clear – digital music distribution over the internet is here to stay and will continue to expand rapidly as broadband access becomes commonplace. When users can download entire CDs in a matter of seconds, there will be little reason to acquire music is any other way.
Streaming Audio and
is a method of preventing the user from making copies of the music they
download. The recording industry is currently developing a standard, termed
the Secure Digital Music Initiative (SDMI), which will likely be a part
of digital music in the coming years. Some encryption formats exist already.
Below is a description of some streaming formats and those which use encryption:
Since MIDI files are so small (a few kB), their use is well-suited for web-based delivery of audio. However, it’s important to recognize that the music may sound different to each user, since it is played on the computer’s synthesizer, which may vary among users. MIDI is best reserved for instrumental works, and only when the selection of the instrument which plays the music is not essential, since you cannot control this. If you want to deliver music which sounds exactly the way you hear it, MIDI should be avoided.
Conversion Between Formats:
While the RIAA and large record companies have unanimously opposed web-based distribution of MP3 music, many musicians have found MP3 to be an opportunity, especially independent musicians and unsigned artists. Before the internet and MP3, artists had a difficult time getting their music heard and distributed, unless they had a major record deal. Now, artists can place music on their own web pages and music distribution websites which promote their music. Opportunities exist for musicians on mp3.com, iuma.com, and for LDS musicians on ldsmusician.com. Artists are now finding an audience for their music without the help of big record companies. And even major artists are utilizing these new distribution methods.
While the future of
web-based audio holds many unanswered questions, the next three to five
years will be exciting to watch as developments take place. In the meantime,
musicians should take every opportunity that exists to get exposure for
their music, using the internet for such purposes.
|Tim White's Online Resources|
Soud Synthesis info
Sound Forge 4.5
List of Real Time Softare Synthesizers
Sound synthesis - general
Sound synthesis by physical
modeling - IRCAM
Sound synthesis by physical
modeling - Italy
Chaotic sound synthesis
online voice/speech synthesis
Fourier synthesis (additive synthesis)
CSound for MAC