Overview

Schedule

Resources

Introduction to Digital Audio

MULTM 130-8502

BFL-205

Course Syllabus

Fall Semester 2002

Instructor: John Sappington <--email link 415.577.0807

Class Schedule: Lecture – Thursday 7:00-9:50 PM

The instructor will be available to assist MULTM 130 students in the lab (A-303), from 5:30 to 6:30 pm on Mondays and Tuesdays. For additional assistance or consultation please arrange to meet with the instructor during office hours.

Office Hours: In the lab (BFL-205) at the following times:

Monday 5:30 – 6:30 pm

Thursday 5:30 – 6:30 pm

Appointments are preferred though not required.

DESCRIPTION:

Students will learn the basics of digital audio production and presentation with a focus toward presentation via the network based technologies.

REQUIRED TEXTS:

None

REQUIRED SUPPLIES:

Students are required to obtain a 250MB Zip disk formatted for either the Macintosh or PC to store small project files.

Students should also purchase several blank recordable CDs to store larger movie files. Ideally the student will have a minimum of 4 CD’s to last the length of the semester. Some students may need significantly more, others significantly less. Recordable CD’s are especially recommended for students who intend to work on assignments at home.

Students are also required to have an e-mail account. This can be accessed through the lab and may be a free account like those available from Hotmail, Yahoo, Freemail, Excite, etc. The instructor will assist students in obtaining an e-mail account if necessary only on the first day of class. Students will be expected to check this e-mail account at least once a week. Assignment grades, class progress, and general class announcements will be reported via e-mail.

EVALUATION:

50% of student grades will be based on class participation. Students must attend class in order to effectively participate.

30% of student grades will be based on the completion of periodic exercises. These exercises will not be graded for skill or content, but will be checked only for completion. Incomplete assignments will be considered undone. Grades will correspond to the percentage of weekly exercises completed as follows.

20% of student grades will be determined by averaging grades on three evaluative class assignments and one final exam.

ATTENDANCE POLICY:

Students are expected to attend all of every class meeting unless they have received prior permission from the instructor. Attendance will be taken at the beginning of each class meeting. Anyone absent when attendance is taken will be assumed absent from the class. If you are late to class it is your responsibility to make sure your attendance is acknowledged by talking to the instructor.

If a student misses two weeks of class without being excused, it can be assumed that the student is intending to drop the class. Because this class meets only once a week that means two absences will be grounds for the instructor to drop you from the class.

If you are intending to drop the class, it is your responsibility to drop the class. You should not assume that the instructor will automatically drop you because of absences. If you stop attending classes, you do not drop the class, and the instructor has not dropped you from the class; the instructor may be required to give you a grade of F for the class.

Introduction to Digital Audio

MULTM 130-8502

Course Schedule

(subject to change)

Fall Semester 2002

Class web site: http://www.basearts.com/audio.htm

Session II

Week 1 10.24.02	– Introductions Digital Audio Theory	Assignments: # 1 - Submit a Microsoft word file containing a paragraph which briefly describes your audio project. Describe original audio sources, intended output format and/or use. What programs will be required to complete this project? What other media will be used in conjunction with the audio. (web pages, graphics, video, animation, etc.). This file should be named as follows: firstinital, lastname.1.(wav or aiff) – jsappington.1.doc # 2 – Copy to my drop box a first draft of your audio file(s)as 16 bit, 44.1k .wav or .aiff file(s). This file should be named as follows: firstinital, lastname.2.(wav or aiff) – jsappington.2.wav #3 – Copy to my drop box your final draft of the audio recording in a 16 bit, 44.1k wav or aiff file. This file should be named as follows: firstinital, lastname.3.(wav or aiff) – jsappington.3.wav #4 – Final – copy to my drop box the final presentation of your project. Minimum size is 1 minute in length and must contain a combination of 3 sources (Sources may include: cd audio, mp3s, collected wav or aiff files, field recordings ( personal recordings made from real to life sources). Presentation may take any of the forms described in your original proposal ( Flash, embedded in HTML or as a 16 bit stereo/ wav. or aiff) If you are sumbitting a wav or aiff file please include playlist information within the file and complete the authorship summary. This file/folder should be named as follows: firstinital, lastname.4 ( if individual files must include (.wav or .aiff) .
Week 2 10.31.02	– Peak/Sound Forge - Extracting existing audio samples from the internet, cd and various media. Basic Capture
Week 3 11.7.02	- Digital Signal Processing / Effects - Begin class project outline, Editing
Week 4 11.14.02	– File types / saving out / exporting to various delivery applications ( html, streaming, interactive). Exporting
Week 5 11.21.02	Digital audio via the Web . Assignment: bring in source audio / recordings / music & individual project descriptions are submitted. #1
Week 6 11.28.02	Integrating audio into media applications. Alternatives in Distribution. Assignment: Present first draft audio files from within html (embedded as background sound, simple link or streaming). #2
Week 7 12.05.02	Assignment: Present second draft audio file from within another media delivery application (Shockwave, Flash, Premiere). #3
Week 8 12.12.02	- Submit Final Draft - #4
Week 9	Guests: Matmos - http://www.brainwashed.com/matmos/ Final Presentations & Critique

Introduction to Digital Audio

MULTM 130-8502

DIGITAL AUDIO
by
Christopher Dobrian

from MSP: The Documentation
Cycling '74 and IRCAM, December 1997

http://music.arts.uci.edu/dobrian/digitalaudio.htm

Technical Links - Application Notes - Codecs - Artists - Preparing Files - Digital Audio Overview - Tim White's Online Resources

Resources - Technical	Artists/ Experimental
Adobe Technical Guide to Digital Audio http://www.adobe.com/support/techguides/webpublishing/audio/main.html Sound Forge Manual - Download http://www.sonicfoundry.com/download/step2.asp?DID=198 Peak Manual - Download http://www.bias-inc.com/downloads/documentation/peak.html Acid Xpress - Download http://www.sonicfoundry.com/download/Step2.asp?DID=155 Apple Music and Audio http://www.apple.com/creative/musicaudio/ http://developer.apple.com/techpubs/quicktime/qtdevdocs/RM/frameset.htm Pure Mac http://www.pure-mac.com/mp3.html The Recording Website http://www.recordingwebsite.com/ Pro Tools Welcome To Digidesign Digidesign mag Digidesign Training & Education Digidesign Training Reference Guides http://www.digidesign.com/download/links.html Links We Like — Inside & Outside Resources Digital Audio Theory and Reality Digital Audio Welcome to BIAS! http://www.unibrain.com/home/ http://filmsound.studienet.org/ http://www.sound-design.org.uk/ sonic arts network avant baschet MA sound design ,bournemouth ams neve prosound online magazine prs sonic foundry q sound cakewalk aes audio directory prosound news soundwave dobly technologies eighth nerve moviesound links cinema audio society sound theory page partnersinrhyme avant garde music soundscape imax sound system bectu mandy ircam	If you like to choose music to match your mood or perhaps pull you out of one, IBM has created an online tool that will be of interest. The tool called the Glass Engine allows users to surf a selection of Philip Glass works not only by standard categories like work title and year, but also by more subjective characteristics like joy, sorrow and density. If a selection is too soft, crank up the intensity. This is not an exercise in Artificial Intelligence: Glass' longtime producer and sound designer, Kurt Munkacsi, has assigned values to the tracks. http://www.philipglass.com/glassengine/index.htm John Oswald : Plunderphonics http://www.plunderphonics.com/ http://www.halcyon.com/robinja/mythos/Plunderphonics.html Asphodel http://www.asphodel.com/doc/news.whatsnew.html Leonardo / the International Society for the Arts, Sciences and Technology. http://mitpress2.mit.edu/e-journals/Leonardo/index.html Matmos http://www.brainwashed.com/matmos/ http://www.lsr1.com/disc/ Realab - Realtime Experimental Audio Laboratory UCI Music Department University of California, Irvine http://music.arts.uci.edu/dobrian/realab.htm Center for New Music and Audio Technologies http://cnmat.cnmat.berkeley.edu/ CDeMUSIC / Electronic Music Foundation, Ltd. http://www.cdemusic.org/ The Music, Mind and Machine Group at the MIT Media Laboratory http://sound.media.mit.edu/ PixelZoom [feat. spo0ky]_ The PixelFocus and PixelZoom projects are based on the spatial dimension of sound. Using multiple microphones connected to a stereo input or a stereo music cd, we can manipulate images based on the position of the sound source. http://www.no.org.za/2002/004.html John Cage http://sonicflux.walkerart.org/cage/index.html George Legrady's Information Theory - noise http://www.merz-akademie.de/projekte/george.legrady/theory/noise/intro.htm http://csunix1.lvc.edu/~snyder/em/mc.html Mobius operandi http://www.mobiusmusic.com/ A Catalog of Kinetic Sculpture by Norman Tuck http://www.normantuck.com/ Matt Heckert http://www.mattheckert.com/ Christopher Dobrian http://music.arts.uci.edu/dobrian/ Designers http://yugop.com/ver2/ Electronic Music - Historical Links Bernie Krause Pierre Schaeffer Luigi Russolo The Art of Noise http://cadre.sjsu.edu/switch/sound/articles/wendt/folder6/ng632.htm Edgard Varese
Application Notes:	Preparing Files
About Shockwave Audio Shockwave Audio is a technology that makes sounds smaller and plays them faster from disk or over the Internet. Shockwave Audio can compress the size of sounds by a ratio of up to 176:1 and is streamable, which means Director doesn't have to load the entire sound into RAM before it begins playing. Director starts to play the beginning of the sound while the rest of the sound is still streaming from its source, whether coming from disk or over the Internet. When used properly, the Shockwave Audio compression and streaming features provide fast playback of high-quality audio, even for users with relatively slow modem connections to the Internet. Compression quality in Shockwave Audio Although Shockwave Audio uses advanced compression technology that alters original sounds as little as possible, the more a sound is compressed the more it is changed. Set the amount of compression by choosing a bit rate setting in any of the Shockwave Audio Xtras. The bit rate is not related to sampling rates you may have used in other audio programs. Try compressing the same sound at several different bit rates to see how the sound changes. Choose the bit rate appropriate for the intended delivery system (modem, ISDN, CD-ROM, hard disk, and so on), the type of movie, and the nature of the sound itself. Voice-over sound quality, for example, may not need to be as high as that of music. Test the sound on several systems to find the right balance between quality and performance. The more compressed a sound is, the faster it streams. If you choose to use a high quality and low degree of compression, a slow delivery system may not be able to send the data fast enough, resulting in gaps during playback. Most developers choose 16 Kbps for the best results over the Internet. The following table suggests some general guidelines for setting the bit rate for different delivery systems. It also provides a rough estimate of perceived quality for different rates of compression. Note that real transmission times may be slower than the times shown in this table, depending on network traffic and server load. Delivery Bit rate Quality T1 64 to 128 Kbps Equal to source material ISDN or CD-ROM 32 to 56 Kbps FM stereo to CD 28.8 modem 16 Kbps FM monaural or good-quality AM 14.4 modem 8 Kbps Telephone Note: Any sound compressed at less than 48 Kbps is converted to monaural. Codecs: ----------------------------- µ-Law 2:1 Useful for exchanging audio with applications on platforms (such as many UNIX workstations) where µ-Law is a standard audio format. µ-Law is used for digital telephony in North America and Japan. (The first letter of the codec name is a Greek letter pronounced Mu.) 16-bit Big Endian and 16-bit Little Endian Useful when audio must be stored using Big Endian or Little Endian (byte order) encoding, such as when preparing microprocessor-specific audio. These codecs are useful for hardware and software engineers but are generally not useful for video editing. 24-bit Integer and 32-bit Integer Useful when the audio data must be stored using 24-bit or 32-bit Integer encoding, such as when preparing microprocessor-specific audio. These codecs are useful for hardware and software engineers but are generally not useful for video editing. IMA 4:1 Useful for cross-platform audio for multimedia. IMA 4:1 was developed by the IMA using ADPCM. 32-bit Floating Point and 64-bit Floating Point Useful when audio must be stored using 32-bit or 64-bit floating point encoding, such as when preparing microprocessor-specific audio. These codecs are useful for hardware and software engineers but are generally not useful for video editing. ALaw 2:1 Similar to µ-Law, but used primarily for digital telephony in Europe. MetaSound/MetaVoice Codecs (Mac OS only) A wide series of codecs developed by Voxware. These codecs discard parts of the audio signal that are imperceptible to the human ear so the compression provides high music quality with high compression. The codecs cover a wide range of bit rates from AC06 V2.0 at 6,000 bps to the ACS96 V2.0 at 96,000 bps i order to accommodate varying bandwidths. Qualcomm PureVoice Intended for speech; works best at 8 kHz. Based on the Code Division Multiple Access (CDMA) technology standard for cellular telephony. QDesign Music Codec Useful when compressing high-quality music for Internet distribution. It is capable of delivering CD-quality (16-bit, 44.1 kHz) audio over a 28.8 Kbps line. MACE 3:1 and MACE 6:1 Useful as a general-purpose audio codec. The Macintosh Audio Compression and Expansion codec (MACE) has been built into the Mac OS Sound Manager for many years. MACE 3:1's lower compression ratio provides higher quality than MACE 6:1. Because it is provided with QuickTime 3.0 and later, it is also accessible in Windows when QuickTime is installed. MPEG Layer-3 Codec Also known as MP3. This is the third coding scheme for MPEG audio compression. MPEG Layer-3 uses perceptual audio coding and psychoacoustic compression to remove parts of the audio signal that are imperceptible to the human ear. The result is a compression ratio up to 12:1 without loss of audio quality. MP3 is a common format for distributing music files over the Internet. ACELP.net A net-based codec using frame-concatenation and interlacing for improved music quality. ACELP.net allows a dual-rate bit-rate of 8.5/6.5 kbps or a fixed-rate bit-rate of 5.0 kbps. WM-AUDIO More fully known as Microsoft® Windows Media™ audio compression. This is the standard codec for Microsoft Active Streaming Format which combines fast encoding with high music quality and is optimized for Pentium II (MMX) and Pentium III (SSE/SIMD) processors. WM-AUDIO has a wide bit-rate range from 5 kbps to 128 kbps and offers high quality sound over the Internet even over 28.8 modems. WM-AUDIO is considered a future replacement for MP3. Indeo® Audio Software Useful for music and speech distributed over the Internet. Its maximum compression ratio is 8:1. This codec is designed to work together with the Indeo Video codec. Microsoft G.723.1 A codec intended for use in video conferencing. It offers acceptable voice quality, but is a poor choice for music or sound effects. The audio quality is lower than other codecs that use the same data rate. L&H Codecs Speech and music compression algorithm developed by Lernout & Hauspie™. TrueSpeech™ Useful for speech over the Internet at low data rates. Microsoft GSM 6.10 Useful for speech, used in Europe for telephony. Microsoft CCITT G.711 This codec uses µ-Law encoding and is commonly used for digital telephony in North America and Japan. MS-ADPCM A Microsoft implementation of Adaptive Differential Pulse Code Modulation (ADPCM), a common digital audio format capable of storing CD-quality audio. Microsoft IMA ADPCM An implementation of ADPCM, useful for cross-platform audio for multimedia, developed by the Interactive Multimedia Association (IMA).	Keep everything DDD if you can. Audiophiles use the three Ds to refer to a track that's recorded digitally, produced (mixed and edited) digitally, and mastered digitally. Although this here Webmonkey is an analog junkie who smiles lovingly upon 2-inch tape, he knows that unless your product is going to vinyl, you should take advantage of the cleanliness of a fully digital process. Make sure you normalize your files. Most audio editors have a "normalize" command, which lets you trim off amplitudes above and below certain levels. This leaves you with a steady and stable signal that doesn't peak or drop out dramatically, which is very important for digitized audio. Use a good pair of headphones or speakers as your primary monitor. This may seem obvious, but it's crucial to do this throughout the entire production and post-production process. Equalize your files. As you probably already know, equalization balances waveforms so that the output contains the proper ratio of treble to bass. Optimally you should do this with outboard gear - like a rack-mounted unit - but most audio editors have a least a five-band EQ. For advanced audio engineers, the EQing process is part of a larger affair called mixing, but everyone doing any audio should equalize the files before mastering them. And be careful with the bass - built-in computer speakers can't handle bass response very well. Digital Audio Overview - LSD Music World -------------------------------------- Downloadable audio was born from the combination of two technologies - the internet and audio compression. With access to the internet infiltrating homes over the past decade, only one element was necessary to achieve the obvious desire to get music from websites - audio files which could be downloaded in a reasonable amount of time. CD-quality audio (44 kHz, 16 bit) occupies about 10 MB per minute in storage. A typical song of 30-40 MB would take more than 2 hours to download using conventional modems, making internet distribution impractical - welcome mp3 While many audio compression methods exist, MPEG-1 Audio Layer 3 (or mp3) was the breakthrough everyone was waiting for. Utilizing the concept of masking, this compression technology provided CD-like quality at a fraction of the file size - that 30 MB song, once compressed, took up only 3 MB. Suddenly, music was downloadable at 15-20 minutes per song. MP3 took the music industry by storm, and whatever your views of Napster or its clones, digital downloadable music is here to stay. Music distribution is in a state of transformation the likes of which we have not seen in generations. As broadband connections to the internet become commonplace, and as the internet reaches more homes, people will choose to get their music over the internet. Musicians should embrace this technology and this method of delivery. -------------------------------------------------------------------------------- Digital Audio Background: Digital audio is often described as 16-bit, 44 kHz, otherwise known as CD-quality audio. Digital audio differs from analog audio in its waveform structure. While analog audio is represented by a continuous waveform, digital audio is a step-function. We hear music as an analog waveform. Our ears perceive sound as small changes in air pressure, and convert those changes to mechanical motions, and then to electrical stimuli which proceed to the brain. All digital audio must be converted to analog audio through a D/A converter before we hear it. This is done within the CD player or computer sound card before the signal is sent to the speakers. The purpose of digital audio is to store sound in a format which is non-destructive, which means it doesn’t get degraded when copied. Analog sound gets worse with every copy made, as with a cassette for instance. Just as digital is converted to analog through a D/A converter, analog is converted to digital through an A/D converter. The fundamental element in this process is called sampling. Sample Rate: Analog-to-digital conversion takes place when the audio waveform is sampled at a fixed interval of time and represented as a series of data values. The sample rate is the number of times per second the waveform is sampled, and is expressed in units of kHz (kilohertz, or one thousand hertz) – one Hz equals one cycle per second. Both sample rate and audio frequency are expressed in units of kHz. Simply stated, frequency corresponds with pitch. For example, the A below middle C has a frequency of 220 Hz, and the A one octave higher has a frequency of 440 Hz. To produce CD-quality digital audio, the analog audio must be sampled at 44 kHz or higher. The human ear can detect frequencies up to about 20 kHz. According to the Nyquist principal, the sampling frequency must be at least two times the highest frequency that you want represented. Otherwise, aliasing will occur, which means that the frequencies above 1/2 the sample rate will be represented inappropriately as lower frequencies, because they were undersampled. This is the reason for the 44 kHz – it is roughly twice the 20 kHz that we can hear. Some commercial audio recording software boasts sampling rates as high as 96 kHz, but unless you want your dog to enjoy those higher frequencies which you can’t hear, there’s no physical reason to go above 44 kHz. Good temporal resolution can be achieved with commercial analog-to-digital conversion technology. A useful analogy is to compare temporal resolution (sound) with spatial resolution (imaging) – audio frequency is expressed in cycles per second, while spatial frequency is expressed in line pairs per millimeter; the greater the frequency, the better the detail. When converting an analog image into a digital image (as with a digital camera), the spatial resolution or detail of the resulting image depends on how many samples you take of the image over its area, or how many pixels you have. Commercial digital cameras have improved in spatial resolution in recent years, but they remain a far cry from the resolution of conventional film, which is on the order of a few microns. The goal is to sample the image, or the audio waveform, at a sample rate which exceeds two times the highest frequency that can be resolved by the viewer, or listener. In audio, that frequency is 20 kHz, and the sample rate must therefore be above 40 kHz (44 kHz is the current standard for CD-quality audio). Bit Depth: The bit depth for CD-quality audio is 16 bits. Think of bit depth as step size. The greater bit depth you use, the finer your steps will be, or the greater number of steps you will have throughout your range. Continuing the analogy with imaging, bit depth describes the number of gray levels in a black and white image. For example, if you have 8 bits, then you have 256 different shades of gray to represent changes in the darkness/lightness of your image (28 = 256). If you have 10 bits, then you have 1024 shades of gray. In 16-bit audio, you have 65,534 numbers (216) available to represent the incoming voltage level of the signal. In practice, utlilizing a greater bit depth reduces noise in your audio, as well as quantization distortion. Quantization refers to the practice of assigning whole numbers to an input voltage level. Since rounding always occurs, the smaller the step size the better – a greater number of bits achieves this. CD-quality audio uses 16 bits. Sometimes bit depth is referred to as resolution, not to be confused with temporal resolution, which refers to the sample rate. Dynamic Range and Signal-to-Noise Ratio: Greater bit depth increases the dynamic range. The dynamic range is the range between the lowest and highest level that can be reproduced by a system. A system with 16-bit resolution has a dynamic range of 96 dB, where dB refers to the decibel. The decibel is one tenth of a Bel (named after Alexander Graham Bell). This is a logarithmic scale and is relative. A 3 dB change is the minimum perceptable change, and a 10 dB change represents a sound that is twice as loud. Log scales are used frequently in both sound and imaging (as with optical density) to condense the dynamic range. Also, log scales often more closely approximate how our ears and eyes perceive sound and images, and thus their use is properly justified. Signal-to-noise ratio is simply the level of the music divided by the level of the noise. All sound has a noise component. Noise refers to random fluctuations in sound which are not associated with the music. The goal is to reduce this noise to acceptable levels, or to separate the signal as far from the noise as possible. A large signal-to-noise ratio represents better reduction of noise. CDs typically achieve a signal-to-noise ratio of about 90 dB, which is enviable when compared with the SNR in medical imaging technologies, for instance. File Size Reduction: The size of a digital audio file can be computed as the product of sample rate, bit-depth, number of channels, and seconds, divided by the number of bits per byte, or 8. As an example, consider a 60 second stereo sound clip. Because it’s stereo, it has two channels. If this is recorded with a 44.1 kHz sample rate at 16 bit resolution, then the file size is 44,100162*60/8 = 10,584,000 bytes or 10.6 MB. The size of an uncompressed audio file is about 10 MB per minute of audio. This is not a concern from a storage standpoint anymore, with multi-GB drives, but for web-based audio this size is excessive. If you were to download a four minute song using a 28.8 kpbs modem, it would take more than 3 hours. Without compression/decompression algorithms (codecs), the only way to reduce the size of an audio file is through manually degrading its sound quality in one of three ways. 1- Convert a stereo file to a mono file: Stereo has two tracks, while mono has only one. You can reduce the file size in half by making this conversion. If you are planning to provide users with audio files that you do not expect them to listen to on good sound systems, this file reduction method should be utilized. This should be your first choice in reducing file size, since the dual-channel nature of the music is lost but no actual degradation of audio quality takes place. 2- Reduce the sample rate: When you reduce the sample rate from 44 kHz to 22 kHz, , you will cut your file size in half as well, and you will lose frequencies above 11 kHz. This will be noticeable, but the extent of the degredation will depend on nature of the music. If there is a great deal of high frequency components, the loss will be more meaningful. For speech, of course, you should consider going as low as 8 kHz, since high frequencies can be more easily ignored, as they are across our telephone lines. This should be your second choice in reducing file size. Higher frequencies are sacrificed for smaller file size. 3- Reduce the bit depth: If you reduce your bit depth from 16 bits to 8 bits, this again will cut the file size in half. The practical result of reducing bit depth is to introduce noise into the recording. In most cases, the result will be unacceptable. This method should be your last choice in reducing file size. If all three of these methods are used, you can reduce your file size by a factor of eight. A 31 MB song now takes up only 3.9 MB, which is an acceptable size for downloading over the internet. Unfortunately, the audio quality has been degraded beyond what is acceptable to listeners. This is where compression technology becomes important. -------------------------------------------------------------------------------- Compression Technology and MP3 Introduction: If we could all receive data at a few megabits per second, perhaps we wouldn’t concern ourselves so much with compression technology. But normally, when we want to send data to somebody else, we zip it or stuff it or do whatever we can to make it smaller than it really is. For pictures, we compress them to jpeg or gif format. For audio, we use mpeg, or more specifically, mpeg layer 3 – otherwise known as MP3. In reality, there are many audio compression formats, but MP3 is the format that has taken the music industry by storm in recent years, and is rapidly changing the landscape in the distribution of music over the internet. If you haven’t heard of MP3, then you probably just returned from your mission – or you’d better use that as your excuse, anyway. Background: In 1992 the Motion Picture Experts Group (MPEG) approved a compression/decompression algorithm (or codec) which was called MPEG-1 Audio Layer 3, or MP3. But it wasn’t until 56K modems became commonplace that MP3 reached a wide audience and began to transform the way music is heard and distributed. With this combination of compressed music and faster connection speeds, internet users could download an entire song in 15-20 minutes – certainly not instant gratification, but reasonable enough for people to get the music they wanted. The proliferation of digital music on the internet has since been very rapid. The most commonly downloaded material from the internet is now music. The most commonly requested word in search engines is "mp3". Digital music and the internet are a perfect fit. Why would anyone go to a store, purchase a CD, bring it home and put it in their player, when they can point and click instead? Why would you walk up and down the isles in a music store and sift through the thousands of titles to find the CD you want (and maybe never find it), when you can type in a few keywords and find the title in a matter of seconds? The future of MP3 and music distribution holds many unanswered questions regarding formats, copyright infringement, and distribution channels. But one thing is clear – digital music distribution over the internet is here to stay and will continue to expand rapidly as broadband access becomes commonplace. When users can download entire CDs in a matter of seconds, there will be little reason to acquire music is any other way. Audio Compression: MP3 achieves roughly a 10:1 compression ratio with only minimal loss of audio quality. Unlike the file size reduction methods mentioned, which degrade the audio quality by reducing sample rates or bit depths, the codec used for mp3 utilizes a masking method. The algorithm removes sounds in the audio which are masked by other sounds. The idea behind this is that because that sound is being masked, you aren’t going to hear it anyway. So if it’s removed, you won’t miss it. The audio quality is in reality degraded, because the codec uses lossy compression, but most users find little noticeable degradation. -------------------------------------------------------------------------------- Audio Formats Downloadable Audio: For downloadable audio, there is little reason to look any further than MP3. While many other formats exist, many of these are uncompressed or do not offer the level of compression that MP3 does. The MP3 format is now so widely used that nearly all web users have capabilities either within their browser or through external software for playback of MP3 files. Here’s a brief description of audio formats: - AIFF (.aif, .aiff): Audio Interchange File Format. This is an uncompressed format, and is not used on the web, since file size is approximately 10 MB per minute of audio. This is the default audio format for computers running the Mac OS. It uses the PCM codec. - AU (.au, .snd): Sun Audio format. This is a moderately compressed format. It was used frequently in the early days of the web, but is no longer practical due to large file size. This is the default audio format for computers running Unix. It uses the u-law codec. - Wave (.wav): Microsoft’s Wave format should also be avoided for web use, since it is an uncompressed format. This is the default audio format for computers running Windows. It uses the PCM codec. - Quicktime (.mov): Quicktime is more than an audio format. It is an architecture to store, edit and play multimedia content, such as synchronized graphics, sound, video, and text. Apple’s quicktime software actually supports all of the major audio formats for playback. It uses a proprietary format from Apple Computer. - MP3 (.mp3, mp2): MPEG-1 Audio Layer 3 format. MP3 has become the undisputed format of choice for downloadable audio. It provides good quality digital audio at a compression ratio of about 10:1. The significance of MP3 cannot be overstated. Currently, there is no conceivable reason to use any format other than MP3 for the delivery of downloadable audio. Streaming Audio and Encryption: Streaming audio differs from downloadable audio in that it begins playback almost immediately after being requested. Instead of waiting until the entire song has been downloaded, the audio is "streamed" to the user’s computer, and it continues streaming during playback. Since delivery time is quicker, audio quality is normally poorer with streaming audio. The purpose of streaming audio is generally two-fold – to deliver the audio to the listener with minimal delay, and to prevent the user from obtaining an actual copy of the music. Encryption technology is a method of preventing the user from making copies of the music they download. The recording industry is currently developing a standard, termed the Secure Digital Music Initiative (SDMI), which will likely be a part of digital music in the coming years. Some encryption formats exist already. Below is a description of some streaming formats and those which use encryption: - Real Audio (.ra, .ram, .rm): Real Networks pioneered streaming audio with its introduction of Real Audio several years ago. RealPlayer now supports many streaming formats besides RealAudio. This is by far the most popular format for streaming audio, controlling roughly 80% of the market. - Shockwave Audio (.swa): Shockwave is Macromedia's contribution to web-based audio. It is a streaming audio format which allows you to choose the level of quality for playback, depending on the modem speed of your audience.Shockwave streams a low bit-rate MP3 file with a different file header. Many players can handle Shockwave audio. - Windows Media Audio (.wma): Windows Media Audio uses a proprietary compression format, and is a relatively late entry into this realm. It is a streaming format and is aimed squarly at Real Network’s RealPlayer. - Liquid Audio: Liquid Audio is a streaming format which utilizes licensed technology from Dolby Labs. But it's more than a streaming format. The goal of Liquid Audio is to allow users to preview music and then purchase it one song at a time. Liquid Audio uses a tracking system to make sure the record company, the publisher and the artist get paid. It is meant to be a one-stop solution for digital downloads over the internet. MIDI: MIDI (.mid, .midi): Musical Instrument Digital Interface. MIDI is different from the other formats mentioned, because it really isn’t an audio format. MIDI is a language for computers and musical instruments to talk to each other. A MIDI file does not contain music. It contains instructions for a musical instrument to play a song. You must have a musical instrument to get music from a MIDI file. Fortunately, many people have a musical instrument built right into their computers – Apple’s Quicktime Musical Instruments is one such example. Other software-based synthesizers exist which are more advanced. Since MIDI files are so small (a few kB), their use is well-suited for web-based delivery of audio. However, it’s important to recognize that the music may sound different to each user, since it is played on the computer’s synthesizer, which may vary among users. MIDI is best reserved for instrumental works, and only when the selection of the instrument which plays the music is not essential, since you cannot control this. If you want to deliver music which sounds exactly the way you hear it, MIDI should be avoided. Conversion Between Formats: Many tools exist for converting between different audio formats. High-end audio editing programs such as Sound Forge and Peak offer the most extensive options. However, less expensive alternatives exist, such as Quicktime, SoundJam, Jukebox and WinAmp. A good resource for keeping up with mp3 music, players, news, and much more is the Lycos MP3 site. -------------------------------------------------------------------------------- The Future of Web-Based Audio The distribution of music over the internet will continue to proliferate as broadband access becomes more commonplace - this is a given. The unanswered questions have to do with which format will emerge as the standard and how the recording industry will modify its business model to generate profits from web-based distribution. The widespread acceptance of MP3 can be credited to its open architecture and grass roots support. MP3 is not really a standard, since no company or organization has branded it as such. While consumers have embraced MP3, the recording industry has taken every opportunity to curtail its use, as demonstrated by its numerous lawsuits, first against Diamond Multimedia when the Rio (portable MP3 player) was introduced, and more recently against mp3.com and Napster. The Recording Industry Association of America (RIAA) lost its lawsuit against Diamond Multimedia, but has had more success in the Napster case, where the court ruled in favor of the RIAA. This decision was appealed, and Napster was allowed to continue operations pending an appeals court ruling. While the RIAA and large record companies have unanimously opposed web-based distribution of MP3 music, many musicians have found MP3 to be an opportunity, especially independent musicians and unsigned artists. Before the internet and MP3, artists had a difficult time getting their music heard and distributed, unless they had a major record deal. Now, artists can place music on their own web pages and music distribution websites which promote their music. Opportunities exist for musicians on mp3.com, iuma.com, and for LDS musicians on ldsmusician.com. Artists are now finding an audience for their music without the help of big record companies. And even major artists are utilizing these new distribution methods. While the future of web-based audio holds many unanswered questions, the next three to five years will be exciting to watch as developments take place. In the meantime, musicians should take every opportunity that exists to get exposure for their music, using the internet for such purposes.
Tim White's Online Resources
3D sound http://www.hitl.washington.edu/scivw/EVE/I.B.1.3DSoundSynthesis.html Soud Synthesis info Sound Forge 4.5 http://www.sonicspot.com/soundforge/soundforge.html List of Real Time Softare Synthesizers for PC http://www.sonicspot.com/softwaresynth.html Sound synthesis - general http://www.sfu.ca/sonic-studio/Sound_Synthesis.html Sound synthesis by physical modeling - IRCAM http://viswiz.gmd.de/~eckel/publications/eckel95b.html Physical Modeling http://www.audionica.com/Audionicaen/synthes.htm http://viswiz.gmd.de/~eckel/publications/eckel95b.html (IRCAM) http://www.dei.unipd.it/english/csc/documents/pm/pm.html http://web.ukonline.co.uk/taosynth/ Sound synthesis by physical modeling - Italy http://www.dei.unipd.it/english/csc/documents/pm/pm.html Fourier Synthesis: http://www.phy.ntnu.edu.tw/java/sound/sound.html Chaotic/Fractal synthesis http://www.far-field.com/~dan/Music/chaos/Chaosrel.htm http://www.ecdl.hut.fi/~kimmo/emusic.html Chaotic sound synthesis http://www.far-field.com/~dan/Music/chaos/Chaosrel.htm http://www.sonicspot.com/guide/synthesis.html Genetic Algorithms http://www.cs.ukc.ac.uk/pubs/1999/908/ ? http://datura.cerl.uiuc.edu/BillWalker/thesis/thesis.html online voice/speech synthesis A research version of Next-Generation Text-To-Speech (TTS) from AT&T Labs http://www.research.att.com/~mjm/cgi-bin/ttsdemo http://www.helsinki.fi/~ssyreeni/dsound/index.html Fourier synthesis (additive synthesis) http://www.phy.ntnu.edu.tw/java/sound/sound.html http://www-users.york.ac.uk/~mdjp101/bongsmack/ CSound http://www.csounds.com/ CSound for MAC http://music.columbia.edu/~matt/ Composers Paul Lansky http://www.essentialsofmusic.com/composer/lansky.html Xenakis http://elib.zib.de/ICM98/C/1/urania/abstracts/Hoffmann.html	Rock - Punk - Indie Rock - Math Rock Folk/Acoustic Pop Electronic - House - Techno - Breakbeat - Drum and Bass - Abstract - Ambient - Downtempo - Trance Hip Hop Jazz Experimental