This document provides an overview of audio compression. It begins with a brief history of audio compression and discusses its increasing usage today. The document then covers digital audio basics like sampling, quantization, and the conversion between analog and digital signals. It describes the differences between lossless and lossy compression techniques. Common audio compression techniques are also explained, including how they analyze and store audio signals in a more efficient way. The document concludes by discussing how compression is used in various file formats and recording devices to reduce the large amount of storage space required for high quality digital audio.
Digital Audio Tape (DAT) is a recording and playback medium developed by Sony in the 1980s that uses magnetic tape similar to audio cassettes. DAT supports lossless data compression and allows sampling at various rates up to 48 kHz and 16 bits. DAT tapes range in length from 15 to 180 minutes depending on the amount of data stored. DAT was used professionally for master recordings and in the computer industry for data backups but was never widely adopted for home use.
The document discusses the history and technology of sound and audio, including how sound is digitized, common sound file formats like MP3 and WAV, how MIDI works for synthesized sound, software for editing and sequencing sound, and examples of sound hardware like Creative sound cards. It provides an overview of key concepts in digitized sound and music production for multimedia.
Digital audio technologies allow for the reproduction and manipulation of sound in digital form. Sound is converted from analog to digital via sampling, where the amplitude of sound waves is measured at regular intervals. This results in digital audio files that can be edited, stored and transmitted more easily than analog audio. Popular digital audio file formats include WAV, MP3, MIDI and more. Devices like the iPod and services like iTunes revolutionized portable music and digital music distribution. Technologies like text-to-speech and DAISY have also improved audio accessibility.
This document provides an overview of MPEG Audio Compression Layer 3 (MP3). It discusses how MP3 was developed under EUREKA project EU147 for Digital Audio Broadcasting. It achieves compression ratios of over 12:1 for CD-quality audio using psychoacoustic models to remove inaudible components. The encoder uses filter banks and quantization with Huffman coding, while controlling distortion and rate through nested feedback loops.
Audio compression reduces the size of audio files through lossy or lossless techniques. Lossy compression uses psychoacoustic algorithms to filter out sounds imperceptible to humans, reducing file size but introducing data loss. Lossless compression compresses files without any loss, allowing perfect restoration. Common lossy codecs include MP3, while lossless options are FLAC, ALAC, and WMA Lossless. International standards bodies like MPEG and ITU-T develop and standardize audio compression formats.
The document discusses the theory of audio, including:
1. What is audio and how it involves the production, recording, manipulation and reproduction of sound waves.
2. The basics of analog and digital audio, including how analog audio represents sound waves and how digital audio converts sound to binary numbers through sampling.
3. Key concepts in audio like bandwidth, which refers to the range of frequencies a signal occupies, and how analog audio is converted to digital audio through sampling and quantization.
Audio compression can be either lossless, which reduces file size while retaining all audio information, or lossy, which greatly reduces file size but decreases sound quality by losing some audio information. Common lossless formats are AIFF, WAV, and FLAC, while common lossy formats are MP3, AAC, and Vorbis. The quality and size of compressed audio files depends on factors like sample rate, bit depth, bit rate, and number of channels. Higher values for these factors generally mean higher quality audio but larger file sizes.
The document provides an overview of analog and digital audio, including:
- Analog audio uses continuous waves while digital audio represents sound as discrete numeric samples.
- Key aspects of digital audio include sampling rate, bit depth, and channels. Higher sampling rates and bit depths provide more accurate representations.
- Converting analog to digital audio involves sampling the amplitude over time using an analog-to-digital converter.
Digital Audio Tape (DAT) is a recording and playback medium developed by Sony in the 1980s that uses magnetic tape similar to audio cassettes. DAT supports lossless data compression and allows sampling at various rates up to 48 kHz and 16 bits. DAT tapes range in length from 15 to 180 minutes depending on the amount of data stored. DAT was used professionally for master recordings and in the computer industry for data backups but was never widely adopted for home use.
The document discusses the history and technology of sound and audio, including how sound is digitized, common sound file formats like MP3 and WAV, how MIDI works for synthesized sound, software for editing and sequencing sound, and examples of sound hardware like Creative sound cards. It provides an overview of key concepts in digitized sound and music production for multimedia.
Digital audio technologies allow for the reproduction and manipulation of sound in digital form. Sound is converted from analog to digital via sampling, where the amplitude of sound waves is measured at regular intervals. This results in digital audio files that can be edited, stored and transmitted more easily than analog audio. Popular digital audio file formats include WAV, MP3, MIDI and more. Devices like the iPod and services like iTunes revolutionized portable music and digital music distribution. Technologies like text-to-speech and DAISY have also improved audio accessibility.
This document provides an overview of MPEG Audio Compression Layer 3 (MP3). It discusses how MP3 was developed under EUREKA project EU147 for Digital Audio Broadcasting. It achieves compression ratios of over 12:1 for CD-quality audio using psychoacoustic models to remove inaudible components. The encoder uses filter banks and quantization with Huffman coding, while controlling distortion and rate through nested feedback loops.
Audio compression reduces the size of audio files through lossy or lossless techniques. Lossy compression uses psychoacoustic algorithms to filter out sounds imperceptible to humans, reducing file size but introducing data loss. Lossless compression compresses files without any loss, allowing perfect restoration. Common lossy codecs include MP3, while lossless options are FLAC, ALAC, and WMA Lossless. International standards bodies like MPEG and ITU-T develop and standardize audio compression formats.
The document discusses the theory of audio, including:
1. What is audio and how it involves the production, recording, manipulation and reproduction of sound waves.
2. The basics of analog and digital audio, including how analog audio represents sound waves and how digital audio converts sound to binary numbers through sampling.
3. Key concepts in audio like bandwidth, which refers to the range of frequencies a signal occupies, and how analog audio is converted to digital audio through sampling and quantization.
Audio compression can be either lossless, which reduces file size while retaining all audio information, or lossy, which greatly reduces file size but decreases sound quality by losing some audio information. Common lossless formats are AIFF, WAV, and FLAC, while common lossy formats are MP3, AAC, and Vorbis. The quality and size of compressed audio files depends on factors like sample rate, bit depth, bit rate, and number of channels. Higher values for these factors generally mean higher quality audio but larger file sizes.
The document provides an overview of analog and digital audio, including:
- Analog audio uses continuous waves while digital audio represents sound as discrete numeric samples.
- Key aspects of digital audio include sampling rate, bit depth, and channels. Higher sampling rates and bit depths provide more accurate representations.
- Converting analog to digital audio involves sampling the amplitude over time using an analog-to-digital converter.
This document introduces digital audio by explaining the difference between analog and digital signals. It describes key variables that affect audio sampling including sampling rate, bit depth, and number of channels. Higher sampling rates, bit depths, and more channels captured result in higher quality audio files but also larger file sizes. The optimal balance of these variables must be determined based on the intended use and quality needed for the audio.
This document provides an overview of audio compression technologies. It discusses what audio is, why compression is needed, and the main types of audio compression: lossy and lossless. It describes some standard codecs for each type including MP3, AAC, FLAC. It explains the MPEG audio encoding and decoding process, and notes that AAC is the successor to MP3. In summary, the document covers audio fundamentals and provides details on common audio compression standards and techniques.
This document discusses key concepts in digital audio including sampling, quantization, digital recording, and disk-based audio systems. It explains that sampling is taking parts of an existing music piece to create a new production. Quantization refers to capturing amplitude values during sampling. The document also discusses sample rates, bit depths, lossy and lossless audio formats like MP3, WAV, and FLAC. Common sample rates used in the industry like 44.1 kHz and 48 kHz are also outlined.
Digital audio was created in the late 1960s when Dr. Thomas Stockham began experimenting with digital tape recording using analog to digital converters. The key aspects of digital audio are:
1) Analog audio is converted to digital form through analog to digital conversion which samples the analog signal at regular intervals determined by the sample rate.
2) Higher sample rates and bit depths produce more accurate digital representations of the original analog signal but result in larger file sizes.
3) Quantization error, in the form of quantization noise, occurs when sample values are rounded to binary numbers during digitization and can be reduced by dithering and increasing bit depth.
The document discusses digital audio and its uses in multimedia. It describes sound as pressure waves and how digital audio is captured through sampling. Sampling converts analog sound waves into discrete digital values through an analog-to-digital converter. The document also discusses common audio file formats like WAV and MIDI, which transmits instructions for musical notes rather than raw audio.
This document discusses audio compression techniques. It begins by defining audio and compression. There are two main types of audio compression: lossy and lossless. Lossy compression reduces file sizes but results in some quality loss, while lossless compression decompresses the file back to its original quality. Common lossy audio compression methods are discussed, including those based on psychoacoustics involving how humans perceive sound. MPEG layers are then introduced as a standard for audio compression, with Layer I being highest quality but also highest bitrate, and Layer III providing greater compression but still high quality at lower bitrates like 64kbps. Effectiveness is shown to increase with each newer layer.
Audio compression reduces the bandwidth and file size of digital audio streams and files. Lossy compression provides greater compression rates than lossless compression and is used in consumer devices, but introduces irreversible changes. Lossless compression produces an exact digital duplicate upon decompression. Common lossless formats are FLAC and Apple Lossless. Lossy compression is used widely and achieves much greater compression ratios, discarding less important data, but recompressing causes quality loss making it unsuitable for professional audio editing.
In this presentation, production of digital audio is discussed. Also brief introduction about digital audio broadcast, recording techniques and stereo phony is given.
The sample rate is the number of samples of a sound taken per second to represent it digitally. The higher the sample rate, the more accurate the digital representation. Sample rates are measured in hertz (Hz) or kilohertz (kHz). Common sample rates for digital music and audio in videos range from 44.1 kHz to 192 kHz, with 44.1 kHz used for compact discs and 48 kHz for most digital video formats. Higher sample rates like 96 kHz and 192 kHz provide higher quality representation for professional music recording and mastering.
This document provides an overview of MPEG-1 audio compression. It describes the key components of the MPEG-1 audio encoder including the polyphase filter bank that transforms audio into frequency subbands, the psychoacoustic model that determines inaudible parts of the signal, and the coding and bit allocation process that assigns bits to subbands. The overview concludes by noting that MPEG-1 audio provides high compression while retaining quality and paved the way for future audio compression standards.
Sampling rate refers to the number of digital samples taken per second of an analog audio signal. A higher sampling rate allows for more accurate reproduction of the original sound by capturing more data. The standard CD sampling rate is 44.1kHz.
Bit depth determines the number of possible amplitude levels that can be represented in each digital sample. A higher bit depth provides more precision in capturing the amplitude but requires more storage space. Standard CD audio has a bit depth of 16-bits, providing 65,536 possible amplitude levels per sample.
When an analog audio signal is converted to digital, the continuous waveform is converted into discrete samples. The difference between the original analog signal and the quantized digital representation is called quantization error
Audio Compression Techniques
a type of lossy or lossless compression in which the amount of data in a recorded waveform is reduced to differing extents for transmission respectively with or without some loss of quality, used in CD and MP3 encoding, Internet radio.
Dynamic range compression, also called audio level compression, in which the dynamic range, the difference between loud and quiet, of an audio waveform is reduced
The document discusses digital audio and the process of digitizing sound. It explains that sound is converted to a stream of numbers through sampling and quantization. Sampling measures the amplitude of sound waves at regular time intervals, while quantization represents the measured amplitude with a finite number of digital values. For high quality audio, sampling rates of 44.1 kHz or higher and bit depths of 16 bits are commonly used. The document also covers topics like the Nyquist theorem, audio formats, editing digital audio, and more.
This document summarizes audio and video compression techniques. It defines compression as reducing the number of bits needed to represent data. For audio, it describes lossless compression which removes redundant data without quality loss, and lossy compression which removes irrelevant data and degrades quality. It also describes audio level compression. For video, it defines lossy compression which greatly reduces file sizes but decreases quality, and lossless compression which preserves quality. The advantages of compression are also stated such as faster transmission and reduced storage needs, while disadvantages include possible quality loss and extra processing requirements.
This document summarizes a seminar presentation on audio compression techniques. It introduces common audio compression methods like PCM, DPCM, adaptive DPCM, linear predictive coding, perceptual coding, and MPEG audio coders. Specific techniques covered include third order predictive DPCM, backward and forward adaptive bit allocation used in Dolby AC-1. Applications of audio compression include conferencing, broadcasting radio programs by satellite, and saving memory space in sound cards.
Sound is created by vibrations that travel as waves through air or another medium. These sound waves can be captured and converted into digital audio files through the process of sampling and quantization. The quality of a digital audio file depends on factors like sampling rate, sample size, and avoiding clipping. Preparing high quality digital audio involves balancing file size needs with sound quality through proper recording levels and format settings.
This document provides an overview of digital audio compression techniques. It discusses how audio compression removes redundant or irrelevant information to reduce required storage space and transmission bandwidth. It describes how psychoacoustic modeling is used to eliminate inaudible components based on principles of masking. Spectral analysis is performed using transforms or filter banks to determine masking thresholds. Noise allocation quantizes frequency components to minimize noise while meeting thresholds. Additional techniques like predictive coding, coupling/delta encoding, and Huffman coding provide further compression. The encoding process involves analyzing, quantizing, and packing audio data into frames for storage or transmission.
Jonathan introduces digital media, including digital audio players that store and play audio files. Audio editing developed in the 20th century. The growing popularity of high-quality audio compression will change how consumers use recorded music. Digital video uses digital representations recorded on tape or discs, while digital photography uses electronic devices to capture and store images digitally rather than on film. Digital media hardware includes cameras, recorders, and computers used for digital audio, video, and photography.
The document discusses digitized sound data used in multimedia presentations. It describes key concepts like sampling frequency, sampling depth, file formats for storing sound like WAV and MP3. Higher sampling frequencies and depths improve sound quality but increase file sizes. Compression techniques like MP3 reduce file sizes significantly compared to uncompressed formats like WAV, with some loss of quality.
Sampling rate refers to the number of times per second that the amplitude of a sound wave is recorded in a digital audio file. A higher sampling rate allows for higher frequencies to be captured, more closely representing the original sound. The standard CD sampling rate is 44.1 kHz. Bit depth refers to the number of possible values used to record the amplitude of each sample, with 16 bits being standard for CD quality audio. Higher bit depths can capture a wider dynamic range of sounds but take up more storage space. Lossy compression formats like MP3 and Vorbis reduce file sizes by removing some audio information considered imperceptible to human hearing.
1. The document discusses different compression techniques for text, audio, images, and video.
2. It provides examples of compression ratios achieved using lossy and lossless compression methods. For example, text compression can achieve 3:1 ratios using Lempel-Ziv coding while audio compression can achieve ratios between 3:1 to 24:1 using MP3.
3. The techniques discussed include entropy encoding, run-length encoding, Huffman coding, discrete cosine transforms, and differential encoding which takes advantage of redundancies in the data. The best approach depends on the type of data and acceptable quality.
This document introduces digital audio by explaining the difference between analog and digital signals. It describes key variables that affect audio sampling including sampling rate, bit depth, and number of channels. Higher sampling rates, bit depths, and more channels captured result in higher quality audio files but also larger file sizes. The optimal balance of these variables must be determined based on the intended use and quality needed for the audio.
This document provides an overview of audio compression technologies. It discusses what audio is, why compression is needed, and the main types of audio compression: lossy and lossless. It describes some standard codecs for each type including MP3, AAC, FLAC. It explains the MPEG audio encoding and decoding process, and notes that AAC is the successor to MP3. In summary, the document covers audio fundamentals and provides details on common audio compression standards and techniques.
This document discusses key concepts in digital audio including sampling, quantization, digital recording, and disk-based audio systems. It explains that sampling is taking parts of an existing music piece to create a new production. Quantization refers to capturing amplitude values during sampling. The document also discusses sample rates, bit depths, lossy and lossless audio formats like MP3, WAV, and FLAC. Common sample rates used in the industry like 44.1 kHz and 48 kHz are also outlined.
Digital audio was created in the late 1960s when Dr. Thomas Stockham began experimenting with digital tape recording using analog to digital converters. The key aspects of digital audio are:
1) Analog audio is converted to digital form through analog to digital conversion which samples the analog signal at regular intervals determined by the sample rate.
2) Higher sample rates and bit depths produce more accurate digital representations of the original analog signal but result in larger file sizes.
3) Quantization error, in the form of quantization noise, occurs when sample values are rounded to binary numbers during digitization and can be reduced by dithering and increasing bit depth.
The document discusses digital audio and its uses in multimedia. It describes sound as pressure waves and how digital audio is captured through sampling. Sampling converts analog sound waves into discrete digital values through an analog-to-digital converter. The document also discusses common audio file formats like WAV and MIDI, which transmits instructions for musical notes rather than raw audio.
This document discusses audio compression techniques. It begins by defining audio and compression. There are two main types of audio compression: lossy and lossless. Lossy compression reduces file sizes but results in some quality loss, while lossless compression decompresses the file back to its original quality. Common lossy audio compression methods are discussed, including those based on psychoacoustics involving how humans perceive sound. MPEG layers are then introduced as a standard for audio compression, with Layer I being highest quality but also highest bitrate, and Layer III providing greater compression but still high quality at lower bitrates like 64kbps. Effectiveness is shown to increase with each newer layer.
Audio compression reduces the bandwidth and file size of digital audio streams and files. Lossy compression provides greater compression rates than lossless compression and is used in consumer devices, but introduces irreversible changes. Lossless compression produces an exact digital duplicate upon decompression. Common lossless formats are FLAC and Apple Lossless. Lossy compression is used widely and achieves much greater compression ratios, discarding less important data, but recompressing causes quality loss making it unsuitable for professional audio editing.
In this presentation, production of digital audio is discussed. Also brief introduction about digital audio broadcast, recording techniques and stereo phony is given.
The sample rate is the number of samples of a sound taken per second to represent it digitally. The higher the sample rate, the more accurate the digital representation. Sample rates are measured in hertz (Hz) or kilohertz (kHz). Common sample rates for digital music and audio in videos range from 44.1 kHz to 192 kHz, with 44.1 kHz used for compact discs and 48 kHz for most digital video formats. Higher sample rates like 96 kHz and 192 kHz provide higher quality representation for professional music recording and mastering.
This document provides an overview of MPEG-1 audio compression. It describes the key components of the MPEG-1 audio encoder including the polyphase filter bank that transforms audio into frequency subbands, the psychoacoustic model that determines inaudible parts of the signal, and the coding and bit allocation process that assigns bits to subbands. The overview concludes by noting that MPEG-1 audio provides high compression while retaining quality and paved the way for future audio compression standards.
Sampling rate refers to the number of digital samples taken per second of an analog audio signal. A higher sampling rate allows for more accurate reproduction of the original sound by capturing more data. The standard CD sampling rate is 44.1kHz.
Bit depth determines the number of possible amplitude levels that can be represented in each digital sample. A higher bit depth provides more precision in capturing the amplitude but requires more storage space. Standard CD audio has a bit depth of 16-bits, providing 65,536 possible amplitude levels per sample.
When an analog audio signal is converted to digital, the continuous waveform is converted into discrete samples. The difference between the original analog signal and the quantized digital representation is called quantization error
Audio Compression Techniques
a type of lossy or lossless compression in which the amount of data in a recorded waveform is reduced to differing extents for transmission respectively with or without some loss of quality, used in CD and MP3 encoding, Internet radio.
Dynamic range compression, also called audio level compression, in which the dynamic range, the difference between loud and quiet, of an audio waveform is reduced
The document discusses digital audio and the process of digitizing sound. It explains that sound is converted to a stream of numbers through sampling and quantization. Sampling measures the amplitude of sound waves at regular time intervals, while quantization represents the measured amplitude with a finite number of digital values. For high quality audio, sampling rates of 44.1 kHz or higher and bit depths of 16 bits are commonly used. The document also covers topics like the Nyquist theorem, audio formats, editing digital audio, and more.
This document summarizes audio and video compression techniques. It defines compression as reducing the number of bits needed to represent data. For audio, it describes lossless compression which removes redundant data without quality loss, and lossy compression which removes irrelevant data and degrades quality. It also describes audio level compression. For video, it defines lossy compression which greatly reduces file sizes but decreases quality, and lossless compression which preserves quality. The advantages of compression are also stated such as faster transmission and reduced storage needs, while disadvantages include possible quality loss and extra processing requirements.
This document summarizes a seminar presentation on audio compression techniques. It introduces common audio compression methods like PCM, DPCM, adaptive DPCM, linear predictive coding, perceptual coding, and MPEG audio coders. Specific techniques covered include third order predictive DPCM, backward and forward adaptive bit allocation used in Dolby AC-1. Applications of audio compression include conferencing, broadcasting radio programs by satellite, and saving memory space in sound cards.
Sound is created by vibrations that travel as waves through air or another medium. These sound waves can be captured and converted into digital audio files through the process of sampling and quantization. The quality of a digital audio file depends on factors like sampling rate, sample size, and avoiding clipping. Preparing high quality digital audio involves balancing file size needs with sound quality through proper recording levels and format settings.
This document provides an overview of digital audio compression techniques. It discusses how audio compression removes redundant or irrelevant information to reduce required storage space and transmission bandwidth. It describes how psychoacoustic modeling is used to eliminate inaudible components based on principles of masking. Spectral analysis is performed using transforms or filter banks to determine masking thresholds. Noise allocation quantizes frequency components to minimize noise while meeting thresholds. Additional techniques like predictive coding, coupling/delta encoding, and Huffman coding provide further compression. The encoding process involves analyzing, quantizing, and packing audio data into frames for storage or transmission.
Jonathan introduces digital media, including digital audio players that store and play audio files. Audio editing developed in the 20th century. The growing popularity of high-quality audio compression will change how consumers use recorded music. Digital video uses digital representations recorded on tape or discs, while digital photography uses electronic devices to capture and store images digitally rather than on film. Digital media hardware includes cameras, recorders, and computers used for digital audio, video, and photography.
The document discusses digitized sound data used in multimedia presentations. It describes key concepts like sampling frequency, sampling depth, file formats for storing sound like WAV and MP3. Higher sampling frequencies and depths improve sound quality but increase file sizes. Compression techniques like MP3 reduce file sizes significantly compared to uncompressed formats like WAV, with some loss of quality.
Sampling rate refers to the number of times per second that the amplitude of a sound wave is recorded in a digital audio file. A higher sampling rate allows for higher frequencies to be captured, more closely representing the original sound. The standard CD sampling rate is 44.1 kHz. Bit depth refers to the number of possible values used to record the amplitude of each sample, with 16 bits being standard for CD quality audio. Higher bit depths can capture a wider dynamic range of sounds but take up more storage space. Lossy compression formats like MP3 and Vorbis reduce file sizes by removing some audio information considered imperceptible to human hearing.
1. The document discusses different compression techniques for text, audio, images, and video.
2. It provides examples of compression ratios achieved using lossy and lossless compression methods. For example, text compression can achieve 3:1 ratios using Lempel-Ziv coding while audio compression can achieve ratios between 3:1 to 24:1 using MP3.
3. The techniques discussed include entropy encoding, run-length encoding, Huffman coding, discrete cosine transforms, and differential encoding which takes advantage of redundancies in the data. The best approach depends on the type of data and acceptable quality.
This document discusses various audio compression techniques including:
1. Differential Pulse Code Modulation (DPCM) which encodes differences between samples to reduce bitrate.
2. Third-order predictive DPCM which uses predictions of past 3 samples to improve accuracy over DPCM.
3. Adaptive Differential PCM (ADPCM) which varies the number of bits used based on signal amplitude.
It then covers more advanced techniques like Linear Predictive Coding (LPC) which analyzes perceptual features of audio to further reduce bitrates.
This document discusses digital audio media, including the goals of understanding various audio file formats and audio compression. It explains key terms like audio and compression, and asks the reader to research and explain the characteristics of three audio file formats, including whether they are compressed. Keywords include audio, sound, compression and platforms that allow listening to digital media.
This document provides information on using feedback to improve work. It discusses choosing appropriate file formats for different scenarios and understanding how compression can impact file size and quality. The document emphasizes using feedback to strengthen areas of research, planning, creation and review in revision guides with the goal of improvement. Key words around feedback and improvement are also defined.
The document discusses audio compression techniques. It begins with an introduction to pulse code modulation (PCM) and then describes μ-law and A-law compression standards which compress audio using companding algorithms. It also covers differential PCM and adaptive differential PCM (ADPCM) techniques. The document then discusses the MPEG audio compression standard, including its encoder architecture, three layer standards (Layers I, II, III), and applications. It concludes with a comparison of various MPEG audio compression standards and references.
Multimedia Technologies Introduction Subject
Multimedia Technology introduction - I created these slides for my students to teach CMP 383 Multimedia Technology at Jazan Community College , Jazan University
Audio Disc - Processing of the Audio signal - read out from the Disc Reconstruction of audio signal - Video Disc – Video disc formats - recording systems - Play back Systems, CD player and DVD player, Blue ray discs.
Abstract In market there are many sound recorders present now days designed for recording of sound in digital format. Digital sound recorders present in market are designed on different platforms. This particular sound recorder designed is designed on ARM 9 base. This sound recorder records sound in .wav format. Along with recording sound can be stored in multiple memory devices/systems. Here, it is stored in S.D. card as well as U.S.B device connected. Using this device multiple users can have recorded audio file without having their individual sound recorders. Just by connecting their U.S.B devices to system, they can have recorded file in .wav format. This system provides recorded file directly to U.S.B devices. Along with recorded file can be transferred to remote place too. Keywords: ARM 9, S.D. card, U.S.B device, .wav format and remote place.
This chapter describes the process of converting analog signals to digital form. It discusses sampling, where the analog signal is captured at discrete time intervals. It also discusses quantization, where the amplitude of the signal at each sample is assigned a digital value. The sampling rate and number of quantization levels affect the quality and size of the digital data. Digital signals have advantages over analog like improved quality, ability to edit and combine content, and efficient compression and storage. The chapter also covers filtering of digital signals.
This document discusses different multimedia elements including sound, animation, and video. It covers:
- Understanding how sound works through analog waves, digital sampling, and sound file formats. Sound can be added to multimedia through a sound card and editing software.
- 2D and 3D animation techniques and how they are used on the web. Animation can enhance multimedia titles.
- Video compression methods, editing, and embedding video on the web. The file size of video content needs to be decreased for web use.
- The document considers balancing multimedia elements with objectives, costs, and file sizes for different intended audiences and applications.
Digital Audio Watermarking Using Psychoacoustic Model and CDMA Modulationsipij
Digital watermarking is used to insert information (a signature) in a computer document. The addition of the signature must be imperceptible and undetectable by any system ignoring its mode of insertion. In particular, it must be completely invisible to the human eye. This method is different from cryptography, which hides a message by making it unreadable. Digital watermarking now includes other data within music signals. Audio watermarking consists in embedding inaudible information in an audio signal. The watermarking system must guarantee one transmission of the inaudible, reliable and robust information face a set of disruptions. In this goal, we propose a new strategy of insertion adapted to a watermarking system. This strategy permits to construct an inaudible watermarking and of maximal hardiness to the addition of a noise.
This document provides an introduction to communication systems. It defines a communication system as a system that transfers information from one place to another. Communication systems have various components including a source that generates a message, a transmitter that converts the message to a signal, a channel that conveys the signal, a receiver that converts the signal back to a message, and a destination. Communication systems can transfer both analog and digital signals and messages. Key aspects of communication systems discussed include modulation, encoding, bandwidth, and the tradeoff between communication resources and system performance.
Digital audio can be in the form of sampled audio or MIDI data. Sampled audio involves capturing an analog sound wave by taking regular samples of its amplitude at a certain sampling rate. MIDI data instead conveys musical performance instructions rather than the actual sound. While sampled audio requires more storage space, MIDI files are smaller and can be embedded in web pages more easily. When choosing an audio format, considerations include file size, compatibility, and the sound playback capabilities of the end user's system.
Digital audio can be in the form of sampled audio or MIDI data. Sampled audio involves capturing an analog sound wave through sampling at regular intervals (sample rate) and storing the amplitude measurements digitally. MIDI data instead stores instructions on how to recreate a musical performance without the actual sound recording. While sampled audio provides higher quality sound, MIDI files are smaller in size and can be changed without affecting pitch or quality, making them suitable for early web embedding. Factors like file format, playback capabilities and sound type must be considered when adding audio to multimedia projects.
This document discusses a redundancy removal technique for real-time voice compression. It begins by introducing voice compression and its increasing popularity. It then describes implementing a redundancy removal technique using MATLAB to encode and compress speech in real-time. The technique accurately estimates speech parameters and is computationally efficient. Testing showed it provided high compression and high quality audio. The technique reduces bandwidth needs for voice traffic, providing better performance than other methods for real-time applications.
Multimedia and-system-design-sound-images by zubair yaseen& yameen shakirUniversity of Education
The document discusses various topics related to multimedia, including images, sounds, and digital audio. It provides details on bitmap and vector images, describes the components of sound waves, and explains how to record and edit digital audio files. Tips are provided on balancing sound quality and file size when preparing digital audio, and setting proper recording levels to avoid distortion. Different image and audio file formats used for multimedia are also cited.
This document provides an introduction to digital audio and sound representation on computers. It discusses how sound is a pressure wave that humans can hear within the range of 20Hz to 20kHz. For digital storage and processing, sound must be sampled and quantized by converting the analog sound signal into a digital format. Key concepts covered include the Nyquist theorem for sampling rate, quantization levels for quality/signal-to-noise ratio, common audio file formats, and an overview of MIDI for digital musical instrument communication.
The document describes a student project that used a Texas Instruments DSP starter kit to implement digital signal processing techniques for real-time audio effects and a haptic beat detector device. Key aspects included designing digital filters in MATLAB to create effects like echo, reverb and chorus. A haptic motor controller connected to the DSP board detected beats in music and vibrated in time. The project provided hands-on experience with DSP concepts and their applications in areas like assistive technology. Evaluation showed the audio effects and beat detector worked as intended.
This document discusses a proposed low bit rate audio codec algorithm using discrete wavelet transform. The key aspects of the algorithm are:
1. Choosing an optimal wavelet basis for audio signals and determining the optimal decomposition level in the discrete wavelet transform.
2. Applying thresholding to wavelet coefficients to truncate insignificant coefficients, allowing data compression while maintaining suitable peak signal to noise ratio.
3. Comparing performance of the audio codec using discrete wavelet transform to one using discrete wavelet packet transform.
4. Applying a postfiltering technique to improve the quality of the reconstructed audio signal by estimating and subtracting the error in the coded signal.
This document discusses multimedia information representation and digitization principles. It covers the different media types used in multimedia like text, images, audio, and video. It explains how each media type is represented digitally and the encoding and decoding processes used to convert analog signals to digital and vice versa. It also discusses topics like digital sampling, quantization, signal bandwidth, encoding design, and image and text representation formats.
The document discusses various approaches for streaming stored audio and video over the internet. It describes:
1. Using a web server, which allows simple downloading of compressed files but requires fully downloading before playback.
2. Using a web server with a metafile, which provides information to the media player to access the audio/video file, reducing download time.
3. Using a separate media server, as web servers are designed for TCP, while streaming requires UDP for improved performance without retransmissions. The media player accesses the audio/video file from the media server.
Recordings can be analog or digital. Analog recording captures sound waves on a medium like a phonograph. Digital recording converts sound to a series of numbers and stores it digitally. Recordings are used in education for speech practice, drama, and music. They allow replaying lessons and come in formats like tapes, records, and CDs. The recording process involves capturing sound, converting it to numeric format, storing it, and reconverting playback. Recordings provide a safe, easy way to store information but overuse can bore students.
This document discusses audio compression using multiple transformation techniques for audio applications. It compares the Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT) for compressing audio signals. The DCT and DWT are applied to audio signals to generate new data sets with smaller values, achieving compression. Performance is evaluated using metrics like compression ratio, peak signal-to-noise ratio, signal-to-noise ratio, and normalized root mean square error. The results show that DWT provides a lower compression ratio but higher performance metrics compared to DCT. Overall, the document examines using DCT and DWT transforms to compress audio signals and compares their performance.
Data Compression using Multiple Transformation Techniques for Audio Applicati...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audioinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
1. Analog-to-digital conversion (ADC) allows computers to interact with analog signals by sampling and quantizing analog signals from devices like CD players.
2. During recording, an ADC converts an analog audio signal into a digital format by repeatedly measuring and assigning a binary number to the signal's amplitude at set intervals defined by the sample rate.
3. During playback, a digital-to-analog converter (DAC) reconverts the digital numbers back into an analog signal by combining the amplitude information from each sample to rebuild the original wave.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Webinar: Designing a schema for a Data WarehouseFederico Razzoli
Are you new to data warehouses (DWH)? Do you need to check whether your data warehouse follows the best practices for a good design? In both cases, this webinar is for you.
A data warehouse is a central relational database that contains all measurements about a business or an organisation. This data comes from a variety of heterogeneous data sources, which includes databases of any type that back the applications used by the company, data files exported by some applications, or APIs provided by internal or external services.
But designing a data warehouse correctly is a hard task, which requires gathering information about the business processes that need to be analysed in the first place. These processes must be translated into so-called star schemas, which means, denormalised databases where each table represents a dimension or facts.
We will discuss these topics:
- How to gather information about a business;
- Understanding dictionaries and how to identify business entities;
- Dimensions and facts;
- Setting a table granularity;
- Types of facts;
- Types of dimensions;
- Snowflakes and how to avoid them;
- Expanding existing dimensions and facts.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Compression
1. Audio Compression
by: Philipp Herget
Su ciency Course Sequence:
Course Number Course Title Term
HI1341 Introduction to Global History A92
HI2328 History of Revolution in the 20th Century B92
MU1611 Fundamentals of Music I A93
MU2611 Fundamentals of Music II B93
MU3611 Computer Techniques in Music C94
Presented to: Professor Bianchi
Department of Humanities & Arts
Term B, 1996
FWB5102
Submitted in Partial Ful llment
of the Requirements of
the Humanities & Arts Su ciency Program
Worcester Polytechnic Institute
Worcester, Massachusetts
2. Abstract
This report examines the area of audio compression and its rapidly expanding use
in the world today. Covered topics include a primer on digital audio, discussion of
di erent compression techniques, a description of a variety of compressed formats, and
compression in computers and Hi-Fi stereo equipment. Information was gathered on a
multitude of di erent compression uses.
4. 1 Introduction
The rst form of audio compression came out in 1939 when Dudley rst introduced the
VOCODER (VOice CODER) to reduce the amount of bandwidth needed to transmit speech
over a telephone line (Lynch, 222). The VOCODER broke speech down into certain fre-
quency bands, transmitted information about the amount of energy in each band, and then
synthesized speech using the transmitted information on the receiving end of the device. Since
then, there has been a great deal of research conducted in the area of audio compression. In
the 1960's, compression was used in telephony, and extensive research was done to minimize
bandwidth needed to transmit audio data (Nelson, 313). Today, audio compression is a large
subarea of Audio Engineering.
The need for audio compression is brought about by the tremendous amount of space
required to store high quality digital audio data. One minute of CD quality audio data
takes up 4Mbytes of storage space (Ratcli , 32). The use of compression allows a signi cant
reduction in the amount of data needed to create audio sounds with usually only a minimal loss
in the quality of the audio signal. Compression comes at the expense of the extra hardware or
software needed to compress the signal. However, in todays technologically advanced times,
this cost is usually small compared to the cost of space that is saved.
Compression is used in almost all new digital audio devices on the market, and in many of
the older ones. Some examples are the telephone system, digital message recorders, like those
in answering machines, and Sony's new MiniDisc player. With the use of compression, these
devices are able to store more information in less space. Compression is accompanied by a
loss in quality, but usually so minimal it cannot be heard by most people. A good example
of this is the anti-shock mechanism found in the newer CD players. This mechanism uses a
small portion of digital memory to bu er digital data from the CD. When a physical shock
disrupts the player and it can no longer read data from the CD, the data from the memory
bu er is used to generate the audio signal until the player re-tracks on the CD. To store a
maximum amount of data, the player uses compression to store the data in the memory. The
1
5. Panasonic SL-S600C has such an anti-shock mechanism with 10 seconds of storage bu er.
The Panasonic SL-S600C Operating Instructions state:
The extra anti-shock function incorporates digital signal compression technology.
When listening to sound with the unit connected to a system at home, it is
recommended that the extra anti-shock switch be set to the OFF position.
The recommendation is given because the compression algorithm used in the storage has a
slightly detrimental impact on the sound quality.
The use of audio compression is a tradeo among di erent factors. Knowledge of audio
compression is useful not only to the designer, but also the consumer. The key questions that
arise in the evaluation of an audio compression systems are how much is the data compressed,
what are the losses associated with the compression, and what is the cost of the compression.
This paper will answer some of these questions by providing a basic awareness of compression,
giving background on compression, explaining various popular compression techniques, and
discussing the compression formats used in various audio devices and audio computer les.
2 Digital Audio Basics
Compression can be accomplished using two di erent methods. The rst method is to take
the data from a standard digital audio system and compress it using software. The second is
to encode the signal in a di erent yet similar manner to that done in a normal digital audio
system. Both of these methods are based on digital audio theory, therefore, the understanding
of their functionality and performances requires an understanding of digital audio basics.
The sounds we hear are caused by variations in air pressure which are picked up by our
ear. In an analog electronic audio system, these pressure signals are converted to a electric
voltage by a microphone. The changing voltage, which represents the sound pressure, is
stored on a medium (like tape), and later used to control a speaker to reproduce the original
sound. The largest source of error in such an audio system occurs in the storage and retrieval
process were noise is added to the sound.
2
6. Voltage (Air Pressure)
time
Figure 1: An Example of an Analog Waveform
The idea behind a digital system is to represent an analog (continuous) waveform as a
nite number of discrete values. These values can be stored in any digital media, such as a
computer. Later, the values can be converted back to an analog audio signal. This method
is advantageous over the older analog techniques because no information (quality) is lost in
the storage and retrieval process. Also unlike analog, when a copy of a digital recording is
made, the values can be exactly duplicated, creating an exact replica of the original digital
work. However, the process does su er other losses. These losses occur in the conversion
process from the analog to the digital format.
To explain the analog to digital conversion process, we will look at an analog audio
waveform and show each of the steps taken in digitizing it. The waveform in Figure 1
represents a brief moment of an audible sound. The amplitude of the waveform represents
the relative air pressure due to the sound.
In a digital system, the waveform is represented by a series of discrete values. To get
these values, two steps must be taken. First the signal is sampled. This means that discrete
values of the signal are selected in time. The second step is to quantize each of the values
attained in the sampling step. Quantization reduces the amount of storage space required for
each value in a digital system.
In the rst step, the samples are taken at constant intervals. The number of samples
3
7. Voltage
time
T
Figure 2: An Example of a Sampled Analog Waveform
taken every second is called the sampling rate. Figure 2 shows the result of sampling the
signal. The X's on the waveform represent the samples which were taken. Since the samples
were taken every T seconds, there are 1=T samples per second. The sampling rate shown
in Figure 2 is therefore 1=T samples/s. Typical sampling rates range from 8000 to 44100
samples/s for a CD. The term samples/s is often replaced by the term Hz, kHz, or MHz to
represent units of samples/s, kilo samples/s, or Mega samples/s respectively (Audio FAQ).
The sample values, the values with the X's, now represent the original waveform. These
values could now be stored, and be used at a later time to recreate the original signal. How
well the original signal can be recreated, is related to the number of samples taken in a given
time period. Therefore, the sampling rate is a critical factor in the quality of the digitized
signal. If too few samples are taken, then the original signal cannot be re-generated correctly.
In 1933, a publication by Harry Nyquest proved that if the sampling rate is greater
that twice the highest frequency of the original signal, the original signal can be exactly
reconstructed (Nelson, 321). This means that if we sample our original signal at a rate that
is twice as high as the highest frequency contained in the signal, there will be no theoretical
losses of quality. This sampling rate, necessary for perfect reconstruction, is commonly
referred to as the Nyquest rate.
Now that we have a set of consecutive samples of the original signal, the samples need
4
8. Voltage
time
T
Figure 3: An Example of a Quantization of the A Sampled Analog Waveform
to be quantized in order to reduce the storage space required by each sample. The process
involves converting the sampled values into a certain number of discrete levels, which are
stored as binary numbers. A sample value is typically converted to one of 2n levels, where n
is the number of bits used to represent each sample digitally. This process is carried out in
hardware by a device called an analog to digital converter (ADC).
The result of quantizing the values from Figure 2 is shown in Figure 3. The samples still
have approximately the same value as before, but have been rounded o " to the nearest of
16 di erent levels. In a digital system, the amount of storage space required by a number
is governed by the number of possible values that number could have. By quantizing the
sample, the number of possible values is limited, signi cantly reducing the required storage
space. After quantizing the value of each sample in the gure to one of 24 levels, only 4 bits
of storage are needed for each sample. In most digital audio systems, either 8 or 16 bits are
used for storage, yielding 28 = 256 or 216 = 65536 di erent levels in the quantization process.
The quantization process is the most signi cant source of error in a digital audio signal.
Each time a value is quantized, the original value is lost, and the value is replaced by an
approximation of the original. The peak value of the error is 1=2 the value of the quantization
step. Thus the smaller the quantization steps, the smaller the error is. This means the more
5
9. Voltage
time
T
Figure 4: An Example of a Signal Reconstructed from the Digital Data
bits used to quantize the signal, the better the quality of reconstructed sound signal, and the
more space required to store the signal values.
To regain the original signal, each of the values stored as the digital audio signal are
converted back to an analog audio signal using a Digital to Analog Converter (DAC). An
example of the output of the DAC is shown in Figure 4. The DAC takes the sample points and
makes an analog waveform out of them. Due to the process used to convert the waveform,
the resulting signal is comprised of a series of steps. To remedy this, the signal is then put
through a low pass lter which smoothes out the waveform, removing all of the sharp edges
caused by the DAC. The resulting signal is very close to the original.
All the losses in the digital system occur in the conversion process to and from a digital
signal. Once the signal is digital, it can be duplicated, or replayed any number of times and
never lose any quality. This is the advantage of a digital system. The losses generated by
the conversion process can be measured as a Signal to Noise Ratio (SNR), the same measure
used for analog signals. The noise in the signal is considered to be the signal that would have
to be subtracted from the reconstructed signal to obtain the original. SNR is used to compare
the quality of di erent types of quantization, and is also used in the quality measurement of
compression techniques.
6
10. 3 Compression Basics
The underlying idea behind data compression is that a data le can be re-written in a di erent
format that takes up less space. A data format is called compressed when it saves either
more information in the same space, or saves information in less space than a standard
uncompressed format. A compression algorithm for an audio signal will analyze the signal
and store it in a di erent way, hopefully saving space. An analogy could be made between
compression and shorthand. In shorthand, words are represented by symbols, e ectively
shortening the amount of space occupied. Data compression uses the same concept.
3.1 Lossless vs. Lossy Compression
The eld of compression is divided into two categories, lossless and lossy compression. In
lossless compression, no data is lost in the compression process. An example of a lossless
compression program is pkzip for the IBM PC. This is a shareware utility which is widely
available. It can be used to compress and uncompress any type of computer le. When a le
is uncompressed, the exact original is retrieved. The amount of compression that is achieved
is highly dependent on the type of le, and varies greatly from le to le.
In lossy compression schemes, the goal is to encode an approximation of the original.
By using a close approximation of the signal, the coding can usually be accomplished using
much less space. Since an approximation is saved, instead of the original, lossy compression
schemes can only be used to compress information when the exact original is not needed.
This is the case for audio and video data. With these types of data, any digital format used
is an approximation of the original signal. Compression used in computer data or program
les must be compressed using lossless compression because all of the data is usually critical.
In general, lossy compression schemes yield much higher compression ratios than lossless
compression schemes. In many cases, the di erence in quality between the compressed
version and the original is so minimal that it is not noticeable. Yet, in other compression
7
11. schemes there is a signi cant di erence in quality. Deciding what how much information is
to be lost is up to the discretion of the designer of the algorithm or technique. It is a tradeo
between size and quality.
If the shorthand writer, from the previous analogy, was to write only the main idea's of the
text down, it would be analogous to lossy compression. Using only the main ideas would be
an extreme form of compression. If he or she were to leave out some adjectives and adverbs,
it would again be a form of lossy compression. This one being less lossy than the rst. From
the analogy, it can be seen how the writer (programmer) can decide how important the details
are and how many details to include.
Almost all compression techniques used in digital systems are lossy. This is because
lossless compression algorithms are generally very unpredictable in the amount of compres-
sion they can achieve. In a typical application, there is a limited amount of space" for the
digital audio data that is generated. If the audio data cannot be compressed to a guaranteed
size, it simply will not t in the required space, which is unacceptable.
The reason for the unpredictability of a lossless technique lies in the technique itself. Data
which happens to be in a format which does not lend itself to the way the lossless technique
re-writes" the data will not be compressed. In The Data Compression Book, Mark Nelson
compares raw speech les which were compressed with a shareware lossless data compression
program, ARJ, to demonstrate how well a typical lossless compression scheme will compress
an audio signal. He states:
ARJ results showed that voice les did in fact compress relatively well. The six
sample raw sound les gave the following results:
Filename Original Compressed Ratio
SAMPLE-1.RAW 50777 33036 35%
SAMPLE-2.RAW 12033 8796 27%
SAMPLE-3.RAW 73019 59527 19%
SAMPLE-4.RAW 23702 9418 60%
SAMPLE-5.RAW 27411 19037 30%
SAMPLE-6.RAW 15913 12771 20%
8
12. His data shows that the compression ratios uctuate greatly depending on the particular
sample of speech that is used.
3.2 Audio Compression Techniques
For any type of compression, the compression ratio and the algorithm used is highly depend-
ent on the type of data that is being compressed. The data source used in this paper is audio
data, and we have already determined that lossy compression will be used in most cases.
Now we can further subdivide the source into music and voice data.
The more information that is known about the source, the better the compression tech-
nique can be tailored toward that type of data. The di erences between music and speech
allow audio compression techniques to be subdivided into two categories: waveform coding
and voice coding. Waveform coding can be used on all types of audio data, including voice.
The goal of waveform coding is to recreate the original waveform after decompression. The
closer the decompressed waveform is to the original, the better the quality of the coding
algorithm is. The second technique, voice coding, yields a much higher compression ratio,
but can only be used if the audio source is a voice. In voice coding, the goal is to recreate the
words that were spoken and not the actual voice. The algorithms utilize priori information
about the human voice, in particular the mechanism that produces it" (Lynch, 255).
Since the two techniques are fundamentally di erent, the performance of each technique
is measured di erently. The performance of waveform coding techniques are measured by
determining how well the uncompressed signal matches the original speech waveform. This
is usually done by measuring the SNR. With the voice coding technique this is not possible
since the technique doesn't try to mimic the waveform. Therefore, in voice coding algorithms,
the quality of the algorithm is measured by listener preference.
These coding techniques can be further subdivided into two categories, time domain
coding and frequency domain coding. In a time domain coding technique, information on each
of the samples of the original signal are encoded. In a frequency domain coding technique,
9
13. the signal is transformed into it's frequency representation. This frequency representation is
then encoded into a compressed format. Later the information is decoded, and transformed
back into the time representation of the signal to get back the original samples. Most simple
compression algorithms use a time domain coding technique.
The more recent waveform coding techniques provide a much higher compression ratio by
using psychoacoustics to aid in the compression. Psychoacoustics is the study of how sounds
are heard subjectively and of the individual's response to sound stimuli" (Webster's New
World Dictionary, 1147). By basing the compression scheme on psychoacoustic phenomenon,
data that can't be heard by humans can be discarded. For example, in psychoacoustics it has
been determined that certain levels of sounds cannot be heard while other louder sounds are
present (Beerends, 965). This e ect is called masking. By eliminating the unheard sounds
from the audio signal, the signal is simpli ed, and can be more easily compressed. Techniques
like these are used in modern systems where high compression ratios are necessary, like Sony's
new MiniDisc player.
3.3 Common Audio Compression Techniques
The techniques that have been discussed thus far are general subcategories of the approaches
that can be taken when designing an audio compression algorithm. In this section, the details
of some popular compression techniques will be discussed. Since compression is such a large
area, a comprehensive guide to all the di erent compression methods is far beyond the scope
of this paper. However, this section covers some fundamental and some advanced techniques
to provide a general idea of how di erent compression techniques are implemented.
To give a general background, both waveform and voice coding techniques are discussed.
Since the waveform coding techniques are simpler, they will be discussed rst. In these
techniques, the compressed digital data is often obtained from the original signal itself, rather
than creating standard digital audio data and compressing it with software.
10
14. 3.3.1 Waveform Coding Techniques
PCM
Pulse Code Modulation (PCM) refers to the technique used to code the raw digital audio
data as described in Section 2. It is the fundamental digital audio technique that is used
most frequently in digital audio systems. Although PCM is not a compression technique,
when it is used along with non-uniform quantization such as {Law or A{Law, it can be
considered compression. PCM combined with non-uniform quantization is used as a reference
for comparing the performance of other compression schemes (Lynch, 225).
{Law and A{Law Companding
Since the dynamic range of an audio signal is very wide, an audio waveform having a maximum
possible amplitude of 1 volt may never reach over 0.1 volts if the audio signal is not very
loud. If the signal is quantized with a linear scale, the values attained by the signal will
cover only 1/10 of the quantization range. As a result, the softer audio signals have a very
granular waveform after being quantized, and the quality of the sound deteriorates rapidly
as the sound gets softer. To compensate for the wide dynamic range of audio signals, a non-
linear scale can be used to quantize the signal. Using this method, the digitized signal will
have an increased number of steps in the lower range, alleviating the problem (Couch, 152).
Using non-uniform quantization can raise the SNR for a softer sound, making the SNR for
a wide range of sound levels approximately uniform (Couch, 155). Typically, non-uniform
quantization is done on a logarithmic scale.
The two standard formats for the logarithmic quantization of a signal are {Law and
A{Law. A{Law is the standard format used in Europe (Couch, 153), and {Law is used in
the telephone systems of the United States, Canada, and Japan. The {Law quantization,
used in phone systems, uses eight bits of data to provide the dynamic range that normally
requires twelve bits of PCM data (Audio FAQ).
The process of converting a computer le to {Law is a form of compression, since the
11
15. amount of data that is needed per sample is reduced and the dynamic range of the sample
is increased. The result is much less data with more information. To create {Law or A{
Law data, the signal must be originally be compressed and later expanded. This process is
commonly referred to as companding.
Silence Compression
Silence compression is a form of lossless compression that is extremely easy to implement.
In silence compression, periods of relative silence in a audio signal are replaced by actual
silence. The samples of data that were used to represent the silent part are replaced by a
code and a number telling the device which reconstructs the analog signal how much silence
to insert. This reduces all of the data needed to represent the silent part of the signal down
to a few bytes.
To implement this, the compression algorithm rst determines if the audio data is silent
by comparing the level of the digital audio data to a threshold. If the level is lower than the
threshold, that part of the audio signal is considered silent, and the samples are replaced by
zeros. The performance of the algorithm therefore hinges on the threshold level. The higher
the level, the more compression there is but the more lossy the technique is. The amount of
compression achieved also depends on the total length of all the silent periods in an audio
signal. The amount can be very signi cant in some types of audio data like voice data.
Silence encoding is extremely important for human speech. If you examine a
waveform of human speech, you will see long, relatively at pauses between the
spoken words. (Ratcli 32)
In The Data Compression Book, Mark Nelson wrote silence compression code in C, and
used it to compress some PCM audio data les. The results he obtained were as follows:
Filename Original Compressed Ratio
SAMPLE-1.RAW 50777 37769 26%
SAMPLE-2.RAW 12033 11657 3%
SAMPLE-3.RAW 73019 73072 0%
SAMPLE-4.RAW 13852 10962 21%
SAMPLE-5.RAW 27411 22865 17%
12
16. a)
b)
Figure 5: An Example of Signals in a DM waveform: a) The original and reconstructed
waveforms and b) The DM waveform
The table indicates that silence compression can be very e ective in some instances, but in
others it may have no e ect at all, or even increase the le size slightly. Silence compression
is used mainly in le formats found in computers.
DM
Delta Modulation (DM) is one of the most primitive forms of audio encoding. In DM, a
stream of 1 bit values is used to represent the analog signal. Each bit contains information
on whether the DM signal is greater or less than the actual audio signal. With this information,
the original signal can then be reconstructed.
Figure 5 shows an example DM signal, the original signal it was generated from, and the
reconstructed signal before ltering. The actual DM signal, Figure 5b, contains information
on whether the output should rise or fall. The size of the step and the rate of the steps are
xed. The reconstruction algorithm simply raises or lowers the input value according to the
DM waveform.
DM su ers from two major losses, granular noise and slope overload. Granular noise
occurs when the input signal is at. The DM signal simulates at regions by rising and
falling, leading to granular noise. Slope overload is caused when the input signal rises faster
13
17. than the DM signal can follow it. Granular noise can be eliminated by making the step size
small enough, and slope overload can be prevented by increasing the data rate. However,
decreasing the step size and increasing the data rate, also increases the amount of data
needed to store the signal. DM is rarely used, but was explained here to provide a basis for
understanding ADM, which o ers a signi cant advantage over PCM.
ADM
Adaptive Delta Modulation (ADM) is the solution to the problems with DM. In ADM, the
step size is continuously adjusted, making the step size larger in the fast changing parts of
the signal and smaller in the slower changing parts of the signal. Using this technique, both
the granular noise and the slope overload problems are solved.
In order to adjust the step size, an estimation must be made to determine if the signal is
changing rapidly. The estimation in ADM is usually based on the last sample. If the signal
increased for two consecutive samples, the step size is increased. If the two previous steps
were opposite in direction, then the step size is decreased. This estimation method is simple
yet e ective.
The performance of ADM using the above technique turns out to be better than Log PCM
when little data is used to represent a signal1. When more data is used however, Log PCM
performs better (Lynch 229).
DPCM
A Di erential Pulse Code Modulation (DPCM) system consists of a predictor, a di erence
calculator, and a quantizer. The predictor predicts the value of the next sample. The
di erence calculator then determines the di erence between the predicted value and the actual
value. Finally, this di erence value is quantized by the quantizer. The quantized di erences
are used to represent the original signal.
1 Performance is measured with SNR.
14
18. Essentially, a DM signal is a DPCM signal with one bit being used in the quantization
process and a predictor based on the previous bit. In a DM system, the predicted value
is always the same as the previous value and the di erence between the predicted value
(previous value) and the actual signal is quantized with using one bit (two levels).
The performance of a DPCM signal depends on the predictor. The better it can predict
where the signal is headed, the better it will perform. A DPCM system using one previous
value in the predictor can achieve the same SNR as a {Law PCM system using one less bit
to quantize each sample value. If three previous values are used for the predictor, the same
SNR can be achieved using two bits less to represent each sample (Lynch 227). This is a
signi cant performance increase over PCM because it obtains the same SNR using less data.
This technique can be extended even further by making the prediction method adaptive to the
input data. The technique is called Adaptive Di erential Pulse Code Modulation (ADPCM).
ADPCM
ADPCM is a modi cation of the DPCM technique making the algorithm adapt to the char-
acteristics of the signal. The relationship between DM and ADM is the same as that between
DPCM and ADPCM. In both of these, the algorithm is made adaptive to the changes in the
audio signal. The adaptive part of the system can be built into the predictor, the quantizer,
or both, but has been shown to be most e ective in the quantizer (Lynch 227).
Using this adaptive algorithm, the compression performance can be increased beyond that
of DPCM. Cohen (1973) shows that by using the two most signi cant bits in the previous
three samples, a gain in SNR of 7dB over non-adaptive DPCM can be obtained" (Lynch,
227). Di erent forms of ADPCM are used in many applications including inexpensive digital
recorders. Also, ADPCM is used in public compression standards which are slowly gaining
popularity, like CCITT G.721 and G.723, which used ADPCM at 32 kbits/s and 24 or 40
kbits/s respectively (Audio FAQ).
15
19. PASC and ATRAC
All of the previously mentioned compression techniques are a relatively simple re-writing
of the audio data. Precision Adaptive Subband Coding (PASC) and Adaptive TRansform
Acoustic Coding (ATRAC) di er from these, because they are much more complex propri-
etary schemes which were developed for a speci c purpose. PASC and ATRAC were both
developed for used in the Hi-Fi audio market. PASC was developed by Philips for use with
the Digital Compact Cassette (DCC), and ATRAC was developed by Sony for use with their
MiniDisc player. Both of these techniques use psychoacoustic phenomena as a basis for the
compression algorithm in order to achieve the extreme compression ratios required for their
applications.
The details of the algorithms are complicated, and will not be discussed here. More
information is given in the discussion of compression used in Hi-Fi audio equipment in
Section 4.2. In addition to this, details on PASC can be found in Advanced Digital Audio
by Ken Polmann, and details on ATRAC can be found in the Proceedings of the IEEE in an
article titled, The Rewritable MiniDisc System" by Tadao Yoshida.
3.3.2 Voice Coding Techniques
LPC
Linear Predictive Coding (LPC) is one of the most popular voice coding techniques. In
an LPC system, the voice signal is represented by storing characteristics about the system
creating the voice. When the data is played back, the voice is synthesized from the stored data
by the playing device. The model used in an LPC system includes the source of the sound,
a variable lter resembling the human vocal tract, and an variable ampli er resembling the
amplitude of the sound.
The source of the sound is modeled in two di erent ways depending on how the voice is
being produced. This is done because humans can produce two types of sound, voiced and
unvoiced. Voiced sounds are those which are created by using the vocal cords and unvoiced
16
20. sounds are created by pushing air through the vocal tract. An LPC algorithm models these
sounds by using either driven periodic pulses (voiced) or a random noise generator (unvoiced)
as the source.
The human vocal tract is modeled in the system as a time-varying lter (Lynch, 240).
Parameters are calculated for the lter to mimic the changing characteristics of the vocal
tract when the sound was being produced. The data used to represent the voice in an LPC
algorithm consists of the information on the lter parameters, the source used (voiced or
unvoiced), the pitch of the voice, and the volume of the voice. The amount of data generated
by storing these parameters is signi cantly less than the amount of data used to represent
the waveform of the speech signal.
GSM
The Global System for Mobile telecommunications (GSM) is a standard used for compression
of speech in the European digital cellular telephone system. GSM is an advanced compression
technique that can achieve a compression ratio of 8:1. To obtain this high compression ratio
and still produce high quality sound, GSM is based on the LPC voice coding technique and
also incorporates a form of waveform coding (Degener, 30).
4 Uses of Compression
Compression is used in almost all modern digital audio applications. These devices include
computer les, audio playing devices, telephony applications, and digital recording devices.
Many of the devices, like the telephone system, have been using compression for many years
now. Others have just recently started using it. The type of compression that is used depends
on cost, size, space, and many other factors.
After reviewing a basic background on compression, one question remains unanswered:
what type of compression is used for a particular application? In the following sections, the
17
21. uses of compression in two major areas will be discussed: computer les, and digital hi-
stereo equipment. Knowledge about these areas is particularly useful, because it can help in
deciding which device to use.
4.1 Compression in File Formats
When digital audio technology was rst appearing on the market, each computer manufac-
turer had their own le format, or formats, associated with their computer (Audio FAQ). As
software became more advanced, computers attained the ability to read more than one le
format. Today, most software can read and write a wide range of le formats, leaving the
choice to the user.
In general, there are two types of le formats, raw" and self-describing. In a raw le
format data can be in any format. The encoding and parameters are xed and know in
advance to be able to read the le. The self-describing format has a header in which di erent
information about the data type are stored, like sampling rate and compression. The main
concern here will be with self-describing le formats, since these are most often used and
most versatile.
A disadvantage of using compression in computer les is that the le usually needs to be
converted to linear PCM data for playback on digital audio devices. This requires extra code
and processing time. It also may be one of the reasons why approximately half of the le
formats available for computers don't support compression. The following is a chart taken
from the Audio Tutorial FAQ" of The Center for Innovative Computer Applications. It
describes most of the popular le formats on the market, and the compression that is used if
any:
18
22. Extension, Name Origin Variable Parameters
.au or .snd NeXT, Sun rate, #channels, encoding, info string
.aif(f), AIFF Apple, SGI rate, #channels, sample width, lots of info
.aif(f), AIFC Apple, SGI same (extension of AIFF with compression)
.i , IFF/8SVX Amiga rate, #channels, instrument info (8 bits)
.voc Soundblaster rate (8 bits/1 ch; can use silence deletion)
.wav, WAVE Microsoft rate, #channels, sample width, lots of info
including compression scheme]
.sf IRCAM rate, #channels, encoding, info
none, HCOM Mac rate (8 bits/1 ch; uses Hu man compression)
none, MIME Internet usually 8-bit {Law compression 8000 samp/s]
.mod or .nst Amiga bank of digitized instrument samples with
sequencing information]
Many of these le formats are just uncompressed PCM data with the sampling rate and
the number of channels used during recording speci ed in the header. For the formats that do
support compression, it is usually optional. For example, in the Soundblaster .voc" format,
silence compression can be used, and in the Microsoft .wav" format, a number of di erent
encoding schemes can be used including PCM, DM, DPCM, and ADPCM.
Conversion from one format to another can be accomplished via software. The Audio
FAQ" also provides information on a number of di erent programs that will do the conversion.
When converting from uncompressed to compressed formats, the le is generally smaller
afterwards, but some quality is lost. If the le is later converted back, the size will increase,
but the quality can never be regained.
4.2 Compression in Recording Devices
There are currently four major digital stereo devices on the market. These are the Compact
Disc (CD), the Digital Analog Tape (DAT), the Digital Compact Cassette (DCC), and the
MiniDisc (MD). They are all very di erent from each other. The CD and MD use an optical
storage mechanism, and the DAT and DCC use a magnetic tape to store the data. There are
also a number of other apparent di erences between the mediums. For example, a CD is not
19
23. re-writable while the others are.
A major di erence that may not be apparent, however, is that the MD and DCC utilize
digital data compression while the DAT and CD do not. This allows the MD and DCC to be
physically smaller than their uncompressed counterparts. In both devices, the smaller data
size is necessary and advantageous.
In the MD, the design goal was to make the optical disc small so that it would be portable.
The MD contains the same density of data as the CD. Only by using compression can the disc
be made physically smaller than the CD. In addition to reducing the size, the compression
used gave the MD other advantages. It allowed the MD to be the rst optical player with
the digital anti-shock mechanism described in the introduction. Since less data is required
to generate sound and the MD reads at the same speed as the CD, the MD can read more
data than it needs to generate sound. The extra data is stored in a bu er, which does not
need to be very big. CD's eventually came out with the same technology, but in order to
implement it, the reading speed of the CD needed to be increased, and the data needed to
be compressed after reading to t it into a memory bu er.
The design goal of the DCC was to make the storage medium inexpensive and the same
size as an audio tape. By doing this, a DCC player could accept standard audio tapes as
well as the new DCC tapes, making it more marketable. To be able to t the data onto a
relatively inexpensive tape medium which can be housed in an audio cassette case, digital
compression was required.
In both the MD and DCC, the space available for digital audio data was approximately 1=4
of the size required for PCM data. The compression ratio needed was therefore approximately
4:1. To obtain such high compression rates, the compression schemes utilize psychoacoustic
phenomena.
Precision Adaptive Subband Coding (PASC) is the compression algorithm that is used
for the DCC to provide a 4:1 compression of the digital PCM data. PASC is described in
the book Advanced Digital Audio, edited by Ken Pohlmann:
20
24. The PASC system is based on three principles. First, the ear only hears sounds
above the threshold of hearing. Second, louder sounds mask softer sounds of
similar frequency, thus dynamically changing the threshold of hearing. Similarly,
other masking properties such as high- and low-frequency masking may be util-
ized. Third, su cient data must be allocated for precise encoding of sounds above
the dynamic threshold of hearing.
Using PASC, enough digital data can t onto a medium the size of a cassette to make the
DCC player feasible.
The MD uses the ATRAC compression algorithm, which is based on the same psy-
choacoustical phenomenon. Compression in a MiniDisc is more advanced, however. The
MiniDisc achieves a compression ratio of 5:1 in order to o er 74 min of playback time"
(Yoshida, 1498).
Although these algorithms o er such a high compression, there are some losses that are
involved. Experts claim that they can hear a di erence between a CD and a MD, but the
actual losses are so minimal that the average person will not hear them. The largest errors
occur with certain types of audio sounds that the compression algorithm has problems with.
In an article in Audio Magazine, Edward Foster writes:
Although the test was not double-blind, and thus is suspect, I convinced my-
self I could reliably tell the original from the copy|just barely, buy di erent
nonetheless.
The di erences occurred in three areas: A slight suppression of low-level high-
frequency content when the algorithm needed most of the available bitstream
to handle strong bass and midrange content, a slight dulling of the attack of
percussion instruments (piano, harpsichord, glockenspiel, etc.) probably caused
by imperfect masking of pre-echo" and a slight post-echo" (noise pu ) at the
sensation of a sharp sound (such as claves struck in an acoustically dead envir-
onment). The second and third of these anomalies were most readily discernible
on single instruments played one note at a time in a quiet environment and were
taken from a recording speci cally made to evaluate perceptual encoders.
Similar e ects exist when listening to a DCC recording. Although the losses are minimal,
they are still present, being the tradeo of having the small compact portable format.
21
25. 5 Conclusion
In the last decade, the eld of digital audio compression has grown tremendously. With the
expansion of the electronics industry and the decreasing prices of digital audio, many devices
which once used analog audio technology now use digital technology. Many of these digital
devices use compression to reduce storage space, and bring down cost.
Digital audio compression has become a sub-area of Audio Engineering, supporting many
professionals who specialize in this eld. Millions of dollars are invested by companies,
such as Sony and Philips, to develop proprietary compression schemes for their digital audio
applications (Audio FAQ).
Because of the widespread use of compression, knowledge in this area can be useful.
As a musician working with modern digital recording and editing equipment, the study of
compression can provide an advantage. Knowledge in the eld of compression can help in
the evaluation and understanding of recording and playback equipment. It can also aid when
manipulating digital les with computers. As we move into the next century, and digital
audio technology continues to grow, the knowledge of audio compression will become an
increasingly valuable asset.
22
26. Bibliography
Audio tutorial FAQ." FTP://pub/usenet/news.answers/audio-fmts/part 12], Center for
Innovative Computer Applications, August 1994.
J. G. Beerends and J. A. Stermerdink, A perceptual audio quality measure based on
a psychoacoustic sound representation," AES: Journal of the Audio Engineering Society,
vol. 40, p. 963, December 1992.
L. W. Couch, Digital and Analog Communication Systems. New York, NY: Macmillan
Publishing Company, fourth ed., 1993.
J. Degener, Digital speech compression," Dr. Dobb's Journal, vol. 19, p. 30, December
1994.
M. Fleischmann, Digital recording arrives," Popular Science, vol. 242, p. 84, April 1993.
E. J. Foster, Sony MSD-501 minidisc deck," Audio, vol. 78, p. 56, November 1994.
D. B. Guralnik, ed., Webster's New World Dictionary. New York, NY: Prentice Hall Press,
second college ed., 1986.
P. Lutter, M. Muller-Wernhart, J. Ramharter, F. Rattay, and P. Slowik, Speech research
with WAVE-GL," Dr. Dobb's Journal, vol. 21, p. 50, November 1996.
T. J. Lynch, Data Compression: Techniques and Applications. New York, NY: Van Nos-
trand Reinhold, 1985.
M. Nelson, The Data Compression Book. San Mateo, CA: M&T Books, 1992.
Panasonic Portable CD Player SL-S600C Operating Instructions.
K. C. Pollmann, ed., Advanced Digital Audio. Carmel, IN: SAMS, rst ed., 1993.
J. W. Ratcli , Audio compression," Dr. Dobb's Journal, vol. 17, p. 32, July 1992.
J. W. Ratcli , Examining PC audio," Dr. Dobb's Journal, vol. 18, p. 78, March 1993.
J. Rothstein, MIDI: A Comprehensive Introduction. Madison, WI: A-R Editions, Inc., 1992.
A. Vollmer, Minidisc, digital compact cassette vie for digital recording market," Electron-
ics, vol. 66, p. 11, September 13 1993.
J. Watkinson, An Introduction to Digital Audio. Jordan Hill, Oxford (GB): Focal Press,
1994.
T. Yoshida, The rewritable minidisc system," Proceedings of the IEEE, vol. 82, p. 1492,
October 1994.
23