Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Secret Lives
     of MP3 Files

        Doug Kaye
The Conversations Network
    and GigaVox Media
Formats & Encoders

  • Lossless (WAV, AIFF)
  • Lossy
   - MPEG 1, Layer 3 (MP3)
   - AAC (AAC, M4A, M4B)
   - MPEG I, La...
MPEG Confusion


• Lossy Perceptual/Psychoacoustical Codecs
• MP3 = MPEG-I Layer 3
• MP2 = MPEG-I Layer 2 (not MPEG-II)
Motion Picture Experts Group

   • MPEG-1:Video CDs, MP3 Audio
   • MPEG-2: Digital TV, Set-Top Boxes
   • MPEG-4: Online ...
MPEG-1 for Geeks
• Layer 1
 • Simple 32-Band Algorithm
 • Philips DCC (Digital Compact Cassette)
• Layer 2 (a.k.a. MUSICAM...
MPEG-1 Layer 3 (MP3)
    for Geeks
• Psychoacoustic Masking
 • 32 Bands Divided into 576 Subbands
 • More Accurate Masking...
Sample Rate for Geeks

• The Nyquist Theorem
 • Sample at 2x the Highest Frequency
 • 22.05kHz Sample Rate for 11kHz Audio...
Sample Rate in Practice

• Standardize on 44.1kHz Sample Rate
• Flash & Other Players Require n*11.025kHz
• Resample if So...
Bit Rate for Geeks
• Independent of Sample Rate
• Specifies Encoder Output File Size (CBR)
 • @64kbps, 1 hour ≈ 27MB
• Var...
Bit Rate in Practice
• “Use Higher Bit Rates for Music?”
• It’s a Myth!
 • Human Voices Are Complex
 • Music Masks Its Own...
Podcasting Bit-Rate History

  • June 2003: 32kbps. “Files too large”
  • April 2004: 48kbps. “No problem”
  • September 2...
Stereo Encoding

• “Stereo MP3s are twice as large as mono.”
• It’s a Myth!
• Only Bit Rate Specifies Output File Size
• Y...
Stereo Encoding for Geeks

  • Dual Channel or Independent Channel (IC)
   - Entirely Separate Left and Right
  • But Most...
Stereo Encoding
       (Even Geekier)

• JS Encodes L+R and L-R
• If L=R then L-R=0
• SinceUsesRate is ConstantStereo Info...
Stereo Encoding in Practice

  • StereoReason to (not Music vs.Voice) is a
           vs. Mono
    Good             Use Hi...
Mastering for MP3

• Help the Encoder: Eliminate Unnecessary Data
 - High-Pass Filter at 80Hz
 - Low-Pass Filter at 11kHz ...
Which is Louder?




• It’s Not the Height of the Peaks (voltage)
• It’s the Area Under the Curve (power)
Loudness
• What’s the Standard?
• We Asked:
 - Podcasters
 - Audio Engineers
 - Radio Engineers
• Answer: There Isn’t One
...
Normalization

• Peak Normalization (common)
 - Maximizes Voltage, not Power
• RMS Normalization
 - Maximizes Power (=Loud...
Avoid Recording to MP3!

• MP3 is a final/release format.
• Not designed to be decoded and re-encoded.
• Use MP2 Instead.....
AAC/M4B Files?

• Yes, AAC is Better Than MP3
• We Added AAC to Support iPod Bookmarks
• Painful: Only iTunes Could Encode...
MP2: Why and When?

• MPEG-1 Layer 2
• Designed as an Intermediate Format
• The Standard in Broadcast Radio
• 128kbps per ...
Audio Lessons Learned

• MP3 Options
• Audio-File Myths
• RMS Normalization (Loudness)
• AAC/M4B Files (iTunes & iPods)
• ...
To Summarize
• Record at 44.1kHz Sample Rate (not in MP3!)
• Mastering
 - RMS Normalization (Pick a Standard Level)
 - 80H...
The Secret Lives of MP3 Files
Upcoming SlideShare
Loading in …5
×

The Secret Lives of MP3 Files

7,493 views

Published on

Things you didn't know (or thought you did) about MP3 files.

Published in: Economy & Finance, Technology

The Secret Lives of MP3 Files

  1. The Secret Lives of MP3 Files Doug Kaye The Conversations Network and GigaVox Media
  2. Formats & Encoders • Lossless (WAV, AIFF) • Lossy - MPEG 1, Layer 3 (MP3) - AAC (AAC, M4A, M4B) - MPEG I, Layer 2 (MP2)
  3. MPEG Confusion • Lossy Perceptual/Psychoacoustical Codecs • MP3 = MPEG-I Layer 3 • MP2 = MPEG-I Layer 2 (not MPEG-II)
  4. Motion Picture Experts Group • MPEG-1:Video CDs, MP3 Audio • MPEG-2: Digital TV, Set-Top Boxes • MPEG-4: Online Multimedia (Video) • MPEG-7: Audio and Video Search • MPEG-21: Multimedia Framework
  5. MPEG-1 for Geeks • Layer 1 • Simple 32-Band Algorithm • Philips DCC (Digital Compact Cassette) • Layer 2 (a.k.a. MUSICAM) • Also 32 Bands • International Standard for Broadcasting
  6. MPEG-1 Layer 3 (MP3) for Geeks • Psychoacoustic Masking • 32 Bands Divided into 576 Subbands • More Accurate Masking Thresholds • Redundancy Reduction • Lossless Huffman Encoding • Bit-Reservoir Buffering • Joint Stereo
  7. Sample Rate for Geeks • The Nyquist Theorem • Sample at 2x the Highest Frequency • 22.05kHz Sample Rate for 11kHz Audio • Sample Rate Is aSource (WAV or AIFF) Property of Uncompressed
  8. Sample Rate in Practice • Standardize on 44.1kHz Sample Rate • Flash & Other Players Require n*11.025kHz • Resample if Source is 48kHz from DVDs
  9. Bit Rate for Geeks • Independent of Sample Rate • Specifies Encoder Output File Size (CBR) • @64kbps, 1 hour ≈ 27MB • Variable Bit Rate (VBR) • For Higher Bit Rates Only • Not Universally Supported (Avoid It)
  10. Bit Rate in Practice • “Use Higher Bit Rates for Music?” • It’s a Myth! • Human Voices Are Complex • Music Masks Its Own Artifacts • 64kbps is Most Common Today • 96kbps is Gaining
  11. Podcasting Bit-Rate History • June 2003: 32kbps. “Files too large” • April 2004: 48kbps. “No problem” • September 2004: 64kbps. “Quality is low” • Today: Still 64kbps. • Tomorrow??
  12. Stereo Encoding • “Stereo MP3s are twice as large as mono.” • It’s a Myth! • Only Bit Rate Specifies Output File Size • You May Want to Use Higher Bit Rates for Stereo
  13. Stereo Encoding for Geeks • Dual Channel or Independent Channel (IC) - Entirely Separate Left and Right • But Most L/R Information is Redundant • Intensity Stereo (IS) • Mid/Side Stereo (MS) • Joint Stereo (JS) Allows IS/MS Combination
  14. Stereo Encoding (Even Geekier) • JS Encodes L+R and L-R • If L=R then L-R=0 • SinceUsesRate is ConstantStereo Information Bit L=R Fewer Bits for
  15. Stereo Encoding in Practice • StereoReason to (not Music vs.Voice) is a vs. Mono Good Use Higher Bit Rates • Greater Separation Suggests Higher Rates • If Mostly Speech, Consider 100% Mono • If Mono, Make L&R Digitally Identical • Always Encode in Stereo for Compatibility
  16. Mastering for MP3 • Help the Encoder: Eliminate Unnecessary Data - High-Pass Filter at 80Hz - Low-Pass Filter at 11kHz (@64kbps encoding) - Normalize
  17. Which is Louder? • It’s Not the Height of the Peaks (voltage) • It’s the Area Under the Curve (power)
  18. Loudness • What’s the Standard? • We Asked: - Podcasters - Audio Engineers - Radio Engineers • Answer: There Isn’t One • It’s a Hard Problem to Solve
  19. Normalization • Peak Normalization (common) - Maximizes Voltage, not Power • RMS Normalization - Maximizes Power (=Loudness) • Determine a Standard Loudness Level
  20. Avoid Recording to MP3! • MP3 is a final/release format. • Not designed to be decoded and re-encoded. • Use MP2 Instead... • or the highest MP3 bit rate possible.
  21. AAC/M4B Files? • Yes, AAC is Better Than MP3 • We Added AAC to Support iPod Bookmarks • Painful: Only iTunes Could Encode M4B • Doubled Much of Our Workflow • Can’t Be Easily Assembled
  22. MP2: Why and When? • MPEG-1 Layer 2 • Designed as an Intermediate Format • The Standard in Broadcast Radio • 128kbps per Track • 44.1kHz Sample Rate Preferred
  23. Audio Lessons Learned • MP3 Options • Audio-File Myths • RMS Normalization (Loudness) • AAC/M4B Files (iTunes & iPods) • MP2 Files
  24. To Summarize • Record at 44.1kHz Sample Rate (not in MP3!) • Mastering - RMS Normalization (Pick a Standard Level) - 80Hz Hi-Pass, 11kHz Low Pass (for voice) - If Mono, Make L&R Digitally Identical • Encoding - 64kbps when L=R - Consider ≥96kbps for L≠R - Always Use Joint Stereo

×