The Secret Lives
     of MP3 Files

        Doug Kaye
The Conversations Network
    and GigaVox Media
Formats & Encoders

  • Lossless (WAV, AIFF)
  • Lossy
   - MPEG 1, Layer 3 (MP3)
   - AAC (AAC, M4A, M4B)
   - MPEG I, La...
MPEG Confusion


• Lossy Perceptual/Psychoacoustical Codecs
• MP3 = MPEG-I Layer 3
• MP2 = MPEG-I Layer 2 (not MPEG-II)
Motion Picture Experts Group

   • MPEG-1:Video CDs, MP3 Audio
   • MPEG-2: Digital TV, Set-Top Boxes
   • MPEG-4: Online ...
MPEG-1 for Geeks
• Layer 1
 • Simple 32-Band Algorithm
 • Philips DCC (Digital Compact Cassette)
• Layer 2 (a.k.a. MUSICAM...
MPEG-1 Layer 3 (MP3)
    for Geeks
• Psychoacoustic Masking
 • 32 Bands Divided into 576 Subbands
 • More Accurate Masking...
Sample Rate for Geeks

• The Nyquist Theorem
 • Sample at 2x the Highest Frequency
 • 22.05kHz Sample Rate for 11kHz Audio...
Sample Rate in Practice

• Standardize on 44.1kHz Sample Rate
• Flash & Other Players Require n*11.025kHz
• Resample if So...
Bit Rate for Geeks
• Independent of Sample Rate
• Specifies Encoder Output File Size (CBR)
 • @64kbps, 1 hour ≈ 27MB
• Var...
Bit Rate in Practice
• “Use Higher Bit Rates for Music?”
• It’s a Myth!
 • Human Voices Are Complex
 • Music Masks Its Own...
Podcasting Bit-Rate History

  • June 2003: 32kbps. “Files too large”
  • April 2004: 48kbps. “No problem”
  • September 2...
Stereo Encoding

• “Stereo MP3s are twice as large as mono.”
• It’s a Myth!
• Only Bit Rate Specifies Output File Size
• Y...
Stereo Encoding for Geeks

  • Dual Channel or Independent Channel (IC)
   - Entirely Separate Left and Right
  • But Most...
Stereo Encoding
       (Even Geekier)

• JS Encodes L+R and L-R
• If L=R then L-R=0
• SinceUsesRate is ConstantStereo Info...
Stereo Encoding in Practice

  • StereoReason to (not Music vs.Voice) is a
           vs. Mono
    Good             Use Hi...
Mastering for MP3

• Help the Encoder: Eliminate Unnecessary Data
 - High-Pass Filter at 80Hz
 - Low-Pass Filter at 11kHz ...
Which is Louder?




• It’s Not the Height of the Peaks (voltage)
• It’s the Area Under the Curve (power)
Loudness
• What’s the Standard?
• We Asked:
 - Podcasters
 - Audio Engineers
 - Radio Engineers
• Answer: There Isn’t One
...
Normalization

• Peak Normalization (common)
 - Maximizes Voltage, not Power
• RMS Normalization
 - Maximizes Power (=Loud...
Avoid Recording to MP3!

• MP3 is a final/release format.
• Not designed to be decoded and re-encoded.
• Use MP2 Instead.....
AAC/M4B Files?

• Yes, AAC is Better Than MP3
• We Added AAC to Support iPod Bookmarks
• Painful: Only iTunes Could Encode...
MP2: Why and When?

• MPEG-1 Layer 2
• Designed as an Intermediate Format
• The Standard in Broadcast Radio
• 128kbps per ...
Audio Lessons Learned

• MP3 Options
• Audio-File Myths
• RMS Normalization (Loudness)
• AAC/M4B Files (iTunes & iPods)
• ...
To Summarize
• Record at 44.1kHz Sample Rate (not in MP3!)
• Mastering
 - RMS Normalization (Pick a Standard Level)
 - 80H...
The Secret Lives of MP3 Files
Upcoming SlideShare
Loading in …5
×

The Secret Lives of MP3 Files

6,538 views

Published on

Things you didn't know (or thought you did) about MP3 files.

Published in: Economy & Finance, Technology
2 Comments
4 Likes
Statistics
Notes
No Downloads
Views
Total views
6,538
On SlideShare
0
From Embeds
0
Number of Embeds
37
Actions
Shares
0
Downloads
159
Comments
2
Likes
4
Embeds 0
No embeds

No notes for slide

The Secret Lives of MP3 Files

  1. The Secret Lives of MP3 Files Doug Kaye The Conversations Network and GigaVox Media
  2. Formats & Encoders • Lossless (WAV, AIFF) • Lossy - MPEG 1, Layer 3 (MP3) - AAC (AAC, M4A, M4B) - MPEG I, Layer 2 (MP2)
  3. MPEG Confusion • Lossy Perceptual/Psychoacoustical Codecs • MP3 = MPEG-I Layer 3 • MP2 = MPEG-I Layer 2 (not MPEG-II)
  4. Motion Picture Experts Group • MPEG-1:Video CDs, MP3 Audio • MPEG-2: Digital TV, Set-Top Boxes • MPEG-4: Online Multimedia (Video) • MPEG-7: Audio and Video Search • MPEG-21: Multimedia Framework
  5. MPEG-1 for Geeks • Layer 1 • Simple 32-Band Algorithm • Philips DCC (Digital Compact Cassette) • Layer 2 (a.k.a. MUSICAM) • Also 32 Bands • International Standard for Broadcasting
  6. MPEG-1 Layer 3 (MP3) for Geeks • Psychoacoustic Masking • 32 Bands Divided into 576 Subbands • More Accurate Masking Thresholds • Redundancy Reduction • Lossless Huffman Encoding • Bit-Reservoir Buffering • Joint Stereo
  7. Sample Rate for Geeks • The Nyquist Theorem • Sample at 2x the Highest Frequency • 22.05kHz Sample Rate for 11kHz Audio • Sample Rate Is aSource (WAV or AIFF) Property of Uncompressed
  8. Sample Rate in Practice • Standardize on 44.1kHz Sample Rate • Flash & Other Players Require n*11.025kHz • Resample if Source is 48kHz from DVDs
  9. Bit Rate for Geeks • Independent of Sample Rate • Specifies Encoder Output File Size (CBR) • @64kbps, 1 hour ≈ 27MB • Variable Bit Rate (VBR) • For Higher Bit Rates Only • Not Universally Supported (Avoid It)
  10. Bit Rate in Practice • “Use Higher Bit Rates for Music?” • It’s a Myth! • Human Voices Are Complex • Music Masks Its Own Artifacts • 64kbps is Most Common Today • 96kbps is Gaining
  11. Podcasting Bit-Rate History • June 2003: 32kbps. “Files too large” • April 2004: 48kbps. “No problem” • September 2004: 64kbps. “Quality is low” • Today: Still 64kbps. • Tomorrow??
  12. Stereo Encoding • “Stereo MP3s are twice as large as mono.” • It’s a Myth! • Only Bit Rate Specifies Output File Size • You May Want to Use Higher Bit Rates for Stereo
  13. Stereo Encoding for Geeks • Dual Channel or Independent Channel (IC) - Entirely Separate Left and Right • But Most L/R Information is Redundant • Intensity Stereo (IS) • Mid/Side Stereo (MS) • Joint Stereo (JS) Allows IS/MS Combination
  14. Stereo Encoding (Even Geekier) • JS Encodes L+R and L-R • If L=R then L-R=0 • SinceUsesRate is ConstantStereo Information Bit L=R Fewer Bits for
  15. Stereo Encoding in Practice • StereoReason to (not Music vs.Voice) is a vs. Mono Good Use Higher Bit Rates • Greater Separation Suggests Higher Rates • If Mostly Speech, Consider 100% Mono • If Mono, Make L&R Digitally Identical • Always Encode in Stereo for Compatibility
  16. Mastering for MP3 • Help the Encoder: Eliminate Unnecessary Data - High-Pass Filter at 80Hz - Low-Pass Filter at 11kHz (@64kbps encoding) - Normalize
  17. Which is Louder? • It’s Not the Height of the Peaks (voltage) • It’s the Area Under the Curve (power)
  18. Loudness • What’s the Standard? • We Asked: - Podcasters - Audio Engineers - Radio Engineers • Answer: There Isn’t One • It’s a Hard Problem to Solve
  19. Normalization • Peak Normalization (common) - Maximizes Voltage, not Power • RMS Normalization - Maximizes Power (=Loudness) • Determine a Standard Loudness Level
  20. Avoid Recording to MP3! • MP3 is a final/release format. • Not designed to be decoded and re-encoded. • Use MP2 Instead... • or the highest MP3 bit rate possible.
  21. AAC/M4B Files? • Yes, AAC is Better Than MP3 • We Added AAC to Support iPod Bookmarks • Painful: Only iTunes Could Encode M4B • Doubled Much of Our Workflow • Can’t Be Easily Assembled
  22. MP2: Why and When? • MPEG-1 Layer 2 • Designed as an Intermediate Format • The Standard in Broadcast Radio • 128kbps per Track • 44.1kHz Sample Rate Preferred
  23. Audio Lessons Learned • MP3 Options • Audio-File Myths • RMS Normalization (Loudness) • AAC/M4B Files (iTunes & iPods) • MP2 Files
  24. To Summarize • Record at 44.1kHz Sample Rate (not in MP3!) • Mastering - RMS Normalization (Pick a Standard Level) - 80Hz Hi-Pass, 11kHz Low Pass (for voice) - If Mono, Make L&R Digitally Identical • Encoding - 64kbps when L=R - Consider ≥96kbps for L≠R - Always Use Joint Stereo

×