Error-Tolerant Audio Coding Workshop
                          Networked Audio Track : Event N1

            David Trainor, Director of Advanced Audio Research, CSR
                                                     26th October 2012




                                                  133rd AES Convention
Workshop Overview


 Some networked audio trends
   – Hierarchical broadcast / multicast networks
      • Wide-, local- and personal-area
   – Real-time / interactive / low-delay audio services e.g. gaming
   – Networks with more complex QoS characteristics (e.g. wireless)
      • Convenient and inexpensive, but reliability is an issue


 Audio coding is vital (e.g. network bandwidth management)
   – Minimally affected by network reliability fluctuations


 This workshop discusses
   – Approaches to and capabilities of error-tolerant audio coding
   – Recent advances in state-of-the-art

                                                          133rd AES Convention
                                                                        Page 2
Your Panellists


 Dr David Trainor, CSR (Workshop Chair)
 Dr Gary Spittle, Dolby Labs
 Dr Deepen Sinha, ATC Labs
 Dr Bernhard Grill, Fraunhofer IIS

 Workshop Format
    – 25 minute presentation by each panellist
       • (20 minutes + 5 minutes Q&A)
    – 15-20 minute general Q&A session




                                                 133rd AES Convention
                                                               Page 3
Error-Tolerant Audio Coding
      General Concepts and Techniques




                        133rd AES Convention
Classification of Audio Error Control Strategies


 Error correction
    – FEC versus ARQ
    – Dependent source coding versus independent source coding
    – Typically exhibits two levels of success
       • Corrects each error event flawlessly or fails completely


 Error limiting
    –   Limiting of catastrophic propagation of error events
    –   Sender-based, Receiver-based or Sender-and-Receiver-based
    –   Dependent source coding versus independent source coding
    –   Several levels and measures of “success” in perceptual terms
          • Error propagation can continue to different degrees, but not
            beyond a prescribed limit


                                                           133rd AES Convention
                                                                         Page 5
Classification of Audio Error Control Strategies (2)


 Error concealment
    –   Reduce perceptual significance of error events
    –   Sender-based, Receiver-based or Sender-and-Receiver-based
    –   Dependent source coding versus independent source coding
    –   Many levels and measures of “success” in perceptual terms
         • PEAQ/PESQ objective testing
         • Subjective quality measurements




                                                       133rd AES Convention
                                                                     Page 6
Error Correction with Independent Source Coding


 Applied at networking baseband or media access layer
    – CRC-based detection
    – FEC codes
    – ARQ retransmissions


 Redundancy not applied in an audio-optimized way
    – Packet payload treated as arbitrary data (equal error protection)


 Bit-rate and delay compromises
    – Packet-based networks may require additional time-domain
      interleaving, etc




                                                          133rd AES Convention
                                                                        Page 7
Error Correction with Dependent Source Coding


 Coded syntax protection prioritization
    – Each field protected according to perceptual significance
    – Unequal error protection across coded frame/stream fields


 Scalable coding
    – Each coded layer protected according to perceptual significance
    – Unequal error protection across coded layers


 These techniques augmented by network QoS prioritization
    – Send critical coded frame values or coded audio layers over
      network channels with higher QoS parameters




                                                        133rd AES Convention
                                                                      Page 8
Unequal Error Protection Examples
                              NON-SCALABLE CODEC

 Field
   1
          Field 2         Field 3          ...   Field
                                                 N-1
                                                             Field N        Coded Frame




                               SCALABLE CODEC

  Field
    1
           Field
             2
                          Field 3          ...   Field
                                                 N-1
                                                             Field N
                                                                           Mid-Quality Stereo
                                                                           Base Layer


 Field
   1
            Field 2            Field 3     ...   Field N-1
                                                               Field High-Quality Stereo
                                                                N    Enhancement Layer


 Field
   1
          Field
            2
                    Field 3      Field 4   ...   Field N-1
                                                               Field Parametric Upmixing
                                                                N    Enhancement Layer

Green = low protection
Orange = medium protection
                                                                       133rd AES Convention
Red = high protection
                                                                                     Page 9
Error Correction with Dependent Source Coding (2)


 Joint quantization and error control code insertion
    – Trade small quantization noise increase for improved robustness


 Dynamic data segmentation
    – Encode time-varying amounts of audio data
    – Choose audio block size based on (for example)
       • Communications channel reliability (probability of data loss)
       • Network baseband packet size


 Possible compromises of error correction + audio coding
    – Bit-rate, delay, codec computational complexity




                                                          133rd AES Convention
                                                                       Page 10
Error Limiting with Dependent Source Coding


 Key goal is to provide frequent points of resynchronization
    – Error event can’t propagate beyond the next synchronization point


 Special synchronization codes/values
    – Zero or low probability of occurrence in actual coded audio values


 Error Resilient Entropy Coding
    –   Insert variable-length entropy codes into fixed-length “slots”
    –   Partial codes take up spare space at the end of future slots
    –   Each slot guaranteed to start with first bit of a valid entropy code
    –   Codes are self-delimiting, hence frequent synchronization points




                                                              133rd AES Convention
                                                                           Page 11
Error Resilient Entropy Coding Concept




                                                          Insert variable length
                                                          codes in fixed-length
                                                          slots



Slot 1 Slot 2 Slot 3 Slot 4 Slot 5 Slot 6 Slot 7 Slot 8


                                                          Partial codes that don’t
                                                          fit in their slot are put into
                                                          the spare space in later
                                                          slots




            Start of a code.
            Frequent point of synchronization
                                                             133rd AES Convention
                                                                          Page 12
Error Limiting with Dependent Source Coding (2)


 Reversible Variable Length Codes
    – Code sequence readable both forward and backwards in time
    – Parsing in both directions can reveal inconsistencies and hence
      errors.


 Fixed-length coding techniques

 Limit differential coding techniques
    – Restart the differential or progressive coding technique




                                                          133rd AES Convention
                                                                       Page 13
Error Concealment with Independent Source Coding


 Padding silence or wideband noise

 Pitch estimation and synthesis
    – Estimate dominant pitches from previous good frames
    – Smooth discontinuities at boundaries


 Replacement with previous best-matching segments
    – Simpler forms may be correlation-based
    – Sophisticated forms
       • SoFi (University of Ulster). MPEG-7 + semantic song analysis.


 Filtered wideband noise generation
    – Noise shaped using spectrum of previous good frames

                                                       133rd AES Convention
                                                                    Page 14
Error Concealment with Independent Source Coding


 Dynamic data segmentation (discussed previously)

 Coded domain interleaving
    – Spectral (packet loss only affects specific frequencies)
    – Linear Prediction quantized prediction errors


 Linear Predictive Coding specific methods
    – Coder parameters
        • Copy coder parameters from last good frame
        • Coder parameters from last good frame with scaled-down gains
    – Stimulate predictor with statistically-shaped synthesized data
      (coded residuals)



                                                          133rd AES Convention
                                                                       Page 15

AES 2012 Error Tolerant Coding Workshop

  • 1.
    Error-Tolerant Audio CodingWorkshop Networked Audio Track : Event N1 David Trainor, Director of Advanced Audio Research, CSR 26th October 2012 133rd AES Convention
  • 2.
    Workshop Overview  Somenetworked audio trends – Hierarchical broadcast / multicast networks • Wide-, local- and personal-area – Real-time / interactive / low-delay audio services e.g. gaming – Networks with more complex QoS characteristics (e.g. wireless) • Convenient and inexpensive, but reliability is an issue  Audio coding is vital (e.g. network bandwidth management) – Minimally affected by network reliability fluctuations  This workshop discusses – Approaches to and capabilities of error-tolerant audio coding – Recent advances in state-of-the-art 133rd AES Convention Page 2
  • 3.
    Your Panellists  DrDavid Trainor, CSR (Workshop Chair)  Dr Gary Spittle, Dolby Labs  Dr Deepen Sinha, ATC Labs  Dr Bernhard Grill, Fraunhofer IIS  Workshop Format – 25 minute presentation by each panellist • (20 minutes + 5 minutes Q&A) – 15-20 minute general Q&A session 133rd AES Convention Page 3
  • 4.
    Error-Tolerant Audio Coding General Concepts and Techniques 133rd AES Convention
  • 5.
    Classification of AudioError Control Strategies  Error correction – FEC versus ARQ – Dependent source coding versus independent source coding – Typically exhibits two levels of success • Corrects each error event flawlessly or fails completely  Error limiting – Limiting of catastrophic propagation of error events – Sender-based, Receiver-based or Sender-and-Receiver-based – Dependent source coding versus independent source coding – Several levels and measures of “success” in perceptual terms • Error propagation can continue to different degrees, but not beyond a prescribed limit 133rd AES Convention Page 5
  • 6.
    Classification of AudioError Control Strategies (2)  Error concealment – Reduce perceptual significance of error events – Sender-based, Receiver-based or Sender-and-Receiver-based – Dependent source coding versus independent source coding – Many levels and measures of “success” in perceptual terms • PEAQ/PESQ objective testing • Subjective quality measurements 133rd AES Convention Page 6
  • 7.
    Error Correction withIndependent Source Coding  Applied at networking baseband or media access layer – CRC-based detection – FEC codes – ARQ retransmissions  Redundancy not applied in an audio-optimized way – Packet payload treated as arbitrary data (equal error protection)  Bit-rate and delay compromises – Packet-based networks may require additional time-domain interleaving, etc 133rd AES Convention Page 7
  • 8.
    Error Correction withDependent Source Coding  Coded syntax protection prioritization – Each field protected according to perceptual significance – Unequal error protection across coded frame/stream fields  Scalable coding – Each coded layer protected according to perceptual significance – Unequal error protection across coded layers  These techniques augmented by network QoS prioritization – Send critical coded frame values or coded audio layers over network channels with higher QoS parameters 133rd AES Convention Page 8
  • 9.
    Unequal Error ProtectionExamples NON-SCALABLE CODEC Field 1 Field 2 Field 3 ... Field N-1 Field N Coded Frame SCALABLE CODEC Field 1 Field 2 Field 3 ... Field N-1 Field N Mid-Quality Stereo Base Layer Field 1 Field 2 Field 3 ... Field N-1 Field High-Quality Stereo N Enhancement Layer Field 1 Field 2 Field 3 Field 4 ... Field N-1 Field Parametric Upmixing N Enhancement Layer Green = low protection Orange = medium protection 133rd AES Convention Red = high protection Page 9
  • 10.
    Error Correction withDependent Source Coding (2)  Joint quantization and error control code insertion – Trade small quantization noise increase for improved robustness  Dynamic data segmentation – Encode time-varying amounts of audio data – Choose audio block size based on (for example) • Communications channel reliability (probability of data loss) • Network baseband packet size  Possible compromises of error correction + audio coding – Bit-rate, delay, codec computational complexity 133rd AES Convention Page 10
  • 11.
    Error Limiting withDependent Source Coding  Key goal is to provide frequent points of resynchronization – Error event can’t propagate beyond the next synchronization point  Special synchronization codes/values – Zero or low probability of occurrence in actual coded audio values  Error Resilient Entropy Coding – Insert variable-length entropy codes into fixed-length “slots” – Partial codes take up spare space at the end of future slots – Each slot guaranteed to start with first bit of a valid entropy code – Codes are self-delimiting, hence frequent synchronization points 133rd AES Convention Page 11
  • 12.
    Error Resilient EntropyCoding Concept Insert variable length codes in fixed-length slots Slot 1 Slot 2 Slot 3 Slot 4 Slot 5 Slot 6 Slot 7 Slot 8 Partial codes that don’t fit in their slot are put into the spare space in later slots Start of a code. Frequent point of synchronization 133rd AES Convention Page 12
  • 13.
    Error Limiting withDependent Source Coding (2)  Reversible Variable Length Codes – Code sequence readable both forward and backwards in time – Parsing in both directions can reveal inconsistencies and hence errors.  Fixed-length coding techniques  Limit differential coding techniques – Restart the differential or progressive coding technique 133rd AES Convention Page 13
  • 14.
    Error Concealment withIndependent Source Coding  Padding silence or wideband noise  Pitch estimation and synthesis – Estimate dominant pitches from previous good frames – Smooth discontinuities at boundaries  Replacement with previous best-matching segments – Simpler forms may be correlation-based – Sophisticated forms • SoFi (University of Ulster). MPEG-7 + semantic song analysis.  Filtered wideband noise generation – Noise shaped using spectrum of previous good frames 133rd AES Convention Page 14
  • 15.
    Error Concealment withIndependent Source Coding  Dynamic data segmentation (discussed previously)  Coded domain interleaving – Spectral (packet loss only affects specific frequencies) – Linear Prediction quantized prediction errors  Linear Predictive Coding specific methods – Coder parameters • Copy coder parameters from last good frame • Coder parameters from last good frame with scaled-down gains – Stimulate predictor with statistically-shaped synthesized data (coded residuals) 133rd AES Convention Page 15