Spatial Sound




Music Department, Ionian University, Corfu, July 2008
             Presented by Richard Elen
What is Spatial Sound?
• Sound recording/reproduction intended to
  create or recreate a sense of space, such as
  the environment of a performance
• Using knowledge of human hearing to
  localise sound sources in space
• Possibly in conjunction with a visual aspect
  such as film, live performance, etc
What is a Soundfield?

• The sound around us in an acoustic space.
• The sound around us in a virtual space.
• The pattern of sound waves reaching our
  ears that we perceive as an acoustic
  environment, real or imaginary.
Experiencing Space

• Hearing allows identification of source
  direction from all around including above/
  below
• We can also perceive distance of sources
  from the listener
• How is this done with just two ears?
Directional Cues
• Intensity of the same sound in each ear
• Left/right arrival times of sound at the ears
• Spectral differences due to pinnae,
  shoulders, body masking – HRTF (Head
  Related Transfer Functions)
• Visual cues, especially for front/back
Distance Cues

• The further away a source is,
 • The quieter it is
 • the less high frequencies
 • the more reverberation/reflections
• What about ‘depth’
Environment Cues

• Relationship between direct and reflected
  sound
• Character of early reflections
  (where are the walls?)
• Character of reverberation/reverb time
  (how big is the space?)
Spatial Sound:
         Applications
• Capturing an acoustic event such as a live
  performance more fully/accurately
• Mixing a multitrack recording to place
  instruments around the listener
• Using computers to generate virtual
  soundfields via algorithms
• Recreating those experiences at home or
  in a performance environment
General vs
     Special Solutions
• Yes, you can just put speakers anywhere
  and play sounds through them.
• But you cannot reproduce/recreate a
  soundfield.
• An overall technology offers repeatability
  and a general solution rather than a special
  one.
Questions
• “We are there” versus “They are here”
• How easy is it for the listener to experience
  correctly?
• How do we integrate live performers?
• “Spatial accuracy” versus “immersion” →
• “Artistic Illusion” versus “Accurate 3D
  rendering” →
Accuracy or Immersion

• Which is most important?
• Why?
• How much does it depend on the application?
Artistic Illusions?

• Live natural/acoustic music and spaces
• Multitrack-derived music
• Performance art
• Radio/TV sound (drama etc)
• Cinema sound
Accurate Rendering?

• Virtual Reality/Gaming
• Simulators and control systems
• Conferencing
From Mono to…


• Early equipment: monophonic.
  Crude spatial cues via reverberation.
• Phonograph and Gramophone
Clement Ader


• Clement Ader made the first
  stereo transmission
Clement Ader
• Pairs of mics
  feeding pairs
  of earphones
• From the
  Paris Opera
  to the 1881
  Exhibition
Clement Ader


• Early binaural
  but with
  spaced mics
Bell Labs 1930s
• Attempted to recreate wavefronts at the
  listener, using multiple mics/channels/
  speakers
• Mainly aimed at cinema, not consumers
• Settled on three L-C-R mics/channels/
  speakers, also tried two spaced omnis
• Also worked with 2 channels for disc
Blumlein 1930s
• 1903–1942
• Employed by EMI
• A true genius: Recording,
  Television, Radar...
• Developed various stereo
  technologies
• Coincident mics instead of
  spaced ones
Blumlein Stereo I


• Crossed
  figure-eights
Blumlein Stereo II
• “Mid-Side” (M-S)
• Coincident Omni and
  figure-eight
• Delivers L+R (M)
  and L-R (S)
• Matrixed to L & R
Other Stereo
         Mic Systems

• A-B Stereo
  Mics
• Spaced
  Omnis
Other Stereo
       Mic Systems

• XY Stereo:
  Coincident
  Cardioids
Other Stereo
        Mic Systems




• ORTF stereo – spaced, wider cardioids
Binaural
       (Dummy Head)
• Attempts to capture all
  the information the
  hearing system would
  capture
• Technique passes in and
  out of fashion
• Very effective at best
Binaural II
• Listen on headphones
  (speakers require a
  transfer function)
• Can be very good
  indeed, but there are
  problems
• DSP-synthesised
  binaural mixing
Stereo Mixing I

• Localise sources between two speakers
  using level alone
• Effect depends on listener position
• Standard technique for multitrack mixing
Stereo Mixing II


• Level-only panning
  makes listener
  position important
Consumer Stereo

• 1950s: Stereo Vinyl discs using Blumlein’s
  cutting technique
• Classical music with various mic techniques
• Popular music mixed from multitrack
Cinema Stereo
• Multichannel effects used for Fantasia
  (1939): 3 audio channels with a control
  track; several speakers including rear.
• Otherwise mono until 1950s
• Initially, stereo music and panned dialogue
• Later, mono dialogue (listener position and
  production problems)
Cinema Sound

• 1960s: 70mm presentations with multiple
  front channels, one surround channel and
  LFE/Sub
• Origin of current cinema surround systems
Quadraphony I
• 1970s: A first attempt at surround sound
  for the consumer market
• Need for compatibility with stereo discs
• ‘Discrete’ 4-channel plus several
  incompatible matrixes: SQ, QS, RM, CD-4,
  UD-4...
• None worked very well
Quadraphony II
• ‘Discrete’ quad: four channels each feed a
  speaker – LF, RF, LS, RS ‘Quadrifontal’ (‘four
  source’) approach based on extending
  panpotted stereo into two dimensions
• Level-only panning between 4 corners of a
  square
• Speakers wider than normal stereo (90°
  instead of 60° – hole in the middle)
Quadraphony III
• Fundamental flaws in localisation
  approach
• Level-only panning works poorly
  at the back, almost non-existent
  at sides
• ‘Discrete’ quad was a failure:
  matrix quad even more so
Quadraphony IV
• Three types of
  Quad transmission
  system:
 • Discrete
 • Subcarrier disc
 • Matrix disc
Quadraphony IV

• Systems did not
  accurately reproduce
  the localisation intent
• UD-4 nearly got it
  right
The roots of 5.1

• Matrix Quad formed the basis of Dolby
  Surround
• ‘Steering’/ ‘logic’ circuits to make it appear
  that separation was sufficient.
• Still a 2-channel matrix system
Cinema Surround
• Analogue cinema systems:
 • Left & Right Front
 • Added Centre Front for Dialogue
     (due to poor audibility)
 • One Surround channel for envelopment,
     spaciousness
 •   Low Frequency Effects (LFE)
Movie Surround
• Advent of digital systems allowed more
  channels
• LF, CF, RF at the front;
• LS, RS at the rear
• LFE
• This is ‘5.1’
L, C, R, Ls, Rs
              ‘5.1’                   N

                                      C



• Front L & R
                            L                    R



  speakers are at               30º



  normal stereo                           100–120º

  positions
• CF for dialogue      Ls                            Rs


• Surround L & R are
  very far apart
‘5.1’                 N

                                          C


• Surrounds may                 L                    R



  also be along the                 30º


  sides (eg cinema)
                                              100–120º



• What do the
  speaker positions tell
  us about the intent      Ls                            Rs


  of 5.1 – and cinema
  surround?
‘5.1’             N

                                       C


• 5.1 is designed            L                    R


  for envelopment                30º

  and impressive
  effects, not for                         100–120º




  localisation
  accuracy              Ls                            Rs


• Envelopment vs
  localisation
An important aside:
   Bass Management
• Misunderstanding the difference between
  the LFE and the Subwoofer
• Bass management ensures that bass is
  played by speakers that can handle it.
• There is no need for an LFE in music (or
  anything else)
• Errors here can seriously affect replay!
Bass Management

• Never connect the LFE to the subwoofer:
  in the studio or at home
• The LFE is not the ‘subwoofer channel’!
• Never send signals direct to the LFE
• Play safe and do not use the LFE at all!
Beyond 5.1
• All these systems are descendants of
  ‘discrete quad’
• One channel to one speaker
• Using more channels/speakers to fill in the
  holes.
• 7.1... 10.2... 20.4... where will it end?
Audio Rendering:
  a different approach
• Capture or create the soundfield in the
  most effective way
• Transmit/store it in the most efficient way
• Replay it the best the system can do
• Analogous to a Postscript file/printer
• ‘Device-independent audio’

Spatial Sound parts 1 & 2

  • 1.
    Spatial Sound Music Department,Ionian University, Corfu, July 2008 Presented by Richard Elen
  • 2.
    What is SpatialSound? • Sound recording/reproduction intended to create or recreate a sense of space, such as the environment of a performance • Using knowledge of human hearing to localise sound sources in space • Possibly in conjunction with a visual aspect such as film, live performance, etc
  • 3.
    What is aSoundfield? • The sound around us in an acoustic space. • The sound around us in a virtual space. • The pattern of sound waves reaching our ears that we perceive as an acoustic environment, real or imaginary.
  • 4.
    Experiencing Space • Hearingallows identification of source direction from all around including above/ below • We can also perceive distance of sources from the listener • How is this done with just two ears?
  • 5.
    Directional Cues • Intensityof the same sound in each ear • Left/right arrival times of sound at the ears • Spectral differences due to pinnae, shoulders, body masking – HRTF (Head Related Transfer Functions) • Visual cues, especially for front/back
  • 6.
    Distance Cues • Thefurther away a source is, • The quieter it is • the less high frequencies • the more reverberation/reflections • What about ‘depth’
  • 7.
    Environment Cues • Relationshipbetween direct and reflected sound • Character of early reflections (where are the walls?) • Character of reverberation/reverb time (how big is the space?)
  • 8.
    Spatial Sound: Applications • Capturing an acoustic event such as a live performance more fully/accurately • Mixing a multitrack recording to place instruments around the listener • Using computers to generate virtual soundfields via algorithms • Recreating those experiences at home or in a performance environment
  • 9.
    General vs Special Solutions • Yes, you can just put speakers anywhere and play sounds through them. • But you cannot reproduce/recreate a soundfield. • An overall technology offers repeatability and a general solution rather than a special one.
  • 10.
    Questions • “We arethere” versus “They are here” • How easy is it for the listener to experience correctly? • How do we integrate live performers? • “Spatial accuracy” versus “immersion” → • “Artistic Illusion” versus “Accurate 3D rendering” →
  • 11.
    Accuracy or Immersion •Which is most important? • Why? • How much does it depend on the application?
  • 12.
    Artistic Illusions? • Livenatural/acoustic music and spaces • Multitrack-derived music • Performance art • Radio/TV sound (drama etc) • Cinema sound
  • 13.
    Accurate Rendering? • VirtualReality/Gaming • Simulators and control systems • Conferencing
  • 14.
    From Mono to… •Early equipment: monophonic. Crude spatial cues via reverberation. • Phonograph and Gramophone
  • 15.
    Clement Ader • ClementAder made the first stereo transmission
  • 16.
    Clement Ader • Pairsof mics feeding pairs of earphones • From the Paris Opera to the 1881 Exhibition
  • 17.
    Clement Ader • Earlybinaural but with spaced mics
  • 18.
    Bell Labs 1930s •Attempted to recreate wavefronts at the listener, using multiple mics/channels/ speakers • Mainly aimed at cinema, not consumers • Settled on three L-C-R mics/channels/ speakers, also tried two spaced omnis • Also worked with 2 channels for disc
  • 19.
    Blumlein 1930s • 1903–1942 •Employed by EMI • A true genius: Recording, Television, Radar... • Developed various stereo technologies • Coincident mics instead of spaced ones
  • 20.
    Blumlein Stereo I •Crossed figure-eights
  • 21.
    Blumlein Stereo II •“Mid-Side” (M-S) • Coincident Omni and figure-eight • Delivers L+R (M) and L-R (S) • Matrixed to L & R
  • 22.
    Other Stereo Mic Systems • A-B Stereo Mics • Spaced Omnis
  • 23.
    Other Stereo Mic Systems • XY Stereo: Coincident Cardioids
  • 24.
    Other Stereo Mic Systems • ORTF stereo – spaced, wider cardioids
  • 25.
    Binaural (Dummy Head) • Attempts to capture all the information the hearing system would capture • Technique passes in and out of fashion • Very effective at best
  • 26.
    Binaural II • Listenon headphones (speakers require a transfer function) • Can be very good indeed, but there are problems • DSP-synthesised binaural mixing
  • 27.
    Stereo Mixing I •Localise sources between two speakers using level alone • Effect depends on listener position • Standard technique for multitrack mixing
  • 28.
    Stereo Mixing II •Level-only panning makes listener position important
  • 29.
    Consumer Stereo • 1950s:Stereo Vinyl discs using Blumlein’s cutting technique • Classical music with various mic techniques • Popular music mixed from multitrack
  • 30.
    Cinema Stereo • Multichanneleffects used for Fantasia (1939): 3 audio channels with a control track; several speakers including rear. • Otherwise mono until 1950s • Initially, stereo music and panned dialogue • Later, mono dialogue (listener position and production problems)
  • 31.
    Cinema Sound • 1960s:70mm presentations with multiple front channels, one surround channel and LFE/Sub • Origin of current cinema surround systems
  • 32.
    Quadraphony I • 1970s:A first attempt at surround sound for the consumer market • Need for compatibility with stereo discs • ‘Discrete’ 4-channel plus several incompatible matrixes: SQ, QS, RM, CD-4, UD-4... • None worked very well
  • 33.
    Quadraphony II • ‘Discrete’quad: four channels each feed a speaker – LF, RF, LS, RS ‘Quadrifontal’ (‘four source’) approach based on extending panpotted stereo into two dimensions • Level-only panning between 4 corners of a square • Speakers wider than normal stereo (90° instead of 60° – hole in the middle)
  • 34.
    Quadraphony III • Fundamentalflaws in localisation approach • Level-only panning works poorly at the back, almost non-existent at sides • ‘Discrete’ quad was a failure: matrix quad even more so
  • 35.
    Quadraphony IV • Threetypes of Quad transmission system: • Discrete • Subcarrier disc • Matrix disc
  • 36.
    Quadraphony IV • Systemsdid not accurately reproduce the localisation intent • UD-4 nearly got it right
  • 37.
    The roots of5.1 • Matrix Quad formed the basis of Dolby Surround • ‘Steering’/ ‘logic’ circuits to make it appear that separation was sufficient. • Still a 2-channel matrix system
  • 38.
    Cinema Surround • Analoguecinema systems: • Left & Right Front • Added Centre Front for Dialogue (due to poor audibility) • One Surround channel for envelopment, spaciousness • Low Frequency Effects (LFE)
  • 39.
    Movie Surround • Adventof digital systems allowed more channels • LF, CF, RF at the front; • LS, RS at the rear • LFE • This is ‘5.1’
  • 40.
    L, C, R,Ls, Rs ‘5.1’ N C • Front L & R L R speakers are at 30º normal stereo 100–120º positions • CF for dialogue Ls Rs • Surround L & R are very far apart
  • 41.
    ‘5.1’ N C • Surrounds may L R also be along the 30º sides (eg cinema) 100–120º • What do the speaker positions tell us about the intent Ls Rs of 5.1 – and cinema surround?
  • 42.
    ‘5.1’ N C • 5.1 is designed L R for envelopment 30º and impressive effects, not for 100–120º localisation accuracy Ls Rs • Envelopment vs localisation
  • 43.
    An important aside: Bass Management • Misunderstanding the difference between the LFE and the Subwoofer • Bass management ensures that bass is played by speakers that can handle it. • There is no need for an LFE in music (or anything else) • Errors here can seriously affect replay!
  • 44.
    Bass Management • Neverconnect the LFE to the subwoofer: in the studio or at home • The LFE is not the ‘subwoofer channel’! • Never send signals direct to the LFE • Play safe and do not use the LFE at all!
  • 45.
    Beyond 5.1 • Allthese systems are descendants of ‘discrete quad’ • One channel to one speaker • Using more channels/speakers to fill in the holes. • 7.1... 10.2... 20.4... where will it end?
  • 46.
    Audio Rendering: a different approach • Capture or create the soundfield in the most effective way • Transmit/store it in the most efficient way • Replay it the best the system can do • Analogous to a Postscript file/printer • ‘Device-independent audio’