SlideShare a Scribd company logo
1 of 47
Download to read offline
Soundly 제 4회 오프라인 모임, 2019.11.30
Audio Technologies for Virtual Reality
Ben Sangbae Chon, Ph.D.
bc@gaudiolab.com
Chief Science Officer
2
Gaudio Lab is a spatial audio technology
company developing encompassing,
three-dimensional sound solutions.
We simplify the process of creating and
delivering immersive sound, allowing
listeners to perceive depth, direction,
and interactivity of audio sources.
• Founded in 2015 by MPEG Audio experts with 10+ active years in the group
• Offices: Seoul and San Francisco
• Audio scientists in residence: 6 Ph.D.s, 2 Master in Audio Signal Processing

3
Immersive Audio Contents by Gaudio
01. Tower GAUDI
Unity 3D plugin performance demo (’15.7)
High-Definition 3D Positional Sound with
Interaction
03. CF Zombie (Mini VR Game, GDC 2016)
High-definition 3D positional sound (’16.3)
All Objects rendered with GAUDIO Unity 3D plugin
04. Project CoC (Stereo to VR)
G’AUDIO specialty for legacy 360 video (’16.1)
Convert legacy stereo to interactive
05. Fanta Promo - Drift
360 live recording for immersive (’16.2)
Compare Ambisonics v. omni-binaural
02. Horror Maze (Launching Work for GearVR)
Main 360 video demo for Gear VR 2 launching (’15.11)
Built with 1 Sound Object + Stereo ambient track
06. Fanta Promo - Rope-swing, Wing-suit
First-person view 360 live recording (’16.3)
Binaural recording for extreme sport-enthusiast
07. Downhell Studio (interactive 360 Music)
Immersive 360 Spatial Audio for 360 Video
HOA + Objects rendering | also for TV speaker demo
08. Blue Dress (PV demo & Sample PT session)
Immersive 360 Spatial Audio for 360 Video
HOA + 3 Moving Objects rendering for Immersive
09. Goodday (360 Commercial)
Immersive 360 Spatial Audio for 360 Video
Remaster with extracted objects after post-production
10. Jambinai (interactive 360 Music)
Interactive Audio for 360 Video
Loud speaker demo for concert & same via Gear VR
11. World 1st VR Audio Livestreaming
13. Bo Hwa Gwak (Virtual Museum)
Virtual experience of famous Museum
Featured in Busan Film Festival 2017
14. SBS -Astro1&2
Episodes for K-pop star 360 Drama
Ambisonics shoot, post-production with Works
15. SBS-TV Live Music Show
Take from On-air live TV Music Show
for 360 Video, Produced by Works
18. HKQ (String Quartet) (i360Music)
Live String Quartet Recording by Petr Soupa
Object with shotgun + FOA
19. The Committee (UK Comedy)
British Comedy by 360 mixed by Petr Soupa
Object with Lav + FOA
16. Bloodless (Award in Venice Film Fest.)
Based on real story of brutally murdered sex worker,
the Best VR Story award in Venice Film Festival 2017
17. G’Mic Take #1
Proprietary consumer FOA Mic Tech Demo
1st shooting on street, in room
21. Elliot Sloan (Gold medal Skateboarder)
Skateboarding by Olympic Gold Medalist
Reality with extreme elevation
Teaser for Hollywood horror movie
Positional sound extremely important for scare factor
20. Annabelle (Blockbuster Feature Film)
Collaboration with SK Telecom
Interactive with FOA, Sol Livestreaming
SK Telecom - Seol-hyun
Promo 360 Episodes for SKT
Live shooting 360
12.
22. SBS - 9 Muses, Sitcom series
SBS produced high quality 360 content
with K-pop stars (9 Muses) / Stereoscopic 360
23. SBS - TV Live Show Season 2
SBS produced high quality 360 content with 

SBS featured program, Inkikayo / Stereoscopic 360
24. Gaudio Live Take #2
Livestreaming showcase with FOA mic
25. C-Flat String Quartet 2 (i360Music)
String quartet performance in a church
4
What Is Immersive Audio for VR?
5
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
- Loudspeaker layout references a static video
- Sweet spot (and sweet orientation) is assumed
- Quality of Experience is optimized at the sweet spot
- It does NOT supports interactive viewing
6
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
7
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
360-Video : Multimedia Content Covering Whole Space
8
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
360-Video : Multimedia Content Covering Whole Space
Interactive Video : User chooses where to look at and Video is rendered accordingly
9
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Interactive Video : User chooses where to look at and Video is rendered accordingly
Interactive Audio : Audio should also be rendered according to the User’s Head Orientation (and Position)
360-Video : Multimedia Content Covering Whole Space
10
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Interactive Video : User chooses where to look at and Video is rendered accordingly
Interactive Audio : Audio is also rendered according to the User’s Head Orientation (and Position)
360-Video : Multimedia Content Covering Whole Space
Yaw
Roll
Pitch
11
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Yaw
Roll
Pitch
Y
Z X
<3 Degree of Freedom>
- Interactive with Head Orientation
- 360 Video
<6 Degree of Freedom>
- Interactive with Head Orientation

and User’ Position
- VR Games, AR
12
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Positional audio is an important storytelling cue in VR
Due to user’s freedom of viewpoint, sound is an important signal to direct
viewer’s attention to a scene and story as producer intended
13
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Is Loudspeaker Rendering Suitable for Interactive Immersive Audio?
14
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Binaural Technologies over Headphone is the Answer!
15
History of the Immersive Sound
Binaural Technologies
16
History of Immersive Sound - Binaural Recording Technologies
‣ First Telephony by Alexander Graham Bell in 1876, First Phonograph by Thomas Edison in 1877

‣ The Origins of Speech Communication and Audio Media Storage 

‣ In 1877, Yankee Doodle by Frederick Boscovitz from Philadelphia to Washington DC
17
History of Immersive Sound - Binaural Recording Technologies
Theatrophone - The First Live Binaural/Stereophonic Sound Demo

‣ Introduced by Clement Ader at the 1881 World Expo Paris

‣ A pair of microphones → A pair of receivers
‣ 80 telephone transmitters across the front of a stage connected from Opera Garnier to Paris Electrical Exhibition
‣ Scientific American reported, 

“singers place themselves, in the mind of the listener, at a fixed distance, some to the right and others to the left.”
‣ Technical Difficulties : Rudimentary amplification, difficult microphone placement
18
History of Immersive Sound - Binaural Recording Technologies
Theatrophone - The First, Subscription-based, Live Music Streaming
‣ Commercialize by the Compagnie du Theatrophone in 1890
‣ Subscription model : 180 francs per year, 15 francs per performance
‣ Terminal : A headset (and a transmitter to tell a Teheatrophone operator which venue they wished to listen in to)
‣ Coin-operated listening stations installed in hotel lobbies and cafes, 50 cent for 2:30 of listening time
19
History of Immersive Sound - Binaural Recording Technologies
‣ Intermediate-Binaural Broadcasting by Doolittle at a US Radio Station (WPAJ) in 1924
‣ Storing two channel sound and reproduction, without acoustic septum, was made in 1921

‣ Left signal to one AM transmitter and right signal to another AM transmitter

‣ The listener at home needed two receivers

left signalRadio Station A
Radio Station B right signal
* Radio broadcasting service started in United States, 1920.
20
History of Immersive Sound - Binaural Recording Technologies
‣ Binaural Telephone (Western Electric, 1927)

‣ Microphones are placed flush with the wall of a balloon made of leather or cloth and 

packed with sponge rubber, wool, or cotton

‣ Signal separation by shadowing and absorption
21
History of Immersive Sound - Binaural Recording Technologies
‣ “Artificial Head”, a binaural capturing device by W. Bartlett Jones in 1927
‣ Microphones placed on the sphere heading to forward

‣ First simulation of the orientation of human pinae
22
History of Immersive Sound - Binaural Recording Technologies
‣ Oscar by Fletcher, Bell Lab (Presented in 1933)
‣ Manikin bought from a wax figure dealer 

and Microphones mounted on the cheeks

-
‣ Oscar II (1940s)
‣ Microphones placed in the ears

‣ But the membrane at about 5mm distance
outside the cave conchae
Binaural Recording Now

‣ Record ear-input signal and play it over earphones

‣ Quite Natural and Ambient

‣ No Interaction in General

‣ Limited Interaction (3DIO Omni Binaural Microphone)
23
History of Immersive Sound - Present Binaural Recording Technologies
Yaw
Roll
Pitch
‣ Basic Signal and System in Electrical Engineering
‣ Audio signal is a group of impulses and system can be defined by an impulse response.
‣ Convolution : Each sample’s response is summed up, with the normalization gain corresponding to
the input sample.
‣ Sound propagation path from the sound source to each ear in a room can be modeled by a pair
Impulse Response
24
History of Immersive Sound - Binaural Rendering
System Impulse Response
‣ Binaural Room Impulse Response
25
History of Immersive Sound - Binaural Rendering
BRIR (Binaural Room Impulse Response) HRIR (Head Related Impulse Response)
HRIR has the key information for human auditory perception of the source position

26
History of Immersive Sound - Binaural Rendering
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
27
History of Immersive Sound - Binaural Rendering
BRIR at Concert Hall
= HRIR + Reflections and Reverberation
- Good Externalization
- Natural Ambience
- Difficult to De-reverberate

(No Dry Sound)
HRIR at Anechoic Chamber
- Intermediate Externalization
- No Ambience
- Dry, straightforward sound

(It is sound engineer’s job!)
909 × 401
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
28
History of Immersive Sound - Binaural Rendering
Impulse from one direction
Binaural Recording
Head Related Impulse Response
Impulses from all direction
HRIR Database from all direction
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
29
History of Immersive Sound - Binaural Rendering
1) Input : Object Position, User Position, User Head Orientation
2) Output : Binaural Rendered Audio Signal
Choose a HRIR/BRIR Filter Pair based on the Position and Orientation Information
Sound source
1
2
2
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
Expensive Realtime Convolution

‣ A 48kHz Sampled BRIR Pair from a Concert Hall would be 96000 Samples x 2 Channel

‣ To Create One Pair of Binaural Samples, 96000 x 2 Multiplications and Additions are required

‣ For a second, 9.2 G (96k x 2 x 48k) Operations are required

‣ Even for a short HRIR Pair (256 Samples at 48kHz), 25M (256 x 2 x 48k) Operations are required
30
History of Immersive Sound - Binaural Rendering
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
‣ Signal Processing Techniques for Convolution
‣ Cooley and Turkey, “An Algorithm for the Machine Calculation of Complex Fourier Series” in 1965
‣ FFT based convolution : Thomas Stockham, Jr, “High Speed Convolution and Correlation” in 1966
‣ Block convolution : Charls S. Burrus, “Block Realization of Digital Filters” in 1972
31
History of Immersive Sound - Binaural Rendering
Time Domain Freq Domain
256 25M 5M
1sec 4.6G 76M
2sec 9.2G 148M
32
History of Immersive Sound - Binaural Rendering
‣ Convolvotron : Binaural Synthesis using HRTF Convolution
‣ E. Wenzel, et. al, “A Virtual Display System for Conveying Three-Dimensional Acoustic Information” in 1988
‣ Hardware Development : A PC extension card equipped with many parallel DSPs (up to 8 sound sources) in 1992

- Each DSP is capable of 320 MOPS
33
History of Immersive Sound - Binaural Rendering
‣ Acoustically Optimized Low Complexity Binaural Synthesis Model for BRIR
‣ Acoustic Optimization : Taegyu Lee, “Scalable Multiband Binaural Renderer for MPEG-H 3D Audio” in 2015

- Parameterizing BRIR (2 second) with Variable Order Filter in Frequency Domain (VOFF), 

Parameteric Late Reverberation Filtering (PLF), QMF domain tapped delay line (QTDL)
34
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics, started by Gerzon in 1985
‣ Ambisonics Microphone (A-Format) : A set of directional microphone, highly calibrated
‣ A-to-B Conversion : Inner product. (Each microphone signal is decomposed in terms of spherical harmonics)
‣ Object-to-B Conversion : Inner product. (Object signal is decomposed in terms of spherical harmonics)
Spherical HaromonicsAmbisonics Microphone
35
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics Multichannel Reproduction (Coincident Microphones)
‣ Rotation, if necessary : Simple matrix conversion
‣ Reproduction : Over regular layout, inner-product of Spherical Harmonics and Loudspeaker Position Vector 

36
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics Multichannel Reproduction (Coincident Microphones)
‣ Not really popular as a multichannel reproduction method
‣ Production : Difficult to control source width control, distance control, sweet spot control, ambience control
‣ Distribution : More bandwidth, transmission of the channel rendered signal over the standard layout is preferred
‣ Suddenly, became very popular in the VR Audio
‣ Distribution : First order ambisonics is enough to enable the audio to interact with 3-DoF yaw, pitch and roll
‣ Other candidates : Object-based and channel based generally requires more audio channel signal
‣ Limitation : Capable of handling 6DoF by inter/extrapolation of multiple Ambisonics signals but very inefficient
Rotation Virtual Loudspeaker Rendering Binaural Rendering
37
Present Outlook and Future Forecast
38
Three Major Types of Binaural Renderings - Direct Object Rendering
‣ Very straightforward Binaural Rendering Mechanism
1) Input : Object Position, User Position, User Head Orientation
2) Output : Binaural Rendered Audio Signal
Choose a HRIR/BRIR Filter Pair based on the Geometrical Information
Sound source
1
2
2
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
39
Three Major Types of Binaural Renderings - Virtual Channel based Rendering
‣ Objects and their positional metadata are delivered to the renderer at the reproduction device

‣ Relative positions of objects are calculated using head orientation

‣ Each object is localized at the relative position over a pre-determined virtual channel loudspeaker layout

‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position.
L30
O
F0
O
R60
O
L60
O
R120
O
L120
O
B180
O
F0
O
R60
O
L60
O
R120
O
L120
O
B180
O
R30
O
40
Three Major Types of Binaural Renderings - Ambisonics based Rendering
‣ Objects are Localized in Ambisonics, 

‣ Interaction by the head orientation is applied using Ambisonics rotation,

‣ Ambisonics signal is converted to the virtual channel layout

‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position.
Rotation Virtual Loudspeaker Rendering Binaural Rendering
41
Three Major Types of Binaural Renderings - Comparison
Rendering Type Direct Object Rendering Virtual Channel Panning Ambisonics Represenation
Adapted by
Gaudio
Some Game Engines
Gaudio
MPEG-H 3D Audio
Dolby ATMOS VR
Gaudio
MPEG-H 3D Audio
Facebook 360 Video
Google 360 Video
Spatial Accuracy OOO OO O
Panning No Typically pair or triplet Typically all of the Virtual Speakers
Timbre Affected by HRTF
Affected by HRTF and Panning
(Some Phase Problem)
Affected by HRTF and Panning
(More Phase Problem)
Complexity
Depending on
the Number of Objects
(2xN Convolutions)
Depending on
the Number of Virtual Channels
(N Convolutions)
Depending on
the Ambisonics Order
(M Convolutions)
Running Memory 360x180 HRIRs N HRIRs M HRIRs
Delivery Large number of object needs to be delivered.
4 signals for FOA
9 signals for SOA
Interaction 3DOF (head orientation) and 6DOF (3DOF + positional interaction) only 3DOF
42
VR Market Situation
‣ Virtual Reality in General

‣ Uncomfortable form factor

‣ Not enough display resolution

‣ Dizziness from the rendering latency (both video or audio)

‣ No unified content creation-transmission-consumption ecosystem

‣ No Killer Content

‣ Expensive in Content Production : 180 degree?

‣ Couch Potato vs. 6 DOF Movement

‣ Audio

‣ Codec Issues : No server/device handles 5+ objects transmission 

‣ Neither a standard or de-facto renderer for content creation

‣ Personalization : How to measure personalized HRIR?

‣ Difference auditory system characteristics with headphone : e.g. occlusion
43
MPEG-I, the 6DOF Standard for the Future
1995 2000 2005 2010 2015 2022
MPEG-IMPEG-1 MPEG-2 MPEG-4 MPEG-D MPEG-H
3D Audio : Support Rendering for Combination of
Channels, Objects, and/or Ambisonics
Audio for 6DoF AR/VR Content
USAC : Hybrid Audio and Speech Codec
SAOC : Object Mix & Control
MPEG-Surround : Enhanced Channel Extension
HE-AAC v.2 : Channel Extension by Parametric Stereo
HE-AAC : Bandwidth Extension by Spectral Band Replication
AAC : Perceptual Noise Shaping
AAC : Enhanced Psychoacoustical Model
MP3 : First Psychoacoustical Compression Standard
44
MPEG-I, the 6DOF Standard for the Future
• 3 Degrees of Freedom (3DoF)
- Yaw-Pitch-Roll
- Head orientation only
- 360° Video and Cinematic VR
- Ambisonics
‣ MPEG-H 3D Audio
• 3DoF+
- 3DoF with limited Movements (typically, head movements)
• 6 Degrees of Freedom (6DoF)
- 3DoF + x-y-z
- Head orientation + Movements
- Full Interactive VR
- Object-based Audio
‣ MPEG-I Immersive Audio
“ User’s freedom of movements for interactivity in VR space ”
45
MPEG-I, the 6DOF Standard for the Future
• Localization of Virtual Sources
‣ Use Head-Related Transfer Function (HRTF)
- From virtual source to L/R ears
‣ Realistic rendering of spatial position due to perceptual cues
- ITD, ILD, IC
• Directivity of Virtual Sources
‣ Object perceived loudness changes as user moved around object
• Ambience and Reverberation
‣ Almost all spaces impose some reverberation on sound sources
- Direct Sound, Early Reflections, and Late Reverberation
- Need to estimate model for AR
• Occlusion, Doppler, etc…
46
MPEG-I Audio Reference Architecture
*N18158
47
Audio-Only Applications
*N18627

More Related Content

What's hot

My final digipak process
My final digipak processMy final digipak process
My final digipak processJayImogenJones
 
DOES IT HERTZ - De Roma
DOES IT HERTZ - De RomaDOES IT HERTZ - De Roma
DOES IT HERTZ - De RomaArchitectura
 
Intro to Video Production
Intro to Video ProductionIntro to Video Production
Intro to Video ProductionOyetayo Ojoade
 
SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop Michel Henein
 
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...sohot866
 
Location recce london
Location recce londonLocation recce london
Location recce londonalanabusby
 

What's hot (9)

My final digipak process
My final digipak processMy final digipak process
My final digipak process
 
DOES IT HERTZ - De Roma
DOES IT HERTZ - De RomaDOES IT HERTZ - De Roma
DOES IT HERTZ - De Roma
 
Intro to Video Production
Intro to Video ProductionIntro to Video Production
Intro to Video Production
 
Imax
ImaxImax
Imax
 
Imax.docx
Imax.docxImax.docx
Imax.docx
 
SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop
 
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
 
Hiphop
HiphopHiphop
Hiphop
 
Location recce london
Location recce londonLocation recce london
Location recce london
 

Similar to 가상현실을 위한 오디오 기술

Live Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner EarLive Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner EarInner Ear
 
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive DocumentaryThe Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentarypower to the pixel
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Muhammad Imran
 
Media industries and structure
Media industries and structureMedia industries and structure
Media industries and structureStacey Johnson
 
10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and ImpactBruce Wiggins
 
COMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and TrackingCOMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and TrackingMark Billinghurst
 
Intellectual property right and copy right in indian
Intellectual property right and copy right in indian Intellectual property right and copy right in indian
Intellectual property right and copy right in indian SrikantaSahu10
 
History of the vocoder (final)
History of the vocoder (final) History of the vocoder (final)
History of the vocoder (final) connorfisher
 
A brief history of hearing aids
A brief history of hearing aidsA brief history of hearing aids
A brief history of hearing aidsMark Rauterkus
 
After the heroism, collaboration: Organizational learning &amp; the mobile space
After the heroism, collaboration: Organizational learning &amp; the mobile spaceAfter the heroism, collaboration: Organizational learning &amp; the mobile space
After the heroism, collaboration: Organizational learning &amp; the mobile spaceStephanie Pau
 
A history of reverb in music production
A history of reverb in music productionA history of reverb in music production
A history of reverb in music productionPaulo Abelho
 
The creative internet: 106 things
The creative internet: 106 things The creative internet: 106 things
The creative internet: 106 things Shane Smith
 
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)Peter Samis
 
Mobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau TanakaMobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau TanakaMobile 2.0 Europe
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxWilfredFender
 

Similar to 가상현실을 위한 오디오 기술 (20)

Live Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner EarLive Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner Ear
 
Spatial audio(19,24)
Spatial audio(19,24)Spatial audio(19,24)
Spatial audio(19,24)
 
Spatial Audio
Spatial AudioSpatial Audio
Spatial Audio
 
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive DocumentaryThe Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...
 
Media industries and structure
Media industries and structureMedia industries and structure
Media industries and structure
 
10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact
 
COMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and TrackingCOMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and Tracking
 
Techology is the new creative
Techology is the new creativeTechology is the new creative
Techology is the new creative
 
Intellectual property right and copy right in indian
Intellectual property right and copy right in indian Intellectual property right and copy right in indian
Intellectual property right and copy right in indian
 
The Creative Internet
The Creative InternetThe Creative Internet
The Creative Internet
 
History of the vocoder (final)
History of the vocoder (final) History of the vocoder (final)
History of the vocoder (final)
 
A brief history of hearing aids
A brief history of hearing aidsA brief history of hearing aids
A brief history of hearing aids
 
After the heroism, collaboration: Organizational learning &amp; the mobile space
After the heroism, collaboration: Organizational learning &amp; the mobile spaceAfter the heroism, collaboration: Organizational learning &amp; the mobile space
After the heroism, collaboration: Organizational learning &amp; the mobile space
 
A history of reverb in music production
A history of reverb in music productionA history of reverb in music production
A history of reverb in music production
 
The creative internet: 106 things
The creative internet: 106 things The creative internet: 106 things
The creative internet: 106 things
 
L10 The Broadcast Century
L10 The Broadcast CenturyL10 The Broadcast Century
L10 The Broadcast Century
 
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
 
Mobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau TanakaMobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau Tanaka
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptx
 

More from Keunwoo Choi

"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo ChoiKeunwoo Choi
 
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)Keunwoo Choi
 
Conditional generative model for audio
Conditional generative model for audioConditional generative model for audio
Conditional generative model for audioKeunwoo Choi
 
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, ExpectDeep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, ExpectKeunwoo Choi
 
Convolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classificationConvolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classificationKeunwoo Choi
 
The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...Keunwoo Choi
 
dl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Koreadl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, KoreaKeunwoo Choi
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Keunwoo Choi
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Keunwoo Choi
 
딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)Keunwoo Choi
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music PlaylistsKeunwoo Choi
 

More from Keunwoo Choi (12)

"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi
 
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
 
Conditional generative model for audio
Conditional generative model for audioConditional generative model for audio
Conditional generative model for audio
 
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, ExpectDeep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
 
Convolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classificationConvolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classification
 
The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...
 
dl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Koreadl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Korea
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24
 
딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

가상현실을 위한 오디오 기술

  • 1. Soundly 제 4회 오프라인 모임, 2019.11.30 Audio Technologies for Virtual Reality Ben Sangbae Chon, Ph.D. bc@gaudiolab.com Chief Science Officer
  • 2. 2 Gaudio Lab is a spatial audio technology company developing encompassing, three-dimensional sound solutions. We simplify the process of creating and delivering immersive sound, allowing listeners to perceive depth, direction, and interactivity of audio sources. • Founded in 2015 by MPEG Audio experts with 10+ active years in the group • Offices: Seoul and San Francisco • Audio scientists in residence: 6 Ph.D.s, 2 Master in Audio Signal Processing

  • 3. 3 Immersive Audio Contents by Gaudio 01. Tower GAUDI Unity 3D plugin performance demo (’15.7) High-Definition 3D Positional Sound with Interaction 03. CF Zombie (Mini VR Game, GDC 2016) High-definition 3D positional sound (’16.3) All Objects rendered with GAUDIO Unity 3D plugin 04. Project CoC (Stereo to VR) G’AUDIO specialty for legacy 360 video (’16.1) Convert legacy stereo to interactive 05. Fanta Promo - Drift 360 live recording for immersive (’16.2) Compare Ambisonics v. omni-binaural 02. Horror Maze (Launching Work for GearVR) Main 360 video demo for Gear VR 2 launching (’15.11) Built with 1 Sound Object + Stereo ambient track 06. Fanta Promo - Rope-swing, Wing-suit First-person view 360 live recording (’16.3) Binaural recording for extreme sport-enthusiast 07. Downhell Studio (interactive 360 Music) Immersive 360 Spatial Audio for 360 Video HOA + Objects rendering | also for TV speaker demo 08. Blue Dress (PV demo & Sample PT session) Immersive 360 Spatial Audio for 360 Video HOA + 3 Moving Objects rendering for Immersive 09. Goodday (360 Commercial) Immersive 360 Spatial Audio for 360 Video Remaster with extracted objects after post-production 10. Jambinai (interactive 360 Music) Interactive Audio for 360 Video Loud speaker demo for concert & same via Gear VR 11. World 1st VR Audio Livestreaming 13. Bo Hwa Gwak (Virtual Museum) Virtual experience of famous Museum Featured in Busan Film Festival 2017 14. SBS -Astro1&2 Episodes for K-pop star 360 Drama Ambisonics shoot, post-production with Works 15. SBS-TV Live Music Show Take from On-air live TV Music Show for 360 Video, Produced by Works 18. HKQ (String Quartet) (i360Music) Live String Quartet Recording by Petr Soupa Object with shotgun + FOA 19. The Committee (UK Comedy) British Comedy by 360 mixed by Petr Soupa Object with Lav + FOA 16. Bloodless (Award in Venice Film Fest.) Based on real story of brutally murdered sex worker, the Best VR Story award in Venice Film Festival 2017 17. G’Mic Take #1 Proprietary consumer FOA Mic Tech Demo 1st shooting on street, in room 21. Elliot Sloan (Gold medal Skateboarder) Skateboarding by Olympic Gold Medalist Reality with extreme elevation Teaser for Hollywood horror movie Positional sound extremely important for scare factor 20. Annabelle (Blockbuster Feature Film) Collaboration with SK Telecom Interactive with FOA, Sol Livestreaming SK Telecom - Seol-hyun Promo 360 Episodes for SKT Live shooting 360 12. 22. SBS - 9 Muses, Sitcom series SBS produced high quality 360 content with K-pop stars (9 Muses) / Stereoscopic 360 23. SBS - TV Live Show Season 2 SBS produced high quality 360 content with 
 SBS featured program, Inkikayo / Stereoscopic 360 24. Gaudio Live Take #2 Livestreaming showcase with FOA mic 25. C-Flat String Quartet 2 (i360Music) String quartet performance in a church
  • 4. 4 What Is Immersive Audio for VR?
  • 5. 5 Comparison between Conventional Media and Immersive (VR/AR/XR) Media - Loudspeaker layout references a static video - Sweet spot (and sweet orientation) is assumed - Quality of Experience is optimized at the sweet spot - It does NOT supports interactive viewing
  • 6. 6 Comparison between Conventional Media and Immersive (VR/AR/XR) Media
  • 7. 7 Comparison between Conventional Media and Immersive (VR/AR/XR) Media 360-Video : Multimedia Content Covering Whole Space
  • 8. 8 Comparison between Conventional Media and Immersive (VR/AR/XR) Media 360-Video : Multimedia Content Covering Whole Space Interactive Video : User chooses where to look at and Video is rendered accordingly
  • 9. 9 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Interactive Video : User chooses where to look at and Video is rendered accordingly Interactive Audio : Audio should also be rendered according to the User’s Head Orientation (and Position) 360-Video : Multimedia Content Covering Whole Space
  • 10. 10 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Interactive Video : User chooses where to look at and Video is rendered accordingly Interactive Audio : Audio is also rendered according to the User’s Head Orientation (and Position) 360-Video : Multimedia Content Covering Whole Space
  • 11. Yaw Roll Pitch 11 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Yaw Roll Pitch Y Z X <3 Degree of Freedom> - Interactive with Head Orientation - 360 Video <6 Degree of Freedom> - Interactive with Head Orientation
 and User’ Position - VR Games, AR
  • 12. 12 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Positional audio is an important storytelling cue in VR Due to user’s freedom of viewpoint, sound is an important signal to direct viewer’s attention to a scene and story as producer intended
  • 13. 13 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Is Loudspeaker Rendering Suitable for Interactive Immersive Audio?
  • 14. 14 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Binaural Technologies over Headphone is the Answer!
  • 15. 15 History of the Immersive Sound Binaural Technologies
  • 16. 16 History of Immersive Sound - Binaural Recording Technologies ‣ First Telephony by Alexander Graham Bell in 1876, First Phonograph by Thomas Edison in 1877 ‣ The Origins of Speech Communication and Audio Media Storage ‣ In 1877, Yankee Doodle by Frederick Boscovitz from Philadelphia to Washington DC
  • 17. 17 History of Immersive Sound - Binaural Recording Technologies Theatrophone - The First Live Binaural/Stereophonic Sound Demo ‣ Introduced by Clement Ader at the 1881 World Expo Paris ‣ A pair of microphones → A pair of receivers ‣ 80 telephone transmitters across the front of a stage connected from Opera Garnier to Paris Electrical Exhibition ‣ Scientific American reported, 
 “singers place themselves, in the mind of the listener, at a fixed distance, some to the right and others to the left.” ‣ Technical Difficulties : Rudimentary amplification, difficult microphone placement
  • 18. 18 History of Immersive Sound - Binaural Recording Technologies Theatrophone - The First, Subscription-based, Live Music Streaming ‣ Commercialize by the Compagnie du Theatrophone in 1890 ‣ Subscription model : 180 francs per year, 15 francs per performance ‣ Terminal : A headset (and a transmitter to tell a Teheatrophone operator which venue they wished to listen in to) ‣ Coin-operated listening stations installed in hotel lobbies and cafes, 50 cent for 2:30 of listening time
  • 19. 19 History of Immersive Sound - Binaural Recording Technologies ‣ Intermediate-Binaural Broadcasting by Doolittle at a US Radio Station (WPAJ) in 1924 ‣ Storing two channel sound and reproduction, without acoustic septum, was made in 1921 ‣ Left signal to one AM transmitter and right signal to another AM transmitter ‣ The listener at home needed two receivers left signalRadio Station A Radio Station B right signal * Radio broadcasting service started in United States, 1920.
  • 20. 20 History of Immersive Sound - Binaural Recording Technologies ‣ Binaural Telephone (Western Electric, 1927) ‣ Microphones are placed flush with the wall of a balloon made of leather or cloth and 
 packed with sponge rubber, wool, or cotton ‣ Signal separation by shadowing and absorption
  • 21. 21 History of Immersive Sound - Binaural Recording Technologies ‣ “Artificial Head”, a binaural capturing device by W. Bartlett Jones in 1927 ‣ Microphones placed on the sphere heading to forward ‣ First simulation of the orientation of human pinae
  • 22. 22 History of Immersive Sound - Binaural Recording Technologies ‣ Oscar by Fletcher, Bell Lab (Presented in 1933) ‣ Manikin bought from a wax figure dealer 
 and Microphones mounted on the cheeks - ‣ Oscar II (1940s) ‣ Microphones placed in the ears ‣ But the membrane at about 5mm distance outside the cave conchae
  • 23. Binaural Recording Now ‣ Record ear-input signal and play it over earphones ‣ Quite Natural and Ambient ‣ No Interaction in General ‣ Limited Interaction (3DIO Omni Binaural Microphone) 23 History of Immersive Sound - Present Binaural Recording Technologies Yaw Roll Pitch
  • 24. ‣ Basic Signal and System in Electrical Engineering ‣ Audio signal is a group of impulses and system can be defined by an impulse response. ‣ Convolution : Each sample’s response is summed up, with the normalization gain corresponding to the input sample. ‣ Sound propagation path from the sound source to each ear in a room can be modeled by a pair Impulse Response 24 History of Immersive Sound - Binaural Rendering System Impulse Response
  • 25. ‣ Binaural Room Impulse Response 25 History of Immersive Sound - Binaural Rendering BRIR (Binaural Room Impulse Response) HRIR (Head Related Impulse Response)
  • 26. HRIR has the key information for human auditory perception of the source position 26 History of Immersive Sound - Binaural Rendering
  • 27. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 27 History of Immersive Sound - Binaural Rendering BRIR at Concert Hall = HRIR + Reflections and Reverberation - Good Externalization - Natural Ambience - Difficult to De-reverberate
 (No Dry Sound) HRIR at Anechoic Chamber - Intermediate Externalization - No Ambience - Dry, straightforward sound
 (It is sound engineer’s job!) 909 × 401
  • 28. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 28 History of Immersive Sound - Binaural Rendering Impulse from one direction Binaural Recording Head Related Impulse Response Impulses from all direction HRIR Database from all direction
  • 29. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 29 History of Immersive Sound - Binaural Rendering 1) Input : Object Position, User Position, User Head Orientation 2) Output : Binaural Rendered Audio Signal Choose a HRIR/BRIR Filter Pair based on the Position and Orientation Information Sound source 1 2 2 Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 30. Expensive Realtime Convolution ‣ A 48kHz Sampled BRIR Pair from a Concert Hall would be 96000 Samples x 2 Channel ‣ To Create One Pair of Binaural Samples, 96000 x 2 Multiplications and Additions are required ‣ For a second, 9.2 G (96k x 2 x 48k) Operations are required ‣ Even for a short HRIR Pair (256 Samples at 48kHz), 25M (256 x 2 x 48k) Operations are required 30 History of Immersive Sound - Binaural Rendering Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 31. ‣ Signal Processing Techniques for Convolution ‣ Cooley and Turkey, “An Algorithm for the Machine Calculation of Complex Fourier Series” in 1965 ‣ FFT based convolution : Thomas Stockham, Jr, “High Speed Convolution and Correlation” in 1966 ‣ Block convolution : Charls S. Burrus, “Block Realization of Digital Filters” in 1972 31 History of Immersive Sound - Binaural Rendering Time Domain Freq Domain 256 25M 5M 1sec 4.6G 76M 2sec 9.2G 148M
  • 32. 32 History of Immersive Sound - Binaural Rendering ‣ Convolvotron : Binaural Synthesis using HRTF Convolution ‣ E. Wenzel, et. al, “A Virtual Display System for Conveying Three-Dimensional Acoustic Information” in 1988 ‣ Hardware Development : A PC extension card equipped with many parallel DSPs (up to 8 sound sources) in 1992
 - Each DSP is capable of 320 MOPS
  • 33. 33 History of Immersive Sound - Binaural Rendering ‣ Acoustically Optimized Low Complexity Binaural Synthesis Model for BRIR ‣ Acoustic Optimization : Taegyu Lee, “Scalable Multiband Binaural Renderer for MPEG-H 3D Audio” in 2015
 - Parameterizing BRIR (2 second) with Variable Order Filter in Frequency Domain (VOFF), 
 Parameteric Late Reverberation Filtering (PLF), QMF domain tapped delay line (QTDL)
  • 34. 34 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics, started by Gerzon in 1985 ‣ Ambisonics Microphone (A-Format) : A set of directional microphone, highly calibrated ‣ A-to-B Conversion : Inner product. (Each microphone signal is decomposed in terms of spherical harmonics) ‣ Object-to-B Conversion : Inner product. (Object signal is decomposed in terms of spherical harmonics) Spherical HaromonicsAmbisonics Microphone
  • 35. 35 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics Multichannel Reproduction (Coincident Microphones) ‣ Rotation, if necessary : Simple matrix conversion ‣ Reproduction : Over regular layout, inner-product of Spherical Harmonics and Loudspeaker Position Vector 

  • 36. 36 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics Multichannel Reproduction (Coincident Microphones) ‣ Not really popular as a multichannel reproduction method ‣ Production : Difficult to control source width control, distance control, sweet spot control, ambience control ‣ Distribution : More bandwidth, transmission of the channel rendered signal over the standard layout is preferred ‣ Suddenly, became very popular in the VR Audio ‣ Distribution : First order ambisonics is enough to enable the audio to interact with 3-DoF yaw, pitch and roll ‣ Other candidates : Object-based and channel based generally requires more audio channel signal ‣ Limitation : Capable of handling 6DoF by inter/extrapolation of multiple Ambisonics signals but very inefficient Rotation Virtual Loudspeaker Rendering Binaural Rendering
  • 37. 37 Present Outlook and Future Forecast
  • 38. 38 Three Major Types of Binaural Renderings - Direct Object Rendering ‣ Very straightforward Binaural Rendering Mechanism 1) Input : Object Position, User Position, User Head Orientation 2) Output : Binaural Rendered Audio Signal Choose a HRIR/BRIR Filter Pair based on the Geometrical Information Sound source 1 2 2 Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 39. 39 Three Major Types of Binaural Renderings - Virtual Channel based Rendering ‣ Objects and their positional metadata are delivered to the renderer at the reproduction device ‣ Relative positions of objects are calculated using head orientation ‣ Each object is localized at the relative position over a pre-determined virtual channel loudspeaker layout ‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position. L30 O F0 O R60 O L60 O R120 O L120 O B180 O F0 O R60 O L60 O R120 O L120 O B180 O R30 O
  • 40. 40 Three Major Types of Binaural Renderings - Ambisonics based Rendering ‣ Objects are Localized in Ambisonics, ‣ Interaction by the head orientation is applied using Ambisonics rotation, ‣ Ambisonics signal is converted to the virtual channel layout ‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position. Rotation Virtual Loudspeaker Rendering Binaural Rendering
  • 41. 41 Three Major Types of Binaural Renderings - Comparison Rendering Type Direct Object Rendering Virtual Channel Panning Ambisonics Represenation Adapted by Gaudio Some Game Engines Gaudio MPEG-H 3D Audio Dolby ATMOS VR Gaudio MPEG-H 3D Audio Facebook 360 Video Google 360 Video Spatial Accuracy OOO OO O Panning No Typically pair or triplet Typically all of the Virtual Speakers Timbre Affected by HRTF Affected by HRTF and Panning (Some Phase Problem) Affected by HRTF and Panning (More Phase Problem) Complexity Depending on the Number of Objects (2xN Convolutions) Depending on the Number of Virtual Channels (N Convolutions) Depending on the Ambisonics Order (M Convolutions) Running Memory 360x180 HRIRs N HRIRs M HRIRs Delivery Large number of object needs to be delivered. 4 signals for FOA 9 signals for SOA Interaction 3DOF (head orientation) and 6DOF (3DOF + positional interaction) only 3DOF
  • 42. 42 VR Market Situation ‣ Virtual Reality in General ‣ Uncomfortable form factor ‣ Not enough display resolution ‣ Dizziness from the rendering latency (both video or audio) ‣ No unified content creation-transmission-consumption ecosystem ‣ No Killer Content ‣ Expensive in Content Production : 180 degree? ‣ Couch Potato vs. 6 DOF Movement ‣ Audio ‣ Codec Issues : No server/device handles 5+ objects transmission ‣ Neither a standard or de-facto renderer for content creation ‣ Personalization : How to measure personalized HRIR? ‣ Difference auditory system characteristics with headphone : e.g. occlusion
  • 43. 43 MPEG-I, the 6DOF Standard for the Future 1995 2000 2005 2010 2015 2022 MPEG-IMPEG-1 MPEG-2 MPEG-4 MPEG-D MPEG-H 3D Audio : Support Rendering for Combination of Channels, Objects, and/or Ambisonics Audio for 6DoF AR/VR Content USAC : Hybrid Audio and Speech Codec SAOC : Object Mix & Control MPEG-Surround : Enhanced Channel Extension HE-AAC v.2 : Channel Extension by Parametric Stereo HE-AAC : Bandwidth Extension by Spectral Band Replication AAC : Perceptual Noise Shaping AAC : Enhanced Psychoacoustical Model MP3 : First Psychoacoustical Compression Standard
  • 44. 44 MPEG-I, the 6DOF Standard for the Future • 3 Degrees of Freedom (3DoF) - Yaw-Pitch-Roll - Head orientation only - 360° Video and Cinematic VR - Ambisonics ‣ MPEG-H 3D Audio • 3DoF+ - 3DoF with limited Movements (typically, head movements) • 6 Degrees of Freedom (6DoF) - 3DoF + x-y-z - Head orientation + Movements - Full Interactive VR - Object-based Audio ‣ MPEG-I Immersive Audio “ User’s freedom of movements for interactivity in VR space ”
  • 45. 45 MPEG-I, the 6DOF Standard for the Future • Localization of Virtual Sources ‣ Use Head-Related Transfer Function (HRTF) - From virtual source to L/R ears ‣ Realistic rendering of spatial position due to perceptual cues - ITD, ILD, IC • Directivity of Virtual Sources ‣ Object perceived loudness changes as user moved around object • Ambience and Reverberation ‣ Almost all spaces impose some reverberation on sound sources - Direct Sound, Early Reflections, and Late Reverberation - Need to estimate model for AR • Occlusion, Doppler, etc…
  • 46. 46 MPEG-I Audio Reference Architecture *N18158