SlideShare a Scribd company logo
Soundly 제 4회 오프라인 모임, 2019.11.30
Audio Technologies for Virtual Reality
Ben Sangbae Chon, Ph.D.
bc@gaudiolab.com
Chief Science Officer
2
Gaudio Lab is a spatial audio technology
company developing encompassing,
three-dimensional sound solutions.
We simplify the process of creating and
delivering immersive sound, allowing
listeners to perceive depth, direction,
and interactivity of audio sources.
• Founded in 2015 by MPEG Audio experts with 10+ active years in the group
• Offices: Seoul and San Francisco
• Audio scientists in residence: 6 Ph.D.s, 2 Master in Audio Signal Processing

3
Immersive Audio Contents by Gaudio
01. Tower GAUDI
Unity 3D plugin performance demo (’15.7)
High-Definition 3D Positional Sound with
Interaction
03. CF Zombie (Mini VR Game, GDC 2016)
High-definition 3D positional sound (’16.3)
All Objects rendered with GAUDIO Unity 3D plugin
04. Project CoC (Stereo to VR)
G’AUDIO specialty for legacy 360 video (’16.1)
Convert legacy stereo to interactive
05. Fanta Promo - Drift
360 live recording for immersive (’16.2)
Compare Ambisonics v. omni-binaural
02. Horror Maze (Launching Work for GearVR)
Main 360 video demo for Gear VR 2 launching (’15.11)
Built with 1 Sound Object + Stereo ambient track
06. Fanta Promo - Rope-swing, Wing-suit
First-person view 360 live recording (’16.3)
Binaural recording for extreme sport-enthusiast
07. Downhell Studio (interactive 360 Music)
Immersive 360 Spatial Audio for 360 Video
HOA + Objects rendering | also for TV speaker demo
08. Blue Dress (PV demo & Sample PT session)
Immersive 360 Spatial Audio for 360 Video
HOA + 3 Moving Objects rendering for Immersive
09. Goodday (360 Commercial)
Immersive 360 Spatial Audio for 360 Video
Remaster with extracted objects after post-production
10. Jambinai (interactive 360 Music)
Interactive Audio for 360 Video
Loud speaker demo for concert & same via Gear VR
11. World 1st VR Audio Livestreaming
13. Bo Hwa Gwak (Virtual Museum)
Virtual experience of famous Museum
Featured in Busan Film Festival 2017
14. SBS -Astro1&2
Episodes for K-pop star 360 Drama
Ambisonics shoot, post-production with Works
15. SBS-TV Live Music Show
Take from On-air live TV Music Show
for 360 Video, Produced by Works
18. HKQ (String Quartet) (i360Music)
Live String Quartet Recording by Petr Soupa
Object with shotgun + FOA
19. The Committee (UK Comedy)
British Comedy by 360 mixed by Petr Soupa
Object with Lav + FOA
16. Bloodless (Award in Venice Film Fest.)
Based on real story of brutally murdered sex worker,
the Best VR Story award in Venice Film Festival 2017
17. G’Mic Take #1
Proprietary consumer FOA Mic Tech Demo
1st shooting on street, in room
21. Elliot Sloan (Gold medal Skateboarder)
Skateboarding by Olympic Gold Medalist
Reality with extreme elevation
Teaser for Hollywood horror movie
Positional sound extremely important for scare factor
20. Annabelle (Blockbuster Feature Film)
Collaboration with SK Telecom
Interactive with FOA, Sol Livestreaming
SK Telecom - Seol-hyun
Promo 360 Episodes for SKT
Live shooting 360
12.
22. SBS - 9 Muses, Sitcom series
SBS produced high quality 360 content
with K-pop stars (9 Muses) / Stereoscopic 360
23. SBS - TV Live Show Season 2
SBS produced high quality 360 content with 

SBS featured program, Inkikayo / Stereoscopic 360
24. Gaudio Live Take #2
Livestreaming showcase with FOA mic
25. C-Flat String Quartet 2 (i360Music)
String quartet performance in a church
4
What Is Immersive Audio for VR?
5
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
- Loudspeaker layout references a static video
- Sweet spot (and sweet orientation) is assumed
- Quality of Experience is optimized at the sweet spot
- It does NOT supports interactive viewing
6
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
7
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
360-Video : Multimedia Content Covering Whole Space
8
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
360-Video : Multimedia Content Covering Whole Space
Interactive Video : User chooses where to look at and Video is rendered accordingly
9
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Interactive Video : User chooses where to look at and Video is rendered accordingly
Interactive Audio : Audio should also be rendered according to the User’s Head Orientation (and Position)
360-Video : Multimedia Content Covering Whole Space
10
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Interactive Video : User chooses where to look at and Video is rendered accordingly
Interactive Audio : Audio is also rendered according to the User’s Head Orientation (and Position)
360-Video : Multimedia Content Covering Whole Space
Yaw
Roll
Pitch
11
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Yaw
Roll
Pitch
Y
Z X
<3 Degree of Freedom>
- Interactive with Head Orientation
- 360 Video
<6 Degree of Freedom>
- Interactive with Head Orientation

and User’ Position
- VR Games, AR
12
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Positional audio is an important storytelling cue in VR
Due to user’s freedom of viewpoint, sound is an important signal to direct
viewer’s attention to a scene and story as producer intended
13
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Is Loudspeaker Rendering Suitable for Interactive Immersive Audio?
14
Comparison between Conventional Media and Immersive (VR/AR/XR) Media
Binaural Technologies over Headphone is the Answer!
15
History of the Immersive Sound
Binaural Technologies
16
History of Immersive Sound - Binaural Recording Technologies
‣ First Telephony by Alexander Graham Bell in 1876, First Phonograph by Thomas Edison in 1877

‣ The Origins of Speech Communication and Audio Media Storage 

‣ In 1877, Yankee Doodle by Frederick Boscovitz from Philadelphia to Washington DC
17
History of Immersive Sound - Binaural Recording Technologies
Theatrophone - The First Live Binaural/Stereophonic Sound Demo

‣ Introduced by Clement Ader at the 1881 World Expo Paris

‣ A pair of microphones → A pair of receivers
‣ 80 telephone transmitters across the front of a stage connected from Opera Garnier to Paris Electrical Exhibition
‣ Scientific American reported, 

“singers place themselves, in the mind of the listener, at a fixed distance, some to the right and others to the left.”
‣ Technical Difficulties : Rudimentary amplification, difficult microphone placement
18
History of Immersive Sound - Binaural Recording Technologies
Theatrophone - The First, Subscription-based, Live Music Streaming
‣ Commercialize by the Compagnie du Theatrophone in 1890
‣ Subscription model : 180 francs per year, 15 francs per performance
‣ Terminal : A headset (and a transmitter to tell a Teheatrophone operator which venue they wished to listen in to)
‣ Coin-operated listening stations installed in hotel lobbies and cafes, 50 cent for 2:30 of listening time
19
History of Immersive Sound - Binaural Recording Technologies
‣ Intermediate-Binaural Broadcasting by Doolittle at a US Radio Station (WPAJ) in 1924
‣ Storing two channel sound and reproduction, without acoustic septum, was made in 1921

‣ Left signal to one AM transmitter and right signal to another AM transmitter

‣ The listener at home needed two receivers

left signalRadio Station A
Radio Station B right signal
* Radio broadcasting service started in United States, 1920.
20
History of Immersive Sound - Binaural Recording Technologies
‣ Binaural Telephone (Western Electric, 1927)

‣ Microphones are placed flush with the wall of a balloon made of leather or cloth and 

packed with sponge rubber, wool, or cotton

‣ Signal separation by shadowing and absorption
21
History of Immersive Sound - Binaural Recording Technologies
‣ “Artificial Head”, a binaural capturing device by W. Bartlett Jones in 1927
‣ Microphones placed on the sphere heading to forward

‣ First simulation of the orientation of human pinae
22
History of Immersive Sound - Binaural Recording Technologies
‣ Oscar by Fletcher, Bell Lab (Presented in 1933)
‣ Manikin bought from a wax figure dealer 

and Microphones mounted on the cheeks

-
‣ Oscar II (1940s)
‣ Microphones placed in the ears

‣ But the membrane at about 5mm distance
outside the cave conchae
Binaural Recording Now

‣ Record ear-input signal and play it over earphones

‣ Quite Natural and Ambient

‣ No Interaction in General

‣ Limited Interaction (3DIO Omni Binaural Microphone)
23
History of Immersive Sound - Present Binaural Recording Technologies
Yaw
Roll
Pitch
‣ Basic Signal and System in Electrical Engineering
‣ Audio signal is a group of impulses and system can be defined by an impulse response.
‣ Convolution : Each sample’s response is summed up, with the normalization gain corresponding to
the input sample.
‣ Sound propagation path from the sound source to each ear in a room can be modeled by a pair
Impulse Response
24
History of Immersive Sound - Binaural Rendering
System Impulse Response
‣ Binaural Room Impulse Response
25
History of Immersive Sound - Binaural Rendering
BRIR (Binaural Room Impulse Response) HRIR (Head Related Impulse Response)
HRIR has the key information for human auditory perception of the source position

26
History of Immersive Sound - Binaural Rendering
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
27
History of Immersive Sound - Binaural Rendering
BRIR at Concert Hall
= HRIR + Reflections and Reverberation
- Good Externalization
- Natural Ambience
- Difficult to De-reverberate

(No Dry Sound)
HRIR at Anechoic Chamber
- Intermediate Externalization
- No Ambience
- Dry, straightforward sound

(It is sound engineer’s job!)
909 × 401
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
28
History of Immersive Sound - Binaural Rendering
Impulse from one direction
Binaural Recording
Head Related Impulse Response
Impulses from all direction
HRIR Database from all direction
Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response

‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model

‣ Convolution of the Object signal by the HRIR will provide Binaural Sound
29
History of Immersive Sound - Binaural Rendering
1) Input : Object Position, User Position, User Head Orientation
2) Output : Binaural Rendered Audio Signal
Choose a HRIR/BRIR Filter Pair based on the Position and Orientation Information
Sound source
1
2
2
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
Expensive Realtime Convolution

‣ A 48kHz Sampled BRIR Pair from a Concert Hall would be 96000 Samples x 2 Channel

‣ To Create One Pair of Binaural Samples, 96000 x 2 Multiplications and Additions are required

‣ For a second, 9.2 G (96k x 2 x 48k) Operations are required

‣ Even for a short HRIR Pair (256 Samples at 48kHz), 25M (256 x 2 x 48k) Operations are required
30
History of Immersive Sound - Binaural Rendering
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
‣ Signal Processing Techniques for Convolution
‣ Cooley and Turkey, “An Algorithm for the Machine Calculation of Complex Fourier Series” in 1965
‣ FFT based convolution : Thomas Stockham, Jr, “High Speed Convolution and Correlation” in 1966
‣ Block convolution : Charls S. Burrus, “Block Realization of Digital Filters” in 1972
31
History of Immersive Sound - Binaural Rendering
Time Domain Freq Domain
256 25M 5M
1sec 4.6G 76M
2sec 9.2G 148M
32
History of Immersive Sound - Binaural Rendering
‣ Convolvotron : Binaural Synthesis using HRTF Convolution
‣ E. Wenzel, et. al, “A Virtual Display System for Conveying Three-Dimensional Acoustic Information” in 1988
‣ Hardware Development : A PC extension card equipped with many parallel DSPs (up to 8 sound sources) in 1992

- Each DSP is capable of 320 MOPS
33
History of Immersive Sound - Binaural Rendering
‣ Acoustically Optimized Low Complexity Binaural Synthesis Model for BRIR
‣ Acoustic Optimization : Taegyu Lee, “Scalable Multiband Binaural Renderer for MPEG-H 3D Audio” in 2015

- Parameterizing BRIR (2 second) with Variable Order Filter in Frequency Domain (VOFF), 

Parameteric Late Reverberation Filtering (PLF), QMF domain tapped delay line (QTDL)
34
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics, started by Gerzon in 1985
‣ Ambisonics Microphone (A-Format) : A set of directional microphone, highly calibrated
‣ A-to-B Conversion : Inner product. (Each microphone signal is decomposed in terms of spherical harmonics)
‣ Object-to-B Conversion : Inner product. (Object signal is decomposed in terms of spherical harmonics)
Spherical HaromonicsAmbisonics Microphone
35
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics Multichannel Reproduction (Coincident Microphones)
‣ Rotation, if necessary : Simple matrix conversion
‣ Reproduction : Over regular layout, inner-product of Spherical Harmonics and Loudspeaker Position Vector 

36
History of Immersive Sound - Ambisonics for Binaural Rendering
‣ Ambisonics Multichannel Reproduction (Coincident Microphones)
‣ Not really popular as a multichannel reproduction method
‣ Production : Difficult to control source width control, distance control, sweet spot control, ambience control
‣ Distribution : More bandwidth, transmission of the channel rendered signal over the standard layout is preferred
‣ Suddenly, became very popular in the VR Audio
‣ Distribution : First order ambisonics is enough to enable the audio to interact with 3-DoF yaw, pitch and roll
‣ Other candidates : Object-based and channel based generally requires more audio channel signal
‣ Limitation : Capable of handling 6DoF by inter/extrapolation of multiple Ambisonics signals but very inefficient
Rotation Virtual Loudspeaker Rendering Binaural Rendering
37
Present Outlook and Future Forecast
38
Three Major Types of Binaural Renderings - Direct Object Rendering
‣ Very straightforward Binaural Rendering Mechanism
1) Input : Object Position, User Position, User Head Orientation
2) Output : Binaural Rendered Audio Signal
Choose a HRIR/BRIR Filter Pair based on the Geometrical Information
Sound source
1
2
2
Audio Samples
= Impulses
Convolution
= Filtering
Binaural Rendered Signal
39
Three Major Types of Binaural Renderings - Virtual Channel based Rendering
‣ Objects and their positional metadata are delivered to the renderer at the reproduction device

‣ Relative positions of objects are calculated using head orientation

‣ Each object is localized at the relative position over a pre-determined virtual channel loudspeaker layout

‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position.
L30
O
F0
O
R60
O
L60
O
R120
O
L120
O
B180
O
F0
O
R60
O
L60
O
R120
O
L120
O
B180
O
R30
O
40
Three Major Types of Binaural Renderings - Ambisonics based Rendering
‣ Objects are Localized in Ambisonics, 

‣ Interaction by the head orientation is applied using Ambisonics rotation,

‣ Ambisonics signal is converted to the virtual channel layout

‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position.
Rotation Virtual Loudspeaker Rendering Binaural Rendering
41
Three Major Types of Binaural Renderings - Comparison
Rendering Type Direct Object Rendering Virtual Channel Panning Ambisonics Represenation
Adapted by
Gaudio
Some Game Engines
Gaudio
MPEG-H 3D Audio
Dolby ATMOS VR
Gaudio
MPEG-H 3D Audio
Facebook 360 Video
Google 360 Video
Spatial Accuracy OOO OO O
Panning No Typically pair or triplet Typically all of the Virtual Speakers
Timbre Affected by HRTF
Affected by HRTF and Panning
(Some Phase Problem)
Affected by HRTF and Panning
(More Phase Problem)
Complexity
Depending on
the Number of Objects
(2xN Convolutions)
Depending on
the Number of Virtual Channels
(N Convolutions)
Depending on
the Ambisonics Order
(M Convolutions)
Running Memory 360x180 HRIRs N HRIRs M HRIRs
Delivery Large number of object needs to be delivered.
4 signals for FOA
9 signals for SOA
Interaction 3DOF (head orientation) and 6DOF (3DOF + positional interaction) only 3DOF
42
VR Market Situation
‣ Virtual Reality in General

‣ Uncomfortable form factor

‣ Not enough display resolution

‣ Dizziness from the rendering latency (both video or audio)

‣ No unified content creation-transmission-consumption ecosystem

‣ No Killer Content

‣ Expensive in Content Production : 180 degree?

‣ Couch Potato vs. 6 DOF Movement

‣ Audio

‣ Codec Issues : No server/device handles 5+ objects transmission 

‣ Neither a standard or de-facto renderer for content creation

‣ Personalization : How to measure personalized HRIR?

‣ Difference auditory system characteristics with headphone : e.g. occlusion
43
MPEG-I, the 6DOF Standard for the Future
1995 2000 2005 2010 2015 2022
MPEG-IMPEG-1 MPEG-2 MPEG-4 MPEG-D MPEG-H
3D Audio : Support Rendering for Combination of
Channels, Objects, and/or Ambisonics
Audio for 6DoF AR/VR Content
USAC : Hybrid Audio and Speech Codec
SAOC : Object Mix & Control
MPEG-Surround : Enhanced Channel Extension
HE-AAC v.2 : Channel Extension by Parametric Stereo
HE-AAC : Bandwidth Extension by Spectral Band Replication
AAC : Perceptual Noise Shaping
AAC : Enhanced Psychoacoustical Model
MP3 : First Psychoacoustical Compression Standard
44
MPEG-I, the 6DOF Standard for the Future
• 3 Degrees of Freedom (3DoF)
- Yaw-Pitch-Roll
- Head orientation only
- 360° Video and Cinematic VR
- Ambisonics
‣ MPEG-H 3D Audio
• 3DoF+
- 3DoF with limited Movements (typically, head movements)
• 6 Degrees of Freedom (6DoF)
- 3DoF + x-y-z
- Head orientation + Movements
- Full Interactive VR
- Object-based Audio
‣ MPEG-I Immersive Audio
“ User’s freedom of movements for interactivity in VR space ”
45
MPEG-I, the 6DOF Standard for the Future
• Localization of Virtual Sources
‣ Use Head-Related Transfer Function (HRTF)
- From virtual source to L/R ears
‣ Realistic rendering of spatial position due to perceptual cues
- ITD, ILD, IC
• Directivity of Virtual Sources
‣ Object perceived loudness changes as user moved around object
• Ambience and Reverberation
‣ Almost all spaces impose some reverberation on sound sources
- Direct Sound, Early Reflections, and Late Reverberation
- Need to estimate model for AR
• Occlusion, Doppler, etc…
46
MPEG-I Audio Reference Architecture
*N18158
47
Audio-Only Applications
*N18627

More Related Content

What's hot

My final digipak process
My final digipak processMy final digipak process
My final digipak process
JayImogenJones
 
DOES IT HERTZ - De Roma
DOES IT HERTZ - De RomaDOES IT HERTZ - De Roma
DOES IT HERTZ - De Roma
Architectura
 
Intro to Video Production
Intro to Video ProductionIntro to Video Production
Intro to Video Production
Oyetayo Ojoade
 
Imax
ImaxImax
Imax.docx
Imax.docxImax.docx
Imax.docx
Darshil Kapadiya
 
SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop
Michel Henein
 
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
sohot866
 
Hiphop
HiphopHiphop
Hiphop
shirlon
 
Location recce london
Location recce londonLocation recce london
Location recce london
alanabusby
 

What's hot (9)

My final digipak process
My final digipak processMy final digipak process
My final digipak process
 
DOES IT HERTZ - De Roma
DOES IT HERTZ - De RomaDOES IT HERTZ - De Roma
DOES IT HERTZ - De Roma
 
Intro to Video Production
Intro to Video ProductionIntro to Video Production
Intro to Video Production
 
Imax
ImaxImax
Imax
 
Imax.docx
Imax.docxImax.docx
Imax.docx
 
SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop SOBA July 2019 Audio Workshop
SOBA July 2019 Audio Workshop
 
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
Lowprice pioneer avh p2300 dvd 5.8 in-dash double-din dvd av receiver with ip...
 
Hiphop
HiphopHiphop
Hiphop
 
Location recce london
Location recce londonLocation recce london
Location recce london
 

Similar to 가상현실을 위한 오디오 기술

Live Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner EarLive Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner Ear
Inner Ear
 
Spatial audio(19,24)
Spatial audio(19,24)Spatial audio(19,24)
Spatial audio(19,24)
Arnab Banerjee
 
Spatial Audio
Spatial AudioSpatial Audio
Spatial Audio
Atrija Singh
 
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive DocumentaryThe Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
power to the pixel
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...
Muhammad Imran
 
Media industries and structure
Media industries and structureMedia industries and structure
Media industries and structure
Stacey Johnson
 
10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact
Bruce Wiggins
 
COMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and TrackingCOMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and Tracking
Mark Billinghurst
 
Techology is the new creative
Techology is the new creativeTechology is the new creative
Techology is the new creative
Lawrence Ball-Piatti
 
Intellectual property right and copy right in indian
Intellectual property right and copy right in indian Intellectual property right and copy right in indian
Intellectual property right and copy right in indian
SrikantaSahu10
 
The Creative Internet
The Creative InternetThe Creative Internet
The Creative Internet
Boris Loukanov
 
History of the vocoder (final)
History of the vocoder (final) History of the vocoder (final)
History of the vocoder (final)
connorfisher
 
A brief history of hearing aids
A brief history of hearing aidsA brief history of hearing aids
A brief history of hearing aids
Mark Rauterkus
 
A history of reverb in music production
A history of reverb in music productionA history of reverb in music production
A history of reverb in music production
Paulo Abelho
 
The creative internet: 106 things
The creative internet: 106 things The creative internet: 106 things
The creative internet: 106 things
Shane Smith
 
L10 The Broadcast Century
L10 The Broadcast CenturyL10 The Broadcast Century
L10 The Broadcast Century
Ólafur Andri Ragnarsson
 
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
Peter Samis
 
Mobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau TanakaMobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptx
WilfredFender
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptx
WilfredFender
 

Similar to 가상현실을 위한 오디오 기술 (20)

Live Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner EarLive Video Streaming, a practical workshop by Inner Ear
Live Video Streaming, a practical workshop by Inner Ear
 
Spatial audio(19,24)
Spatial audio(19,24)Spatial audio(19,24)
Spatial audio(19,24)
 
Spatial Audio
Spatial AudioSpatial Audio
Spatial Audio
 
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive DocumentaryThe Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
The Pixel Lab 2014_Loc Dao_Lessons Learned In Interactive Documentary
 
Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...Immersive audio rendering for interactive complex virtual architectural envir...
Immersive audio rendering for interactive complex virtual architectural envir...
 
Media industries and structure
Media industries and structureMedia industries and structure
Media industries and structure
 
10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact10 Minute Research Presentation on Ambisonics and Impact
10 Minute Research Presentation on Ambisonics and Impact
 
COMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and TrackingCOMP 4010 Lecture5 VR Audio and Tracking
COMP 4010 Lecture5 VR Audio and Tracking
 
Techology is the new creative
Techology is the new creativeTechology is the new creative
Techology is the new creative
 
Intellectual property right and copy right in indian
Intellectual property right and copy right in indian Intellectual property right and copy right in indian
Intellectual property right and copy right in indian
 
The Creative Internet
The Creative InternetThe Creative Internet
The Creative Internet
 
History of the vocoder (final)
History of the vocoder (final) History of the vocoder (final)
History of the vocoder (final)
 
A brief history of hearing aids
A brief history of hearing aidsA brief history of hearing aids
A brief history of hearing aids
 
A history of reverb in music production
A history of reverb in music productionA history of reverb in music production
A history of reverb in music production
 
The creative internet: 106 things
The creative internet: 106 things The creative internet: 106 things
The creative internet: 106 things
 
L10 The Broadcast Century
L10 The Broadcast CenturyL10 The Broadcast Century
L10 The Broadcast Century
 
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
The eyes want to have it: Multimedia Handhelds in the Museum (an evolving story)
 
Mobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau TanakaMobile 2.0 Europe - Atau Tanaka
Mobile 2.0 Europe - Atau Tanaka
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptx
 
Evolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptxEvolution of Audio Media Wilfredo Satiada.pptx
Evolution of Audio Media Wilfredo Satiada.pptx
 

More from Keunwoo Choi

"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi
Keunwoo Choi
 
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
Keunwoo Choi
 
Conditional generative model for audio
Conditional generative model for audioConditional generative model for audio
Conditional generative model for audio
Keunwoo Choi
 
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, ExpectDeep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
Keunwoo Choi
 
Convolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classificationConvolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classification
Keunwoo Choi
 
The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...
Keunwoo Choi
 
dl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Koreadl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Korea
Keunwoo Choi
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Keunwoo Choi
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
Keunwoo Choi
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24
Keunwoo Choi
 
딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)
Keunwoo Choi
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
Keunwoo Choi
 

More from Keunwoo Choi (12)

"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi"All you need is AI and music" by Keunwoo Choi
"All you need is AI and music" by Keunwoo Choi
 
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
인공지능의 음악 인지 모델 - 65차 한국음악지각인지학회 기조강연 (최근우 박사)
 
Conditional generative model for audio
Conditional generative model for audioConditional generative model for audio
Conditional generative model for audio
 
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, ExpectDeep Learning with Audio Signals: Prepare, Process, Design, Expect
Deep Learning with Audio Signals: Prepare, Process, Design, Expect
 
Convolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classificationConvolutional recurrent neural networks for music classification
Convolutional recurrent neural networks for music classification
 
The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...The effects of noisy labels on deep convolutional neural networks for music t...
The effects of noisy labels on deep convolutional neural networks for music t...
 
dl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Koreadl4mir tutorial at ETRI, Korea
dl4mir tutorial at ETRI, Korea
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
 
Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24Deep learning for music classification, 2016-05-24
Deep learning for music classification, 2016-05-24
 
딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)딥러닝 개요 (2015-05-09 KISTEP)
딥러닝 개요 (2015-05-09 KISTEP)
 
Understanding Music Playlists
Understanding Music PlaylistsUnderstanding Music Playlists
Understanding Music Playlists
 

Recently uploaded

Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 

Recently uploaded (20)

Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 

가상현실을 위한 오디오 기술

  • 1. Soundly 제 4회 오프라인 모임, 2019.11.30 Audio Technologies for Virtual Reality Ben Sangbae Chon, Ph.D. bc@gaudiolab.com Chief Science Officer
  • 2. 2 Gaudio Lab is a spatial audio technology company developing encompassing, three-dimensional sound solutions. We simplify the process of creating and delivering immersive sound, allowing listeners to perceive depth, direction, and interactivity of audio sources. • Founded in 2015 by MPEG Audio experts with 10+ active years in the group • Offices: Seoul and San Francisco • Audio scientists in residence: 6 Ph.D.s, 2 Master in Audio Signal Processing

  • 3. 3 Immersive Audio Contents by Gaudio 01. Tower GAUDI Unity 3D plugin performance demo (’15.7) High-Definition 3D Positional Sound with Interaction 03. CF Zombie (Mini VR Game, GDC 2016) High-definition 3D positional sound (’16.3) All Objects rendered with GAUDIO Unity 3D plugin 04. Project CoC (Stereo to VR) G’AUDIO specialty for legacy 360 video (’16.1) Convert legacy stereo to interactive 05. Fanta Promo - Drift 360 live recording for immersive (’16.2) Compare Ambisonics v. omni-binaural 02. Horror Maze (Launching Work for GearVR) Main 360 video demo for Gear VR 2 launching (’15.11) Built with 1 Sound Object + Stereo ambient track 06. Fanta Promo - Rope-swing, Wing-suit First-person view 360 live recording (’16.3) Binaural recording for extreme sport-enthusiast 07. Downhell Studio (interactive 360 Music) Immersive 360 Spatial Audio for 360 Video HOA + Objects rendering | also for TV speaker demo 08. Blue Dress (PV demo & Sample PT session) Immersive 360 Spatial Audio for 360 Video HOA + 3 Moving Objects rendering for Immersive 09. Goodday (360 Commercial) Immersive 360 Spatial Audio for 360 Video Remaster with extracted objects after post-production 10. Jambinai (interactive 360 Music) Interactive Audio for 360 Video Loud speaker demo for concert & same via Gear VR 11. World 1st VR Audio Livestreaming 13. Bo Hwa Gwak (Virtual Museum) Virtual experience of famous Museum Featured in Busan Film Festival 2017 14. SBS -Astro1&2 Episodes for K-pop star 360 Drama Ambisonics shoot, post-production with Works 15. SBS-TV Live Music Show Take from On-air live TV Music Show for 360 Video, Produced by Works 18. HKQ (String Quartet) (i360Music) Live String Quartet Recording by Petr Soupa Object with shotgun + FOA 19. The Committee (UK Comedy) British Comedy by 360 mixed by Petr Soupa Object with Lav + FOA 16. Bloodless (Award in Venice Film Fest.) Based on real story of brutally murdered sex worker, the Best VR Story award in Venice Film Festival 2017 17. G’Mic Take #1 Proprietary consumer FOA Mic Tech Demo 1st shooting on street, in room 21. Elliot Sloan (Gold medal Skateboarder) Skateboarding by Olympic Gold Medalist Reality with extreme elevation Teaser for Hollywood horror movie Positional sound extremely important for scare factor 20. Annabelle (Blockbuster Feature Film) Collaboration with SK Telecom Interactive with FOA, Sol Livestreaming SK Telecom - Seol-hyun Promo 360 Episodes for SKT Live shooting 360 12. 22. SBS - 9 Muses, Sitcom series SBS produced high quality 360 content with K-pop stars (9 Muses) / Stereoscopic 360 23. SBS - TV Live Show Season 2 SBS produced high quality 360 content with 
 SBS featured program, Inkikayo / Stereoscopic 360 24. Gaudio Live Take #2 Livestreaming showcase with FOA mic 25. C-Flat String Quartet 2 (i360Music) String quartet performance in a church
  • 4. 4 What Is Immersive Audio for VR?
  • 5. 5 Comparison between Conventional Media and Immersive (VR/AR/XR) Media - Loudspeaker layout references a static video - Sweet spot (and sweet orientation) is assumed - Quality of Experience is optimized at the sweet spot - It does NOT supports interactive viewing
  • 6. 6 Comparison between Conventional Media and Immersive (VR/AR/XR) Media
  • 7. 7 Comparison between Conventional Media and Immersive (VR/AR/XR) Media 360-Video : Multimedia Content Covering Whole Space
  • 8. 8 Comparison between Conventional Media and Immersive (VR/AR/XR) Media 360-Video : Multimedia Content Covering Whole Space Interactive Video : User chooses where to look at and Video is rendered accordingly
  • 9. 9 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Interactive Video : User chooses where to look at and Video is rendered accordingly Interactive Audio : Audio should also be rendered according to the User’s Head Orientation (and Position) 360-Video : Multimedia Content Covering Whole Space
  • 10. 10 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Interactive Video : User chooses where to look at and Video is rendered accordingly Interactive Audio : Audio is also rendered according to the User’s Head Orientation (and Position) 360-Video : Multimedia Content Covering Whole Space
  • 11. Yaw Roll Pitch 11 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Yaw Roll Pitch Y Z X <3 Degree of Freedom> - Interactive with Head Orientation - 360 Video <6 Degree of Freedom> - Interactive with Head Orientation
 and User’ Position - VR Games, AR
  • 12. 12 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Positional audio is an important storytelling cue in VR Due to user’s freedom of viewpoint, sound is an important signal to direct viewer’s attention to a scene and story as producer intended
  • 13. 13 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Is Loudspeaker Rendering Suitable for Interactive Immersive Audio?
  • 14. 14 Comparison between Conventional Media and Immersive (VR/AR/XR) Media Binaural Technologies over Headphone is the Answer!
  • 15. 15 History of the Immersive Sound Binaural Technologies
  • 16. 16 History of Immersive Sound - Binaural Recording Technologies ‣ First Telephony by Alexander Graham Bell in 1876, First Phonograph by Thomas Edison in 1877 ‣ The Origins of Speech Communication and Audio Media Storage ‣ In 1877, Yankee Doodle by Frederick Boscovitz from Philadelphia to Washington DC
  • 17. 17 History of Immersive Sound - Binaural Recording Technologies Theatrophone - The First Live Binaural/Stereophonic Sound Demo ‣ Introduced by Clement Ader at the 1881 World Expo Paris ‣ A pair of microphones → A pair of receivers ‣ 80 telephone transmitters across the front of a stage connected from Opera Garnier to Paris Electrical Exhibition ‣ Scientific American reported, 
 “singers place themselves, in the mind of the listener, at a fixed distance, some to the right and others to the left.” ‣ Technical Difficulties : Rudimentary amplification, difficult microphone placement
  • 18. 18 History of Immersive Sound - Binaural Recording Technologies Theatrophone - The First, Subscription-based, Live Music Streaming ‣ Commercialize by the Compagnie du Theatrophone in 1890 ‣ Subscription model : 180 francs per year, 15 francs per performance ‣ Terminal : A headset (and a transmitter to tell a Teheatrophone operator which venue they wished to listen in to) ‣ Coin-operated listening stations installed in hotel lobbies and cafes, 50 cent for 2:30 of listening time
  • 19. 19 History of Immersive Sound - Binaural Recording Technologies ‣ Intermediate-Binaural Broadcasting by Doolittle at a US Radio Station (WPAJ) in 1924 ‣ Storing two channel sound and reproduction, without acoustic septum, was made in 1921 ‣ Left signal to one AM transmitter and right signal to another AM transmitter ‣ The listener at home needed two receivers left signalRadio Station A Radio Station B right signal * Radio broadcasting service started in United States, 1920.
  • 20. 20 History of Immersive Sound - Binaural Recording Technologies ‣ Binaural Telephone (Western Electric, 1927) ‣ Microphones are placed flush with the wall of a balloon made of leather or cloth and 
 packed with sponge rubber, wool, or cotton ‣ Signal separation by shadowing and absorption
  • 21. 21 History of Immersive Sound - Binaural Recording Technologies ‣ “Artificial Head”, a binaural capturing device by W. Bartlett Jones in 1927 ‣ Microphones placed on the sphere heading to forward ‣ First simulation of the orientation of human pinae
  • 22. 22 History of Immersive Sound - Binaural Recording Technologies ‣ Oscar by Fletcher, Bell Lab (Presented in 1933) ‣ Manikin bought from a wax figure dealer 
 and Microphones mounted on the cheeks - ‣ Oscar II (1940s) ‣ Microphones placed in the ears ‣ But the membrane at about 5mm distance outside the cave conchae
  • 23. Binaural Recording Now ‣ Record ear-input signal and play it over earphones ‣ Quite Natural and Ambient ‣ No Interaction in General ‣ Limited Interaction (3DIO Omni Binaural Microphone) 23 History of Immersive Sound - Present Binaural Recording Technologies Yaw Roll Pitch
  • 24. ‣ Basic Signal and System in Electrical Engineering ‣ Audio signal is a group of impulses and system can be defined by an impulse response. ‣ Convolution : Each sample’s response is summed up, with the normalization gain corresponding to the input sample. ‣ Sound propagation path from the sound source to each ear in a room can be modeled by a pair Impulse Response 24 History of Immersive Sound - Binaural Rendering System Impulse Response
  • 25. ‣ Binaural Room Impulse Response 25 History of Immersive Sound - Binaural Rendering BRIR (Binaural Room Impulse Response) HRIR (Head Related Impulse Response)
  • 26. HRIR has the key information for human auditory perception of the source position 26 History of Immersive Sound - Binaural Rendering
  • 27. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 27 History of Immersive Sound - Binaural Rendering BRIR at Concert Hall = HRIR + Reflections and Reverberation - Good Externalization - Natural Ambience - Difficult to De-reverberate
 (No Dry Sound) HRIR at Anechoic Chamber - Intermediate Externalization - No Ambience - Dry, straightforward sound
 (It is sound engineer’s job!) 909 × 401
  • 28. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 28 History of Immersive Sound - Binaural Rendering Impulse from one direction Binaural Recording Head Related Impulse Response Impulses from all direction HRIR Database from all direction
  • 29. Binaural Rendering - Head Related Impulse Response or Binaural Room Impulse Response ‣ Head Related Impulse Response (HRIR) : Linear FIR Filter Model ‣ Convolution of the Object signal by the HRIR will provide Binaural Sound 29 History of Immersive Sound - Binaural Rendering 1) Input : Object Position, User Position, User Head Orientation 2) Output : Binaural Rendered Audio Signal Choose a HRIR/BRIR Filter Pair based on the Position and Orientation Information Sound source 1 2 2 Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 30. Expensive Realtime Convolution ‣ A 48kHz Sampled BRIR Pair from a Concert Hall would be 96000 Samples x 2 Channel ‣ To Create One Pair of Binaural Samples, 96000 x 2 Multiplications and Additions are required ‣ For a second, 9.2 G (96k x 2 x 48k) Operations are required ‣ Even for a short HRIR Pair (256 Samples at 48kHz), 25M (256 x 2 x 48k) Operations are required 30 History of Immersive Sound - Binaural Rendering Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 31. ‣ Signal Processing Techniques for Convolution ‣ Cooley and Turkey, “An Algorithm for the Machine Calculation of Complex Fourier Series” in 1965 ‣ FFT based convolution : Thomas Stockham, Jr, “High Speed Convolution and Correlation” in 1966 ‣ Block convolution : Charls S. Burrus, “Block Realization of Digital Filters” in 1972 31 History of Immersive Sound - Binaural Rendering Time Domain Freq Domain 256 25M 5M 1sec 4.6G 76M 2sec 9.2G 148M
  • 32. 32 History of Immersive Sound - Binaural Rendering ‣ Convolvotron : Binaural Synthesis using HRTF Convolution ‣ E. Wenzel, et. al, “A Virtual Display System for Conveying Three-Dimensional Acoustic Information” in 1988 ‣ Hardware Development : A PC extension card equipped with many parallel DSPs (up to 8 sound sources) in 1992
 - Each DSP is capable of 320 MOPS
  • 33. 33 History of Immersive Sound - Binaural Rendering ‣ Acoustically Optimized Low Complexity Binaural Synthesis Model for BRIR ‣ Acoustic Optimization : Taegyu Lee, “Scalable Multiband Binaural Renderer for MPEG-H 3D Audio” in 2015
 - Parameterizing BRIR (2 second) with Variable Order Filter in Frequency Domain (VOFF), 
 Parameteric Late Reverberation Filtering (PLF), QMF domain tapped delay line (QTDL)
  • 34. 34 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics, started by Gerzon in 1985 ‣ Ambisonics Microphone (A-Format) : A set of directional microphone, highly calibrated ‣ A-to-B Conversion : Inner product. (Each microphone signal is decomposed in terms of spherical harmonics) ‣ Object-to-B Conversion : Inner product. (Object signal is decomposed in terms of spherical harmonics) Spherical HaromonicsAmbisonics Microphone
  • 35. 35 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics Multichannel Reproduction (Coincident Microphones) ‣ Rotation, if necessary : Simple matrix conversion ‣ Reproduction : Over regular layout, inner-product of Spherical Harmonics and Loudspeaker Position Vector 

  • 36. 36 History of Immersive Sound - Ambisonics for Binaural Rendering ‣ Ambisonics Multichannel Reproduction (Coincident Microphones) ‣ Not really popular as a multichannel reproduction method ‣ Production : Difficult to control source width control, distance control, sweet spot control, ambience control ‣ Distribution : More bandwidth, transmission of the channel rendered signal over the standard layout is preferred ‣ Suddenly, became very popular in the VR Audio ‣ Distribution : First order ambisonics is enough to enable the audio to interact with 3-DoF yaw, pitch and roll ‣ Other candidates : Object-based and channel based generally requires more audio channel signal ‣ Limitation : Capable of handling 6DoF by inter/extrapolation of multiple Ambisonics signals but very inefficient Rotation Virtual Loudspeaker Rendering Binaural Rendering
  • 37. 37 Present Outlook and Future Forecast
  • 38. 38 Three Major Types of Binaural Renderings - Direct Object Rendering ‣ Very straightforward Binaural Rendering Mechanism 1) Input : Object Position, User Position, User Head Orientation 2) Output : Binaural Rendered Audio Signal Choose a HRIR/BRIR Filter Pair based on the Geometrical Information Sound source 1 2 2 Audio Samples = Impulses Convolution = Filtering Binaural Rendered Signal
  • 39. 39 Three Major Types of Binaural Renderings - Virtual Channel based Rendering ‣ Objects and their positional metadata are delivered to the renderer at the reproduction device ‣ Relative positions of objects are calculated using head orientation ‣ Each object is localized at the relative position over a pre-determined virtual channel loudspeaker layout ‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position. L30 O F0 O R60 O L60 O R120 O L120 O B180 O F0 O R60 O L60 O R120 O L120 O B180 O R30 O
  • 40. 40 Three Major Types of Binaural Renderings - Ambisonics based Rendering ‣ Objects are Localized in Ambisonics, ‣ Interaction by the head orientation is applied using Ambisonics rotation, ‣ Ambisonics signal is converted to the virtual channel layout ‣ Each virtual channel signal is binauralized using the HRIR of the virtual channel position. Rotation Virtual Loudspeaker Rendering Binaural Rendering
  • 41. 41 Three Major Types of Binaural Renderings - Comparison Rendering Type Direct Object Rendering Virtual Channel Panning Ambisonics Represenation Adapted by Gaudio Some Game Engines Gaudio MPEG-H 3D Audio Dolby ATMOS VR Gaudio MPEG-H 3D Audio Facebook 360 Video Google 360 Video Spatial Accuracy OOO OO O Panning No Typically pair or triplet Typically all of the Virtual Speakers Timbre Affected by HRTF Affected by HRTF and Panning (Some Phase Problem) Affected by HRTF and Panning (More Phase Problem) Complexity Depending on the Number of Objects (2xN Convolutions) Depending on the Number of Virtual Channels (N Convolutions) Depending on the Ambisonics Order (M Convolutions) Running Memory 360x180 HRIRs N HRIRs M HRIRs Delivery Large number of object needs to be delivered. 4 signals for FOA 9 signals for SOA Interaction 3DOF (head orientation) and 6DOF (3DOF + positional interaction) only 3DOF
  • 42. 42 VR Market Situation ‣ Virtual Reality in General ‣ Uncomfortable form factor ‣ Not enough display resolution ‣ Dizziness from the rendering latency (both video or audio) ‣ No unified content creation-transmission-consumption ecosystem ‣ No Killer Content ‣ Expensive in Content Production : 180 degree? ‣ Couch Potato vs. 6 DOF Movement ‣ Audio ‣ Codec Issues : No server/device handles 5+ objects transmission ‣ Neither a standard or de-facto renderer for content creation ‣ Personalization : How to measure personalized HRIR? ‣ Difference auditory system characteristics with headphone : e.g. occlusion
  • 43. 43 MPEG-I, the 6DOF Standard for the Future 1995 2000 2005 2010 2015 2022 MPEG-IMPEG-1 MPEG-2 MPEG-4 MPEG-D MPEG-H 3D Audio : Support Rendering for Combination of Channels, Objects, and/or Ambisonics Audio for 6DoF AR/VR Content USAC : Hybrid Audio and Speech Codec SAOC : Object Mix & Control MPEG-Surround : Enhanced Channel Extension HE-AAC v.2 : Channel Extension by Parametric Stereo HE-AAC : Bandwidth Extension by Spectral Band Replication AAC : Perceptual Noise Shaping AAC : Enhanced Psychoacoustical Model MP3 : First Psychoacoustical Compression Standard
  • 44. 44 MPEG-I, the 6DOF Standard for the Future • 3 Degrees of Freedom (3DoF) - Yaw-Pitch-Roll - Head orientation only - 360° Video and Cinematic VR - Ambisonics ‣ MPEG-H 3D Audio • 3DoF+ - 3DoF with limited Movements (typically, head movements) • 6 Degrees of Freedom (6DoF) - 3DoF + x-y-z - Head orientation + Movements - Full Interactive VR - Object-based Audio ‣ MPEG-I Immersive Audio “ User’s freedom of movements for interactivity in VR space ”
  • 45. 45 MPEG-I, the 6DOF Standard for the Future • Localization of Virtual Sources ‣ Use Head-Related Transfer Function (HRTF) - From virtual source to L/R ears ‣ Realistic rendering of spatial position due to perceptual cues - ITD, ILD, IC • Directivity of Virtual Sources ‣ Object perceived loudness changes as user moved around object • Ambience and Reverberation ‣ Almost all spaces impose some reverberation on sound sources - Direct Sound, Early Reflections, and Late Reverberation - Need to estimate model for AR • Occlusion, Doppler, etc…
  • 46. 46 MPEG-I Audio Reference Architecture *N18158