SlideShare a Scribd company logo
MPEG for Augmented Reality 
ISMAR, September 9, 2014, Munich 
AR Standards Community Meeting September 12, 2014 
Marius Preda, MPEG 3DG Chair 
Institut Mines TELECOM 
http://www.slideshare.net/MariusPreda/mpeg-augmented-reality-tutorial
What you will learn today 
• Who is MPEG and why MPEG is doing AR 
• MPEG ARAF design principles and the main features 
• Create ARAF experiences: two exercises
Tidy City
Portal Hunt
Elements
ARQuiz
Augmented Books
Event LOOV 
• Collecting virtual money in real world for buying real 
services and products 
Available on AppStore, AndroidStores and MyMultimediaWorld.com
Summer School (1 week) Games
What is common in these "games" ? 
Based on MPEG ARAF 
Augmented Reality Application 
Format
MPEG Augmented Reality 
Why MPEG AR?
Answers to (some) of Christine’s (non-technical) 
questions 
• Who is MPEG? 
• What MPEG does successfully? 
• Who are the members? 
• IPR policy
What is MPEG? 
A suite of ~130 ISO/IEC standards for: 
•Coding/compression of elementary media: 
• Audio (MPEG-1, 2 and 4), Video (MPEG-1, 2 and 4), 2D/3D graphics (MPEG-4) 
• Transport 
• MPEG-2 Transport, File Format, Dynamic Adaptive Streaming over HTTP (DASH) 
• Hybrid (natural & synthetic) scene description, user interaction (MPEG-4) 
• Metadata (MPEG-7) 
• Media management and protection (MPEG-21) 
• Sensors and actuators, Virtual Worlds (MPEG-V) 
• Advanced User interaction (MPEG-U) 
• Media-oriented middleware (MPEG-M) 
More ISO/IEC standards under development for 
• Coding and Delivery in Heterogeneous Environments (incl.) 
• 3DVideo 
•…
What is MPEG? 
• A standardization activity continuing for 25 years, 
– Supported by several hundreds companies/organisations from ~25 countries 
– ~500 experts participating in quarterly meetings 
– More than 2300 active contributors 
– Many thousands experts working in companies 
• A proven manner to organize the work to deliver useful and used standards 
– Developing standards by integrating individual technologies 
– Well defined procedures 
– Subgroups with clear objectives 
– Ad hoc groups continuing coordinated work between meetings 
• MPEG standards are widely referenced by industry 
– 3GPP, ARIB, ATSC, DVB, DVD-Forum, BDA, EITSI, SCTE, TIA, DLNA, DECE, OIPF… 
• Billions of software and hardware devices built on MPEG technologies 
– MP3 players, cameras, mobile handsets, PCs, DVD/Blue-Ray players, STBs, TVs, … 
• Business friendly IPR policy established at ISO level
MPEG technologies related to AR: 1st pillar 
1992/4 
1997 
MPEG-1/2 
(AV content) 
1998 
VRML 
• Part 11 - BIFS: 
-Binarisation of VRML 
-Extensions for streaming 
-Extensions for server command 
-Extensions for 2D graphics 
- Real time augmentation with 
audio & video 
• Part 2 - Visual: 
- 3D Mesh compression 
- Face animation 
• Part 2 – Visual 
- Body animation 
1999 
MPEG-4 v.1 
MPEG-4 v.2 
First form of broadcast signal augmentation
MPEG technologies related to AR: 1st pillar 
2003 
MPEG-4 
•AFX 2nd Edition: 
- Animation by 
morphing 
- Multi-texturing 
2005 
• AFX 3rd Edition 
- WSS for terrain 
and cities 
- Frame based 
animation 
2007 
MPEG-4 
MPEG-4 
• Part 16 - AFX: 
- A rich set of 3D 
graphics tools 
- Compression of 
geometry, 
appearance, 
animation 
• AFX 4th Edition 
- Scalable complexity 
mesh coding 
2011 
MPEG-4 
A rich set of Scene 
and Graphics 
representation and 
compression tools
MPEG technologies related to AR: 2nd pillar 
2011 
2012 
MPEG-V - Media 
Context and Control 
2013 
2014 
• 2nd Edition: 
- GPS 
- Biosensors 
- 3D Camera 
MPEG-H 
• Compression 
of video + 
depth 
MPEG-V 
- 3D Video 
• 1st Edition 
- Sensors and 
actuators 
- Interoperability 
between Virtual 
Worlds 
• Feature-point based 
descriptors for image 
recognition 
201x 
CDVS 
MPEG-U – 
Advanced 
User Interface 
A rich set of Sensors 
and Actuators 
- 3D Audio
MPEG technologies related to AR: 2nd pillar 
MPEG-V – Media Context and Control
MPEG technologies related to AR: 2nd pillar 
Actuators 
Light 
Flash 
Heating 
Cooling 
Wind 
Vibration 
Sprayer 
Scent 
Fog 
Color correction 
Initialize color correction parameter 
Rigid body motion 
Tactile 
Kinesthetic 
Global position command 
MPEG-V – Media Context and Control 
Sensors 
Light 
Ambient noise 
Temperature 
Humidity 
Distance 
Atmospheric pressure 
Position 
Velocity 
Acceleration 
Orientation 
Angular velocity 
Angular acceleration 
Force 
Torque 
Pressure 
Motion 
Intelligent camera type 
Multi Interaction point 
Gaze tracking 
Wind 
Global position 
Altitude 
Bend 
Gas 
Dust 
Body height 
Body weight 
Body temperature 
Body fat 
Blood type 
Blood pressure 
Blood sugar 
Blood oxygen 
Heart rate 
Electrograph 
EEG , ECG, EMG, EOG , GSR 
Weather 
Facial expression 
Facial morphology 
Facial expression characteristics 
Geomagnetic
Main features of MPEG AR technologies 
• All AR-related data is available from MPEG standards 
• Real time composition of synthetic and natural objects 
• Access to 
– Remotely/locally stored scene/compressed 2D/3D mesh 
objects 
– Streamed real-time scene/compressed 2D/3D mesh objects 
• Inherent object scalability (e.g. for streaming) 
• User interaction & server generated scene changes 
• Physical context 
– Captured by a broad range of standard sensors 
– Affected by a broad range of standard actuators
MPEG vision on AR 
MPEG-4/MPEG-7/MPEG-21/ 
MPEG-U/MPEG-V 
MPEG Player 
Compression 
Authoring Tool 
Produce 
Download 
ARAF
MPEG vision on AR 
MPEG-4/MPEG-7/MPEG-21/ 
MPEG-U/MPEG-V 
ARAF Browser 
Compression 
Authoring Tool 
Produce 
Download 
ARAF
End to end chain 
ARAF 
Browser 
Media 
Servers 
Service 
Servers 
User 
Local 
Sensors & 
Actuators 
Remote 
Sensors & 
Actuators 
MPEG 
ARAF 
Local 
Real World 
Environment 
Remote 
Real World 
Environment 
Authoring 
Tools
MPEG-A Part 13 ARAF 
Three main components: scene, sensors/actuators, media 
• A set of scene graph nodes/protos as defined in MPEG-4 Part 11 
– Existing nodes : Audio, image, video, graphics, programming, communication, user 
interactivity, animation 
– New standard PROTOs : Map, MapMarker, Overlay, Local & Remote Recognition, 
Local & Remote Registration, CameraCalibration, AugmentedRegion, Point of 
Interest 
• Connection to sensors and actuators as defined in MPEG-V 
– Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic, Altitude 
– Local or/and remote camera sensor 
– Flash, Heating, Cooling, Wind, Sprayer, Scent, Fog, RigidBodyMotion, Kinestetic 
• Compressed media
MPEG-A Part 13 ARAF 
Scene: 73 XML Elements 
Documentation available online: 
http://wg11.sc29.org/augmentedReality/
Event LOOV, how it looks like?
MPEG-A Part 13 ARAF 
Exercises 
AR Quiz Augmented Book
MPEG-A Part 13 ARAF 
Exercises 
AR Quiz Augmented Book 
http://youtu.be/la-Oez0aaHE http://youtu.be/LXZUbAFPP-Y
MPEG-A Part 13 ARAF 
AR Quiz setting, preparing the medias 
images, videos, audios, 2D/3D assets 
GPS location
MPEG-A Part 13 ARAF 
AR Quiz XML inspection 
http://tiny.cc/MPEGARQuiz
MPEG-A Part 13 ARAF 
AR Quiz Authoring Tool 
www.MyMultimediaWorld.com go to Create / Augmented Reality
MPEG-A Part 13 ARAF 
Augmented Book setting 
images, audios
MPEG-A Part 13 ARAF 
Augmented Book XML inspection 
http://tiny.cc/MPEGAugBook
MPEG-A Part 13 ARAF 
Augmented Book Authoring Tool 
www.MyMultimediaWorld.com go to Create / Augmented Books
Conclusions 
• ARAF Browser is Open Source 
– iOS, Android, WS, Linux 
– distributed at www.MyMultimediaWorld.com 
• ARAF V1 published early 2014 
• ARAF V2 in progress 
– Visual Search (client side and server side) 
– 3D Video, 3D Audio 
– Connection to Social Networks 
– Connection to POI servers
• Other slides that may help
MPEG 3DG Report 
ARAF 2nd Edition
MPEG 3DG Report 
ARAF 2nd Edition, items under discussion 
1. Local vs Remote recognition and tracking 
2. Social Networks 
3. 3D video 
4. 3D audio
MPEG 3DG Report 
Server side object recognition: a real system* 
Client Server 
Query 
image 
[Extraction] 
Descriptors 
[Detection] 
Key points 
HTTP POST 
(binary descriptor + 
key points) 
Query 
descriptors 
DB 
descriptors 
Matchin 
g 
ID 
Correspondin 
g Information 
Error/no message 
Data as String 
Parse and 
display the 
answer 
Decod 
e 
(5.2) 
Decod 
e 
(1) 
(2.2) 
(2.1) 
(3.1) 
(3.2) 
HTTP 
Response 
Descriptors, 
images and 
information 
[DB] 
(4) 
(5.1) 
(6) 
(7) 
(8’) 
(8’’) 
(10) (9) 
Binary 
Data 
* Wine recognizer : GooT and IMT
MPEG 3DG Report 
Server side object recognition: ARAF version 
End-user Device 
MAR 
Scene 
ARAF Browser 
Video 
stream Video 
source 
Processing Server URLs 
Source 
(video URL) 
optional: 
recognition region 
Video 
stream 
Processing 
Servers 
Medi 
a data 
Binary (base64) 
key points + 
descriptors 
Corresponding 
media 
DB 
Image 
Detection 
Library 
Detection 
Library 
Recognition 
Libraries 
MAR 
Experience 
Creator + 
Content 
Creator 
Large 
Image DB 
ORB
MPEG 3DG Report 
Server side object recognition: ARAF version 
Discussions on: 
- Does the content creator specify the form of request 
(full image or descriptors) or the browser will take the 
best decision? 
- Is the server’s answer formalized in ARAF?
MPEG 3DG Report 
ARAF – Social Network Data in ARAF scene 
Scenario: display posts from SN in a geo-localized 
manner 
ARAF can do this directly by programming the access 
to the SN service at the scene level
MPEG 3DG Report 
ARAF – Social Network Data in ARAF scene 
At minimum, user login to SN - at maximum : the MPEG UD
MPEG 3DG Report 
ARAF – Social Network Data in ARAF scene 
Connect to an UD server to get all the necessary data
MPEG 3DG Report 
ARAF – Social Network scenario 
Two categories of “SNS Data” 
– Static data 
• Name, photo, email, phone number, address, 
sex, interest, … 
– Social Network related activity 
• Reported location, SNS post title, SNS text, SNS 
media, SNS media 
Obtained from the UD server
MPEG 3DG Report 
ARAF 2nd Edition – introducing 3D Video 
Modeling of 3 AR classes for 3D video: 
1.Pre-created 3D model of the environment, using visual search 
and other sensors to obtain camera position and orientation; 3D 
video used for handle occlusions 
2.No a priori 3D model of the scene, depth captured in real-time 
and used to handle occlusions at the rendering step 
3.No a priori model of the scene but created during AR 
experience (SLAM – Simultaneous Location and Mapping)
MPEG 3DG Report 
ARAF – introducing 3D Audio 
Spatialisation Recognition 
Use sounds 
from the real 
world to trigger 
events in an AR 
scene
MPEG 3DG Report 
ARAF – 3DAudio : local spatialisation 
MAR 
Experience 
Creator + 
Content Creator 
User location & direction + sound location 
Scene 
Mobile device 
ARAF Browser 
Video/audio 
stream 
Camera 
Coordination 
mapping 
Sensed 
data 
Position & 
orientation 
sensor 
3D Audio 
Engine 
Relative sound location + 
(Acoustic scene) + audio 
source 
Spatialized 
audio source 
Video/audio 
stream 
ARAF file 
Microphone 
Mixer 
Synthesized audio stream
MPEG 3DG Report 
ARAF – 3DAudio : remote spatialisation 
User location & direction + sound location 
Scene 
Mobile device 
ARAF Browser 
Video/audio 
stream 
Camera 
Coordination 
mapping 
ARAF file 
Sensed 
data 
Position & 
orientation 
sensor 
video/audio 
stream 
Proxy 
Server 
3D Audio 
Engine 
Detection 
Library 
Detection 
Library 
Detection 
Library 
Relative sound location + Audio source + (Acoustic scene) 
Spatialized audio source 
MAR 
Experience 
Creator + 
Content 
Creator 
Processing Server URL 
Microphone 
Mixer 
Synthesized audio stream
MPEG 3DG Report 
ARAF – Audio recognition: local 
MAR 
Experience 
Creator + 
Content Creator 
Target Resources or descriptors 
Audio 
Detection 
Library 
Detection 
Library 
Detection 
Library 
Source (microphone/audio URL) Detection 
Scene 
Mobile device 
ARAF Browser 
Target Resources 
ID Mask 
Microphone/audio stream 
Audio 
source 
Library 
optional: detection window, 
sampling rate, detection delay
MPEG 3DG Report 
ARAF – Audio recognition: local 
MAR 
Experience 
Creator + 
Content 
Creator 
Target Resources or descriptors 
Scene 
Mobile device 
ARAF Browser 
Microphone/audio stream 
Audio 
source 
Source (microphone/audio URL) 
optional: detection window, 
sampling rate, detection delay 
Proxy 
Server 
Audio 
Detection 
Library 
Detection 
Library 
Detection 
Library 
Detection 
Library 
ID Mask 
URL of Processing Server 
Target Resources or descriptors + IDs 
+ optional detection window, sampling rate, detection delay
MPEG 3DG Report 
ARAF – Audio recognition: local 
MAR 
Experience 
Creator + 
Content 
Creator 
Target Resources or descriptors 
Target Resources or descriptors + IDs 
+ optional detection window, sampling rate, detection delay 
Scene 
Mobile device 
ARAF Browser 
Audio 
source 
Source (microphone/audio URL) 
optional: detection window, 
sampling rate, detection delay 
Processing 
Server 
Audio 
Detection 
Library 
Detection 
Library 
Detection 
Library 
Detection 
Library 
ID Mask 
URL of Processing Server 
Descriptor 
Extraction 
Microphone/audio stream Descriptors
MPEG 3DG Report 
ARAF – joint meeting with 3DAudio 
Spatialisation Recognition 
• The 3D audio renderer 
needs an API to get the 
user position and 
orientation 
• It may be more 
complex to update in 
real time position and 
orientation of all the 
acoustic objects 
• MPEG-7 has several 
tools for audio 
fingerprint 
• Investigate the 
ongoing work on 
“Audio 
synchronisation” and 
check if it is suitable 
for AR

More Related Content

What's hot

Tutorial MPEG 3D Graphics
Tutorial MPEG 3D GraphicsTutorial MPEG 3D Graphics
Tutorial MPEG 3D Graphics
Marius Preda PhD
 
Bridging the gap between web and television
Bridging the gap between web and televisionBridging the gap between web and television
Bridging the gap between web and television
Marius Preda PhD
 
Point Cloud Compression in MPEG
Point Cloud Compression in MPEGPoint Cloud Compression in MPEG
Point Cloud Compression in MPEG
Marius Preda PhD
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
Godo Dodo
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
Christian Kehl
 
Filmic Tonemapping - EA 2006
Filmic Tonemapping - EA 2006Filmic Tonemapping - EA 2006
Filmic Tonemapping - EA 2006
hpduiker
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004aniruddh Tyagi
 
Jpeg and mpeg ppt
Jpeg and mpeg pptJpeg and mpeg ppt
Jpeg and mpeg ppt
siddharth rathore
 
MPEG Immersive Media
MPEG Immersive MediaMPEG Immersive Media
MPEG Immersive Media
ITU
 
Video Compression Technology
Video Compression TechnologyVideo Compression Technology
Video Compression TechnologyTong Teerayuth
 
8 k shd presentation
8 k shd presentation8 k shd presentation
8 k shd presentation
vanshkumar20
 
8k high resolution camera
8k high resolution camera8k high resolution camera
8k high resolution cameraAnkit Tandekar
 
Performance Analysis of Digital Watermarking Of Video in the Spatial Domain
Performance Analysis of Digital Watermarking Of Video in the Spatial DomainPerformance Analysis of Digital Watermarking Of Video in the Spatial Domain
Performance Analysis of Digital Watermarking Of Video in the Spatial Domain
paperpublications3
 
Sem vaibhav belkhude
Sem vaibhav belkhudeSem vaibhav belkhude
Sem vaibhav belkhude
Vaibhav Belkhude
 
Unity: Next Level Rendering Quality
Unity: Next Level Rendering QualityUnity: Next Level Rendering Quality
Unity: Next Level Rendering Quality
Unity Technologies
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Video
adil raja
 
iDiff 2008 conference #6 IP-Racine DVS
iDiff 2008 conference #6  IP-Racine DVSiDiff 2008 conference #6  IP-Racine DVS
iDiff 2008 conference #6 IP-Racine DVSBenoit Michel
 
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
hpduiker
 
Evaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoEvaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoAlpen-Adria-Universität
 
8k RESOLUTION CAMERA SYSTEM
8k RESOLUTION CAMERA SYSTEM8k RESOLUTION CAMERA SYSTEM
8k RESOLUTION CAMERA SYSTEM
Arun Raj
 

What's hot (20)

Tutorial MPEG 3D Graphics
Tutorial MPEG 3D GraphicsTutorial MPEG 3D Graphics
Tutorial MPEG 3D Graphics
 
Bridging the gap between web and television
Bridging the gap between web and televisionBridging the gap between web and television
Bridging the gap between web and television
 
Point Cloud Compression in MPEG
Point Cloud Compression in MPEGPoint Cloud Compression in MPEG
Point Cloud Compression in MPEG
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
Filmic Tonemapping - EA 2006
Filmic Tonemapping - EA 2006Filmic Tonemapping - EA 2006
Filmic Tonemapping - EA 2006
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
Jpeg and mpeg ppt
Jpeg and mpeg pptJpeg and mpeg ppt
Jpeg and mpeg ppt
 
MPEG Immersive Media
MPEG Immersive MediaMPEG Immersive Media
MPEG Immersive Media
 
Video Compression Technology
Video Compression TechnologyVideo Compression Technology
Video Compression Technology
 
8 k shd presentation
8 k shd presentation8 k shd presentation
8 k shd presentation
 
8k high resolution camera
8k high resolution camera8k high resolution camera
8k high resolution camera
 
Performance Analysis of Digital Watermarking Of Video in the Spatial Domain
Performance Analysis of Digital Watermarking Of Video in the Spatial DomainPerformance Analysis of Digital Watermarking Of Video in the Spatial Domain
Performance Analysis of Digital Watermarking Of Video in the Spatial Domain
 
Sem vaibhav belkhude
Sem vaibhav belkhudeSem vaibhav belkhude
Sem vaibhav belkhude
 
Unity: Next Level Rendering Quality
Unity: Next Level Rendering QualityUnity: Next Level Rendering Quality
Unity: Next Level Rendering Quality
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Video
 
iDiff 2008 conference #6 IP-Racine DVS
iDiff 2008 conference #6  IP-Racine DVSiDiff 2008 conference #6  IP-Racine DVS
iDiff 2008 conference #6 IP-Racine DVS
 
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
ACEScg: A Common Color Encoding for Visual Effects Applications - DigiPro 2015
 
Evaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical videoEvaluation of bandwidth performance for interactive spherical video
Evaluation of bandwidth performance for interactive spherical video
 
8k RESOLUTION CAMERA SYSTEM
8k RESOLUTION CAMERA SYSTEM8k RESOLUTION CAMERA SYSTEM
8k RESOLUTION CAMERA SYSTEM
 

Viewers also liked

Mpeg v-awareness event
Mpeg v-awareness eventMpeg v-awareness event
Mpeg v-awareness event
Marius Preda PhD
 
Mp3
Mp3Mp3
MPEG-DASH Conformance and Reference Software
MPEG-DASH Conformance and Reference SoftwareMPEG-DASH Conformance and Reference Software
MPEG-DASH Conformance and Reference SoftwareAlpen-Adria-Universität
 
Introduction au Lean Startup
Introduction au Lean StartupIntroduction au Lean Startup
Introduction au Lean StartupSébastien Sacard
 
La réalité augmentée : Que retenir de 2016 ?
La réalité augmentée : Que retenir de 2016 ?La réalité augmentée : Que retenir de 2016 ?
La réalité augmentée : Que retenir de 2016 ?
Grégory MAUBON, PhD
 
Basics of Mpeg 4 Video Compression
Basics of Mpeg 4 Video CompressionBasics of Mpeg 4 Video Compression
Basics of Mpeg 4 Video Compression
Marius Preda PhD
 
Introduction To Software Engineering
Introduction To Software EngineeringIntroduction To Software Engineering
Introduction To Software Engineering
Leyla Bonilla
 

Viewers also liked (7)

Mpeg v-awareness event
Mpeg v-awareness eventMpeg v-awareness event
Mpeg v-awareness event
 
Mp3
Mp3Mp3
Mp3
 
MPEG-DASH Conformance and Reference Software
MPEG-DASH Conformance and Reference SoftwareMPEG-DASH Conformance and Reference Software
MPEG-DASH Conformance and Reference Software
 
Introduction au Lean Startup
Introduction au Lean StartupIntroduction au Lean Startup
Introduction au Lean Startup
 
La réalité augmentée : Que retenir de 2016 ?
La réalité augmentée : Que retenir de 2016 ?La réalité augmentée : Que retenir de 2016 ?
La réalité augmentée : Que retenir de 2016 ?
 
Basics of Mpeg 4 Video Compression
Basics of Mpeg 4 Video CompressionBasics of Mpeg 4 Video Compression
Basics of Mpeg 4 Video Compression
 
Introduction To Software Engineering
Introduction To Software EngineeringIntroduction To Software Engineering
Introduction To Software Engineering
 

Similar to Mpeg ARAF tutorial @ ISMAR 2014

MPEG-DASH Reference Software and Conformance
MPEG-DASH Reference Software and ConformanceMPEG-DASH Reference Software and Conformance
MPEG-DASH Reference Software and ConformanceAlpen-Adria-Universität
 
What’s new in MPEG?
What’s new in MPEG?What’s new in MPEG?
What’s new in MPEG?
Alpen-Adria-Universität
 
Audio and Video streaming.ppt
Audio and Video streaming.pptAudio and Video streaming.ppt
Audio and Video streaming.pptVideoguy
 
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
Anand Bhojan
 
Audio and video streaming
Audio and video streamingAudio and video streaming
Audio and video streaming
Rohan Bhatkar
 
Presentation NBMP and PCC
Presentation NBMP and PCCPresentation NBMP and PCC
Presentation NBMP and PCC
Rufael Mekuria
 
Video Streaming
Video StreamingVideo Streaming
Video StreamingVideoguy
 
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Brad Fortner
 
P9000 p-017o (argon general presentation - june 2014)
P9000 p-017o (argon general presentation - june 2014)P9000 p-017o (argon general presentation - june 2014)
P9000 p-017o (argon general presentation - june 2014)
Clifford Dive
 
JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009
Hal J. Reisiger
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovAMD Developer Central
 
Module 2 3
Module 2 3Module 2 3
Module 2 3ryanette
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.pptVideoguy
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.pptVideoguy
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.pptVideoguy
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.pptVideoguy
 
Ip Cam
Ip CamIp Cam
Ip Cam
Robert Bosch
 
YUVsoft Profile
YUVsoft ProfileYUVsoft Profile
YUVsoft Profile
Dmitriy Vatolin
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
ijtsrd
 

Similar to Mpeg ARAF tutorial @ ISMAR 2014 (20)

MPEG-DASH Reference Software and Conformance
MPEG-DASH Reference Software and ConformanceMPEG-DASH Reference Software and Conformance
MPEG-DASH Reference Software and Conformance
 
What’s new in MPEG?
What’s new in MPEG?What’s new in MPEG?
What’s new in MPEG?
 
Audio and Video streaming.ppt
Audio and Video streaming.pptAudio and Video streaming.ppt
Audio and Video streaming.ppt
 
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
ShowNTell: An easy-to-use tool for answering students’ questions with voice-o...
 
Audio and video streaming
Audio and video streamingAudio and video streaming
Audio and video streaming
 
Presentation NBMP and PCC
Presentation NBMP and PCCPresentation NBMP and PCC
Presentation NBMP and PCC
 
Video Streaming
Video StreamingVideo Streaming
Video Streaming
 
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
 
P9000 p-017o (argon general presentation - june 2014)
P9000 p-017o (argon general presentation - june 2014)P9000 p-017o (argon general presentation - june 2014)
P9000 p-017o (argon general presentation - june 2014)
 
JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009JPEG2000 Alliance IBC 2009
JPEG2000 Alliance IBC 2009
 
Sandeep_Resume
Sandeep_ResumeSandeep_Resume
Sandeep_Resume
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
Module 2 3
Module 2 3Module 2 3
Module 2 3
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.ppt
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.ppt
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.ppt
 
Streaming Overview Final.ppt
Streaming Overview Final.pptStreaming Overview Final.ppt
Streaming Overview Final.ppt
 
Ip Cam
Ip CamIp Cam
Ip Cam
 
YUVsoft Profile
YUVsoft ProfileYUVsoft Profile
YUVsoft Profile
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
 

Recently uploaded

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 

Recently uploaded (20)

GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Introduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptxIntroduction to Mean Field Theory(MFT).pptx
Introduction to Mean Field Theory(MFT).pptx
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 

Mpeg ARAF tutorial @ ISMAR 2014

  • 1. MPEG for Augmented Reality ISMAR, September 9, 2014, Munich AR Standards Community Meeting September 12, 2014 Marius Preda, MPEG 3DG Chair Institut Mines TELECOM http://www.slideshare.net/MariusPreda/mpeg-augmented-reality-tutorial
  • 2. What you will learn today • Who is MPEG and why MPEG is doing AR • MPEG ARAF design principles and the main features • Create ARAF experiences: two exercises
  • 8. Event LOOV • Collecting virtual money in real world for buying real services and products Available on AppStore, AndroidStores and MyMultimediaWorld.com
  • 9. Summer School (1 week) Games
  • 10. What is common in these "games" ? Based on MPEG ARAF Augmented Reality Application Format
  • 11. MPEG Augmented Reality Why MPEG AR?
  • 12. Answers to (some) of Christine’s (non-technical) questions • Who is MPEG? • What MPEG does successfully? • Who are the members? • IPR policy
  • 13. What is MPEG? A suite of ~130 ISO/IEC standards for: •Coding/compression of elementary media: • Audio (MPEG-1, 2 and 4), Video (MPEG-1, 2 and 4), 2D/3D graphics (MPEG-4) • Transport • MPEG-2 Transport, File Format, Dynamic Adaptive Streaming over HTTP (DASH) • Hybrid (natural & synthetic) scene description, user interaction (MPEG-4) • Metadata (MPEG-7) • Media management and protection (MPEG-21) • Sensors and actuators, Virtual Worlds (MPEG-V) • Advanced User interaction (MPEG-U) • Media-oriented middleware (MPEG-M) More ISO/IEC standards under development for • Coding and Delivery in Heterogeneous Environments (incl.) • 3DVideo •…
  • 14. What is MPEG? • A standardization activity continuing for 25 years, – Supported by several hundreds companies/organisations from ~25 countries – ~500 experts participating in quarterly meetings – More than 2300 active contributors – Many thousands experts working in companies • A proven manner to organize the work to deliver useful and used standards – Developing standards by integrating individual technologies – Well defined procedures – Subgroups with clear objectives – Ad hoc groups continuing coordinated work between meetings • MPEG standards are widely referenced by industry – 3GPP, ARIB, ATSC, DVB, DVD-Forum, BDA, EITSI, SCTE, TIA, DLNA, DECE, OIPF… • Billions of software and hardware devices built on MPEG technologies – MP3 players, cameras, mobile handsets, PCs, DVD/Blue-Ray players, STBs, TVs, … • Business friendly IPR policy established at ISO level
  • 15. MPEG technologies related to AR: 1st pillar 1992/4 1997 MPEG-1/2 (AV content) 1998 VRML • Part 11 - BIFS: -Binarisation of VRML -Extensions for streaming -Extensions for server command -Extensions for 2D graphics - Real time augmentation with audio & video • Part 2 - Visual: - 3D Mesh compression - Face animation • Part 2 – Visual - Body animation 1999 MPEG-4 v.1 MPEG-4 v.2 First form of broadcast signal augmentation
  • 16. MPEG technologies related to AR: 1st pillar 2003 MPEG-4 •AFX 2nd Edition: - Animation by morphing - Multi-texturing 2005 • AFX 3rd Edition - WSS for terrain and cities - Frame based animation 2007 MPEG-4 MPEG-4 • Part 16 - AFX: - A rich set of 3D graphics tools - Compression of geometry, appearance, animation • AFX 4th Edition - Scalable complexity mesh coding 2011 MPEG-4 A rich set of Scene and Graphics representation and compression tools
  • 17. MPEG technologies related to AR: 2nd pillar 2011 2012 MPEG-V - Media Context and Control 2013 2014 • 2nd Edition: - GPS - Biosensors - 3D Camera MPEG-H • Compression of video + depth MPEG-V - 3D Video • 1st Edition - Sensors and actuators - Interoperability between Virtual Worlds • Feature-point based descriptors for image recognition 201x CDVS MPEG-U – Advanced User Interface A rich set of Sensors and Actuators - 3D Audio
  • 18. MPEG technologies related to AR: 2nd pillar MPEG-V – Media Context and Control
  • 19. MPEG technologies related to AR: 2nd pillar Actuators Light Flash Heating Cooling Wind Vibration Sprayer Scent Fog Color correction Initialize color correction parameter Rigid body motion Tactile Kinesthetic Global position command MPEG-V – Media Context and Control Sensors Light Ambient noise Temperature Humidity Distance Atmospheric pressure Position Velocity Acceleration Orientation Angular velocity Angular acceleration Force Torque Pressure Motion Intelligent camera type Multi Interaction point Gaze tracking Wind Global position Altitude Bend Gas Dust Body height Body weight Body temperature Body fat Blood type Blood pressure Blood sugar Blood oxygen Heart rate Electrograph EEG , ECG, EMG, EOG , GSR Weather Facial expression Facial morphology Facial expression characteristics Geomagnetic
  • 20. Main features of MPEG AR technologies • All AR-related data is available from MPEG standards • Real time composition of synthetic and natural objects • Access to – Remotely/locally stored scene/compressed 2D/3D mesh objects – Streamed real-time scene/compressed 2D/3D mesh objects • Inherent object scalability (e.g. for streaming) • User interaction & server generated scene changes • Physical context – Captured by a broad range of standard sensors – Affected by a broad range of standard actuators
  • 21. MPEG vision on AR MPEG-4/MPEG-7/MPEG-21/ MPEG-U/MPEG-V MPEG Player Compression Authoring Tool Produce Download ARAF
  • 22. MPEG vision on AR MPEG-4/MPEG-7/MPEG-21/ MPEG-U/MPEG-V ARAF Browser Compression Authoring Tool Produce Download ARAF
  • 23. End to end chain ARAF Browser Media Servers Service Servers User Local Sensors & Actuators Remote Sensors & Actuators MPEG ARAF Local Real World Environment Remote Real World Environment Authoring Tools
  • 24. MPEG-A Part 13 ARAF Three main components: scene, sensors/actuators, media • A set of scene graph nodes/protos as defined in MPEG-4 Part 11 – Existing nodes : Audio, image, video, graphics, programming, communication, user interactivity, animation – New standard PROTOs : Map, MapMarker, Overlay, Local & Remote Recognition, Local & Remote Registration, CameraCalibration, AugmentedRegion, Point of Interest • Connection to sensors and actuators as defined in MPEG-V – Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic, Altitude – Local or/and remote camera sensor – Flash, Heating, Cooling, Wind, Sprayer, Scent, Fog, RigidBodyMotion, Kinestetic • Compressed media
  • 25. MPEG-A Part 13 ARAF Scene: 73 XML Elements Documentation available online: http://wg11.sc29.org/augmentedReality/
  • 26. Event LOOV, how it looks like?
  • 27. MPEG-A Part 13 ARAF Exercises AR Quiz Augmented Book
  • 28. MPEG-A Part 13 ARAF Exercises AR Quiz Augmented Book http://youtu.be/la-Oez0aaHE http://youtu.be/LXZUbAFPP-Y
  • 29. MPEG-A Part 13 ARAF AR Quiz setting, preparing the medias images, videos, audios, 2D/3D assets GPS location
  • 30. MPEG-A Part 13 ARAF AR Quiz XML inspection http://tiny.cc/MPEGARQuiz
  • 31. MPEG-A Part 13 ARAF AR Quiz Authoring Tool www.MyMultimediaWorld.com go to Create / Augmented Reality
  • 32. MPEG-A Part 13 ARAF Augmented Book setting images, audios
  • 33. MPEG-A Part 13 ARAF Augmented Book XML inspection http://tiny.cc/MPEGAugBook
  • 34. MPEG-A Part 13 ARAF Augmented Book Authoring Tool www.MyMultimediaWorld.com go to Create / Augmented Books
  • 35. Conclusions • ARAF Browser is Open Source – iOS, Android, WS, Linux – distributed at www.MyMultimediaWorld.com • ARAF V1 published early 2014 • ARAF V2 in progress – Visual Search (client side and server side) – 3D Video, 3D Audio – Connection to Social Networks – Connection to POI servers
  • 36. • Other slides that may help
  • 37. MPEG 3DG Report ARAF 2nd Edition
  • 38. MPEG 3DG Report ARAF 2nd Edition, items under discussion 1. Local vs Remote recognition and tracking 2. Social Networks 3. 3D video 4. 3D audio
  • 39. MPEG 3DG Report Server side object recognition: a real system* Client Server Query image [Extraction] Descriptors [Detection] Key points HTTP POST (binary descriptor + key points) Query descriptors DB descriptors Matchin g ID Correspondin g Information Error/no message Data as String Parse and display the answer Decod e (5.2) Decod e (1) (2.2) (2.1) (3.1) (3.2) HTTP Response Descriptors, images and information [DB] (4) (5.1) (6) (7) (8’) (8’’) (10) (9) Binary Data * Wine recognizer : GooT and IMT
  • 40. MPEG 3DG Report Server side object recognition: ARAF version End-user Device MAR Scene ARAF Browser Video stream Video source Processing Server URLs Source (video URL) optional: recognition region Video stream Processing Servers Medi a data Binary (base64) key points + descriptors Corresponding media DB Image Detection Library Detection Library Recognition Libraries MAR Experience Creator + Content Creator Large Image DB ORB
  • 41. MPEG 3DG Report Server side object recognition: ARAF version Discussions on: - Does the content creator specify the form of request (full image or descriptors) or the browser will take the best decision? - Is the server’s answer formalized in ARAF?
  • 42. MPEG 3DG Report ARAF – Social Network Data in ARAF scene Scenario: display posts from SN in a geo-localized manner ARAF can do this directly by programming the access to the SN service at the scene level
  • 43. MPEG 3DG Report ARAF – Social Network Data in ARAF scene At minimum, user login to SN - at maximum : the MPEG UD
  • 44. MPEG 3DG Report ARAF – Social Network Data in ARAF scene Connect to an UD server to get all the necessary data
  • 45. MPEG 3DG Report ARAF – Social Network scenario Two categories of “SNS Data” – Static data • Name, photo, email, phone number, address, sex, interest, … – Social Network related activity • Reported location, SNS post title, SNS text, SNS media, SNS media Obtained from the UD server
  • 46. MPEG 3DG Report ARAF 2nd Edition – introducing 3D Video Modeling of 3 AR classes for 3D video: 1.Pre-created 3D model of the environment, using visual search and other sensors to obtain camera position and orientation; 3D video used for handle occlusions 2.No a priori 3D model of the scene, depth captured in real-time and used to handle occlusions at the rendering step 3.No a priori model of the scene but created during AR experience (SLAM – Simultaneous Location and Mapping)
  • 47. MPEG 3DG Report ARAF – introducing 3D Audio Spatialisation Recognition Use sounds from the real world to trigger events in an AR scene
  • 48. MPEG 3DG Report ARAF – 3DAudio : local spatialisation MAR Experience Creator + Content Creator User location & direction + sound location Scene Mobile device ARAF Browser Video/audio stream Camera Coordination mapping Sensed data Position & orientation sensor 3D Audio Engine Relative sound location + (Acoustic scene) + audio source Spatialized audio source Video/audio stream ARAF file Microphone Mixer Synthesized audio stream
  • 49. MPEG 3DG Report ARAF – 3DAudio : remote spatialisation User location & direction + sound location Scene Mobile device ARAF Browser Video/audio stream Camera Coordination mapping ARAF file Sensed data Position & orientation sensor video/audio stream Proxy Server 3D Audio Engine Detection Library Detection Library Detection Library Relative sound location + Audio source + (Acoustic scene) Spatialized audio source MAR Experience Creator + Content Creator Processing Server URL Microphone Mixer Synthesized audio stream
  • 50. MPEG 3DG Report ARAF – Audio recognition: local MAR Experience Creator + Content Creator Target Resources or descriptors Audio Detection Library Detection Library Detection Library Source (microphone/audio URL) Detection Scene Mobile device ARAF Browser Target Resources ID Mask Microphone/audio stream Audio source Library optional: detection window, sampling rate, detection delay
  • 51. MPEG 3DG Report ARAF – Audio recognition: local MAR Experience Creator + Content Creator Target Resources or descriptors Scene Mobile device ARAF Browser Microphone/audio stream Audio source Source (microphone/audio URL) optional: detection window, sampling rate, detection delay Proxy Server Audio Detection Library Detection Library Detection Library Detection Library ID Mask URL of Processing Server Target Resources or descriptors + IDs + optional detection window, sampling rate, detection delay
  • 52. MPEG 3DG Report ARAF – Audio recognition: local MAR Experience Creator + Content Creator Target Resources or descriptors Target Resources or descriptors + IDs + optional detection window, sampling rate, detection delay Scene Mobile device ARAF Browser Audio source Source (microphone/audio URL) optional: detection window, sampling rate, detection delay Processing Server Audio Detection Library Detection Library Detection Library Detection Library ID Mask URL of Processing Server Descriptor Extraction Microphone/audio stream Descriptors
  • 53. MPEG 3DG Report ARAF – joint meeting with 3DAudio Spatialisation Recognition • The 3D audio renderer needs an API to get the user position and orientation • It may be more complex to update in real time position and orientation of all the acoustic objects • MPEG-7 has several tools for audio fingerprint • Investigate the ongoing work on “Audio synchronisation” and check if it is suitable for AR

Editor's Notes

  1. Passing On, Treasure Hunt, Castle Quest, Arduinnae, Castle Crisis
  2. Head Tracking is needed to render the audio. 3DAudio can be used to modulate the audio perception with respect to the user position and orientation. Currently similar approach is used at the production side but it can be used at the user side (in real time). The 3D position and orientation of the graphical objects (enriched with audio) is known and it should be forwarded to the 3D audio engine. The relative positions between the sources and the user are prefered. Draw a diagram showing that the scene is sending to the 3D audio engine the relative position of all the sources and get back the sound for the headphones. Reference software implementation exists but is working using files: the chain is the following: (1) 3D decoder (multi-channel); some of the outputs are objects and higher order ambisonic. (2) Object renderer. The 3D coordinates are included as a metadata in the bitstream but an entry can be done in the Object Renderer taking the input from the scene.
  3. Head Tracking is needed to render the audio. 3DAudio can be used to modulate the audio perception with respect to the user position and orientation. Currently similar approach is used at the production side but it can be used at the user side (in real time). The 3D position and orientation of the graphical objects (enriched with audio) is known and it should be forwarded to the 3D audio engine. The relative positions between the sources and the user are prefered. Draw a diagram showing that the scene is sending to the 3D audio engine the relative position of all the sources and get back the sound for the headphones. Reference software implementation exists but is working using files: the chain is the following: (1) 3D decoder (multi-channel); some of the outputs are objects and higher order ambisonic. (2) Object renderer. The 3D coordinates are included as a metadata in the bitstream but an entry can be done in the Object Renderer taking the input from the scene.
  4. Head Tracking is needed to render the audio. 3DAudio can be used to modulate the audio perception with respect to the user position and orientation. Currently similar approach is used at the production side but it can be used at the user side (in real time). The 3D position and orientation of the graphical objects (enriched with audio) is known and it should be forwarded to the 3D audio engine. The relative positions between the sources and the user are prefered. Draw a diagram showing that the scene is sending to the 3D audio engine the relative position of all the sources and get back the sound for the headphones. Reference software implementation exists but is working using files: the chain is the following: (1) 3D decoder (multi-channel); some of the outputs are objects and higher order ambisonic. (2) Object renderer. The 3D coordinates are included as a metadata in the bitstream but an entry can be done in the Object Renderer taking the input from the scene.
  5. Head Tracking is needed to render the audio. 3DAudio can be used to modulate the audio perception with respect to the user position and orientation. Currently similar approach is used at the production side but it can be used at the user side (in real time). The 3D position and orientation of the graphical objects (enriched with audio) is known and it should be forwarded to the 3D audio engine. The relative positions between the sources and the user are prefered. Draw a diagram showing that the scene is sending to the 3D audio engine the relative position of all the sources and get back the sound for the headphones. Reference software implementation exists but is working using files: the chain is the following: (1) 3D decoder (multi-channel); some of the outputs are objects and higher order ambisonic. (2) Object renderer. The 3D coordinates are included as a metadata in the bitstream but an entry can be done in the Object Renderer taking the input from the scene.