SlideShare a Scribd company logo
1 of 51
Download to read offline
ARMIN MUSTAFA
ROYAL ACADEMY OF ENGINEERING RESEARCH FELLOW
4D Vision for Dynamic Scene
Understanding
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
What is 4D Vision?
Multi-view video 3D 4D
Spatio-temporally coherent models
Why 4D Vision?
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Robotics
Computer AnimationComputer Graphics Medical Imaging
Virtual Reality Digital Media
Why 4D Vision?
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D Vision enables machine perception
o Autonomous machine perception for:
o Online video-rate content capture
o Tools for content production in films (e.g.: automatic rotoscoping)
o Intelligent next-generation of sophisticated gaming
o VR/AR/MR (e.g.: holoportation for general scenes, virtual tourism,
immersive story telling)
4D Vision - Challenges
Input:
o Uncalibrated wide-baseline multi-views from static/moving cameras
o Challenging outdoor scenes:
o Large capture volume
o Natural scene backgrounds
o Uncontrolled illumination and Repetitive texture
o Dynamic fast scene motion
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D Vision - Challenges
o Temporally coherent reconstruction of complex dynamic scenes.
o Unknown background, structure and segmentation.
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
No prior information
Moving cameras
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Multi-view scene
Framework
Object identification
Temporal coherence
4D scene reconstruction and
segmentation
4D Vision – Overview
Contributions to 4D Vision
General Scene Reconstruction
Temporally Coherent Reconstruction
4D Light-field Video
Non-sequential Alignment
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Semantic Reconstruction
General scene reconstruction (3D)
General Dynamic Scene Reconstruction from Multiple View Video.
A. Mustafa, H. Kim, J-Y. Guillemaut and A. Hilton
International Conference in Computer vision (ICCV) 2015
Multi-scale Segmentation based Features for Wide-baseline Scene Reconstruction.
A. Mustafa, H. Kim and A. Hilton
IEEE Transactions in Image Processing (TIP) 2018
Existing methods - Problems
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Segmentation Depth map
o Requires accurate segmentation of dynamic foreground objects
o Known background and structure
J-Y. Guillemaut and A. Hilton. Joint Multi-Layer Segmentation and Reconstruction for Free-Viewpoint Video Applications. IJCV 2010
Contributions
o Unsupervised dense reconstruction of general scenes without priors.
o Robust joint refinement of reconstruction and segmentation.
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Scene level
Multi-view
data
Feature
Detection
Sparse
Reconstruction
Feature
Matching
Framework – General scene reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Scene level
Multi-view
data
Object
Clustering
Feature
Detection
Sparse
Reconstruction
Feature
Matching
ClusteringSparse point cloud
Framework – General scene reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Scene level
Multi-view
data
Object
Clustering
Feature
Detection
Sparse
Reconstruction
Feature
Matching
Object level
Initial coarse
reconstruction
Framework – General scene reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Multi-view
data
Object
Clustering
Feature
Detection
Sparse
Reconstruction
Feature
Matching
Initial coarse
reconstruction
Refinement
o Joint segmentation and reconstruction
optimization
o Photo-consistency, Smoothness and
Contrast constraints
Framework – General scene reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
General dynamic reconstruction
 A method to segment and reconstruct dynamic objects with improved quality
 No prior information on background appearance or structure
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Limitations
 Per-frame inconsistent reconstruction and segmentation
 The quality of results is far from perfect
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Temporally coherent general scene
reconstruction
Temporally coherent 4D reconstruction of complex dynamic scenes
A. Mustafa, H. Kim, J-Y. Guillemaut and A. Hilton
Computer vision and pattern recognition (CVPR) 2016
Contributions
o Temporally coherent general scene reconstruction and segmentation.
o Improved joint refinement by introducing geodesic star-convexity.
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Previous
frame mesh
Optical flow
Dense temporal
correspondence
Initial coarse
reconstruction
Final mesh
Temporal coherence
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Input
4D Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Limitations
 The segmentation and reconstruction quality is not perfect
What we want!What we get!
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Semantically Coherent Co-segmentation and
Reconstruction
Semantically coherent co-segmentation and reconstruction of dynamic scenes
A. Mustafa and A. Hilton
Computer vision and pattern recognition (CVPR) 2017
Contributions
o Semantic co-segmentation and reconstruction of complex scenes
o Temporal semantic coherence across sequence
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Framework – Semantically coherent reconstruction
Multi-view
data
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Framework – Semantically coherent reconstruction
Scene level
Multi-view
data
Initial Semantic
Segmentation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
FCNs produce segmentations with poorly
localized object boundaries
Framework – Semantically coherent reconstruction
Scene level
Multi-view
data
Object
Clustering
Initial Semantic
Segmentation
Sparse Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Object level
Initial Semantic 3D
Reconstruction
Framework – Semantically coherent reconstruction
Scene level
Multi-view
data
Object
Clustering
Initial Semantic
Segmentation
Sparse Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Framework – Semantically coherent reconstruction
Scene level
Multi-view
data
Object
Clustering
Initial Semantic
Segmentation
Sparse Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Object level
Semantic
Tracklets
Initial Semantic 3D
Reconstruction
Introduce temporal and
semantic coherence
Semantic tracklets
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
where N is the number of views
Am is the measure of appearance similarity.
S(i,j) = 1 /3N 𝑐=1
𝑁
𝐴𝑚 + 𝑆𝑚 + 𝐿𝑚
All frames with similarity > 0.75 are selected to form a semantic tracklet
Lm is the measure of class labels in the semantic segmentation region
Framework – Semantically coherent reconstruction
Scene level
Multi-view
data
Object
Clustering
Initial Semantic
Segmentation
Sparse Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Object level
Initial Semantic 3D
Reconstruction
Refinement
Semantic
Tracklets
E(l,d) = δ Esemantic(l,d) + + + +
Framework – Semantically coherent reconstruction
Frame 11
Frame 42
Frame 26
Frame 56
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Framework – Semantically coherent reconstruction
Multi-view data
Initial Semantic
Segmentation
Semantically Coherent
Segmentation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Results and Evaluation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
CVPR16ProposedInput
Results and Evaluation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Original videos Semantic reconstruction
Semantic co-segmentation Segmentation comparison
Results and Evaluation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Input videos Semantic reconstruction
Semantic co-segmentation
Results and Evaluation
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Input videos Semantically coherent reconstruction
Semantic segmentation comparison
o Semantic co-segmentation and reconstruction of dynamic scenes
o Temporal semantic coherence enforced by semantic tracklets
o Improved segmentation and reconstruction of dynamic scenes
Semantically coherent reconstruction - Conclusions
Original Image
Frame 195
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Results – 4D Reconstruction
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Dance dataset Juggler dataset
Limitations
 Sequential alignment is prone to errors due to drift and
large complex motions
Frame 26 Frame 86 Frame 86
What we want!
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D match trees for non-sequential surface
alignment
4D Match Trees for Non-rigid Surface Alignment
A. Mustafa, H. Kim and A. Hilton
European conference in computer vision (ECCV) 2016
Contributions
o Robust global 4D alignment of partial reconstructions of non-rigid shape
o Sparse matching between wide-timeframe image pairs using SFD
o 4D Match Trees to represent the optimal non-sequential alignment path
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Multi-view video
+ Surface
Wide-timeframe
sparse matches
4D Match Tree Dense
correspondence
4D scene reconstruction
Frame 1
Results – Non-sequential alignment
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Results – Non-sequential alignment
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
Frame 1 Frame 1
4D Light-field Video
4D Temporally Coherent Light-field Video
A. Mustafa, M. Volino, J-Y. Guillemaut and A. Hilton
3D Vision (3DV) 2017
ALIVE – 4D Light-field Video
• Address limitation of 360 video
• Introduce light-fields for immersive virtual experiences
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
“Kinch and the double world” – Figment Cinematic VR Experience
SIGGRAPH 2018 VR Festival
Film festivals (Raindance, Strasbourg)
4D Light-field Video
• 4D Temporally coherent light-field video for dynamic scenes
• Light-field scene flow using Epipolar Plane Image
• Efficient light-field representations for live action VR
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D Light-field Video
48
Light-field video Camera 2 video
Light-field scene flow4D light-field video
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D Vision - Summary
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
I. Outdoor 3D and Scene Understanding
I. SFD features for wide-baseline reconstruction [3DV 2015 , TIP 2018]
II. Unsupervised general scene reconstruction [ICCV 15]
III. Semantic reconstruction and segmentation [CVPR 2017]
II. 3D video to 4D models
I. Temporally coherent general scene reconstruction [CVPR 2016]
II. Non-sequential alignment [ECCV 2016]
III. 4D light-field video for virtual reality [3DV 2017]
4D Vision - Spatio-temporally coherent models from video
Future work: 4D Vision for perceptive machines
 Robust machine perception of general dynamic scenes from video
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA
4D Vision for
Perceptive
Machines
Reconstruction
Registration
Machine
Learning
Artificial
Intelligence
Recognition
THANK YOU!
4D VISION FOR DYNAMIC SCENE UNDERSTANDING
ARMIN MUSTAFA

More Related Content

Similar to Armin mustafa talk_08.11.18_a_imeetup

Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)PetteriTeikariPhD
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVYu Huang
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAMYu Huang
 
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...Kitsukawa Yuki
 
Machine learning for newbies
Machine learning for newbiesMachine learning for newbies
Machine learning for newbiesAndrew Nikishaev
 
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...Edge AI and Vision Alliance
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
CATalkOnline.ppt
CATalkOnline.pptCATalkOnline.ppt
CATalkOnline.pptSamar954063
 
Optical Computing for Fast Light Transport Analysis
Optical Computing for Fast Light Transport AnalysisOptical Computing for Fast Light Transport Analysis
Optical Computing for Fast Light Transport AnalysisMatthew O'Toole
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VYu Huang
 
Project Hammerhead - A 360° VR Visual Field Enhancement System
Project Hammerhead - A 360° VR Visual Field Enhancement SystemProject Hammerhead - A 360° VR Visual Field Enhancement System
Project Hammerhead - A 360° VR Visual Field Enhancement SystemYuval Shubert
 
Real-time animated digital doubles at Eisko
Real-time animated digital doubles at EiskoReal-time animated digital doubles at Eisko
Real-time animated digital doubles at EiskoEiskoDigitalDoubles
 

Similar to Armin mustafa talk_08.11.18_a_imeetup (17)

Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Raskar Computational Camera Fall 2009 Lecture 01
Raskar Computational Camera Fall 2009 Lecture 01Raskar Computational Camera Fall 2009 Lecture 01
Raskar Computational Camera Fall 2009 Lecture 01
 
Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)
 
Physics 4 d
Physics 4 dPhysics 4 d
Physics 4 d
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...
SkyStitch: a Cooperative Multi-UAV-based Real-time Video Surveillance System ...
 
Machine learning for newbies
Machine learning for newbiesMachine learning for newbies
Machine learning for newbies
 
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
“Tools for Creating Next-Gen Computer Vision Apps on Snapdragon,” a Presentat...
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
CATalkOnline.ppt
CATalkOnline.pptCATalkOnline.ppt
CATalkOnline.ppt
 
Optical Computing for Fast Light Transport Analysis
Optical Computing for Fast Light Transport AnalysisOptical Computing for Fast Light Transport Analysis
Optical Computing for Fast Light Transport Analysis
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Project Hammerhead - A 360° VR Visual Field Enhancement System
Project Hammerhead - A 360° VR Visual Field Enhancement SystemProject Hammerhead - A 360° VR Visual Field Enhancement System
Project Hammerhead - A 360° VR Visual Field Enhancement System
 
Real-time animated digital doubles at Eisko
Real-time animated digital doubles at EiskoReal-time animated digital doubles at Eisko
Real-time animated digital doubles at Eisko
 

More from Peter Bloomfield

Ai for urban traffic control neil walton_2020
Ai for urban traffic control neil walton_2020Ai for urban traffic control neil walton_2020
Ai for urban traffic control neil walton_2020Peter Bloomfield
 
Geospatial intelligence satellite applications catapult pdf - july 23 2019
Geospatial intelligence   satellite applications catapult pdf - july 23 2019Geospatial intelligence   satellite applications catapult pdf - july 23 2019
Geospatial intelligence satellite applications catapult pdf - july 23 2019Peter Bloomfield
 
Prem Gill Seals From Space
Prem Gill Seals From SpacePrem Gill Seals From Space
Prem Gill Seals From SpacePeter Bloomfield
 
David Petit Deimos presentation EO
David Petit Deimos presentation EODavid Petit Deimos presentation EO
David Petit Deimos presentation EOPeter Bloomfield
 
Cray mi garage av event march 28 2019 pdf
Cray mi garage av event march 28 2019 pdfCray mi garage av event march 28 2019 pdf
Cray mi garage av event march 28 2019 pdfPeter Bloomfield
 
5 g vehicular_comms_katsaros
5 g vehicular_comms_katsaros5 g vehicular_comms_katsaros
5 g vehicular_comms_katsarosPeter Bloomfield
 
Tsc cav@digital catapult_march2019
Tsc cav@digital catapult_march2019Tsc cav@digital catapult_march2019
Tsc cav@digital catapult_march2019Peter Bloomfield
 
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018Peter Bloomfield
 
Caspian machine learning garage
Caspian machine learning garageCaspian machine learning garage
Caspian machine learning garagePeter Bloomfield
 

More from Peter Bloomfield (11)

Ai for urban traffic control neil walton_2020
Ai for urban traffic control neil walton_2020Ai for urban traffic control neil walton_2020
Ai for urban traffic control neil walton_2020
 
Geospatial intelligence satellite applications catapult pdf - july 23 2019
Geospatial intelligence   satellite applications catapult pdf - july 23 2019Geospatial intelligence   satellite applications catapult pdf - july 23 2019
Geospatial intelligence satellite applications catapult pdf - july 23 2019
 
Prem Gill Seals From Space
Prem Gill Seals From SpacePrem Gill Seals From Space
Prem Gill Seals From Space
 
David Petit Deimos presentation EO
David Petit Deimos presentation EODavid Petit Deimos presentation EO
David Petit Deimos presentation EO
 
Cray mi garage av event march 28 2019 pdf
Cray mi garage av event march 28 2019 pdfCray mi garage av event march 28 2019 pdf
Cray mi garage av event march 28 2019 pdf
 
5 g vehicular_comms_katsaros
5 g vehicular_comms_katsaros5 g vehicular_comms_katsaros
5 g vehicular_comms_katsaros
 
Tsc cav@digital catapult_march2019
Tsc cav@digital catapult_march2019Tsc cav@digital catapult_march2019
Tsc cav@digital catapult_march2019
 
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018
Cyanapse talk photorealisticf_ilters_migaragemeetup_7nov2018
 
Yossarian 2018 intro
Yossarian 2018 introYossarian 2018 intro
Yossarian 2018 intro
 
Caspian machine learning garage
Caspian machine learning garageCaspian machine learning garage
Caspian machine learning garage
 
Pablo Suau - DWP Digital
Pablo Suau - DWP DigitalPablo Suau - DWP Digital
Pablo Suau - DWP Digital
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Armin mustafa talk_08.11.18_a_imeetup

  • 1. ARMIN MUSTAFA ROYAL ACADEMY OF ENGINEERING RESEARCH FELLOW 4D Vision for Dynamic Scene Understanding
  • 2. 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA What is 4D Vision? Multi-view video 3D 4D Spatio-temporally coherent models
  • 3. Why 4D Vision? 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Robotics Computer AnimationComputer Graphics Medical Imaging Virtual Reality Digital Media
  • 4. Why 4D Vision? 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA 4D Vision enables machine perception o Autonomous machine perception for: o Online video-rate content capture o Tools for content production in films (e.g.: automatic rotoscoping) o Intelligent next-generation of sophisticated gaming o VR/AR/MR (e.g.: holoportation for general scenes, virtual tourism, immersive story telling)
  • 5. 4D Vision - Challenges Input: o Uncalibrated wide-baseline multi-views from static/moving cameras o Challenging outdoor scenes: o Large capture volume o Natural scene backgrounds o Uncontrolled illumination and Repetitive texture o Dynamic fast scene motion 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 6. 4D Vision - Challenges o Temporally coherent reconstruction of complex dynamic scenes. o Unknown background, structure and segmentation. 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 7. No prior information Moving cameras 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Multi-view scene Framework Object identification Temporal coherence 4D scene reconstruction and segmentation 4D Vision – Overview
  • 8. Contributions to 4D Vision General Scene Reconstruction Temporally Coherent Reconstruction 4D Light-field Video Non-sequential Alignment 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Semantic Reconstruction
  • 9. General scene reconstruction (3D) General Dynamic Scene Reconstruction from Multiple View Video. A. Mustafa, H. Kim, J-Y. Guillemaut and A. Hilton International Conference in Computer vision (ICCV) 2015 Multi-scale Segmentation based Features for Wide-baseline Scene Reconstruction. A. Mustafa, H. Kim and A. Hilton IEEE Transactions in Image Processing (TIP) 2018
  • 10. Existing methods - Problems 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Segmentation Depth map o Requires accurate segmentation of dynamic foreground objects o Known background and structure J-Y. Guillemaut and A. Hilton. Joint Multi-Layer Segmentation and Reconstruction for Free-Viewpoint Video Applications. IJCV 2010
  • 11. Contributions o Unsupervised dense reconstruction of general scenes without priors. o Robust joint refinement of reconstruction and segmentation. 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 12. Scene level Multi-view data Feature Detection Sparse Reconstruction Feature Matching Framework – General scene reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 13. Scene level Multi-view data Object Clustering Feature Detection Sparse Reconstruction Feature Matching ClusteringSparse point cloud Framework – General scene reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 14. Scene level Multi-view data Object Clustering Feature Detection Sparse Reconstruction Feature Matching Object level Initial coarse reconstruction Framework – General scene reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 15. Multi-view data Object Clustering Feature Detection Sparse Reconstruction Feature Matching Initial coarse reconstruction Refinement o Joint segmentation and reconstruction optimization o Photo-consistency, Smoothness and Contrast constraints Framework – General scene reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 16. General dynamic reconstruction  A method to segment and reconstruct dynamic objects with improved quality  No prior information on background appearance or structure 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 17. Limitations  Per-frame inconsistent reconstruction and segmentation  The quality of results is far from perfect 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 18. Temporally coherent general scene reconstruction Temporally coherent 4D reconstruction of complex dynamic scenes A. Mustafa, H. Kim, J-Y. Guillemaut and A. Hilton Computer vision and pattern recognition (CVPR) 2016
  • 19. Contributions o Temporally coherent general scene reconstruction and segmentation. o Improved joint refinement by introducing geodesic star-convexity. 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 20. Previous frame mesh Optical flow Dense temporal correspondence Initial coarse reconstruction Final mesh Temporal coherence 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 21. Input 4D Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 22. Limitations  The segmentation and reconstruction quality is not perfect What we want!What we get! 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 23. Semantically Coherent Co-segmentation and Reconstruction Semantically coherent co-segmentation and reconstruction of dynamic scenes A. Mustafa and A. Hilton Computer vision and pattern recognition (CVPR) 2017
  • 24. Contributions o Semantic co-segmentation and reconstruction of complex scenes o Temporal semantic coherence across sequence 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 25. Framework – Semantically coherent reconstruction Multi-view data 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 26. Framework – Semantically coherent reconstruction Scene level Multi-view data Initial Semantic Segmentation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA FCNs produce segmentations with poorly localized object boundaries
  • 27. Framework – Semantically coherent reconstruction Scene level Multi-view data Object Clustering Initial Semantic Segmentation Sparse Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 28. Object level Initial Semantic 3D Reconstruction Framework – Semantically coherent reconstruction Scene level Multi-view data Object Clustering Initial Semantic Segmentation Sparse Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 29. Framework – Semantically coherent reconstruction Scene level Multi-view data Object Clustering Initial Semantic Segmentation Sparse Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Object level Semantic Tracklets Initial Semantic 3D Reconstruction Introduce temporal and semantic coherence
  • 30. Semantic tracklets 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA where N is the number of views Am is the measure of appearance similarity. S(i,j) = 1 /3N 𝑐=1 𝑁 𝐴𝑚 + 𝑆𝑚 + 𝐿𝑚 All frames with similarity > 0.75 are selected to form a semantic tracklet Lm is the measure of class labels in the semantic segmentation region
  • 31. Framework – Semantically coherent reconstruction Scene level Multi-view data Object Clustering Initial Semantic Segmentation Sparse Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Object level Initial Semantic 3D Reconstruction Refinement Semantic Tracklets E(l,d) = δ Esemantic(l,d) + + + +
  • 32. Framework – Semantically coherent reconstruction Frame 11 Frame 42 Frame 26 Frame 56 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 33. Framework – Semantically coherent reconstruction Multi-view data Initial Semantic Segmentation Semantically Coherent Segmentation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 34. Results and Evaluation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA CVPR16ProposedInput
  • 35. Results and Evaluation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Original videos Semantic reconstruction Semantic co-segmentation Segmentation comparison
  • 36. Results and Evaluation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Input videos Semantic reconstruction Semantic co-segmentation
  • 37. Results and Evaluation 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Input videos Semantically coherent reconstruction Semantic segmentation comparison
  • 38. o Semantic co-segmentation and reconstruction of dynamic scenes o Temporal semantic coherence enforced by semantic tracklets o Improved segmentation and reconstruction of dynamic scenes Semantically coherent reconstruction - Conclusions Original Image Frame 195 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 39. Results – 4D Reconstruction 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Dance dataset Juggler dataset
  • 40. Limitations  Sequential alignment is prone to errors due to drift and large complex motions Frame 26 Frame 86 Frame 86 What we want! 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 41. 4D match trees for non-sequential surface alignment 4D Match Trees for Non-rigid Surface Alignment A. Mustafa, H. Kim and A. Hilton European conference in computer vision (ECCV) 2016
  • 42. Contributions o Robust global 4D alignment of partial reconstructions of non-rigid shape o Sparse matching between wide-timeframe image pairs using SFD o 4D Match Trees to represent the optimal non-sequential alignment path 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 43. Multi-view video + Surface Wide-timeframe sparse matches 4D Match Tree Dense correspondence 4D scene reconstruction Frame 1 Results – Non-sequential alignment 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 44. Results – Non-sequential alignment 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA Frame 1 Frame 1
  • 45. 4D Light-field Video 4D Temporally Coherent Light-field Video A. Mustafa, M. Volino, J-Y. Guillemaut and A. Hilton 3D Vision (3DV) 2017
  • 46. ALIVE – 4D Light-field Video • Address limitation of 360 video • Introduce light-fields for immersive virtual experiences 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA “Kinch and the double world” – Figment Cinematic VR Experience SIGGRAPH 2018 VR Festival Film festivals (Raindance, Strasbourg)
  • 47. 4D Light-field Video • 4D Temporally coherent light-field video for dynamic scenes • Light-field scene flow using Epipolar Plane Image • Efficient light-field representations for live action VR 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 48. 4D Light-field Video 48 Light-field video Camera 2 video Light-field scene flow4D light-field video 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA
  • 49. 4D Vision - Summary 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA I. Outdoor 3D and Scene Understanding I. SFD features for wide-baseline reconstruction [3DV 2015 , TIP 2018] II. Unsupervised general scene reconstruction [ICCV 15] III. Semantic reconstruction and segmentation [CVPR 2017] II. 3D video to 4D models I. Temporally coherent general scene reconstruction [CVPR 2016] II. Non-sequential alignment [ECCV 2016] III. 4D light-field video for virtual reality [3DV 2017] 4D Vision - Spatio-temporally coherent models from video
  • 50. Future work: 4D Vision for perceptive machines  Robust machine perception of general dynamic scenes from video 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA 4D Vision for Perceptive Machines Reconstruction Registration Machine Learning Artificial Intelligence Recognition
  • 51. THANK YOU! 4D VISION FOR DYNAMIC SCENE UNDERSTANDING ARMIN MUSTAFA