This document summarizes Jeff Powers' presentation on combining geometric 3D vision and deep perception. It outlines Occipital's history and products, including Structure Sensor and Structure Core. It discusses the strengths and weaknesses of 3D vision and deep learning individually. The key points are that 3D vision provides depth estimation and mapping but lacks semantic understanding, while deep learning has conquered semantics but lacks spatial understanding. The document argues that on-device compute is increasing, 3D datasets are improving, and specialized hardware is emerging, making now the ideal time to combine 3D vision and deep learning to advance applications like augmented reality, spatial computing, and 3D scanning.
1. Jeff Powers, Co-Founder & CEO
Denver Dash, Director of Machine Learning
Augmented World Expo
June 1, 2018
When
Geometry
Met Learning
2. 50+ strong
9 former founders
3 original Kinect hackers
TEAM PAST WORK
KEY BACKERS
THE TEAM
San Francisco, CA Boulder, CO Gainesville, FL
Lynx
ACQUISITIONS
3. Structure Sensor
2013-present
#6 tech
Kickstarter
Product History
Anticipate the future of computer vision and build it first.
360 Panorama
2010-present
8M downloads
RedLaser
2009-2010
Acquired by
Bridge
2016-present
Mixed reality on
mobile devices
Structure Core
2017-present
Embeddable
depth sensor
Canvas
2016-present
Reality capture
for homes
20182009
6. DP
Deep
Perception
ILSVRC Image
Classification Error
Deep Q Learning
2015
AlphaGo
2016
Speech Reco
2016
Figure 2. Outline of the DeepFace architecture. A front-end of a single convolution-pooling-convolution filtering on the rectified input, followed by three
locally-connected layers and two fully-connected layers. Colors illustrate feature maps produced at each layer. The net includes more than 120 million
parameters, where more than 95% come from the local and fully connected layers.
DeepFace
2014
OpenPose
2017
Mask RCNN
2017
7. Both methods have weaknesses
• Has not surpassed 3D vision as fast as
it has semantic understanding
• 3D mapping e2e may require huge
models and huge amounts of data
• Sensitive to feature extraction, feature
matching, texture and lighting
• Few ways to infer semantics
8. What deep perception brings to 3D Vision
Computer vision room of nightmares
Plane-Net, Liu, et al., 2018
RoomNet, Lee et al., 2017
9. What 3D vision brings to deep perception
Stanford University Technical University of Munich
3DMV: Dai & Niessner, 2018
10. What 3D vision brings to deep perception
Depth
Estimation
305 cm?
130 cm?
12. Deep Learning is conquering
semantics, but “spatial semantics”
is needed for complete understanding
13. 3D-ML Kryptonite: Data
3D capture is fundamentally
more difficult than 2D
3D data is messy
Extreme compute
requirements
Paracosm PX-80 Walkthrough downtown Gainesville
15. Now is the time to join forces
On-device
compute
increases
3D datasets
are improving
Specialized
ASICs arriveScanNet, Dai et al., 2017 - 1513 scansSceneNN, Hua et al., 2016, 100 scans
Building Parser, Aremeni et al., 2016 - 256 rooms
16. Now is the time to join forces
On-device
compute
increases
3D datasets
are improving
Specialized
ASICs arrive
17. Now is the time to join forces
On-device
compute
increases
3D datasets
are improving
Specialized
ASICs arrive
NVIDIA Deep Learning Accelerator
Google Tensor Processing Unit 3.0
18. Now is the time to join forces
DenseNet
VGG-19
GoogLeNet
ResNet
Frameworks
and code
Available
deep models
19. Now is the time to join forces
DenseNet
VGG-19
GoogLeNet
ResNet
Frameworks
and code
Available deep
models
20. What good might come
to the world when
these two superheroes
combine forces?
25. • 3D scanning
• Room-scale 3D mapping
• Augmented reality
• Low-level access to
depth data
• Nearly 100 apps
already built
• Room-scale mixed reality (MR)
• Real world physics and
occlusion
• Obstacle avoidance
• Character pathfinding
• Low-latency 6-DoF tracking
• Our lowest latency
6-DoF tracking
• Room-scale 3D mapping
• Obstacle avoidance
• Input agnostic
Visit developer.structure.io to get started
Get started developing today
ML+CV features will roll out over time
Tracking
26. Join the Dawn of
Deep 3D Vision
denver@occipital.com