Computer Vision - now working

in over 2 Billion Web Browsers!
Rob Manson

CEO & co-founder
Sebastian Montabone

Computer Vision Engineer
Mixed Reality. In the web. On any device.
https://try.awe.media
So what is Mixed Reality?
Here’s a short demo of Milgram’s Mixed Reality Continuum - all running in a browser.
awe.media
A brief/biased history of Computer Vision
1957 - Russel A. Kirsch scans first photo with a computer
1960 - Larry Roberts publishes thesis at MIT
1964 - First facial recognition system (unamed intelligence agency)
1976 - UK Police create first License Plate recognition system
1978 - David Marr proposes edge detection framework at MIT
1985 - Lockheed Martin/Carnegie Mellon create first self-driving land vehicle
1992 - Tom Caudell at Boeing coins the term Augmented Reality
1999 - Billinghurst & Kato publish/demo ARToolkit at IWAR/SIGGRAPH
2000 - Windows only alpha version of OpenCV launched at CVPR
2007 - OpenCV 1.0 released
2008 - ARToolkit ported to Flash by @saqoosha
2011 - ARToolkit ported to Javascript by Ilmari Heikkinen
2011 - FastCV/Vuforia 1.0 released
2017 - Facebook adds Computer Vision to their camera app
2017 - OpenCV in the browser demonstrated here
awe.media
How does Computer Vision

work in the browser?
awe.media
camera -> gUM -> video -> canvas -> pixels -> vision algorithms
HTMLVideoElement
This is a container for decoding and presenting video streams.
This brought plugin free video to the web.
awe.media
awe.media
Canvas, WebGL & the ArrayBuffer
The 2D Canvas gave us the ability to convert a video stream into pixel data.
WebGL brought 3D Canvases with access to the GPU.
But most importantly WebGL gave us ArrayBuffers

which allowed us to access the pixel data for the first time.
awe.media
JSARToolkit
In 2011 Billinghurst & Kato's ARToolkit was ported to Javascript.
awe.media
Enter WebRTC's getUserMedia()
Some claim this has a latency that makes the web unusable for AR.

But here’s the numbers running on a Pixel - the max difference is ~200ms
200-250ms - Camera stream in a native AR
350-400ms - gUM stream in a web app
awe.media
WebRTC's getUserMedia()
FAST feature detection & Tigerstail in 2012
awe.media
WebRTC's getUserMedia()
Tracking.js released in 2012
awe.media
WebRTC's getUserMedia()
AR.js released in 2017
awe.media
Transpiling OpenCV
This brings a more general computer vision toolkit to the web!
Demo Time!
awe.media
awe.media
But there's no gUM on iOS?
For Vision based functionality we fallback to Visual Search
For Location based apps we fallback to 360°/VR (like Pokemon Go with the camera off)
And remember “video see thu” is not the only form of AR

Computer Vision - now working
 in over 2 Billion Web Browsers!

  • 1.
    Computer Vision -now working
 in over 2 Billion Web Browsers! Rob Manson
 CEO & co-founder Sebastian Montabone
 Computer Vision Engineer Mixed Reality. In the web. On any device. https://try.awe.media
  • 3.
    So what isMixed Reality? Here’s a short demo of Milgram’s Mixed Reality Continuum - all running in a browser. awe.media
  • 4.
    A brief/biased historyof Computer Vision 1957 - Russel A. Kirsch scans first photo with a computer 1960 - Larry Roberts publishes thesis at MIT 1964 - First facial recognition system (unamed intelligence agency) 1976 - UK Police create first License Plate recognition system 1978 - David Marr proposes edge detection framework at MIT 1985 - Lockheed Martin/Carnegie Mellon create first self-driving land vehicle 1992 - Tom Caudell at Boeing coins the term Augmented Reality 1999 - Billinghurst & Kato publish/demo ARToolkit at IWAR/SIGGRAPH 2000 - Windows only alpha version of OpenCV launched at CVPR 2007 - OpenCV 1.0 released 2008 - ARToolkit ported to Flash by @saqoosha 2011 - ARToolkit ported to Javascript by Ilmari Heikkinen 2011 - FastCV/Vuforia 1.0 released 2017 - Facebook adds Computer Vision to their camera app 2017 - OpenCV in the browser demonstrated here awe.media
  • 5.
    How does ComputerVision
 work in the browser? awe.media camera -> gUM -> video -> canvas -> pixels -> vision algorithms
  • 6.
    HTMLVideoElement This is acontainer for decoding and presenting video streams. This brought plugin free video to the web. awe.media
  • 7.
    awe.media Canvas, WebGL &the ArrayBuffer The 2D Canvas gave us the ability to convert a video stream into pixel data. WebGL brought 3D Canvases with access to the GPU. But most importantly WebGL gave us ArrayBuffers
 which allowed us to access the pixel data for the first time.
  • 8.
    awe.media JSARToolkit In 2011 Billinghurst& Kato's ARToolkit was ported to Javascript.
  • 9.
    awe.media Enter WebRTC's getUserMedia() Someclaim this has a latency that makes the web unusable for AR.
 But here’s the numbers running on a Pixel - the max difference is ~200ms 200-250ms - Camera stream in a native AR 350-400ms - gUM stream in a web app
  • 10.
    awe.media WebRTC's getUserMedia() FAST featuredetection & Tigerstail in 2012
  • 11.
  • 12.
  • 13.
    awe.media Transpiling OpenCV This bringsa more general computer vision toolkit to the web!
  • 14.
  • 15.
    awe.media But there's nogUM on iOS? For Vision based functionality we fallback to Visual Search For Location based apps we fallback to 360°/VR (like Pokemon Go with the camera off) And remember “video see thu” is not the only form of AR