Lessons learned building an AI powered live streaming camera

Lessons learned building an
AI powered live streaming
camera
@lbertogon
Alberto González Trastoy
WebRTC.Ventures

WebRTC.ventures August 2019
One to Many
Videogame playersBroadcastersBrands & celebritiesContent
creators

Live Streaming
Technologies
0
20
40
60
80
100
120
2004-01
2004-05
2004-09
2005-01
2005-05
2005-09
2006-01
2006-05
2006-09
2007-01
2007-05
2007-09
2008-01
2008-05
2008-09
2009-01
2009-05
2009-09
2010-01
2010-05
2010-09
2011-01
2011-05
2011-09
2012-01
2012-05
2012-09
2013-01
2013-05
2013-09
2014-01
2014-05
2014-09
2015-01
2015-05
2015-09
2016-01
2016-05
2016-09
2017-01
2017-05
2017-09
2018-01
2018-05
2018-09
2019-01
2019-05
Live Streaming Protocol Trends Interest Over Time
WebRTC: (Worldwide) HTTP Live Streaming: (Worldwide) Real-Time Messaging Protocol: (Worldwide)
Source: Google Trends

RTMP
TCP based
Adaptive bitrate streaming
Low latency (< 1 sec)
RTMP does not work in HTML5, iOS or
Android natively
Video Stream RTMP Server Client with Flash

HLS
TCP based
High latency (30s-60s)
HLS works in all major OS and
browsers natively
Video Stream HLS Server Web client

WebRTC
UDP based
Adaptive bitrate streaming
Low latency (<1s)
Works in all major OS and browsers natively
Video Stream WebRTC gateway Web client

WebRTC live streaming
options
WebRTC native peer to peer
WebRTC Media Server
CPaaS

WebRTC native peer to peer live
streaming
• It is cheap!
• But doesn't sound like a good
idea…
• Broadcaster will need to
upload its stream as many
times as there are viewers
• And the processing will be
done in the broadcaster

WebRTC with media server
for live streaming
AI Video Processing on the edge
• Easier to develop and test
• Cheaper for the provider
AI Video Processing on the server
• Low battery consumption for
clients
• “No” CPU limitations
“The future of AI is on the edge”
Samsung
“ML algorithms that continuously learn require the computational
horsepower and storage that only a server can provide”
Security Magazine

Just use a CPaaS for live
streaming
• Easier to implement
• It is more expensive to use
• No infra maintenance
• The processing is not easy to do
on a server that you don’t
manage
CPaaS
Infraestructure

AI, AI everywhere…
2/3 of our 2019 WebRTC survey responders are working or
WebRTC application with AI

AI image detection options
OpenCV TensorFlow
More options
available
I can train my
algorithm
Faster
manipulating
data
Easy to use
But there are may other
alternatives…
Someone said PyTorch?
Combine both?

Goal
LIV
E

How to stream video from a
Raspberry Pi
There are many options and frameworks…
Comparison in a Raspberry Pi 3
Framework Latency (ms) CPU Framerate Bitrate
Raspivid + VLC server 3000-4000 2% 30 fps 150 kbps
UV4L + VLC server 2000-3000 3% 30 fps 150 Kbps
Raspivid + Gstreamer* RTP to Janus 1000-2000 2% 30 fps 150 kbps
UV4L WebRTC to Kurento 100-200 90% 30 fps 150 kbps
UV4L WebRTC to Janus 100-200 90% 30 fps 150 kbps
*Using the default x264 encoding without playing with parameters

Live Streaming with image
detection on the edge
b

detection on the edge
OK/Slow when doing basic operations
640×480 at < 15fps
Peer 1 Peer 2
Bad if we start doing more CPU intensive stuff
640×480 at < 1fps!
Peer 1 Peer 2
Haar-cascade Object Detection with OpenCV
https://github.com/agonza1/native-webrtc-peer-to-peer
DeepLab MobileNetv2 image segmentation with TF

Example:
Video Face
Detection
on a
Raspberry Pi

detection on the media server
Kurento already has some
modules…
• Some examples exist
• We can use WebRTC in both
legs easily
Janus + OpenCV
• Well maintained (RTP plugin
works great)
• We will need to create a new
plugin… or not?
There are other options too
But we can’t do everything

detection on Kurento

detection on Janus
Goal First try
RTP
Media
Server
RTP
VS
Video parsing and encoding using GStreamer
magic is not that easy

detection on Janus
The video OpenCV service captures and processes the RTP video
stream
const vCap = new cv.VideoCapture('udpsrc port=5000 ! application/x-rtp,payload=96 ! rtph264depay !
h264parse ! avdec_h264 ! decodebin ! videoconvert ! appsink location=/dev/stdout');
const w = new cv.VideoWriter('appsrc ! videoconvert ! video/x-
raw,format=I420,width=640,height=480,framerate=25/1 ! x264enc ! rtph264pay ! udpsink host=127.0.0.1
port=8004', 0, 25, new cv.Size(640, 480));
while (!done) {
let frame = vCap.read();
// process frame
w.write(pFrame);
}

detection on Janus
Thief Detected!
Viewers: 2
Frameworks Latency
(ms)
MaxCPU Framerate Bitrate
Raspivid + Gstreamer + OpenCV
Janus Streaming
300-2000* 1% 30 fps 150
*Depending on the GStreamer configuration we can optimize for latency
**We used Haar-cascade face detection

DEMO?

Some Conclusions
Camera WebRTC live streaming
with video ML operations at
under half second latency is
possible
ML/AI on the edge is easier to
scale but today is limited
By optimizing the algorithm and
the transcoding it is possible to
reduce 80% the latency
ML/AI on the server provides
higher quality without affecting
the client battery but has
scalability and cost challenges

Projects Links
Native WebRTC with OpenCV
https://github.com/agonza1/native-webrtc-peer-to-peer/tree/opencv-facedetection
Native WebRTC with TF.js
https://github.com/agonza1/native-webrtc-peer-to-peer/tree/tensorflowjs
WebRTC Live Streaming using Janus and OpenCV
https://github.com/agonza1/WebRTC-Live-Streaming-with-AI
WebRTC Live Streaming using Kurento and OpenCV face detection
https://github.com/agonza1/kurento-rpi-live-streaming

@lbertogon
Alberto GonzalezTrastoy
WebRTC.ventures
Thanks!! Questions?

Lessons learned building an AI powered live streaming camera

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

Lessons learned building an AI powered live streaming camera

Editor's Notes