SlideShare a Scribd company logo
1 of 26
Lessons learned building an
AI powered live streaming
camera
@lbertogon
Alberto González Trastoy
WebRTC.Ventures
WebRTC.ventures August 2019
One to Many
Videogame playersBroadcastersBrands & celebritiesContent
creators
WebRTC.ventures August 2019
Live Streaming
Technologies
0
20
40
60
80
100
120
2004-01
2004-05
2004-09
2005-01
2005-05
2005-09
2006-01
2006-05
2006-09
2007-01
2007-05
2007-09
2008-01
2008-05
2008-09
2009-01
2009-05
2009-09
2010-01
2010-05
2010-09
2011-01
2011-05
2011-09
2012-01
2012-05
2012-09
2013-01
2013-05
2013-09
2014-01
2014-05
2014-09
2015-01
2015-05
2015-09
2016-01
2016-05
2016-09
2017-01
2017-05
2017-09
2018-01
2018-05
2018-09
2019-01
2019-05
Live Streaming Protocol Trends Interest Over Time
WebRTC: (Worldwide) HTTP Live Streaming: (Worldwide) Real-Time Messaging Protocol: (Worldwide)
Source: Google Trends
RTMP
WebRTC.ventures August 2019
TCP based
Adaptive bitrate streaming
Low latency (< 1 sec)
RTMP does not work in HTML5, iOS or
Android natively
Video Stream RTMP Server Client with Flash
HLS
WebRTC.ventures August 2019
TCP based
High latency (30s-60s)
HLS works in all major OS and
browsers natively
Video Stream HLS Server Web client
WebRTC
WebRTC.ventures August 2019
UDP based
Adaptive bitrate streaming
Low latency (<1s)
Works in all major OS and browsers natively
Video Stream WebRTC gateway Web client
WebRTC.ventures August 2019
WebRTC live streaming
options
WebRTC native peer to peer
WebRTC Media Server
CPaaS
WebRTC.ventures August 2019
WebRTC native peer to peer live
streaming
• It is cheap!
• But doesn't sound like a good
idea…
• Broadcaster will need to
upload its stream as many
times as there are viewers
• And the processing will be
done in the broadcaster
WebRTC.ventures August 2019
WebRTC with media server
for live streaming
AI Video Processing on the edge
• Easier to develop and test
• Cheaper for the provider
AI Video Processing on the server
• Low battery consumption for
clients
• “No” CPU limitations
“The future of AI is on the edge”
Samsung
“ML algorithms that continuously learn require the computational
horsepower and storage that only a server can provide”
Security Magazine
WebRTC.ventures August 2019
Just use a CPaaS for live
streaming
• Easier to implement
• It is more expensive to use
• No infra maintenance
• The processing is not easy to do
on a server that you don’t
manage
CPaaS
Infraestructure
AI, AI everywhere…
2/3 of our 2019 WebRTC survey responders are working or
WebRTC application with AI
WebRTC.ventures August 2019
AI image detection options
OpenCV TensorFlow
More options
available
I can train my
algorithm
Faster
manipulating
data
Easy to use
But there are may other
alternatives…
Someone said PyTorch?
Combine both?
WebRTC.ventures August 2019
Goal
LIV
E
WebRTC.ventures August 2019
How to stream video from a
Raspberry Pi
There are many options and frameworks…
Comparison in a Raspberry Pi 3
Framework Latency (ms) CPU Framerate Bitrate
Raspivid + VLC server 3000-4000 2% 30 fps 150 kbps
UV4L + VLC server 2000-3000 3% 30 fps 150 Kbps
Raspivid + Gstreamer* RTP to Janus 1000-2000 2% 30 fps 150 kbps
UV4L WebRTC to Kurento 100-200 90% 30 fps 150 kbps
UV4L WebRTC to Janus 100-200 90% 30 fps 150 kbps
*Using the default x264 encoding without playing with parameters
WebRTC.ventures August 2019
Live Streaming with image
detection on the edge
b
WebRTC.ventures August 2019
Live Streaming with image
detection on the edge
OK/Slow when doing basic operations
640×480 at < 15fps
Peer 1 Peer 2
Bad if we start doing more CPU intensive stuff
640×480 at < 1fps!
Peer 1 Peer 2
Haar-cascade Object Detection with OpenCV
https://github.com/agonza1/native-webrtc-peer-to-peer
DeepLab MobileNetv2 image segmentation with TF
Example:
Video Face
Detection
on a
Raspberry Pi
WebRTC.ventures August 2019
WebRTC.ventures August 2019
Live Streaming with image
detection on the media server
Kurento already has some
modules…
• Some examples exist
• We can use WebRTC in both
legs easily
Janus + OpenCV
• Well maintained (RTP plugin
works great)
• We will need to create a new
plugin… or not?
There are other options too
But we can’t do everything
WebRTC.ventures August 2019
Live Streaming with image
detection on Kurento
WebRTC.ventures August 2019
Live Streaming with image
detection on Janus
Goal First try
RTP
Media
Server
RTP
VS
Video parsing and encoding using GStreamer
magic is not that easy
WebRTC.ventures August 2019
Live Streaming with image
detection on Janus
The video OpenCV service captures and processes the RTP video
stream
const vCap = new cv.VideoCapture('udpsrc port=5000 ! application/x-rtp,payload=96 ! rtph264depay !
h264parse ! avdec_h264 ! decodebin ! videoconvert ! appsink location=/dev/stdout');
const w = new cv.VideoWriter('appsrc ! videoconvert ! video/x-
raw,format=I420,width=640,height=480,framerate=25/1 ! x264enc ! rtph264pay ! udpsink host=127.0.0.1
port=8004', 0, 25, new cv.Size(640, 480));
while (!done) {
let frame = vCap.read();
// process frame
w.write(pFrame);
}
WebRTC.ventures August 2019
Live Streaming with image
detection on Janus
Thief Detected!
Viewers: 2
Frameworks Latency
(ms)
MaxCPU Framerate Bitrate
Raspivid + Gstreamer + OpenCV
Janus Streaming
300-2000* 1% 30 fps 150
*Depending on the GStreamer configuration we can optimize for latency
**We used Haar-cascade face detection
WebRTC.ventures August 2019
DEMO?
WebRTC.ventures August 2019
Some Conclusions
Camera WebRTC live streaming
with video ML operations at
under half second latency is
possible
ML/AI on the edge is easier to
scale but today is limited
By optimizing the algorithm and
the transcoding it is possible to
reduce 80% the latency
ML/AI on the server provides
higher quality without affecting
the client battery but has
scalability and cost challenges
WebRTC.ventures August 2019
Projects Links
Native WebRTC with OpenCV
https://github.com/agonza1/native-webrtc-peer-to-peer/tree/opencv-facedetection
Native WebRTC with TF.js
https://github.com/agonza1/native-webrtc-peer-to-peer/tree/tensorflowjs
WebRTC Live Streaming using Janus and OpenCV
https://github.com/agonza1/WebRTC-Live-Streaming-with-AI
WebRTC Live Streaming using Kurento and OpenCV face detection
https://github.com/agonza1/kurento-rpi-live-streaming
@lbertogon
Alberto GonzalezTrastoy
WebRTC.ventures
Thanks!! Questions?

More Related Content

Recently uploaded

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Lessons learned building an AI powered live streaming camera

  • 1. Lessons learned building an AI powered live streaming camera @lbertogon Alberto González Trastoy WebRTC.Ventures
  • 2. WebRTC.ventures August 2019 One to Many Videogame playersBroadcastersBrands & celebritiesContent creators
  • 3. WebRTC.ventures August 2019 Live Streaming Technologies 0 20 40 60 80 100 120 2004-01 2004-05 2004-09 2005-01 2005-05 2005-09 2006-01 2006-05 2006-09 2007-01 2007-05 2007-09 2008-01 2008-05 2008-09 2009-01 2009-05 2009-09 2010-01 2010-05 2010-09 2011-01 2011-05 2011-09 2012-01 2012-05 2012-09 2013-01 2013-05 2013-09 2014-01 2014-05 2014-09 2015-01 2015-05 2015-09 2016-01 2016-05 2016-09 2017-01 2017-05 2017-09 2018-01 2018-05 2018-09 2019-01 2019-05 Live Streaming Protocol Trends Interest Over Time WebRTC: (Worldwide) HTTP Live Streaming: (Worldwide) Real-Time Messaging Protocol: (Worldwide) Source: Google Trends
  • 4. RTMP WebRTC.ventures August 2019 TCP based Adaptive bitrate streaming Low latency (< 1 sec) RTMP does not work in HTML5, iOS or Android natively Video Stream RTMP Server Client with Flash
  • 5. HLS WebRTC.ventures August 2019 TCP based High latency (30s-60s) HLS works in all major OS and browsers natively Video Stream HLS Server Web client
  • 6. WebRTC WebRTC.ventures August 2019 UDP based Adaptive bitrate streaming Low latency (<1s) Works in all major OS and browsers natively Video Stream WebRTC gateway Web client
  • 7. WebRTC.ventures August 2019 WebRTC live streaming options WebRTC native peer to peer WebRTC Media Server CPaaS
  • 8. WebRTC.ventures August 2019 WebRTC native peer to peer live streaming • It is cheap! • But doesn't sound like a good idea… • Broadcaster will need to upload its stream as many times as there are viewers • And the processing will be done in the broadcaster
  • 9. WebRTC.ventures August 2019 WebRTC with media server for live streaming AI Video Processing on the edge • Easier to develop and test • Cheaper for the provider AI Video Processing on the server • Low battery consumption for clients • “No” CPU limitations “The future of AI is on the edge” Samsung “ML algorithms that continuously learn require the computational horsepower and storage that only a server can provide” Security Magazine
  • 10. WebRTC.ventures August 2019 Just use a CPaaS for live streaming • Easier to implement • It is more expensive to use • No infra maintenance • The processing is not easy to do on a server that you don’t manage CPaaS Infraestructure
  • 11. AI, AI everywhere… 2/3 of our 2019 WebRTC survey responders are working or WebRTC application with AI
  • 12. WebRTC.ventures August 2019 AI image detection options OpenCV TensorFlow More options available I can train my algorithm Faster manipulating data Easy to use But there are may other alternatives… Someone said PyTorch? Combine both?
  • 14. WebRTC.ventures August 2019 How to stream video from a Raspberry Pi There are many options and frameworks… Comparison in a Raspberry Pi 3 Framework Latency (ms) CPU Framerate Bitrate Raspivid + VLC server 3000-4000 2% 30 fps 150 kbps UV4L + VLC server 2000-3000 3% 30 fps 150 Kbps Raspivid + Gstreamer* RTP to Janus 1000-2000 2% 30 fps 150 kbps UV4L WebRTC to Kurento 100-200 90% 30 fps 150 kbps UV4L WebRTC to Janus 100-200 90% 30 fps 150 kbps *Using the default x264 encoding without playing with parameters
  • 15. WebRTC.ventures August 2019 Live Streaming with image detection on the edge b
  • 16. WebRTC.ventures August 2019 Live Streaming with image detection on the edge OK/Slow when doing basic operations 640×480 at < 15fps Peer 1 Peer 2 Bad if we start doing more CPU intensive stuff 640×480 at < 1fps! Peer 1 Peer 2 Haar-cascade Object Detection with OpenCV https://github.com/agonza1/native-webrtc-peer-to-peer DeepLab MobileNetv2 image segmentation with TF
  • 17. Example: Video Face Detection on a Raspberry Pi WebRTC.ventures August 2019
  • 18. WebRTC.ventures August 2019 Live Streaming with image detection on the media server Kurento already has some modules… • Some examples exist • We can use WebRTC in both legs easily Janus + OpenCV • Well maintained (RTP plugin works great) • We will need to create a new plugin… or not? There are other options too But we can’t do everything
  • 19. WebRTC.ventures August 2019 Live Streaming with image detection on Kurento
  • 20. WebRTC.ventures August 2019 Live Streaming with image detection on Janus Goal First try RTP Media Server RTP VS Video parsing and encoding using GStreamer magic is not that easy
  • 21. WebRTC.ventures August 2019 Live Streaming with image detection on Janus The video OpenCV service captures and processes the RTP video stream const vCap = new cv.VideoCapture('udpsrc port=5000 ! application/x-rtp,payload=96 ! rtph264depay ! h264parse ! avdec_h264 ! decodebin ! videoconvert ! appsink location=/dev/stdout'); const w = new cv.VideoWriter('appsrc ! videoconvert ! video/x- raw,format=I420,width=640,height=480,framerate=25/1 ! x264enc ! rtph264pay ! udpsink host=127.0.0.1 port=8004', 0, 25, new cv.Size(640, 480)); while (!done) { let frame = vCap.read(); // process frame w.write(pFrame); }
  • 22. WebRTC.ventures August 2019 Live Streaming with image detection on Janus Thief Detected! Viewers: 2 Frameworks Latency (ms) MaxCPU Framerate Bitrate Raspivid + Gstreamer + OpenCV Janus Streaming 300-2000* 1% 30 fps 150 *Depending on the GStreamer configuration we can optimize for latency **We used Haar-cascade face detection
  • 24. WebRTC.ventures August 2019 Some Conclusions Camera WebRTC live streaming with video ML operations at under half second latency is possible ML/AI on the edge is easier to scale but today is limited By optimizing the algorithm and the transcoding it is possible to reduce 80% the latency ML/AI on the server provides higher quality without affecting the client battery but has scalability and cost challenges
  • 25. WebRTC.ventures August 2019 Projects Links Native WebRTC with OpenCV https://github.com/agonza1/native-webrtc-peer-to-peer/tree/opencv-facedetection Native WebRTC with TF.js https://github.com/agonza1/native-webrtc-peer-to-peer/tree/tensorflowjs WebRTC Live Streaming using Janus and OpenCV https://github.com/agonza1/WebRTC-Live-Streaming-with-AI WebRTC Live Streaming using Kurento and OpenCV face detection https://github.com/agonza1/kurento-rpi-live-streaming

Editor's Notes

  1. For those of you I haven’t met yet… I came to ClueCon all the way from Chicago uptown to talk about some of the things I learned building live streaming video applications for several projects through a demo project using Raspberry Pi what options are there…
  2. Use cases in many verticals: Content creation/social networks Ads Broadcasting/news Livestreaming video games or playing live (lately has become very popular, on sites such as Twitch. By 2014, Twitch streams had more traffic than HBO's online service! And what about HQ trivia?!)
  3. WebRTC, HLS and RTMP protocols (search popularity)
  4. Real-Time Messaging Protocol (RTMP) was initially a proprietary protocol developed for streaming audio, video and data over the Internet, between a Flash player and a server.
  5. HLS streams video by breaking the overall stream into a sequence of small HTTP-based file downloads, each download loading one short chunk of the stream
  6. WebRTC native peer to peer WebRTC MCU or SFU WebRTC using CPaaS
  7. There is another option of relaying the stream to another peer and this other peer relays it to another one, and so on…This is great to solve the bandwidth/CPU issue with the broadcaster but will end up adding a lot of latency and quality degradation!
  8. This hybrid approach is a very common approach Some camera manufacturers have reserved space on their cameras to allow third-party plugin analytics to be installed which pass data directly to the server. The video doesn’t need to be decoded which saves precious CPU/GPU cycles
  9. AI in and for RTC Speech Analytics Voicebots / AI assistants Computer Vision RTC Optimization (safari facetime making you appear to see at the other person when looking at the screen) Forecasting events
  10. It is not an apples to apples comparison but definitely 2 well known ML frameworks capable of image detection OpenCV: easy to use, its CPU performance is better and has been tested more. More robust! TensorFlow: more complex, I can train my algorithm. Wider set of tools around TensorFlow We could also combine both!
  11. The chart is not mine (from wiki) but it is a great example of k-means clustering. Which is used for image segmentation. There many methods, using models or motion of the image too. Cluster is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other.
  12. H264 ALL 3 ABOVE VP8 when using WebRTC (so no transcoding needed on the server)
  13. Good: Cheaper, which means that it will be easier to scale. Bad: limited by CPU, device might heat up, battery
  14. Haar feature-based cascade classifiers is an effective object detection method. With higher execution speed, Haar-based classifiers typically involve less computations (the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it) TF: trained with VOC 2012 (Visual Object Classes Challenge 2012) http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#introduction Summary: while well-trained CNNs could learn more parameters (and thus detect a larger variety of faces), Haar-based classifiers run faster. If we need very high quality algorithm with CNN (a lot of success rate) running things on the edge becomes a problem today
  15. If you are going to do image detection on the edge, do AI on the client/viewer side or don’t do WebRTC on the RPI Send RTP with effects Hardware accelerated H264?
  16. If you are going to do image detection on the edge, do AI on the client/viewer side or don’t do WebRTC on the RPI Send RTP with effects Processing time above between 100ms to 700ms per frame! CPU without WebRTC usage goes up to 90% WebRTC + OpenCV on RPI starts throwing frames…
  17. The easiest way to stream to browser is just to stream images, although this isn't a performant solution. FreeSWITCH, Wowza, RED5…
  18. Kurento already has modules to go RPI can handle WebRTC at < 60% of CPU 200-500ms at 500kpps Can someone guess from which show is that helmet?
  19. The easiest way to stream to browser is just to stream images, although this isn't a performant solution… I want to keep the video and tried with gstreamer, I was able to send media, modify it and stream it to Janus. Somewhere in the service OpenCV service I am generating a malformed video
  20. But finally got it right! format=I420 explicitly in the videoWriter of OpenCV
  21. If you are going to do image detection on the edge, do AI on the client/viewer side or don’t do WebRTC on the RPI Send RTP with effects Hardware accelerated H264?
  22. Then bottleneck in Gstreamer sending the RTP stream, transcoding for processing. Etc We can improve it by changing the configuration, for example adding tune=zerolatency we are being lossless, passing frames even if they aren’t in order Then the problem was the processing in OpenCV, without optimizing the algorithm we had a lot of missing frames and relatively latencies of about 1 second To optimize the OpenCV face detection. Bitrate increase didn’t affect a lot latency, for example, 1Mbps only increased the CPU usage a bit, from 1 to 3% for example in the case of RTP stream.
  23. Then the problem was the processing in OpenCV, without optimizing the algorithm we had a lot of missing frames and relatively latencies of about 1 second To optimize the OpenCV face detection we just did 2 things: 1) Changing the minimum possible object size. Objects smaller than that are ignored. Processing will improve a lot! 50-80% reduction 2) Play with scaleFactor – Parameter specifying how much the image size is reduced at each image scale
  24. And that’s it! Thank you and feel free to ask me any questions! In the future I hope to build something more complex with OpenCV…I have a couple of ideas already 