Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
webRTC infrastructure
the secret sauce
Dr. Alex Gouaillard,
CTO – Temasys Communications
Plan of the presentation
- The reference code
- A little bit better, but still the usual suspects
- The first real pain: S...
Plan of the presentation
- The reference code
- A little bit better, but still the usual suspects
- The first real pain: S...
Alice
Web
server
Signaling
server
STUN
• Scalable
• Maintained
• Reliable ??
The reference code: appRTC
TURN
MNGT
TURN
HTT...
Alice
Signaling
server
The reference Code:
appRTC Signaling Design
HTTP / AJAX / REST
ICE UA
GAE Channel
Alice
Signaling
server
The reference Code:
appRTC Signaling Design
HTTP / AJAX / REST
GAE API Calls
GAE Channel
DB
Signali...
Alice
Signaling
server
The reference Code:
appRTC Signaling Design
HTTP / AJAX / REST
GAE API Calls
GAE Channel
DB
Signali...
DONEFROZENINCALLCONNECTIONHANDSHAKE(SDPO/A).
new
checking
connected
disconnected
failed
Completed
close
new
gathering
comp...
DONEFROZENINCALLCONNECTIONHANDSHAKE(SDPO/A).
new
checking
connected
disconnected
failed
Completed
close
new
gathering
comp...
The reference code
you can make it reliable
- But then, the handshake is stretched to up to 10 s …
Plan of the presentation
- The reference code
- A little bit better, but still the usual suspects
- The first real pain: S...
Alice
Node.js
server
Better Design
Usual architecture
Socket.io
• Much more Reliable
• Much faster (< 1s)
• Huge Ecosystem...
Better Design
Scalability
• Bandwidth intensive instances in AWS?
• Socket intensive instances in AWS?
• Socket.io 0.9 not...
Plan of the presentation
- The reference code
- A little bit better, but still the usual suspects
- The first real pain: S...
Big Pain: Scalability
CPU Scalability
• Bandwidth intensive instances in AWS?
• BIG instance with HVM (c3.8xl)
• Socket.io...
Big Pain: Scalability
Horizontal Scalability
• Horizontal scalability• Business as usual: reverse proxy load
balancing.
• ...
Big Pain: Scalability
Node Zombification
• Node fails often
• B-E update deployment = downtime?
core
Cluster API is nice, ...
Plan of the presentation
- The reference code
- A little bit better, but still the usual suspects
- The first real pain: S...
Advanced infrastructure: Others
- Session management
- Auth
- Usage
- Payment
Keys to state of the art webRTC infrastructu...
Upcoming SlideShare
Loading in …5
×

WebRTC Infrastructure scalability notes - Geek'n Kranky - June 2014 @ Google SF

566 views

Published on

Slides of the presentation on webRTC infrastructure scalability given at google's SF HQ on june 2014 (post google I/O)

Published in: Software
  • Be the first to comment

  • Be the first to like this

WebRTC Infrastructure scalability notes - Geek'n Kranky - June 2014 @ Google SF

  1. 1. webRTC infrastructure the secret sauce Dr. Alex Gouaillard, CTO – Temasys Communications
  2. 2. Plan of the presentation - The reference code - A little bit better, but still the usual suspects - The first real pain: Scalability - Others things to consider
  3. 3. Plan of the presentation - The reference code - A little bit better, but still the usual suspects - The first real pain: Scalability - Others things to consider
  4. 4. Alice Web server Signaling server STUN • Scalable • Maintained • Reliable ?? The reference code: appRTC TURN MNGT TURN HTTP / AJAX / REST ICE UA GAE Channel
  5. 5. Alice Signaling server The reference Code: appRTC Signaling Design HTTP / AJAX / REST ICE UA GAE Channel
  6. 6. Alice Signaling server The reference Code: appRTC Signaling Design HTTP / AJAX / REST GAE API Calls GAE Channel DB Signaling server
  7. 7. Alice Signaling server The reference Code: appRTC Signaling Design HTTP / AJAX / REST GAE API Calls GAE Channel DB Signaling server Induce delays Just unreliable
  8. 8. DONEFROZENINCALLCONNECTIONHANDSHAKE(SDPO/A). new checking connected disconnected failed Completed close new gathering complete CALLER SIG-SERVER CALLEE stable have-local-offer stable Close CONNECT ENTER WELCOME OFFER ANSWER Create PC Add local stream(s) Create offer <modify sdp> SetLocal(offer) Sending offer Create PC SetRemote(offer) addRemoteStream(s) Add local stream(s) Create answer <modify sdp> SetLocal(answer) Send answer stable Have-remote- offer stable new gathering complete onIceCandidate <Filter candidates> Send candidate <Filter candidates> addIceCandidate onIceCandidate <Filter candidates> Send candidate <Filter candidates> addIceCandidate new checking connected disconnected failed Completed close Close © Temasys Communications, pvt, ltd, 2014 Document provided under CC BY-NC 4.0 CANDIDATES Peer Connection ICE Connection ICE Gathering SetRemote(answer) addRemoteStream BYE
  9. 9. DONEFROZENINCALLCONNECTIONHANDSHAKE(SDPO/A). new checking connected disconnected failed Completed close new gathering complete CALLER SIG-SERVER CALLEE stable have-local-offer stable Close CONNECT ENTER WELCOME OFFER ANSWER Create PC Add local stream(s) Create offer <modify sdp> SetLocal(offer) Sending offer Create PC SetRemote(offer) addRemoteStream(s) Add local stream(s) Create answer <modify sdp> SetLocal(answer) Send answer stable Have-remote- offer stable new gathering complete onIceCandidate <Filter candidates> Send candidate <Filter candidates> addIceCandidate onIceCandidate <Filter candidates> Send candidate <Filter candidates> addIceCandidate new checking connected disconnected failed Completed close Close © Temasys Communications, pvt, ltd, 2014 Document provided under CC BY-NC 4.0 CANDIDATES Peer Connection ICE Connection ICE Gathering SetRemote(answer) addRemoteStream BYE
  10. 10. The reference code you can make it reliable - But then, the handshake is stretched to up to 10 s …
  11. 11. Plan of the presentation - The reference code - A little bit better, but still the usual suspects - The first real pain: Scalability - Others things to consider
  12. 12. Alice Node.js server Better Design Usual architecture Socket.io • Much more Reliable • Much faster (< 1s) • Huge Ecosystem • Not Scalable • Not Maintained => many API/SDK
  13. 13. Better Design Scalability • Bandwidth intensive instances in AWS? • Socket intensive instances in AWS? • Socket.io 0.9 not CPU scalable • Horizontal scalability • Node fails often • B-E update deployment = downtime
  14. 14. Plan of the presentation - The reference code - A little bit better, but still the usual suspects - The first real pain: Scalability - Others things to consider
  15. 15. Big Pain: Scalability CPU Scalability • Bandwidth intensive instances in AWS? • BIG instance with HVM (c3.8xl) • Socket.io 0.9 not CPU scalable • socket.io 0.9 scales by using a store replicate the data object to each instance of the signaling server. • It just replicates load everywhere... All instances are doing the same work essentially • Socket.io 1.x changes the way the store works. The stores become and event bus, and each instance run independently only handling clients connected to them. • they all listen to all events however so two clients connected to different instances can still talk! core Redis store
  16. 16. Big Pain: Scalability Horizontal Scalability • Horizontal scalability• Business as usual: reverse proxy load balancing. • Usual suspect: NGINX Redis store NGINX
  17. 17. Big Pain: Scalability Node Zombification • Node fails often • B-E update deployment = downtime? core Cluster API is nice, Forever is nice, pm2 is better • Keep alive • Clustering / load balancing • Log aggregation • Terminal monitoring • …
  18. 18. Plan of the presentation - The reference code - A little bit better, but still the usual suspects - The first real pain: Scalability - Others things to consider
  19. 19. Advanced infrastructure: Others - Session management - Auth - Usage - Payment Keys to state of the art webRTC infrastructure - MCUs - TURN/STUN

×