This document provides an overview of HTML5 Real-Time Communications (RTC) including:
- The RTC API allows for browser-based real-time communication like voice and video calls without plugins.
- Signaling coordinates session control and exchange of network and media capabilities between peers.
- The RTCPeerConnection API manages audio/video calls and data channels between browsers.
- Signaling uses Session Description Protocol to coordinate network addresses and media exchange.
2. Agenda
HTML5 RTC API Review
Signaling: session control, network and media information
RTCPeerConnection API
RTC Workshop / Demo
3. HTML5 RTC API Review
• Web Real-Time Communications
• Defines standards to enable browser based sessions
(voice, video, Collaboration, …) without the need of
custom clients or plugins
• Builds on HTLM5 capabilities with JavaScript
• Intended for all browsers to support
– Chrome, Firefox, Safari, Opera, IE …
• Apple (Safari) not at the table
4. HTML5 RTC API Review
Application Requirements
• Get streaming audio, video or other data.
• Get network information such as IP address and port, and exchange this with other WebRTC
clients (known as peers) to enable connection, even through NATs and firewalls
• Coordinate 'signaling' communication to report errors and initiate or close sessions.
• Exchange information about media and client capability, such as resolution and codecs.
• Communicate streaming audio, video or data.
5. HTML5 RTC API Review
MediaStream: get access to data streams, such as from the user's camera and
microphone.
RTCPeerConnection: audio or video calling, with facilities for encryption and
bandwidth management.
RTCDataChannel: peer-to-peer communication of generic data.
Available APIs
6. Signaling
a mechanism to coordinate communication and to send control messages.
Signaling methods and protocols are not specified by WebRTC: signaling is not
part of the RTCPeerConnection API.
Developers can choose whatever messaging protocol they prefer, such as SIP or
XMPP, and any appropriate duplex (two-way) communication channel.
WebSocket is a protocol providing full-duplex communication channels over a
single TCP connection and both SIP and XMPP can be used with a websocket
connection.
8. Signaling
Signaling is used to exchange three types of information:
Session control messages: to initialize or close communication and report
errors.
Network configuration: to the outside world, what's my computer's IP address
and port?
Media capabilities: what codecs and resolutions can be handled by my
browser and the browser it wants to communicate with?
Signaling makes use of the SDP for gathering the network addresses and port
numbers that can be used for the media exchange.
Once each browser has sent its own session description object and also
received the session description from the other peer’s browser the media
exchange can begin between the clients
9. Internetwork connectivity
Network Address Translator (NAT) is a device made for assigning public
addresses to devices inside a private local network.
STUN protocol : STUN stands for Session Traversal Utilities for NAT, and when
a client wants to know their public IP address they ask the STUN server.
TURN protocol : If it was not possible for STUN to provide the host with a
public IP address when requested, then TURN will address this problem by
relaying the traffic through the cloud.
ICE protocol : ICE finds communication paths between peers. ICE first
requests the end user's public IP address from the host's operating system.
When ICE finds an address, it adds this to the RTCPeerConnection object.
11. RTCPeerConnection API
Basic RTCPeerConnection usage involves negotiating a connection between your
local machine and a remote one by generating Session Description Protocol to
exchange between the two.
var pc = new RTCPeerConnection();
The caller starts the process by sending an offer to the remote device, which
responds by either accepting or rejecting the connection request.
pc.createOffer(function(offer) {
pc.setLocalDescription(new RTCSessionDescription(offer), function() {
// send the offer to a server to be forwarded to the friend you're calling.
}, error);
}, error);
13. RTCPeerConnection API
On the opposite end, the remote device will receive the offer from the server
using whatever protocol is being used to do so.
An RTCSessionDescription object is created and set up as the remote
description by calling RTCPeerConnection.setRemoteDescription().
Then an answer is created using RTCPeerConnection.createAnswer() and sent
back to the server, which forwards it to the caller.
On the original machine, the response is received. Once that happens, call
RTCPeerConnection.setRemoteDescription() to set the response as the remote
end of the connection.
The Session Initiation Protocol (SIP) is a communications protocol for signaling and controlling multimedia communication sessions. Internet telephony for voice and video calls, as well as instant messaging, over Internet Protocol (IP) networks.
The protocol defines the messages that are sent between endpoints, which govern establishment, termination and other essential elements of a call.
SIP can be used for creating, modifying and terminating sessions consisting of one or several media streams. SIP is an application layer protocol designed to be independent of the underlying transport layer. It is a text-based protocol, incorporating many elements of the Hypertext Transfer Protocol (HTTP) and the Simple Mail Transfer Protocol(SMTP).[1]
Extensible Messaging and Presence Protocol (XMPP) is a communications protocol for message-oriented middleware based on XML (Extensible Markup Language).[1] It enables the near-real-time exchange of structured yet extensible data between any two or more network entities
This makes it more complicated to route a peertopeer communication between web browsers. The devices inside a private network holds private addresses, and it is not possible to make a connection with someone outside the private network without a public address.
The STUN server responds with the public IP address and now does the WebRTC application know its public address and if the remote peer has also obtained its public IP address, the clients can send media to each other through their remote peer’s NAT.
And this will always work since it is out in the public internet and therefore anybody can contact it. The reason this is not the first option to use is because this method consumes a lot of bandwidth and STUN is a cheaper option.
m, but if the host is behind a NAT then that method will fail. So the second strategy will be to employ a STUN server and ask for the address. If that also fails then the last remaining method ICE will use is a TURN server to relay the communication
ICE also checks the connectivity between the peers.
If you are the one initiating the call, you would use navigator.getUserMedia() to get a video stream, then add the stream to the RTCPeerConnection.
Once that's been done, call RTCPeerConnection.createOffer() to create an offer, configure the offer, then send it to the server through which the connection is being mediated.
Once the offer arrives, navigator.getUserMedia() is once again used to create the stream, which is added to the RTCPeerConnection.