Scalable Infrastructure for Distributed Video
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Scalable Infrastructure for Distributed Video

on

  • 893 views

 

Statistics

Views

Total Views
893
Views on SlideShare
893
Embed Views
0

Actions

Likes
0
Downloads
28
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Scalable Infrastructure for Distributed Video Document Transcript

  • 1. White Paper Scalable Infrastructure for Distributed Video June 2008
  • 2. Scalable Infrastructure for Distributed Video June 2008 White Paper Table of Contents ABSTRACT ....................................................................................................................... 3 INTRODUCTION............................................................................................................... 4 Endpoint Intelligence ......................................................................................................... 4 Lightweight Protocols ........................................................................................................ 5 Separating Signaling and Media ....................................................................................... 5 Single-protocol vs. Multi-protocol Servers......................................................................... 6 Server Networking............................................................................................................. 7 Server Pool ....................................................................................................................... 8 Scalable conferencing server ............................................................................................ 9 Scalable Hardware Architecture...................................................................................... 10 Cascading Conferencing Servers ................................................................................... 11 Centralized Conferencing Resource Management ......................................................... 11 Scalable gateways .......................................................................................................... 12 Signaling Gateway .......................................................................................................... 13 Media Gateway ............................................................................................................... 14 Presence scalability ........................................................................................................ 15 Federation ....................................................................................................................... 16 Scalable Directories ........................................................................................................ 17 Dedicated Video Directory .............................................................................................. 18 Application scalability ...................................................................................................... 20 Calendaring Application............................................................................................ 20 Recording Application............................................................................................... 21 Streaming Application............................................................................................... 22 Management and Provisioning Applications ................................................................... 24 Integration with 3rd party Applications ............................................................................ 25 Scalable Firewall Traversal ............................................................................................. 25 CONCLUSION ................................................................................................................ 26 References ..................................................................................................................... 27 2 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 3. Scalable Infrastructure for Distributed Video June 2008 White Paper ABSTRACT Video as a tool for enterprise communication is departing the conference room and becoming a standard element in everyday interactions and workflows. While the benefits to organizations are myriad – faster, more informed decision-making; improved knowledge sharing; reduced operating costs, and more – the growth of video is having a profound impact on the scalability requirements of communication infrastructures. Commonly required to support perhaps dozens of room-based video conferencing systems, enterprise communication infrastructures are now required to sustain up to tens of thousands of users, not only in conference rooms but at their desktops, immersive environments, remote sites, on mobile devices, and beyond. Additionally, communication over video is occurring as both real-time point-to-point and multi-point interactions, and as “on-demand” recording and viewing, in which content is stored, searched for, and replay video back over the communication infrastructure. Distributed video provides a solution to the challenge of scalability, and an architecture combining the high quality of room-based telepresence experiences with the ease-of-use of personal video. This paper explores several key components that enable creation of a scalable, distributed video architecture supporting the evolving requirements of the video-enabled enterprise, as well as service providers seeking to offer video services to their enterprise and SMB customers. Fundamental elements of a distributed video architecture include the call processing server, the conferencing server (MCU) and the gateway to other networks, while a more complete solution also includes directories and the ability to integrate with third-party applications. Today, presence capabilities provide important functionality and are considered a core requirement for communication systems. Therefore, this paper will address both the scalability of the call processing server, conferencing server and gateways and the scalability aspects of directories, presence servers and the means for integration with third-party applications. 3 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 4. Scalable Infrastructure for Distributed Video June 2008 White Paper INTRODUCTION The heart of any communication system is the communication server. Also termed a ‘call manager’, ‘communication manager’, ‘gatekeeper’, ‘SIP server’, or ‘IP-PBX’, by any name a communication server performs the same essential task: keeping track of all communication endpoints in the network, and providing services according to a particular endpoint’s profile, i.e. an endpoint installed in a corporate lobby is restricted from making external calls. Modern communication servers can also apply policies to a given user instead of an endpoint through a logon/authentication procedure, e.g., Bill in Customer Service can only receive calls, while Joe in Accounting can place inbound and outbound calls. Communication servers can also process calls among endpoints, keep track of call states, interact with endpoints in the network to provide logical prompts and options to users, and keep records for each call (Call Detail Records) for inter-departmental accounting and for billing, which is particularly important if the communication system is shared among several companies or provided through a service provider. What makes a communication server scalable? Scalability is basically the ability to serve more users, i.e., if one server can support maximum 1,000 users and another server can support 10,000, the second server is 10 times more scalable than the first. Keeping 1,000 or 10,000 users in the user database of the server is usually not an issue. So what really limits the scalability is the amount of calls per second that the server can process. Statistically, when more users are registered with the server, more calls are placed. If the number of calls per second exceeds the maximum supported by the server’s architecture, it becomes slow and starts rejecting/dropping calls. The number of calls per second that a server can process depends mainly on the complexity of the networking protocols. Endpoint Intelligence Protocols between an endpoint and server can be stimulus or functional. For example, proprietary signaling protocols used in legacy PBXs are stimulus and some standard protocols such as MGCP are stimulus, too. Stimulus protocols are used to keep endpoints simple and inexpensive. The endpoint’s profile (number of keys, size of display, etc.) as well as the call state information (e.g., is the endpoint on-hook or off- hook, is there an active call, etc.) are all kept in the server. If the user lifts the handset or presses a key, the endpoint sends a code to the server. The server then interprets the user’s action and sends instructions to the endpoint how to respond, e.g. what string of symbols to show on the display. The system therefore fully controls the endpoint and can place calls, answer calls and perform all call features on the endpoint’s behalf. Since all information about the device is centrally stored in the communication server, scalability of such systems is limited. 4 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 5. Scalable Infrastructure for Distributed Video June 2008 White Paper Standard protocols such as the Session Initiation Protocol (SIP) and H.323 on the other hand are functional, i.e., the endpoint itself is intelligent and can place calls, initiate transfers, etc. based on user input. The server only receives signaling messages, executes them or passes them to other network elements that can execute them. Putting intelligence in the endpoints allows communication servers to be simplified and made more scalable. Lightweight Protocols Lightweight protocols such as SIP require a smaller number of messages to setup and tear down a call1). The amount of call state information that the server has to store is also less. This automatically increases the scalability of the communication server. Another such protocol is the Lightweight Directory Access Protocol (LDAP) that is used by endpoints and communication servers to retrieve information from directories; this is discussed below. The directory is the list of all users with their contact information. If the directory is embedded in the communication server, the complexity of the directory access protocol directly impacts the server performance, i.e. using a lightweight directory access protocol is a requirement for a scalable communication system. For example, management applications running on Polycom Proxias™ Application Server and Development platform can query directories via LDAP. To increase scalability and decrease the load on the LDAP server, some of the information is cached and updated periodically. Separating Signaling and Media Media includes the audio and video streams among endpoints. Audio is usually compressed, e.g. using one of the standard G.7xx codecs while video is usually compressed by one of the standard H.26x codecs. Processing media, and especially video media, is very resource-intensive. Therefore, the best way to keep the communication server scalable is to process the media separately (see Figure 1). This is possible with most modern protocols – both SIP and H.323 clearly separate signaling from the media1). If the communication server only processes signaling messages, but no media, its scalability is higher by several magnitudes. 5 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 6. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 1: Scalability through signaling and media separation For example, Proxias is a signaling server without any media stream processing. Similar to the configuration in Figure 1, Proxias supports both H.323 and SIP. Both signaling stacks allow video end points to establish RTP/RTCP stream using either one of the protocols. This architecture works very well within an enterprise. But, if a firewall must be traversed, many solutions that combine signaling and media and feed them through the so called Session Border Controller will also limit scalability. This is explained in more detail below. Single-protocol vs. Multi-protocol Servers Single-protocol servers can inherently scale better than multi-protocol servers. The reason is that multi-protocol servers must translate every call from one protocol to another, i.e. they have to understand both message formats and keep call state information for both call legs (see Figure 2). 6 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 7. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 2: Multi-protocol server architecture An analogy would be a group of people speaking the same language and another group of people speaking different languages using a translator. The second group will be slower in their discussion. Server Networking Following the guidelines above allows creating a communication server that scales to as many as 10,000-40,000 users. What if we need a solution for a company that has 100,000 employees worldwide? Additional scalability can be achieved through networking of communication servers. Networking is connecting two or more communication systems through a protocol, i.e. a common language understood by all systems in the network. Protocols for networking of systems are a little different from the protocols between an endpoint and a server. The difference is mainly around the fact that in the system-to-system protocol each side is a server that is responsible for many endpoints. It is inefficient to exchange information about each endpoint associated with a server separately. The information is aggregated and communicated through call routing rules. In H.323 networks, this is done by prefixes, i.e. sever A is configured to know that if an user dials ‘4’+ 5 digits, the destination is an user on server B. If the user dials only 5 digits, server A knows the call is local and will try to route it to the appropriate user on server A. SIP uses the domain name concept – similar to the email system. Server A has its own domain, e.g. serverA.enterpriseX.com, and server B has its own, e.g. serverB.enterpriseX.com. User A on server A is identified by the address userA@serverA.enterpriseX.com while user B on server B is identified by the address userB@serverB.enterpriseX.com. If user A dials the address of user B, server A will 7 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 8. Scalable Infrastructure for Distributed Video June 2008 White Paper recognize the domain of server B and send the call to server B. Addresses in the above format are called Universal Resource Identifiers (URIs). The question is how will users remember these numbers and prefixes (in H.323) and URIs (in SIP). Fortunately, they do not really need to do so because this information is stored in the directory and can be accessed, searched, and selected (clicked on) to place a call. Figure 3 displays server networking and the use of directories. Figure 3: Scalability through server networking For example, several servers running video applications on top of Proxias can register with a gatekeeper or a SIP server to provide increased scalability (beyond a single system). Server Pool In the discussion above, each of the communication servers in the network has different users. Another way of deploying multiple servers is as a redundant pool of resources that serve the same but larger group of users. This configuration has greater redundancy with two servers and can have even higher redundancy with 3 or more servers. If one server fails or is taken down for maintenance (e.g. upgrading software), incoming calls are just routed to another operational server. Figure 4 shows the configuration with two servers. 8 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 9. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 4: Server redundancy and load balancing This is, for example, the default configuration for video applications on Proxias. Two servers are deployed in a cluster configuration; additional servers could be added if needed. The database engine and storage is deployed on both servers. Both databases are kept in sync real time using data replication mechanism. This approach provides a simple deployment scenario and removes the single point of failure. Scalable conferencing server The conferencing server, also called Multipoint Conferencing Unit or MCU in the H.323 architecture, is the main component for multipoint calls. It receives audio and video streams from each endpoint participating in the conference, combines multiple images into one (this technology is known as Continuous Presence) and sends the combined image to the participating endpoints. The conferencing server can translate the audio and video from one format to another, i.e., receive video in H.263 and send video in H.264 format, receive audio in G.711 and send audio in Siren 22 format. This function is known as transcoding and requires significant computing resources (typically via Digital Signaling Processors = DSP’s), and unlike general purpose media servers, are designed specifically for processing business quality video. This is especially true for video because it involves decoding the digital video stream from one format into uncompressed video and then encoding it in another format. Note that scalability can be increased by using conference servers in video switched mode which circumvents transcoding (and therefore the server needs far less computing resources) but also limits the flexibility because all parties have to use the best common codec, resolution, and bit rate. 9 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 10. Scalable Infrastructure for Distributed Video June 2008 White Paper The external interfaces of the conference server require very high input / output speeds for the multiple audio-video streams. For example, if a server supports 80 participants @ 4 megabits per second each (normal speed for High Definition video with High Definition content sharing), the conferencing server must support 80*4 = 320 megabits per second input (from endpoints to server) and another 320 megabits per second output (from server to endpoints). Internally, the server works with uncompressed video which takes many gigabits per second on the internal interfaces, and requires fast internal communication links. Scalable Hardware Architecture One way to achieve scalability in the conferencing server is through deploying scalable hardware architecture such as the AdvancedTCA that is used in Polycom RMX 2000. AdvancedTCA is standard blade architecture, i.e., the standard ATCA blades are plugged into a standard ATCA chassis. This architecture delivers the high speed external interfaces, the even higher speed internal interfaces and large blades with ample space, electrical power, and cooling capacity to accommodate an array of DSPs necessary to process video2). Figure 5 shows multi-point configuration with a conferencing server. Figure 5: Conference server HW scalability ATCA allows for building larger and even more powerful servers. The existing ATCA blades can be used in larger chassis hosting up to 14 blades. Note that an alternative path to scalability is through stacking of small conferencing servers. However, this approach typically introduces additional delay since media travels from one server to another via external interfaces. The stacking approach is also inefficient from power 10 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 11. Scalable Infrastructure for Distributed Video June 2008 White Paper consumption and cooling perspective because each server has separate power supply unit and separate fans for cooling. The ‘green’ aspects of Polycom’s ATCA technology are discussed in detail in the article ‘AdvancedTCA - Green Technology for Data Centers’ in CompactPCI and AdvancedTCA Systems magazine3). Cascading Conferencing Servers When the scalability of a single conferencing server is exhausted, multiple servers can be connected through so called ‘cascading’ to handle larger number of conferences and participants. Cascading is a mechanism by which one conference server creates a link to another conference server. This is necessary, for example, when more participants want to join a conference than resources are available on any of the single servers. The conference server, or an application managing the server, recognizes that participants on two or more conference servers have joined a conference with the same conference identification and password. It then creates a link between the servers, thus connecting all participants in a single conference. Critical is the speed of creating the cascading link (ideally, this process should be hidden from the user) and the capability to mask the additional delay from the additional (cascaded) call leg. Another technical issue that has been long resolved in Polycom equipment is the picture in picture in picture effect from multiple Continuous Presence instances. Centralized Conferencing Resource Management To make a pool of conference servers behave as one huge conferencing server, we need a resource management application that tracks the incoming calls, routes them to the appropriate resource (e.g. based on available server resources but also based on available bandwidth to the location of this server) and automatically creates cascading links if a conference overflows to another server. Note that if the conference is pre- defined, the application server can select a conferencing server that has sufficient resources to handle the number of participants at the required bandwidth. Overflow situations are probable with ad-hoc conferencing where participants spontaneously join without any upfront reservation of resources. Figure 6 shows an example with a Polycom management application running on Proxias and managing the resources of three RMX 2000 conferencing servers. 11 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 12. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 6: Managing distributed conference resources The management application is designed to provide uninterrupted service by routing calls around failed or busy media servers. It also allows the ability to “busy out” media servers for maintenance activities. From the user point of view, the service is always available. The system can gradually grow from small deployments of 1-2 media servers to large deployments with many geographically dispersed media servers. System administrators can monitor daily usage and plan the expansion as necessary. This approach also provides a centralized mechanism to deploy a front end application to control and monitor conferencing activities across all media servers. The management application can also act as a load balancer in this scenario, i.e. it can distribute the load over a group of servers. The larger the resource pool, the more efficient is the load balancing function. This is very important to large global enterprises that have offices and conferencing servers spread across the world. The same technology can be used by service providers who can offer conferencing services globally by deploying video conferencing servers, and audio conferencing servers such as ReadiVoice, in central points of the network. The scenario works well in architectures such as SIP, where the Registrar function is separate from the Proxy function, i.e., the endpoint is registered with a SIP Registrar in the network but sends its calls to a pool of SIP Proxies. Scalable gateways Gateways are the gates to other networks. If we assume scalable SIP deployment, gateways will be necessary to connect to the installed base of H.323 or ISDN systems. 12 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 13. Scalable Infrastructure for Distributed Video June 2008 White Paper For connectivity to mobile video deployments, a gateway to H.324M may be required. Gateways are especially important when a new technology is rolled out, e.g., when new SIP systems are installed, because most of the users you want to talk to will likely still be using legacy systems. Most of the calls in these early stages will therefore be gateway calls. Gateways are not important in green-field installations (without any legacy equipment or when connection to outside legacy systems is not desired) or when the new network has reached critical mass and most of the calls stay within the same domain/protocol. Signaling Gateway If a gateway is required, the scalability of the gateway is critical in the early days of deployment of new technology. Similar to communication servers, the best way to achieve scalability here is through separating signaling from media and limiting the gateway function to signaling only. In this case the gateway is no different from a multi- protocol communication server. It receives messages in one format (e.g., SIP) and translates them into another (e.g., H.323) and vice versa. This architecture does not allow the gateway to scale to the levels of a single-protocol communication server but it can handle much higher load than if media is involved. Just as an example for the performance impact, if the single-protocol communication server scales to 30,000 users, adding support of a second protocol (in effect, creating a signaling gateway) may reduce the scalability to 3,000 simultaneous calls. If media processing is added to signaling processing, scalability may go down to 300 simultaneous audio-only calls or to 30 simultaneous audio-video calls. A signaling gateway is only a feasible solution if the media (audio and video) is in the same format. For example, both H.323 and SIP use the Real Time Protocol (RTP) for media and are therefore candidates for signaling-only gateway interoperation. Figure 7 includes the configuration. 13 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 14. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 7: Signaling gateway between SIP and H.323 There are several issues with the signaling gateway approach. Probably the most important one is that media encryption gets broken if following the respective standards. SIP for example refers to the Secure Real Time Protocol (SRTP) for media encryption. This mechanism is completely different from the Advanced Encryption System (AES) specified by H.323. Therefore, if you follow the standards on both sides and use a signaling gateway, you have to disable encryption, i.e., send audio and video in the clear. Media Gateway Using a media gateway helps overcome the security problem and gives network administrators more flexibility during the transition from one protocol to another. It does limit the scalability since the media gateway often needs to transcode video, i.e., requires DSPs and fast external and internal interfaces. Similar to a conference server, media gateways can scale by avoiding transcoding. The media gateway controls the communication with the endpoints, and transcoding is only necessary if the endpoints negotiate different audio/video algorithms, resolutions and bit rates. If the gateway enforces the same audio/video algorithm, resolution and bit rate between the endpoints, no transcoding is necessary. The media gateway is therefore very similar to a conferencing server, and if a call goes through a conferencing server and through a gateway (see Figure 8) it may get transcoded twice which typically results in decreased picture quality. Why is that? Let’s look at the analogy with language translation. If you translate from English to German, you lose some information but the quality is still acceptable. If you then give the German 14 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 15. Scalable Infrastructure for Distributed Video June 2008 White Paper version to someone else to translate it into Russian, the final version will be far away from the original, and probably not acceptable. Figure 8: Media gateway configuration Therefore, the logical question is: why not use the conferencing server as a gateway? It already must be in the network, and it does have the required functionality to support multiple protocols, e.g., RMX 2000 supports H.323, SIP, ISDN and PSTN. Note that media gateway (with transcoding) is a must when the connected networks have completely different physical layer, e.g. H.323 to ISDN or SIP to H.324M. The only disadvantage of using the conferencing server as a gateway is the relatively high price per port. Presence scalability Over the last few years, presence became entrenched in business communications, mainly through the use of Microsoft Office Communications Server and IBM SameTime. In recent industry discussions (early 2008), presence was cited as a key component of unified communications. Presence is delivered through client-server architecture; XMPP and SIMPLE are the two prevailing protocols for implementing it. The user interacts with the presence client which communicates with the presence server. The server keeps track of the presence status for all users on the system and can get presence information for users on other presence systems through the so-called federation (a form of networking). Similar to other servers, there are two ways to scale: create a scalable presence server that can 15 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 16. Scalable Infrastructure for Distributed Video June 2008 White Paper handle tens of thousands of users (Figure 9) or interconnect multiple presence servers in a network. Figure 9: Scalable presence server architectures With ever growing number of tools for automatic changes of the presence status and the growing number of contacts that users add to their buddy lists, presence servers are required to handle many, frequent updates and communicate them to a large group of users. The server has to then keep larger tables and communicate the change of presence status to larger group of clients. When the scalability of a single server is exhausted, networking techniques such as federation are deployed to support larger deployments. Note that while Instant Messaging is usually mixed with presence, it is a completely separate functionality that does not necessarily belong to a presence server. Federation Federation is a trust relationship between presence servers that allows them to exchange information about the presence status of their users. Figure 10 shows federation relationship between two presence servers. 16 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 17. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 10: Scalability through federation Note that the term ‘federation’ is sometimes used for more than just exchange of presence information, e.g. exchange of directory information, gatekeeper neighboring information and licensing information may be also called ‘federation’. From the two standard protocols for presence (XMPP and SIMPLE), XMPP has found wider adoption in the Internet which indicates that higher scalability is expected from this protocol. XMPP server federation follows the proven and scalable model of Internet email which meets the needs of the individual domain for flexibility and control. Each XMPP domain can define the level of security, quality of service and manageability that make sense for the organization. Exchange of presence information within one XMPP domain is through the XMPP server in this domain. The server exchanges presence information with peer XMPP servers in other organizations. Scalable Directories As discussed in the section on the communication server, directories solve the problem with different dialing formats. Corporate IT organizations have been converging dozens of directories into one directory structure that allows changes (adds, moves, and deletes) to be automatically propagated to applications across the enterprise. The goal is to be able to add or remove an employee in one master database and have all tools that employee uses (email, phone, presence/IM, and web) automatically learn about the change. The Lightweight Directory Access Protocol (LDAP) emerged as the standard for accessing directories, any directory. Polycom VC2 postulates that visual communication will be closely integrated in the corporate IT environment and this includes integrating 17 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 18. Scalable Infrastructure for Distributed Video June 2008 White Paper the directory of visual communication users with the IT directory. The ITU-T H.350 standard describes a LDAP schema for visual communication users, i.e., H.350 describes how to store VC specific parameters into and LDAP database. Figure 11 describes the configuration. Figure 11: Directory access mechanism LDAP and H.350 are supported in many popular directories such as Microsoft Active Directory and Sun OpenLDAP. Dedicated Video Directory Polycom’s VC2 vision clearly sees video fully integrated with the IT infrastructure in the enterprise. This includes integrating video directories with the IT directories that keep user information for network access, VPN access, email, web access, etc. This integration, however, will phase in via multiple stages. In the first stage, a dedicated video directory will communicate with the IT directory using the standard LDAP protocol. There are two reasons for using a dedicated video directory. First, endpoints today use pre-LDAP protocols such as the Polycom Global Address Book (GAB) protocol. Keeping the video directory separate allows support of GAB and, therefore, of legacy endpoints. Second, including a directory with the communication server makes a lot of sense since enterprises may want to pilot new technology and will want to get the system up and running quickly, i.e., without immediate integration with the corporate IT directory. Third, enterprise IT directories may not be yet configured for the H.350 schema (although they all have the capability to support H.350 schema). Figure 12 describes the network configuration. 18 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 19. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 12: Using internal directory connected to IT directory Once the system is approved and a decision is made to integrate it with the IT directory, the internal directory can be connected to the corporate IT directory. If there is no need to support legacy endpoints, the internal directory can be turned off and only the main IT directory can be used (see Figure 13). Figure 13: Direct integration with corporate IT directory 19 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 20. Scalable Infrastructure for Distributed Video June 2008 White Paper The enterprise directory is scalable as it includes all employees with their profiles. Important is to understand the impact of the H.350 extensions to the scalability of the directory. Depending on the implementation, the directory may be queried more or less frequently. LDAP is really not designed for fast lookups and the directory behind the LDAP server component maybe an old X.500 directory running on a mainframe. It is therefore a best practice to cache the information in the communications server or in the endpoints for a predefined period of time, so that frequent multiple queries for the same information are avoided. Application scalability It is not sufficient that only the visual communication core system is scalable. Applications that are connected to it must scale as well. This paper focuses on the applications of scheduling, recording, streaming, management and on the scalable integration with third-party applications. Calendaring Application While there is a trend of moving from scheduled to ad-hoc video call initiation, scheduling and calendaring applications remain very important to the regular video user. Scheduling involves the use of a separate tool to set time, participants and resources for a video call in the future; calendaring provides close integration with the standard calendaring software such as Microsoft Outlook/Exchange and IBM Lotus. An example for a calendaring application is the Polycom SE200 that integrates with both Outlook and Lotus. Figure 14 explains the configuration for such integrations. Figure 14: Calendaring application 20 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 21. Scalable Infrastructure for Distributed Video June 2008 White Paper Both Microsoft Exchange and IBM SameTime are scalable applications: they use the networking mechanisms described in Figure 3 and 4 to support tens and even hundreds of thousands of users in the network. Based on the discussion so far in this paper, we can assume that the visual communication network is also scalable to the same magnitude. The bottleneck of performance and limiting factor for scalability is, therefore, the link between the two systems. This, by the way, also applies to integration with any third- party application as discussed below in this paper. The speed of the interaction depends on the simplicity of the API or protocol used for the integration and amount of information that the applications have to exchange. Recording Application Recording video communications has become very popular after the release of recording and playback solutions such as Polycom RSS 2000. It is widely used for recording lectures and training sessions that can later be viewed by users who were not able to join the live session. For more convenience in the process, new Polycom HDX endpoints have additional record, playback, and other navigation keys on their remote controls. Any video endpoint can connect to the recording server and initiate recording of the video and content channels. Multipoint calls can be recorded when the conferencing server connects to the recording server. The conferencing server treats the recording server as an endpoint and can send any format supported by the conferencing server. For example, Continuous Presence views can be used for recording multiple sites at the same time. Once the recording is complete, any video endpoint can connect to the recording server, authenticate, select the recording and play it back. Figure 15 depicts the configuration for recording from a conferencing server (Polycom RMX 2000) and playback from a video endpoint. 21 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 22. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 15: Recording and playback Recording requires processing of the video media, and the recording server is, in this way, similar to a conferencing server. Therefore, the methods for increasing server scalability discussed above (Figures 5 and 6) apply here as well. RSS 2000 uses a load balancing algorithm that allows multiple servers to operate as a pool of resources. If the recording ports on one server are busy, the incoming recording request is sent to the next available server. Streaming Application Streaming is another application that extends the use of video beyond video endpoints and beyond real-time live calling. Most people typically think of streaming only previously recorded video content. But in fact, streaming is also a very powerful method for distributing live events to very large audiences. For example, if the RSS 2000 is connected to a video call, it streams both the live video and the presentation (content) channel to hundreds of streaming clients. The streaming client receives the information with minimum delay of about 10-15 sec. which still allows interaction through instant messaging, e.g., in a Q&A session. Streaming is, per definition, a great way to increase scalability. First, streaming is unidirectional, i.e., audio and video are only transported from the streaming server to the streaming client, so the client can be a very simple player that runs on general purpose computers. Today, most clients run on PC’s. However, with the increase of bandwidth in wireless networks and with the support of media players in mobile devices, streaming to mobile devices will become more common. 22 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 23. Scalable Infrastructure for Distributed Video June 2008 White Paper Streaming servers have many options to increase scalability. For example, they can limit the bit rate and thus use less resource processing a particular stream. They can also limit the number of different bit rates to be supported, and only stream, for example, in one or two formats at the same time. This also makes the streaming server more scalable. Figure 16 describes the streaming application. Unicast means ‘using point-to-point connections’: the streaming server creates a separate connection (stream) for each of the clients. Polycom offers streaming from the RSS 2000 and from the video content management server, the VMC 1000. Figure 16: Streaming application (unicast) The ultimate way for creating a scalable streaming platform is by using IP Multicast. This technology has been available for some time – RFC 11124) was created in 1989. However, it took years for the switching and routing manufacturers to support the protocol. IP Multicast was not a requirement in the Internet, and corporate network administrators did not push for implementing it due to concerns that multicast traffic would clog the IP network. Only in recent years have serious considerations around distributing audio and video streams in a scalable manner throughout the enterprise lead to enabling IP multicast in corporate networks. Figure 17 describes the how IP multicast delivers media streams to streaming clients. 23 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 24. Scalable Infrastructure for Distributed Video June 2008 White Paper Figure 17: Streaming via IP Multicast Once IP multicast is enabled in the IP network, the streaming server (in this case VMC 1000) only needs to send the stream once to the defined multicast IP address. IP addresses in the range from 224.0.0.0 through 239.255.255.255 are defined as multicast addresses and are recognized as such by the IP networking equipment. When the first video endpoint “joins” the multicast, it signals the router - through the IGMP5) - to forward the multicast packets to its IP subnet. If a second endpoint on the same IP subnet signals the router to join the same multicast, the router does not send a second stream but simply continues to send the single multicast stream to the IT subnet. Since multicast packets are essentially broadcast packets, all endpoints on the same subnet can receive them. Once all of the endpoints have signaled to “leave” the multicast (again with IGMP) or have timed out, the router stops forwarding the multicast stream to this IP subnet. Management and Provisioning Applications Management and provisioning are usually lumped together because they are both related to configuring and monitoring the video endpoints, conferencing servers, end user personal video accounts, applications and all other components of the ‘video network’. Based on the VC2 vision, management and provisioning will seamlessly integrate with standard IT management and provisioning tools, eliminating the need for separate, proprietary video-only management and provisioning applications. In the area of provisioning, it can be expected that some functions in video endpoints and conferencing servers are so specific that it will not be possible to provision them through general purpose IT tools. More details on management and provisioning applications are offered in a separate white paper6). 24 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 25. Scalable Infrastructure for Distributed Video June 2008 White Paper Integration with 3rd party Applications We already discussed integration with calendaring applications such as MS Outlook/Exchange and IBM Lotus Notes. The integration with other applications - IP- PBX, Instant Messaging, Presence soft switches, call center apps, messaging apps - follows similar architecture. Figure 18: Integration with third-party applications The integration is either through standard protocol like H.323 and SIP, or through a proprietary Application Programming Interface (API). Using a standard protocol for the integration is the preferred method because standard protocols are very well documented and described in specifications which either side of the integration can access. Since the integration options within the standard protocol are limited, the integration work can then be reused for other integrations. Integrations through protocols tend to be more scalable since the IP network allows for networking and redundancy as discussed above. Most current Polycom integrations with third party applications are based on the use of the SIP protocol that delivers the best scalability. API’s, on the other hand, are specific to vendor and product, and API integration is rarely portable to other products and vendors. API integrations also tend to be less scalable since the mechanism for network scalability discussed above in this paper cannot be used. Scalable Firewall Traversal Firewalls constitute a major problem for IP communications, both VOIP and visual communications. Application-aware firewalls have been discussed for long time, but in reality, most firewalls require a traversal mechanism such as H.460.17/18/19 for the H.323 family of protocols and ICE for SIP. 25 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 26. Scalable Infrastructure for Distributed Video June 2008 White Paper Both architectures require a Session Border Controller that is usually deployed in the so called De-Militarized Zone (DMZ) of the corporate headquarter or the service provider data center. There are two connectivity requirements for a SBC. It must have a public IP address to connect to endpoints over the Internet. It also must connect to the internal servers - H.323 gatekeepers, SIP servers, MCUs, recording, and streaming servers - in the corporate headquarters or SP data center. This example uses Polycom Video Border Proxies (VBPs). The largest VBP is the 6400 model that currently has throughput of 85Mbps and the VBP 5300 with throughput of 25Mbps. Both models can be installed in the DMZ of the corporate headquarters. The smaller VBP 4350 model supports up to 3Mbps, and is designed for branch offices with up to 15 video endpoints. The smallest model VBP 200 supports up to 1Mbps and is designed for small office / home office (SOHO) installations with 1-2 video endpoints. In many cases, media (both video and audio) needs to go through the SBC in the DMZ, and the throughput of this box is limiting scalability. Architecturally, the best way to approach the problem is to implement proxy functionality in the boxes in the branch and SOHO boxes (hence the name ‘video border proxy’) and send the media directly to the destination, thus avoiding the loop through the SBC. This mechanism is supported in all Polycom VBPs, and allows scalable implementations with ten thousands of endpoints behind firewalls. CONCLUSION Polycom’s VC2 vision stresses the importance of scalability for transforming today’s video conferencing into tomorrow’s visual communication - which will be scalable, reliable and deeply integrated with the core IT infrastructure. As we saw in this paper, scalability is a very complex topic and touches on all system components. Any of the components has the potential to create a bottleneck and limit the system performance and the scale of the system. The technologies discussed in this paper overcome these bottlenecks and are the solid foundation of a scalable visual communication system. Polycom has developed deep expertise in the core areas of networking and created an architecture targeting the high-performance requirements of future visual communication networks. This architecture is designed to support the large number of video users who will embrace the new kinds of VC2 applications which will bring video into the mainstream and make visual communication essential in both our personal and professional lives. 26 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.
  • 27. Scalable Infrastructure for Distributed Video June 2008 White Paper REFERENCES 1. Karapetkov, S.H., 2008. Migrating Visual Communications from H.323 to SIP. Polycom Whitepaper, April 2008. pp. 1. 2. Karapetkov, S.H., 2007. Polycom and ATCA. Polycom Whitepaper, June 2007. pp. 1. 3. Karapetkov, S., 2008. AdvancedTCA - Green Technology for Data Centers. Article in CompactPCI and AdvancedTCA Systems. http://www.compactpci- systems.com/articles/id/?3104. 4. Deering, S., 1989. RFC 1112 Host Extensions for IP Multicasting. IETF Document, August 1989. pp. 1. 5. Fenner, W., 1997. RFC 2236 Internet Group Management Protocol. IETF Document, November 1997. pp. 1. 6. Karapetkov, S., 2008. Management and Provisioning of Large-Scale Video Networks. Polycom Whitepaper, June 2008. pp. 1. 27 ©2008 Polycom, Inc. All rights reserved. Polycom and the Polycom logo are registered trademarks of Polycom, Inc. All other trademarks are the property of Polycom, Inc. or their respective companies.