A Ready Market: Introducing H.264-SVC

740 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
740
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A Ready Market: Introducing H.264-SVC

  1. 1. A Ready Market: Introducing H.264-SVC Next-Generation Technology for Videoconferencing Over IP and 3G Networks Page 1 Copyright © 2006 Wainhouse Research, LLC
  2. 2. “A READY MARKET” Introducing H.264-SVC: next-generation technology for videoconferencing over IP and 3G networks March 2006
  3. 3. About the Author Andrew W. Davis, founder and Managing Partner of Wainhouse Research LLC, has more than fifteen years experience as a technology consultant and industry analyst. Prior to founding Wainhouse Research, Andrew held senior marketing positions with several large and small high-technology companies. He has published over 250 trade journal articles and opinion columns on multimedia communications, image and signal processing, videoconferencing, and corporate strategies as well as numerous market research reports and is the editor of the conferencing industry's leading newsletter, The Wainhouse Research Bulletin. Andrew specializes in videoconferencing, rich media communications, strategy consulting, and new business development. Mr. Davis holds B.S. and M.S. degrees in engineering from Cornell University and a Masters of Business Administration from Harvard University. About Wainhouse Research Wainhouse Research is an independent market research firm that focuses on critical issues in the rich media conferencing and unified collaboration. The company conducts multi-client and custom research studies for industry vendors, consults with end users on key implementation issues, publishes a newsletter, white papers, and market statistics, and delivers public and private seminars at industry group meetings. About the Sponsor Vidyo™ creates VidyoConferencing™ solutions that provide quality experiences to all environments from the home-office desktop to the dedicated corporate video-conferencing facility. Vidyo, the first company to apply the H.264 Scalable Video Coding (SVC) standard to video conferencing, delivers HD/Telepresence quality enhanced by industry-best resilience and low-latency — and it manages to achieve all this over general-purpose IP networks. Vidyo’s family of end-to-end solutions for OEMs and organizations can support multi-point connections that include a variety of different platforms ranging from Mac & Windows desktops to dedicated room solutions. No dedicated networks required. For more information, visit www.vidyo.com. © Vidyo, Inc 2006 - Confidential A Ready Market - 2
  4. 4. Contents Overview: Key Market Trends ................................................................................................4 Enterprise Solutions..............................................................................................................4 Consumer Applications ........................................................................................................5 3G Wireless ..........................................................................................................................5 Technology Innovations ...........................................................................................................5 Challenges Facing Visual Communications.............................................................................6 Resilience .............................................................................................................................6 Communications Quality......................................................................................................7 Scalability .............................................................................................................................8 Ease of Use ...........................................................................................................................8 Cost.......................................................................................................................................9 Introducing H.264 – SVC.........................................................................................................9 How SVC addresses today’s challenges.................................................................................11 Sponsor Information ...............................................................................................................14 © Vidyo, Inc 2006 - Confidential A Ready Market - 3
  5. 5. Overview: Key Market Trends The convergence of multimedia technology with the Internet and the rapid adoption of broadband IP services for both consumer and enterprise communications is creating strong demand for improved digital media delivery. Video is at the forefront for home users as well as enterprise knowledge workers. Both constituencies are now demanding higher quality and easier-to-use video telephony and videoconferencing solutions. As a result, vendor and service provider interest is exploding along several different deployment and service models. Enterprise Solutions Leading enterprise software vendors such as IBM Lotus, Microsoft, Oracle, and SAP are working to embed voice and video communications into their next generation workflow tools. Future users will be able to launch multimedia calls without leaving their high level business process applications. Video will soon become a “feature” of every day productivity solutions such as Microsoft Word and Microsoft Excel as well as CRM and customized solutions. For example, financial services companies and online retailers are looking to deploy next-generation call center solutions to tens of thousands of desktops where video will provide the next level of customer intimacy. Cisco, Avaya, Nortel, Alcatel and others are now introducing systems that deploy video as enhancements to their IP PBX telephony solutions, a development that will expand videoconferencing beyond the dedicated conference room. This telephony-based approach promises to make video as easy to use as voice calling, but will require high quality delivery over networks with variable performance. The IP PBX market is the fastest growing segment of the PBX industry, with approximately 5,000,000 handsets shipped in 2005 across a wide range of customers. Many of these customers are looking to deploy video as part of the IP promise of enhanced solutions. Cisco, Polycom, RADVISION, and a dozen other vendors are developing visual collaboration portals that work in conjunction with directory services and other infrastructure products and services from IBM and Microsoft. This approach promises to deliver the “point and click” interface that customers demand for easy-to-use scheduled and ad-hoc conferencing, but will demand solutions that operate over multiple networks with different bandwidth limitations. Several large pharmaceutical, manufacturing, and government enterprises are looking at this approach since it leverages their investments in Microsoft or IBM Lotus Notes-Sametime infrastructure while extending their communications capabilities to include video. IBM claims 20,000,000 Sametime licensees in the field, and Microsoft is working ardently to top this number with its Live Communications Server product line; bringing video to just a small fraction of this installed base represents a huge challenge and a huge business opportunity as well. All of these enterprise deployment models – embedded collaboration, PBX-based multimedia calling, and portal solutions – promise to move desktop video from its current state where it is in trials to scores of desktops to real deployment status where the solution is © Vidyo, Inc 2006 - Confidential A Ready Market - 4
  6. 6. available to millions of enterprise workers. IT managers will be focused on cost, scalability, and resiliency in order to make video cost-effective and efficient while pleasing (high quality) to the user at the same time.. Consumer Applications Instant messaging is widely adopted in the consumer space, with over 800 million user accounts in 2005 generating over 10 billion messages per day. And the trend is clear – consumer IM is growing from text-only to include voice and video calling enhancements. Consumer webcam sales have surpassed the 20 million per year mark, and video instant messaging is already becoming increasingly popular, with sessions currently running in the billions per year. Today, AOL, MSN, and Yahoo! all support video chat; and Skype, popular with both consumers and traveling enterprise workers, has recently introduced video calling and limited videoconferencing. All of these services will create strong demand for better video technology that works across a variety of consumer Internet access services. Many believe that the future of consumer television lies with “Internet TV” and the ability to deliver hundreds of on-demand channels with high quality and reasonable bandwidth constraints. Service providers in this market will need video technology that can deliver in environments with widely varying performance parameters. 3G Wireless In many countries, wireless networks are replacing wired ones as the primary communications infrastructure, and now 3G networks, for which carriers have committed hundreds of billions of dollars, are promising to provide mobile voice, data, and multimedia. In fact, over 40 wireless service providers around the world are currently running 3G multimedia trials and video is considered a key element in driving demand for 3G services and 3G-capable endpoints. Success for these vendors will require video technology that can provide consumers reasonable quality in the highly unreliable wireless world. Technology Innovations Despite new video compression standards such as H.264 and ever-more powerful processors, major challenges remain with respect to the transmission of real-time voice and video over packet networks and the Internet in particular. Packet-switched network-supported multimedia applications require many different transmission capabilities (bandwidth, latency, jitter, packet loss, etc) while being delivered to a wide variety of endpoint devices operating in homogeneous and heterogeneous bandwidth environments. Conventional video coding systems encode video content using a fixed bit-rate tailored to a specific application. As a consequence, conventional video coding does not fulfill the requirements of flexible digital media applications. Hence, traditional technologies have impeded the wide-scale adoption of video-enabled communications over IP networks. A new approach is needed. © Vidyo, Inc 2006 - Confidential A Ready Market - 5
  7. 7. Scalable video coding (SVC) is emerging as that technology - an IP network-friendly coding approach that can satisfy underlying transmission requirements ranging all the way from HDTV over the enterprise LAN to two-way video chat sessions over an unreliable cell phone network. SVC theory is not new. In fact, SVC has been included in MPEG-2 and MPEG-4 and other video standards. But only recently, with advances in algorithms and processors, has the technology been shown to be practical for real-time, two-way video communications. SVC has the potential to be a disruptive technology, one that promises to improve vastly the videoconferencing experience in systems where bandwidth constraints and error resilience have limited user acceptance in the past. The potential impact for a new technology that promises improved video quality over IP networks is huge. While today’s videoconferencing market is approximately 125,000 room systems and an equivalent number of enterprise desktop systems per year, the potential desktop market for IP-savvy videoconferencing applications in the enterprise is between 5,000,000 and 10,000,000 units per year for PC-based applications, an equivalent number on the PC consumer side, and perhaps five times that number for 3G mobile telephony users. Challenges Facing Visual Communications Major barriers still exist that are preventing enterprises from deploying videoconferencing and visual collaboration tools everywhere while also slowing the adoption of Internet video chat, mobile-phone video telephony, and friends-and-family video calling over the Internet. These barriers can be lumped into one human-factors, and three technology categories in addition to cost as its own category. The barriers include: resilience, communications quality, scalability, and ease-of-use. Resilience Videoconferencing and video streaming applications can suffer from high packet-loss when traveling over an IP (or 3G) network due to the underlying “best-effort” model of the Internet protocol. In addition, bit errors can have a devastating effect on video quality. Over the years, many packet-loss and bit error recovery mechanisms have been used in conjunction with unreliable transport protocols to improve real-time IP network applications. When unrecoverable packet losses occur, it is highly desirable to have a video coding scheme that is resilient to such losses. A resilient video scheme will present gradual degradation in video quality rather than frozen or tiled video when network-based packet loss occurs. Related to packet loss is bandwidth variability, a common occurrence on IP networks where loads can fluctuate widely from moment to moment. This issue is particularly serious on wireless networks where throughput can be hampered by multi-path fading, interference, or noise. When a 400 kbps video stream is suddenly faced with a network availability of 300 kbps, packet loss is inevitable. In this situation it is highly desirable to have a video compression scheme that is capable of adapting to unpredictable variations in bandwidth. © Vidyo, Inc 2006 - Confidential A Ready Market - 6
  8. 8. Communications Quality While objective measurements of quality in a videoconferencing session have always been difficult to come by, users accustomed to “TV quality” video and “toll quality” voice over the PSTN quickly realize that today’s technology for delivering real-time communications over packet-switched networks often falls short. Communications quality in a videoconference is driven largely by three factors: delay, frame rate, and resolution. Delay (latency) causes a very un-natural communications environment; often with the effect of having people stepping on each other. Maintaining acceptable video quality requires keeping to acceptable total delay budgets. Total delay is the sum of the delay in the network, the time it takes a packet to traverse the network from point A to point B, as well as compression latency, the time it takes a video codec to perform all the algorithms required to compress the original digital signal. Experience has shown that a key challenge is establishing an overall delay of less than 200ms – feasible on a point-to-point call, but beyond the state of the art with today’s IP-based videoconferencing bridges. Many research projects have investigated alternative approaches to real-time video compression, with the goal of reducing delay while maximizing video quality and minimizing computational complexity. The North American television standard (NTSC) specifies a frame rate of approximately 30 frames per second (fps) while the European standard (PAL) is set to 25 fps. These two frame rates have become known as “real time” frame rates and are also associated with the term “TV quality video.” However, the definition of “real time” is not set in stone. For example, the movie industry settled long ago on a standard of 24 fps. High frame rates, generally above 20 fps, are required to preserve motion quality, maintain lip synch, and provide the benefits of low delay. In general, all other things being equal, higher frame rates require higher transmission speeds or network bandwidth. Because of the popularity of consumer still digital cameras, most people today are familiar with the concept of image resolution; they understand that a 5 megapixel (MP) camera will produce better images than a 1 MP camera and will enable larger print sizes while requiring more image storage and transport bandwidth. These same concepts hold true for video. In the case of visual communications however, independent bodies have specified certain fixed image sizes in order to enable interoperability between vendors and systems. Many of these standards are shown in the table below. Most videoconferencing systems today use SIF or CIF while most laptops support XGA screen resolutions. HD videoconferencing, with 9x the resolution of CIF, is a recent introduction. Increasing resolution delivers sharper images with more visible detail, enables images to be displayed in larger sizes, and creates a more pleasing visual communications experience overall, but requires more processing power and more network bandwidth. The © Vidyo, Inc 2006 - Confidential A Ready Market - 7
  9. 9. increased bandwidth needs associated with high definition further stress the requirements for network resiliency and efficient multipoint processing. Image Format Pixel Resolution Megapixels per Image SQCIF 128 x 96 0.012 QCIF 176 x 144 0.025 SIF 352 x 240 (NTSC) 0.084 CIF 352 x 288 0.101 4CIF 704 x 576 0.406 4SIF 704 x 480 (NTSC) 0.338 D1 720 x 480 (NTSC) 0.346 D1 720 x 576 (PAL) 0.415 HDTV 1280 x 720 0.922 VGA 640 x 480 0.307 XGA 1024 x 768 0.786 Figure 1 Pixel resolution for different image standards Scalability When enterprise managers talk about scalability of videoconferencing, they are referring to the need to support large numbers of information worker desktops with a wide range of CPU and memory resources connected over IP networks with varying network loads. The heterogeneous nature of receivers makes it difficult to deliver a single video stream to all with acceptable quality. The concerns are even more poignant on the part of service providers attempting to support hundreds of thousands or even millions of consumers using broadband connections to the home. Scalability also involves support for multipoint conferences – conferences involving more than two endpoints. Multipoint chat sessions have introduced many consumers to the value of such calls, and videoconferencing users have long demanded such capabilities in their systems. Multipoint requirements however significantly raise the challenges for call reliability and fault tolerance because multipoint calls often connect endpoints with very different capabilities on networks with different performance levels into a single conference. Ease of Use Enterprise workers have long been disappointed with the ease of use of desktop videoconferencing. Products have not been able to support ad-hoc calling or multipoint conferences, and desktop solutions for video have not generally been able to support the call answering, call forwarding, hold and transfer, and voice mail functionality that most information workers expect. The result has been lack of acceptance and a stagnant market. This is all in the process of changing however, as major enterprise vendors like Avaya, IBM, © Vidyo, Inc 2006 - Confidential A Ready Market - 8
  10. 10. Cisco, Nortel, Microsoft and others are announcing unified communications products that integrate voice, video, data, and instant messaging into one easy-to-use application. On the consumer front, video chat has been integrated into free services from AOL, MSN, Yahoo, and Google while free videoconferencing services from Skype, SightSpeed, GlowPoint and others have made video calling easier than ever. Consumer videoconferencing is ready for the next leap in quality with systems optimized for video delivery over packet networks. Cost Cost is always a factor in deploying advanced communications solutions. With the proliferation of high performance personal computers, the use of dedicated silicon for audio- video processing on the desktop is becoming less and less price-competitive. Furthermore, as video communications proliferate through enterprise IP-PBX systems and web conferencing solutions as well as through consumer IM and chat services, the need to provide a cost-effective multipoint solution becomes an equally important challenge. Current multipoint options that force the customer to choose between 1) feature-rich, expensive solutions (about $4K/concurrent user) that introduce reduced quality and delay performance and 2) lower cost solutions that offer minimal features and video quality will give way to next-generation solutions that combine features, performance, and error resilience with a much lower implementation cost. Introducing H.264 – SVC Scalable video coding (SVC), a technique that enables a video stream to be broken into multiple resolutions, quality levels and frame rates, is appealing for applications where the bandwidth available cannot be guaranteed – for example Internet video, video telephony, and wireless communications. SVC designs were first offered for systems intended for one- way delivery of video over packet-switched networks; Vidyo is the first company to apply SVC technology to two-way video communications and specifically to the challenges of point-to-point and multipoint videoconferencing. We should note, however, that there is a standardization process under way jointly between ITU-T VCEG and ISO MPEG to develop a recommendation for H.264-SVC; this process is expected to lead to a ratified standard by the end of 2006. The purpose of any video compression algorithm is to exploit both the spatial and temporal redundancy of video information so that acceptable video quality can be received at the far end while using as few bits as possible to transmit the signal. A non-scalable video encoder generates a single compressed bitstream. A scalable video encoder compresses a raw video sequence into multiple layers (see diagram). One of the compressed layers is the base layer. The base layer can be independently decoded and can provide a relatively low level of video quality. Additional compressed layers are enhancement layers that provide additional quality to the received video stream. Enhancement layers can be decoded only in conjunction with the base layer. The complete bitstream would consist of the base layer and all the enhancement layers and would provide the very best video quality. © Vidyo, Inc 2006 - Confidential A Ready Market - 9
  11. 11. On a QoS-enabled IP network it would even be possible to send the base layer with a higher priority than the other layers. Figure 2 Scalable video coding principles Enhancement layers can be created in the temporal, spatial, and quality realms. Hence, compared to decoding the complete bitstream, decoding the base layer and only some of the enhancement layers produces video with degraded quality (known as signal-to-noise ratio or SNR), or smaller resolution images (spatial scalability), or a lower frame rate (temporal scalability). The SVC algorithm sits on top of the base video compression scheme enabling videoconferencing vendors to offer SVC as an enhancement to their existing products, and more importantly, to position H.264-SVC as backwards compatible with non-scalable H.264 products in the field. While SVC can theoretically be implemented as a direct client-to-client communication system, future implementations will benefit greatly from some intelligence in the network. Routers, for example, could understand which packets are base-layer packets and treat these with higher priority, or routers could signal encoders that the network is congested so that the coder could degrade gracefully to fit available bandwidth. But a more likely implementation is a client-server architecture based on industry standard hardware and operating systems. Such an H.264-SVC server could not only provide multipoint capabilities for videoconferencing, it could also adjust the bitstream to each endpoint © Vidyo, Inc 2006 - Confidential A Ready Market - 10
  12. 12. individually depending on that endpoint’s coding and network capabilities as well as provide improved resiliency for packet loss. Because an H.264-SVC server sees each participant in a multipoint call as an individual stream, each endpoint has the capability of signaling the server how it wants to receive each stream. For example, in a four-way call, endpoint A might be able to decode the full bitstreams from endpoints B, C, and D and display each of the four video windows in VGA resolution. The 2x2 array on the monitor would therefore be the equivalent of 1280 x 960, nearly the same as HDTV, but without the need for HD cameras. Meanwhile endpoint B, with less processing power and network bandwidth might decode only the base layer and one enhancement layer and hence display all the videos in CIF format. How SVC addresses today’s challenges While H.264-SVC does not directly address issues surrounding ease-of-use, the technology folds seamlessly into today’s videoconferencing architectures and addresses the issues surrounding visual communications on unreliable networks. H.264-SVC will make video useable on a wide variety of consumer and enterprise endpoints and networks. Network Resilience Challenge Besides the ability to decode with different bandwidth and computation resources, scalable video coding technology offers graceful degradation and the ability to withstand error bursts. Because of this architecture, SVC is extremely resistant to packet loss. Rather than lose an entire frame or series of frames, performance degrades gradually because the user continues to receive the base layer and possibly one or more enhancement layers as well. When packet loss occurs, an H.264-SVC image will lose frame rate or quality or image resolution smoothly rather than freeze or drop out together. While video from non-scalable coders typically breaks down completely at less than 1% packet loss without some error recovery mechanisms and at approximately 5-8% with sophisticated error correction schemes (that require additional bandwidth), H.264-SVC has been shown to operate with better-than-usable video performance with up to 60% packet loss. Quality Challenge H.264-SVC provides extremely low delay conferencing. This is particularly important for multipoint calls where delay is a major contributor to the perception of poor quality. An H.264-SVC client/server architecture is resolution-independent and can support HD resolution images. The performance of the endpoint is dependent on only the total number of pixels displayed - a single 4CIF image or four CIF images will result in the same performance. The performance of the server is not dependent on the resolution being displayed. © Vidyo, Inc 2006 - Confidential A Ready Market - 11
  13. 13. Scalability Challenge SVC has a built-in adaptive rate control and intelligent routing capabilities that can send each participant the video stream that it can handle best. This minimizes network bandwidth use while also maximizing video efficiency. SVC fits well with current network architectures, where quality of service can be assigned to different media types. Running the base video layer with high priority and the enhancement video layers with lower priority can save network bandwidth and deployment costs while making large-scale video deployments more feasible. H.264-SVC servers can run on today’s industry standard hardware and software platforms and fit into common management systems. The technology is low cost, easily distributed, and fits neatly into any enterprise management system. The software system is cost-effective for both point-to-point and multipoint IP video communications. Simplified media mixing eliminates the need for compression/decompression in the server when supporting multipoint calls while also substantially reducing latency. The SVC approach provides all or nearly all the features associated with traditional high-end bridging systems such as rate matching and personal layout control without introducing multipoint delay or degraded video quality and without requiring specialized, expensive hardware. Cost Challenge H.264-SVC addresses costs in several significant ways. Because the algorithm can be implemented as software on industry-standard processors, H.264-SVC systems do not require specialized, expensive hardware. H.264-SVC clients can run on ordinary personal computers as well as 3G handsets and PDAs. Moreover, important infrastructure components such as multipoint bridges can support H.264- SVC connectivity on industry-standard server platforms. H.264-SVC can provide superior video performance on ordinary network connections provided to consumers and SOHO customers as well as on typical enterprise LAN architectures, reducing the need for specialized, high-cost network designs or services. © Vidyo, Inc 2006 - Confidential A Ready Market - 12
  14. 14. Conclusion Video-telephony and videoconferencing have never reached their full potential in either the enterprise market or the consumer space because the technology has been hampered by network issues and video quality limitations. While the ultimate acceptance of video communications will rely on IP protocols and packet-switched networks, video compression technology has been based on designs and thinking associated with traditional circuit- switched networks. H.264-SVC is new technology that combines the elegant performance of H.264 with the network resiliency and scalability that come with scalable video coding. H.264-SVC can be implemented with today’s industry-standard, low-cost hardware platforms and promises to take video communications to the enterprise desktop as well as the 3G mobile videophone. © Vidyo, Inc 2006 - Confidential A Ready Market - 13
  15. 15. Sponsor Information About the Sponsor Vidyo™ creates VidyoConferencing™ solutions that provide quality experiences to all environments from the home-office desktop to the dedicated corporate video-conferencing facility. Vidyo, the first company to apply the H.264 Scalable Video Coding (SVC) standard to video conferencing, delivers HD/Telepresence quality enhanced by industry-best resilience and low-latency — and it manages to achieve all this over general-purpose IP networks. Vidyo’s family of end-to-end solutions for OEMs and organizations can support multi-point connections that include a variety of different platforms ranging from Mac & Windows desktops to dedicated room solutions. No dedicated networks required. Contact Information 433 Hackensack Avenue Hackensack, NJ 07601 732 290 7468 info@vidyo.com www.vidyo.com © Vidyo, Inc 2006 - Confidential A Ready Market - 14

×