Media processing in the cloud-  what, where and how
Upcoming SlideShare
Loading in...5

Media processing in the cloud- what, where and how



The evolution to IP technology, VoLTE and new video services will have a profound ...

The evolution to IP technology, VoLTE and new video services will have a profound
impact on the way person-to-person media processing will be performed in the
networks of the future. This evolution raises some questions: what processing will be
needed, where will it take place and how will it be implemented?
Read more from the Ericsson Review here:



Total Views
Views on SlideShare
Embed Views



1 Embed 23 23



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Media processing in the cloud-  what, where and how Media processing in the cloud- what, where and how Document Transcript

  • The communications technology journal since 1924 2013 • 5Media processing in the cloud:what, where and howApril 11, 2013
  • Media processing in thecloud: what, where and howThe evolution to IP technology, VoLTE and new video services will have a profoundimpact on the way person-to-person media processing will be performed in thenetworks of the future. This evolution raises some questions: what processing will beneeded, where will it take place and how will it be implemented?mediaprocessingbeprovided–willitbehandledinacloud-likemannerorwillitbepushedouttoterminals?The deployment of ­generic industryhardware that is capable of runningmany kinds of applications in a flexiblemanner is a growing trend within theICTindustry.Itfollowsthenthatgener-iccomputersofferingcloudserviceswillalso be used to implement future tele-communication networks in operatorcloudcenters.The third and final questionaddressed in this article is: how willmediabeprocessedinevolvedtelecom-munications networks – how muchgeneric hardware will be used and willDSPs on dedicated platforms continuetobethepreferredapproach.Bearing in mind that the cloud is notjust about technology, this article alsodescribes how cloud principles can beapplied to the various business modelsforcommunicationservices.Initially,theservicesprovidedbythetelephone network were carried out byswitchboard operators. Gradually, ascomputing resources were introduced,control logic processing and mediahandling became entirely automatic,leading to today’s models where cloud-based services are provisioned over anetwork using shared pools of comput-ing resources, and where users pay forwhattheyconsume.Phones were initially simple ­devices,consisting of a microphone and a loud-speaker. When routing of calls becameautomatic, a rotary dial was added.Today, more than one billion smart-phones around the world provide acomputing platform that is capable ofrunningmillionsofapplicationsandofprovidingextensivemediaprocessing.Twoofthequestionsaddressedinthisarticle are: what media processing willtake place in the communication ser-vices of the future, and where will thisJOHAN LUNDSTRÖMBOX A  Terms and abbreviationsAMR Adaptive Multi-RateAMR-WB AMR-widebandAS application serverATM Asynchronous Transfer ModeBGF border gateway functionBSC Base Station ControllerCAGR Compound Annual Growth RateDSP digital signal processorEFR Enhanced Full RateIETF Internet Engineering Task ForceIMS IP Multimedia SubsystemMGC Media Gateway ControllerMGW Media GatewayMSC mobile switching centerMSC-S MSC serverM-MGW Mobile Media GatewayMMTel AS multimedia telephony application serverMRF Media Resource FunctionMRS media resource systemMSS mobile softswitchO&M operations and maintenanceOSS operations support systemsPCM pulse-code modulationPLMN public land mobile networkPSTN public switched telephone networkRNC radio network controllerSBG Session Border GatewaySGC Session Gateway ControllerSGW Signaling GatewaySIP Session Initiation ProtocolTDM time division multiplexingTrFO transcoder free operationVLR visitor location registerVoLTE voice over LTEThere’s a strong argument forregarding telephony as one of thefirst cloud-based services. Sincethe invention of the telephone,the industry has evolvedsignificantly and operators havedeveloped a flexible range ofservices for subscribers providedon a pay-as-you-use basis.Smartphones have brought anenriched experience to usersand theoretically they, alongwith other advanced terminals,could perform much of the mediaprocessing traditionally takencare of by networks. However, theconstraints posed by bandwidthand battery life, along with thedesire to provide new servicesindependent of terminal type,tend to indicate that most media-processing services will remainin the network.2ERICSSON REVIEW • APRIL 11, 2013Voice and video in the cloud
  • ProcessingandnetworkevolutionThedigitalizationofvoicewasoneofthefirststepsinnetworkevolutionandelec-tronic media processing. The shift todigitalledtolowerdistortionlevelsandreducedattenuationofthevoicesignal,improvingitsquality.Digitalization led the way in thedevelopment of new approaches forimproving voice quality, such as echocancellingandnoisereduction.Withoutthedigitalizationofvoice,andthedevel-opment of efficient voice codecs thatsave bandwidth, such as EnhancedFull Rate (EFR) and Adaptive Multi-Rate(AMR), mobile telephony would not betherealityitistoday.Pulse-code modulation (PCM) is stillthe most common method of digital-ly representing analog voice signalsover the PSTN and among PLMNs. Asnetworks and devices use and supportdifferent codecs and protocols, mobiletelephonynetworksusuallyneedtocon-vert voice – by transcoding – from oneformattoanother.Further improvements to voice qual-ity are taking place through the appli-cation of new codecs, such as AMR-WB,which supports HD voice, combinedwith mechanisms, such as transcoderfree operation (TrFO), based on codecnegotiation between the end pointsinvolvedinacall1,2.Tones, such as dial and busy tones,and announcements, such as faulty­service indications, are examples ofgeneral network-generated servicesthat users have grown accustomed toover the years. Other services such asconferences,wherevoicestreamsfrommultiplesourcesarecombined,arealsonetwork-generated and exemplify thetrendtowardsadvancedvoiceservices.Circuit-switched networks still han-dle most of today’s voice traffic. Thearchitecture of these networks tendsto be based on softswitches consist-ing of Media Gateways (MGWs) andMedia Gateway Controllers (MGCs). Formobile softswitches (MSSs), the MGC is­integratedinthemobileswitchingcen-ter server (MSC-S). For the most part,echocancelling,transcoding,andsend-ing of tones and announcements is car-ried out by MGWs. These gateways alsointerwork with the PSTN for circuit-switched data and fax, they handlemulti-party calls, and reframe mediasamples on the borders between 3GPPand IETF networks. In addition to per-forming media processing, the MGWsalso act as a bridge between differentbearer technologies, such as betweenTDMandIP.As networks evolve, and people’s useof them progresses, voice will be han-dledbytheIMS.Andsocommunicationwith video will become a mainstreamactivity for enterprises and consumers.Media handling in this environment isperformed primarily in a logical nodecalled the Media Resource Function(MRF), which uses SIP to communi-cate with the rest of the network. TheMRF provides services such as tones,announcements and ­conferences, andwill support new services developed inresponsetosubscriberdemand.Inanall-IPenvironment,suchasIMS,operatorsnolongerhaveend-to-endcon-trol over networks, resulting in greateremphasis on security. For SIP signalingand related media, it is the responsibil-ity of Session Border Gateways (SBGs)to handle security. These SBGs can beimplemented as stand-alone boxes, orintegratedintoothernetworkelementsinalayeredarchitecture,whichreducescapex and opex. These gateways mayalso provide l­imited media-processingcapabilities,suchastranscoding.Further development in media pro-cessingwillbeneededtomeettheexpo-nential growth in person-to-­personvideocommunication.Consider the media processingrequirements for videoconferencing.Most videoconference services showparticipantsusingtwoprimarydisplaymodes: voice activated and continuouspresence. In voice-activated mode, thestream from the active speaker domi-nates the available display area, whileotherparticipantsareshownin­smallerwindows, or not at all. In continuous-presence mode, all participants aredisplayed simultaneously. To deliver avideoconference, the network has twochoices: it can either collect all videostreams from participating users andsend all streams to all users; or it canmix the video streams into one pre-ferredformatbeforesendingthesingle,combinedstreamtoparticipatingusers.Intheall-streams-to-all-usersapproach,media processing is performed by theparticipating terminals, whereas themixing approach relieves theCommon resources that are pooledand dynamically shared by differentapplicationsMSC-S OSSMGCSGWappATMportsTDMportsIPportsDSPdevicesMGWappBGFappCommon resource handlingMRFappO&MSGC MMTelASCommon O&M implementation andinterface with a one node viewFIGURE 1  Ericsson media-resource-system architecture3ERICSSON REVIEW • APR L 11, 2013
  • control, such as the MSC-S, and one formedia processing applications, such astheMGWortheMRF.Today,controlapplicationstendtobebuilt on dedicated, carrier-grade plat-forms with generic processor archi-tectures, such as x86. Some of theseplatformscanalreadyrunmultipletele-com applications and provide many ofthe benefits offered by operator cloudcenters. It is likely that these platformswill develop into telecom cloud cen-ters supporting virtualized softwareand applications – allowing operatorsto further reduce their capex and opexinvestments.The requirements placed on media-processing platforms are ­however sig-nificantly different from those forprocessing control applications. Thisis because the amount of processingneeded for media is much greater andtherequirementsforreal-timeprocess-ing and latency are more stringent. Inaddition to supporting multiple ser-vices and adapting to changing trafficprofiles ­automatically, media-resourceplatforms will need to support TDMinterfaces for some time to maintaininteractionwithlegacysystems.General-purpose processors, suchas the x86, have become more costefficient for handling media, howev-er their performance compared withDSPs varies significantly depending onthe media being processed. A DSP, forexample, offers superior performancefor voice processing, such as transcod-ing. But when it comes to certain typesof video processing the performance ofaDSPisnotsignificantlybetter.It is hard to predict whether thecost-to-­performance ratio for DSPsand ­general-purpose processors willchange as new chips are introducedto the market and the types of media-processing services evolve. For themoment, DSPs provide the best perfor-manceincomparisontooverallcostforservices requiring both high channelcapacity and density, such as voice incircuit-switchednetworks.Inthelongterm,astheneedtointer-face with TDM systems disappears andthe volume of voice transcoding con-sequently shrinks, using generic pro-cessors and operator cloud centers formedia processing will become a morecompetitiveoption.terminal of the need to performany media processing. The combinedapproachcansaveasignificantamountofbandwidthintheaccessnetwork.Yetanotherwaytosavebandwidthistojustsend the video stream associated withthe active speaker to the participants’terminals.Videoconferencing is just one exam-ple of a video-based application. Manynewservicesthatwillbetypicallydeliv-ered by the cloud, such as recording,storage, announcements and mail-boxes, will be implemented later on.Advanced voice and video services mayinclude real-time speech recognition;speech-to-text conversion; automaticlanguagetranslation;speech-controlledsupplementaryservices;embeddedban-ner advertising; speaker identification;and real-time generation and transla-tionofsubtitlesinvideocalls.ThecloudversustheterminalTo ensure good media quality and­efficient use of the access network,­terminalsneedtobeabletoencodeanddecode digital media. In theory, ter-minals could provide more or less allthe media-processing power needed todeliverservicesofferedbythenetwork.To do this, terminals would, for exam-ple,needto:supportallcodecs–sothatallpotentialpeerscanusethecodecbestsuitedtotheirarchitecture;generatetonesandannouncementsbasedonerrorcodesreceivedfromthenetwork;andactasaconferencebridge,orsupportmultiplewaysofactingasavideoclient–toensureinteroperabilitywithallpotentialpeers.But is this approach cost efficient?And is it good for users? The success ofa new communication service lies inthe ­rapid adoption by a critical mass ofusers. New services therefore need tobeas­terminal-independentaspossible,reach as many users as possible and beinteroperablefromdayone.To maintain interoperability andavoid fragmentation of some types ofservices,suchasvideocommunication,performingmediaprocessinginthenet-workiskey.Usingstandardizedinterfac-es between networks helps to ensureinteroperability among operators andsecures optimal performance andquality. In addition, codec negotiation(including interworking between con-trol protocols), transcoding, reframingand video-­mixing services can be usedinnetworkstosupportinteroperability.Asillustratedbythevideoconferenceexample, handling media processingin the network, rather than the termi-nal,cansavebandwidth.Thisexpensiveresourcecanalsobeusedmoreeconom-ically if the network is allowed to pro-videalltranscodingprocessing,leavingterminals free to use the codec that isbestsuitedtotheirspecificarchitecture.Terminals that use less bandwidthoften require less power. And so, byhanding over bandwidth-hungry ser-vices – such as voice and video mixing– to the network, power consumptionintheterminalcanbereduced,extend-ingtherechargingintervalandimprov-ingbatterylife.Algorithms for voice and video pro-cessing tend to be patented and termi-nalmanufacturershavetopayroyaltiestousethem.Performingtranscodinginthe network through pooled instancesreducesthenumberofalgorithmsneed-edforterminalmedia-­processingresult-ing in lower usage fees and reducedoverallcosttosubscribers.When all the factors are broughttogether,itseemsthecurrentapproachto media processing – ­performing it inthe network – remains the most effi-cient.Asitislikelythatthenetworkwillcontinue to be the most practical alter-native in the future, it stands to reasonthat media processing will also remainacloud-basedservice.Cost-drivenplatformevolutionRequirements for reliability, energyefficiency, redundancy and low carbonfootprint have led to the use of dedicat-edhardwareplatformstobuildtelecom-munication network elements – untilnow. In an operator cloud, a competi-tive hardware platform not only needsto meet all of these requirements butshould be generic enough to supportmultiple applications and flexibleenough to accommodate fluctuatingtraffic ­patterns and changing applica-tioncapacityneeds.To efficiently provide communica-tion services in a network, two differ-ent platform types are needed: one for4ERICSSON REVIEW • APRIL 11, 2013Voice and video in the cloud
  • SharingresourcesreducescostThe concept underlying Ericsson’smedia-processing platform is basedon providing processing capabilitiesin the network. Such a platform – amediaresourcesystem(MRS)–usesDSPresources in a ­dynamic way, is capableof allocating resources to the differentmedia-processing functions automati-cally,and canpooluserrequestsamongthevariousDSPs.The MRS concept provides bothmedia-gateway and signaling-gatewayfunctionality for MSS networks. It con-tains an MRF for media processing inIMSnetworksandprovidessessionbor-der functionality for MSS and IMS net-works.Thesessionborderfunctionalityusesalayeredarchitecture,underwhicha border gateway function (BGF) in theMRS handles the media plane, while aSession Gateway Controller (SGC) han-dles the control plane. Figure 1 showsthe high-level distributed and integrat-edarchitectureofthissystem.Networks with Ericsson MobileMGW (M-MGW) nodes installed can beupgraded to an MRS with support forfuture media-processing features, asthe M-MGW/MRS can be part of bothan MSS and an IMS environment. Toperform this type of upgrade simplyinvolvesasoftwareupdate.The MRS can be considered to be amedia cloud platform as it supportsmultiple media-processing applica-tions, it can share the available com-puting ­resources as well as sharingexternalinterfacesdynamicallyamongthe media-processing applications.Planstodevelopthesystemincludetheaddition of open interfaces that allowspecializedexternalproductstoprovidefunctionalityviathecommonMRF.NetworkscenariosAs illustrated by the example inFigure 2, fixed and mobile networkarchitectures have traditionally beendistributed and hierarchical. In suchnetworks, the node closest to the sub-scriber takes care of voice coding ortranscoding to PCM when a call entersthenetwork.Today’s mobile switching solutionsallowthecontrollogic–theMSC­servernodes – to be centralized to just a fewsites, even in fairly large networks.Media, meanwhile, is handledCoding anddecodingBSC MSC/VLR TransitexchangeLocalexchangeTranscoding Coding anddecodingFIGURE 2  Traditional network architectureFIGURE 3  Structure of a modern mobile voice networkBSCBSCMSC-SRNCIPMGWPLMNPSTNIMSPooled media-resources Pooled media-controland call-routing resources5ERICSSON REVIEW • APR L 11, 2013
  • locally to save bandwidth and min-imize latency. To ensure hardwareresourcesareusedefficientlyandahighlevel of resilience is maintained, MSC-Snodesareoftenpooled.IP-basedbearersused on the interface to the radio net-workalsoallowpoolingofMGWs,offer-ingsimilarbenefitsintermsofefficientresourceusageandresilience.Figure 3showsasimplenetworkwhereboththemediagatewaysandserversarepooled.The introduction of VoLTE and IMShas naturally led to a new networkstructure,especiallyinthemediaplane.Thefirsttaskthatthenetworkneedstotake care of is security, and so an SBGmakes sure that it is safe to establish asession. Media processing may then beneededintheset-upphaseto,forexam-ple,producetonesandannouncements;services which can be provided by tem-porarily linking in an MRF. During thecall-establishment phase, the controllayer determines whether transcodingandreframingareneeded.Ifso,anMRFis linked in, or alternatively a BGF maybe able to handle transcoding. Certainservices,suchasconferencing,mayalsorequire additional media processing.Asend-to-endcodecnegotiationwillbemore common in IMS networks thanit is in circuit-switched networks, theneed for media processing will dimin-ish as networks evolve. However, newand advanced processing services willbeintroducedtohandlespecialcases.The best network architecture, illus-trated in Figure 4, is based on distrib-uted SBGs or BGFs optimizing latencyandensuringbandwidthefficiency;andadvanced services that are not used sooftencanbecentralized.The flexible nature of the MRS sup-ports all network architectures. It is ascalablesolutionthatcanbeusedattheedge of a network or in a centralizedway. In cases where an operator wantsto avoid over provisioning to cater foroccasionaltrafficpeaks,MRSnodescanbe pooled to balance the load through-out the network. This can be achievedeven if the nodes are in different geo-graphiclocations.ChangingbusinessmodelsA significant aspect of cloud comput-ing is the business model. The cloudapproach enables enterprises to buy ITservices instead of investing in infra-structure. Telecommunication opera-tors provide communication services,such as voice, to consumers and enter-prises in much the same way. And it islikely that additional products will becloud-based3.Vendorscanprovidewholesalecloudservicestooperatorswho,inturn,breakthem up into smaller, retail, offer-ings for enterprises and consumers.Ericsson’sDeviceConnectionPlatform,for example, supports machine-to-machinecommunicationasacloudser-viceforoperatorsthatofferretail cloudservices. Other services, such as low-volume media processing, may be pro-vided to operators as cloud services inthe future. The sharing of network ele-mentsamongseveraloperatorsenablesvendors to obtain better economies ofscale than individual operators can forcertainservices.Thewhat,thewhereandthehow:theanswersEven though terminals are fast becom-ing advanced computers capable ofperforming sophisticated media pro-cessing, this function is likely toremainanetwork-basedserviceforrea-sons of efficiency. Telecommunicationplatforms are developing into multi-­application systems, that support bothlocal and geographic spreading ofresourcepools.Cloudplatformsbasedongenericpro-cessorsarelikelytobeintroducedinthecontrol plane first. Whether these plat-formswillbeusedformediaprocessing,and when, will depend on: the need forlegacy interfaces; the evolution of thecost-to-performance ratio for DSPs; thetype of media processing services thatwill be required in the future; and thevolumeoftheseservices.Oneoftheimportantaspectsofcloudcomputing is the business model. Themarket is already showing evidence ofincreased flexibility when it comes towho will provide communication ser-vices. In the future, enterprises will beabletorelyonoperatorstoprovidecom-munication services instead of buyingtheir own equipment. Operators will,inturn,beabletorelyonvendorstopro-videcloudservices,creatinganefficientvalue chain in which each player paysforservicesbasedonusage.ExternalnetworksEvolvedPacketCoreEvolvedPacketCoreSGC ASBGF BGFMRFIP transport networkIMScontrol planeSecurity andtranscoding onthe network edgesSecurity andtranscoding onthe network edgesCentralized and pooledmedia-resourcesin MRFFIGURE 4  Architecture of an all-IP and IMS network6ERICSSON REVIEW • APRIL 11, 2013Voice and video in the cloud
  • Johan Lundströmis a strategy managerfor mobile softswitch andmedia processingsolutions within productarea Core and IMS at Business UnitNetworks. He joined Ericsson in 1991and since then, he has workedprimarily with mobile core networks.He has had various positions in bothR&D and product management,including line management. He holdsan M.Sc. in telecommunications andsoftware science from the HelsinkiUniversity of Technology, Finland.1. Ericsson, 2010, Ericsson Review, Evolution of the voiceinterconnect, available at: Ericsson, 2011, White Paper, HD voice – it speaks foritself, available at: Ericsson, 2011, White Paper, Visual communication – whyoperators should address the enterprise market, availableat: author gratefully acknowledgesthe colleagues who have contributedto this article: Patrik Roséen, MatsAlendal, Joakim Haldin, Markku Korpi,Peter Jungner, András Vajda,Kari-Pekka Perttula and Jörg Ewert.Acknowledgements7ERICSSON REVIEW • APR L 11, 2013
  • Telefonaktiebolaget LM EricssonSE-164 83 Stockholm, SwedenPhone: + 46 10 719 0000Fax: +46 8 522 915 99284 23-3186 | UenISSN 0014-0171© Ericsson AB 2013