Date: February 21, 2006
2 3G Multimedia Streaming Services
3GPP2 and its Organizational Partners claim copyright in this document and
individual Organizational Partners may copyright and issue documents or standards
publications in individual Organizational Partner's name based on this document.
Requests for reproduction of this document should be directed to the 3GPP2
Secretariat at email@example.com. Requests to reproduce individual Organizational
Partner's documents should be directed to that Organizational Partner. See
www.3gpp2.org for more information.
2 This document defines content types, media formats, codecs, and delivery support for
3 Multimedia Streaming Service (MSS).
3G Multimedia Streaming Services 2
1 10 Pseudo Streaming............................................................................................................................34
2 10.1 Session Description ..........................................................................................................34
3 10.2 Transport Options .............................................................................................................34
4 10.2.1 Request 34
5 10.2.2 Response 35
6 10.3 FFMS Usage in MSS ........................................................................................................37
7 11 Media Types (Codecs) ....................................................................................................................38
8 11.1 General Requirements ......................................................................................................38
9 11.2 "Video" .............................................................................................................................38
10 11.3 "Speech" ...........................................................................................................................38
11 11.3.1 "Narrow Band Speech" 39
12 11.3.2 "Wide Band Speech" 39
13 11.4 "Audio".............................................................................................................................39
14 11.5 "Text in SMIL" .................................................................................................................39
15 11.6 "Timed Text" ....................................................................................................................40
16 11.7 Other media ......................................................................................................................40
17 12 Rate Adaptation of Streaming Media..............................................................................................41
18 12.1 Introduction ......................................................................................................................41
19 12.2 Rate Adaptation ................................................................................................................41
20 12.2.1 Link Characteristics 41
21 12.2.2 Adaptation of Transmission Rate 42
22 12.2.3 Receiver Buffer Level Feedback 42
23 Annex A Call Flow Example (Informative).....................................................................................44
24 Annex B Sample Scenario of a Session (Informative).....................................................................46
25 Annex C Buffering of Video (Normative).......................................................................................49
26 C.1 Introduction ......................................................................................................................49
27 C.2 MSS client buffering requirements...................................................................................52
28 Annex D Pseudo Streaming Session Representation Example (Informative)..................................53
29 D.1 XHTML Presentation Description Format .......................................................................53
30 D.2 <object> Element..............................................................................................................54
31 Annex E Pseudo Streaming Example (Informative) ........................................................................57
32 E.1 Live encoding ...................................................................................................................57
33 E.2 Random positioning..........................................................................................................58
34 E.3 Choosing bitrate................................................................................................................58
3G Multimedia Streaming Services 4
1 2 List of Figures
2 Figure 7-1 Multimedia Streaming Service ....................................................................................................15
3 Figure 7-2 Multimedia Streaming Services Terminal Function ....................................................................16
4 Figure 8-1 Data Channel Set-up (SO 33/66, 60 and 61) ...............................................................................28
5 Figure 9-1 Protocol Stack for Multimedia Streaming Service ......................................................................29
6 Figure 9-2: Generic Format of an RTCP APP packet. ..................................................................................30
7 Figure 9-3: Data format block for NADU reporting .....................................................................................31
8 Figure 10-1 Basic Sequence of Pseudo-streaming ........................................................................................34
9 Figure E-1 Block diagram for live pseudo-streaming ...................................................................................57
10 Figure E-2 moof -> moov conversion in the server ......................................................................................57
11 Figure E-3 HTTP request for the random positioning ..................................................................................58
12 Figure E-4 Movie file reconstruction for random positioning service ..........................................................58
13 Figure E-5 Protocol enhancement for choosing bitrate.................................................................................59
14 Figure E-6 Generation of movie file for pseudo-streaming from multirate movie file .................................59
3G Multimedia Streaming Services 5
1 3 List of Tables
2 Table 8-1 Summary of non-UAProf defined Attributes for MSS CE ...........................................................25
3 Table 8-2 Summary of UAProf (HardwarePlatform) defined Attributes for MSS CE .................................25
4 Table 8-3 Summary of UAProf (SoftwarePlatform) defined Attributes for MSS CE ..................................26
5 Table 9-1 CMF parameters for HTTP GET ..................................................................................................33
6 Table 10-1 Pseudo-streaming HTTP GET request parameters .....................................................................35
7 Table 10-2 Pseudo-streaming HTTP response options .................................................................................36
8 Table 10-3 Example of pseudo-streaming message header range parameters ..............................................36
3G Multimedia Streaming Services 6
1 4 Scope
2 The objective is to define and standardize the functionality of Multimedia Streaming
3 Services (MSS) that can be incorporated into the operations of wireless
4 telecommunications networks. Audio-only streaming and video-only streaming are
5 special cases of MSS. This document defines the functional characteristics and
6 requirements of the MSS. Features and system requirements are defined in order for
7 MSS to be provided in wireless telecommunications networks. This document addresses
8 unicast services.
3G Multimedia Streaming Services 7
1 5 References
3 5.1 Normative References
4 This section provides references to other specifications and standards that are
5 necessary to implement this document.
6 1. 3GPP2 S.R0021: "Multimedia Streaming Services – Stage 1".
7 2. 3GPP2 C.S0017-12: "Data Service Options for Spread Spectrum Systems:
8 cdma2000 High Speed Packet Data Service Option 33/66".
9 3. 3GPP2 C.S0017-10: "Data Service Options for Spread Spectrum Systems: Radio
10 Link Protocol Type 3".
11 4. 3GPP2 C.S0047: "Link-Layer Assisted Service Options for Voice-over-IP: Header
12 Removal (SO60) and Robust Header Compression (SO61)".
13 5. 3GPP2 C.S0014: "Enhanced Variable Rate Codec, Speech Service Option 3 for
14 Wideband Spread Spectrum Digital Systems".
15 6. 3GPP2 C.S0020: "High Rate Speech Service Option for Wide Band Spread
16 Spectrum Communication Systems".
17 7. 3GPP2 C.S0050: "File Format(s) for Multimedia Services".
18 8. 3GPP2 C.S0052: "Source-Controlled Variable-Rate Multimode Wideband Speech
19 Codec (VMR-WB) Service Option 62 for Spread Spectrum Systems".
20 9. ITU-T Recommendation H.263: "Video Coding for Low Bitrate Communication".
21 10. ISO/IEC 14496-2:2004: "Information Technology — Generic Coding of Audio-
22 Visual Object — Part 2: Visual".
23 11. IETF RFC 3550: "RTP: A Transport Protocol for Real-Time Applications".
24 12. IETF RFC 768: ""User Datagram Protocol"
25 13. IETF RFC 791: "Internet Protocol DARPA Internet Program Protocol
27 14. IETF RFC 3551: "RTP Profile for Audio and Video Conferences with Minimal
29 15. IETF RFC 3016: "RTP Payload Format for MPEG-4 Audio/Visual Streams".
30 16. IETF RFC 2429: "RTP Payload Format for the 1998 Version of ITU-T Rec. H.263
32 17. IETF RFC 2658: "RTP Payload Format for PureVoice(tm) Audio".
3G Multimedia Streaming Services 8
1 18. IETF RFC 3558: "RTP Payload and Format for Enhanced Variable Rate Codecs
2 (EVRC) and Selectable Multimode Vocoders (SMV)".
3 19. IETF RFC 3984: "RTP Payload Format for H.264 Video".
4 20. IETF RFC 4348: "RTP Payload formats for Variable-Rate Multimode Wideband
5 (VMR-WB) Audio codec".
6 21. IETF RFC 2326: "Real-Time Streaming Protocol".
7 22. IETF RFC 2327: "Session Description Protocol".
8 23. IETF RFC 2616: "Hypertext Transfer Protocol - HTTP/1.1".
9 24. IETF STD 0007: "Transmission Control Protocol".
10 25. CompuServe Incorporated: "GIF Graphics Interchange Format: A Standard
11 defining a mechanism for the storage and transmission of raster-based graphics
12 information", Columbus, OH, USA, 1987.
13 26. CompuServe Incorporated: "Graphics Interchange Format: Version 89a",
14 Columbus, OH, USA, 1990.
15 27. ITU-T Recommendation T.81 (1991) | ISO/IEC 10918-1 (1992): "Information
16 technology - Digital compression and coding of continuous-tone still images -
17 Requirements and guidelines.
18 28. W3C Recommendation: "Synchronised Multimedia Integration Language
19 (SMIL 2.0) Specification", http://www.w3.org/TR/2005/REC-SMIL2-20050107/
20 29. ISO/IEC 10646-1 (2000): "Information technology - Universal Multiple-Octet
21 Coded Character Set (UCS) - Part 1: Architecture and Basic Multilingual Plane".
22 30. IETF RFC 2083: "PNG (Portable Networks Graphics) Specification version 1.0 ".
23 31. The Unicode Consortium: "The Unicode Standard", Version 3.0 Reading, MA,
24 Addison-Wesley Developers Press, 2000, ISBN 0-201-61633-5.
25 32. ANSI X3.4, 1986: "Information Systems; Coded Character Set 7 Bit; American
26 National Standard Code for Information Interchange".
27 33. ISO/IEC 8859-1:1998: "Information technology; 8-bit single-byte coded graphic
28 character sets; Part 1: Latin alphabet No. 1".
29 34. IETF; RFC 2279: "UTF-8, A Transformation format of ISO 10646", URL.
30 35. 3GPP TS 23.038: "Alphabets and language-specific information".
31 36. W3C Working Draft Recommendation: "Scalable Vector Graphics (SVG) 1.1
32 Specification", http://www.w3.org/TR/SVG11, February 2002.
33 37. W3C Recommendation: "Mobile SVG Profiles: SVG Tiny and SVG Basic",
3G Multimedia Streaming Services 9
1 38. 3GPP TS 26.234: "Transparent End-to-End Packet Switched Streaming Service
2 (PSS) Protocols and Codecs".
3 39. Polyphony MIDI Specification Version 1.0, RP-34, MIDI Manufacturers
4 Association, Los Angeles, CA, February 2002.
5 40. Scalable Polyphony MIDI Device 5-to-24 Note Profile for 3GPP Version 1.0, RP-
6 35, MIDI Manufacturers Association, Los Angeles, CA, February 2002.
7 41. "Standard MIDI Files 1.0", RP-001, in "The Complete MIDI 1.0 Detailed
8 Specification, Document Version 96.1" The MIDI Manufacturers Association, Los
9 Angeles, CA, USA, February 1996.
10 42. 3GPP2 C.S0045: "Multimedia Messaging Service (MMS): Codecs and File
12 43. W3C Recommendation: "CC/PP structure and vocabularies",
14 44. W3C Recommendation: "Resource Description Framework (RDF) Vocabulary
15 Description Language 1.0: RDF Schema", http://www.w3.org/TR/rdf-schema.
16 45. ITU-T Recommendation H.264 (2003): "Advanced video coding for generic
17 audiovisual services" | ISO/IEC 14496-10:2003: "Information technology –
18 Coding of audio-visual objects – Part 10: Advanced Video Coding".
19 46. ITU-T Recommendation, J.127: "Recommendation: Transmission protocol for
20 multimedia Webcasting over TCP/IP networks"
21 47. ISO/IEC 14496-3:2001, Information technology - Coding of audio-visual objects
22 - Part 3: Audio.
23 48. ISO/IEC 14496-3:2001/Amd.1:2003, Bandwidth Extension.
24 49. ISO/IEC 14496-3:2001/Amd.1:2003/COR1:2004.
25 50. 3GPP2 C.S0047: "Link Layer Assisted Service Option for VoIP: Header removal
26 (SO 60) and Robust Header Compression (SO 61).
27 51. 3GPP2 C.S0024: "cdma2000 High Rate Packet Data Specification".
28 52. 3GPP TS 26.245 "3GPP Timed Text".
29 53. IETF RFC 2234: "Augmented BNF for Syntax Specifications: ABNF".
30 54. IETF RFC 2396: "Uniform Resource Identifiers (URI): Generic Syntax".
31 55. IETF RFC 2732: "Format for Literal IPv6 Addresses in URL's".
32 56. 3GPP2 C.R1001: "Administration of Parameter Value Assignments for cdma2000
33 Spread Spectrum Standards".
34 57. IETF RFC 3625: "The QCP File Format and Media type for speech data".
35 58. IETF RFC 3555: "MIME Type Registration of RTP Payload Formats".
36 59. W3C Recommendation: "XHTML 1.0: The extensible HyperText Markup
37 Language (Second Edition)", http://www.w3.org/TR/xhtml1.
3G Multimedia Streaming Services 10
2 5.2 Informative References
3 This section provides references to other documents that may be useful for the reader
4 of this document. The following exemplifies two ways to reference TIA , 3GPP2 [OSA
5 API] or any other SDO’s documents. Please be consistent in referencing the documents
6 by either using acronyms or numbers.
3G Multimedia Streaming Services 11
1 6 Definitions and Abbreviations
3 6.1 Definitions
4 For the purposes of the present document, the following terms and definitions apply.
5 codec: a system component that encodes and decodes data (usually audio, video,
6 etc.) from one representation to another, often with the goal of saving memory space
7 or transmission bandwidth (compression).
8 continuous media: media with an inherent notion of time. In the present document
9 speech, audio, synthetic audio, and video are examples of continuous media.
10 dynamic media: same as continuous media
11 discrete media: media that itself does not contain an element of time. In the
12 present document text and still image are examples of discrete media. Note that
13 timed text and bit map graphics can be continuous media also, depending on the
14 presentation aspects.
15 file format: an unambiguous method for storing data on a memory device (usually
17 live content: content that is encoded in real-time and formatted for immediate
19 media synchronization and media presentation: description of the spatial layout
20 and temporal behavior of a presentation; it can also contain hyperlinks.
21 multimedia: a combination of multiple media elements used in a service to enrich
22 the user experience.
23 natural media: media that occur naturally. In the present document, speech,
24 audio, video and still image are examples of natural media.
25 pre-encoded content: content that is encoded and then stored in a static file.
26 pseudo streaming: a stream of content distributed by progressive download via a
27 reliable delivery protocol (e.g. http) meant for real-time consumption.
28 synthetic media: media that are synthesized from algorithms and/or semantic
29 descriptions. In the present document, bit map graphics, vector graphics and
30 synthetic audio are examples of synthetic media.
31 Real-time streaming: a stream of content meant for real-time consumption.
32 user agent: the module on the terminal that performs MMS specific operations on a
33 user’s behalf.
3G Multimedia Streaming Services 12
1 6.2 Abbreviations
2 For the purpose of this document, the following abbreviations apply:
3G Third Generation system
3G2 3GPP2 File Format for Multimedia Services
3GPP2 Third Generation Partnership Project 2
3GPP Third Generation Partnership Project
AAC Advanced Audio Coding
ABNF Augmented Backus-Naur Form
AMR Adaptive Multi-Rate
AMR-WB Adaptive Multi-Rate – Wideband
AVC Advanced Video Coding
CC/PP Composite Capabilities/Preferences Profile
CE Capability Exchange
CGI Common Gateway Interface
CMF Compact Multimedia Format
DO Data Only
DV Data and Voice
EVRC Enhanced Variable Rate Codec
FFMS File Formats for Multimedia Services
GIF Graphics Interchange Format
HE AAC High Efficiency Advanced Audio Coding
HRPD High Rate Packet Data
HTTP Hyper Text Transport Protocol
IEC International Electrotechnical Commission
IETF Internet Engineering Task Force
IP Internet Protocol
ISO International Standards Organization
ITU-T International Telecommunication Union - Telecommunication
LLAROHC Link Layer Assisted Robust Header Compression
JPEG Joint Photographic Expert Group
MIDI Musical Instrument Digital Interface
MIME Multipurpose Internet Mail Extensions
MMS Multimedia Messaging Service
3G Multimedia Streaming Services 13
MPEG Motion Picture Expert Group
MS Mobile Station (same as MSS terminal, MSS client)
MSS Multimedia Streaming Service
NADU Next Application Data Unit
PSS Packet Switched Streaming
QoS Quality of Service
RDF Resource Description Framework
RFC Request for Comments
RLP Radio Link Protocol
ROHC Robust Header Compression
RTCP Real-time Transport Control Protocol
RTP Real-time Transport Protocol
RTSP Real-Time Streaming Protocol
SBR Spectral Bandwidth Replication
SDP Session Description Protocol
SMIL Synchronized Media Integration Language
SP-MIDI Scalable Polyphonic MIDI
SO Service Option
SVG Scalable Vector Graphics
TCP Transport Control Protocol
TIA Telecommunications Industry Association
UDP User Datagram Protocol
URI Uniform Resource Identifier
URL Uniform Resource Calculator
VOD Video On Demand
VMR-WB Variable Rate Multimode Wide Band
XML Extensible Markup Language
XHTML Extensible Hypertext Markup Language
3G Multimedia Streaming Services 14
1 7 Multimedia Streaming Services Structure
2 This service is defined as a point-to-point service. The streaming service is asymmetric
3 between the sender and the receiver, since the multimedia stream only goes in one
4 direction from a server (MSS server) to a client (MSS client, MS, MSS terminal). On the
5 sender side, the MSS includes content creation, storage, packetization, and
6 transmission. The streaming service supports both retrieval of pre-encoded content and
7 real-time encoded content. The service supports both real-time streaming (yellow) and
8 pseudo-streaming (blue). The requirements are different for real-time streaming and
Media (Video,Speech,Audio,Other) Transport
Media File Control and Feedback Control
MSS Terminal Air Interface Streaming Server
11 Figure 7-1 Multimedia Streaming Service
12 For real-time streaming the encoded information is sent through the packetizer at the
13 MSS server. The receiving MSS terminal Figure 7-2 de-packetizes the data and then
14 decodes it with the corresponding multimedia decoders. The output is then sent to local
15 multimedia devices to be played back. The MSS terminal may also receive a
16 presentation component, which allows additional spatial and temporal attributes to be
17 assigned to the received multimedia components beyond those implicit in the transport
18 and device defaults.
19 The MSS also includes system control protocols for setting up connections between the
20 client(s) and server; negotiating various options, capabilities, and configurations; and
21 communicating with and controlling the various source codecs that the MSS uses. It
22 also includes advanced procedures for monitoring and maintaining QoS under dynamic
23 conditions. The MSS service is intended to be interoperable (to the greatest extent
24 possible) with the 3GPP PSS service defined in .
3G Multimedia Streaming Services 15
Multimedia Streaming Service Terminal Functions
Visual Output Video Decoder
De-packetizer and Transport Synchronization
Audio Output Speech Decoder
Wireless Communication Network
MSS Control MSS Control
User Interface Interface
2 Figure 7-2 Multimedia Streaming Services Terminal Function
3G Multimedia Streaming Services 16
1 8 Call procedures for Multimedia Streaming
2 Streaming of continuous media using RTP/UDP/IP (see 9.1) requires a session control
3 protocol to set-up and control the individual media streams. For the transport of
4 discrete media (images and text), vector graphics, and synthetic audio this specification
5 adopts the use of HTTP/TCP/IP (see 9.2). In this case there is no need for a separate
6 session set-up and control protocol since this is built into HTTP. This section describes
7 session set-up and control of continuous media.
8 8.1 Control Protocol (RTSP)
9 The MSS terminal shall use the Real-Time Streaming Protocol  to set-up and control
10 an MSS session. Additionally, MSS clients and servers:
11 • Shall follow the rules for minimal on-demand playback RTSP implementations in
12 appendix D of ;
13 • Shall implement the DESCRIBE method (see clause 10.2 in ;
14 • Shall implement the Range header field (see clause 12.29 in [21);
15 • Shall include the Range header field in all PLAY responses.
16 TCP should be used to transport RTSP messages reliably.
17 Additional requirements for RTSP usage during Capability Exchange (CE) are described
18 in section 8.2.
19 8.1.1 Mobile Link-Char Header
20 To enable MSS clients to report the link characteristics of the radio interface to the MSS
21 server, the "Mobile-Link-Char" RTSP header is defined. The header takes one or more
22 arguments. The reported information should be taken from a QoS profile as defined in
23 . Note that this information is only valid for the wireless link and does not apply end-
24 to-end. However, the parameters do provide constraints that can be used.
25 Three parameters are defined that can be included in the header (future extensions are
26 possible). Any unknown parameter shall be ignored. The three parameters are:
27 - "MNT": the mobile network type.
28 - "GBW": the forward link user data rate in kilobits per second as defined by .
29 - "MTD": the forward link maximum delay, as defined by  in milliseconds.
30 The "Mobile-Link-Char" header syntax is defined below using ABNF :
31 mobilelinkheader = "Mobile-Link-Char" ":" link-char-spec *("," 0*1SP link-char-spec)
33 link-char-spec = char-link-url *(";" 0*1SP link-parameters)
34 char-link-url = "url" "=" <">url<">
35 link-parameters = Mobile-Type / Guaranteed-BW / Max-Transfer-delay /
3G Multimedia Streaming Services 17
1 Mobile-Type = "MNT" "=" 1*CHAR
2 Guaranteed-BW = "GBW" "=" 1*DIGIT ; kbps
3 Max-Transfer-delay = "MTD" "=" 1*DIGIT ; ms
4 extension-type = token "=" (token / quoted-string)
5 DIGIT = as defined in 
6 CHAR = as defined in 
7 token = as defined in 
8 quoted-string = as defined in 
9 url = as defined in 
10 The "Mobile-Link-Char" header can be included in a request using any of the following
11 RTSP methods: SETUP, PLAY, OPTIONS, and SET_PARAMETER. The header shall not
12 be included in any response. The header can contain one or more characteristic
13 specifications. Each specification contains a URI that can either be absolute or relative.
14 Any relative URI uses the RTSP request URI as a base. The URI points to the media
15 component that the given parameters apply to. This can either be an individual media
16 stream or a session aggregate.
17 The "Mobile-Link-Char" header should be included in a SETUP or PLAY request by the
18 client to give the initial values for the link characteristics. A SET_PARAMETER or
19 OPTIONS request can be used to update the Mobile-Link-Char values in a session
20 currently playing. It is strongly recommended that SET_PARAMETER be used as this
21 has the correct semantics for the operation. Additionally, it requires less overhead both
22 in bandwidth and server processing. If the client has initially reported these parameters
23 and they are changed during the session, the client shall update these parameters by
24 including the "Mobile-Link-Char" header in a SET_PARAMETER or OPTIONS request.
25 When performing updates of the parameters, all of the previous signaled values are
26 undefined and only the given ones in the update are defined. This means that even if a
27 parameter has not changed it must be included in the update.
28 The entries of the "MNT" field has the following syntax:
29 3GPP2-<Network Type ID>-<Release information>
30 Network Type ID:
31 1. SSS – Spread Spectrum Systems
32 2. HRPD – High Rate Packet Data
33 Example 1:
34 Mobile-Link-Char: url=" rtsp://server.foo.com/presentation.3g2"; MNT=3GPP2-SSS-
35 REL-C; GBW=32; MTD=2000
36 In the above example the header tells the server that its mobile network is cdma2000®
37 Spread Spectrum Systems Release C (1xEV-DV) on an Assured Mode with a bit-rate of
38 32 kbps and a maximum transfer delay of 2.0 seconds. If only mobile network type
39 "MNT" parameter is presented, the server shall assume Non-Assured Mode.
cdma2000® is the trademark for the technical nomenclature for certain specifications and
standards of the Organizational Partners (OPs) of 3GPP2. Geographically (and as of the date of
publication), cdma2000® is a registered trademark of the Telecommunications Industry Association (TIA-
USA) in the United States.
3G Multimedia Streaming Services 18
1 Example 2:
2 Mobile-Link-Char: url=" rtsp://server.foo.com/presentation.3g2"; MNT=3GPP2-HRPD-
4 In this example the client tells the server that it is operating on a Non-Assured 1xEV-
5 DO Release 0 mobile network.
6 8.1.2 The 3GPP-Adaptation Header
7 To enable MSS clients to request a desired buffer size, a new RTSP request and
8 response header is defined. The header can be used in the methods SETUP, PLAY,
9 OPTIONS, and SET_PARAMETER. The header defined in ABNF  has the following
11 3GPP-adaptation-def = "3GPP-Adaptation" ":" request-spec 0*("," request-spec)
12 request-spec = url-def *buffer-params
13 buffer-params = ";" buffer-size-def
14 / ";" target-time-def
15 url-def = "url" "=" <"> url <">
16 buffer-size-def = "size" "=" 1*9DIGIT ; bytes
17 target-time-def = "target-time" "=" 1*9DIGIT; ms
18 url = ( absoluteURI / relativeURI )
19 absoluteURI and relativeURI are defined in  and updated in . The base URI for
20 any relative URI is the RTSP request URI.
21 The "3GPP-Adaptation" header shall be sent in responses to requests containing this
22 header. The MSS server shall not change the values in the response header. The
23 presence of the header in the response indicates to the client that the server
24 acknowledges the request.
25 The buffer size signaled in the "3GPP-Adaptation" header shall correspond to a
26 reception and de-jittering buffer that has this given amount of space for complete RTP
27 packets (including the RTP header). The specified buffer size shall also include any
28 Annex C. pre-decoder buffer space used for this media, as the two buffers cannot be
30 The target time signaled in the value of the "target-time" parameter is the targeted
31 minimum buffer level or, in other words, the client desired amount of playback time in
32 milliseconds to guarantee interrupt-free playback and allow the server to adjust the
33 transmission rate, if needed.
35 3GPP-Adaptation: url="rtsp://server.foo.com/presentation.3g2/streamID=1";
37 8.1.3 The "Avc-rtpinterleaving" Header
38 The MSS client should implement the RTSP "avc-rtpinterleaving" header when
39 H.264/AVC RTP interleaved packetization mode is supported. The header can be used
3G Multimedia Streaming Services 19
1 in the methods SETUP, PLAY, OPTIONS, and SET_PARAMETER.
2 The entry of the "avcrtpinterleaving" field defines how much memory out of the reported
3 client's jitter buffer space can be used for interleaved packets. The value of the
4 "avcrtpinterleaving" field is defined in bytes. If the value of the "avcrtpinterleaving" field
5 equals zero, the interleaved packetization mode is prohibited. This header also indicates
6 that all the AVC RTP packetization types are supported.
7 Example 1:
8 avc-rtpinterleaving: url=" rtsp://server.foo.com/presentation.3g2";
10 8.2 Session Set-up (SDP)
11 SDP shall be used for MSS set-up. MSS servers shall provide and MSS clients shall
12 interpret the SDP syntax according to the SDP specification  and appendix C of .
13 The SDP delivered to the MSS client shall declare the media types to be used in the
14 session using a codec specific MIME media type for media as described in section 
15 of this document.
16 The SDP  specification requires certain fields to always be included in an SDP file.
17 Apart from this an MSS server shall include the following fields in the SDP:
18 • "a=control:" according to clauses C.1.1, C.2 and C.3 in ;
19 • "a=range:" according to clause C.1.5 in ;
20 • "a=rtpmap:" according to clause 6 in ;
21 • "a=fmtp:" according to clause 6 in .
22 The bandwidth field in SDP shall be used to indicate to the MSS terminal the amount of
23 bandwidth that is required for the session and the individual media in the presentation.
24 Therefore, an MSS server should include the "b=AS:" field in SDP (both at the session
25 and media level) and an MSS client shall be able to interpret this field. For RTP based
26 applications, AS gives the RTP "session bandwidth'' (including UDP/IP overhead) as
27 defined in section 6.2 of .
28 NOTE: The SDP parsers and/or interpreters should be able to accept NULL values in
29 the 'c=' field (e.g. 0.0.0.0 in IPv4 case). This may happen when the media content does
30 not have a fixed destination address. For more details, see Section C.1.7 of  and
31 Section 6 of .
32 8.2.1 Session Capability Exchange (SDP)
33 • An advanced method for Capability Exchange (CE) is defined in section 8.3 of this
34 document. When an MSS client or server supports capability exchange it shall
35 support the transport of profile information over both HTTP and RTSP as defined in
36 section 5.2.5, “Signalling of profile information between client and server,” of .
37 [Note that the 3GPP acronym PSS is equivalent to the 3GPP2 acronym MSS].
38 8.2.2 Session Video Buffering (SDP)
39 The following Buffering of Video (Normative) related media level SDP fields are defined
3G Multimedia Streaming Services 20
1 for MSS:
2 • "a=X-predecbufsize:<size of the hypothetical pre-decoder buffer>"
3 This gives the suggested size of the Appendix C hypothetical pre-decoder buffer in
5 • "a=X-initpredecbufperiod:<initial pre-decoder buffering period>"
6 This gives the required initial pre-decoder buffering period specified according to
7 Appendix C. Values are interpreted as clock ticks of a 90-kHz clock. That is, the
8 value is incremented by one for each 1/90 000 seconds. For example, value 180
9 000 corresponds to a two second initial pre-decoder buffering.
10 • "a=X-initpostdecbufperiod:<initial post-decoder buffering period>"
11 This gives the required initial post-decoder buffering period specified according to
12 Appendix C. Values are interpreted as clock ticks of a 90-kHz clock.
13 • "a=X-decbyterate:<peak decoding byte rate>"
14 This gives the peak decoding byte rate that was used to verify the compatibility of
15 the stream with Appendix C. Values are given in bytes per second.
16 If none of the attributes "a=X-predecbufsize:", "a=X-initpredecbufperiod:", "a=X-
17 initpostdecbufperiod:", and "a=x-decbyterate:" are present, clients should not expect a
18 packet stream according to Appendix C. If at least one of the listed attributes is present,
19 the transmitted video packet stream shall conform to Appendix C. If at least one of the
20 listed attributes is present, but some of the listed attributes are missing in an SDP
21 description, clients should expect a default value for the missing attributes according to
22 Appendix C.
23 8.2.3 Additional Attributes (SDP)
24 The MSS client should be able to interpret the following attribute:
26 This indicates that random access is not allowed for this session. This may imply that
27 the stream is live and that any PLAY request will display the stream from its current
28 point. A PLAY request, if used for an on-demand session after the initial one, will be
29 treated by the server to have no Range header regardless of the actual presence of it.
30 That is, the PLAY request indicates a restart from the beginning or resuming from the
31 pause point at the server's discretion (for details, refer to Section 10.5 of ). To avoid
32 an unnecessary delay that can be caused in this process, the MSS client should
33 deactivate the random access feature for such a session. E.g., deactivation of rewind
34 and fast forward buttons in the user interface.
36 For live sessions indicated by an open ended a=range attribute, a PAUSE will mean that
37 the MSS client does not receive the stream for the paused interval and the MSS client
38 shall issue a PLAY request to resume with Range header with the indicator, 'now', e.g.,
39 "Range: npt=now-".
41 The MSS client and server should interpret the "alt", "alt-default-id", and "alt-group"
3G Multimedia Streaming Services 21
1 attributes as described in . The "alt" attribute is used to replace or add an SDP line
2 to the default configuration. The "alt-default-id" attribute is used to assign an
3 alternative identifier to the default alternative. The "alt-group" attribute is used to
4 define grouping alternatives from which the client can select the most appropriate.
5 These attributes are used together to create combinations consisting of, e.g., one audio
6 and one video alternative. It is the server’s responsibility to create meaningful grouping
8 8.2.4 Session 3GPP-Adaptation-Support (SDP)
9 The MSS server should implement a media level only SDP attribute when "Bit Rate
10 Adaptation" is supported.
11 "a=3GPP-Adaptation-Support:<report frequency>"
12 When a MSS client receives a SDP description where the SDP attribute "3GPP-
13 Adaptation-Support" is presented shall then in its subsequent RTSP signaling use the
14 "3GPP-Adaptation" header as defined in section 8.1.2, as well as the RTCP NADU APP
15 packet defined in section 9.1.1.
16 The report frequency value defines the frequency of the buffer level feedback signaling.
17 For example, if the value is 2 then the client shall send the NADU APP packet every two
18 RTCP packets.
19 When the MSS client receives "3GPP-Adaptation-Support" signaling from the server, the
20 Annex C related RTSP signaling "x-predecbufsize" and "x-initpostdecbufperiod" can be
21 ignored. The RTSP "x-initpredecbufperiod" and SDP signaling defined in clause 8.2.2
22 will remain effective.
23 8.2.5 Session HE AAC Support (SDP)
24 The terminal shall support the signaling types "implicit" and "hierarchical explicit"
25 signaling (as defined in ). If implicit signaling is used, the AAC object type is
26 signaled to maintain backwards compatibility. If, in such a case, the sampling rate as
27 indicated by the AAC object type descriptor (in the SDP) is 24 kHz or below and "SBR-
28 enabled" (see below) is not specified in the SDP, the output shall be configured to twice
29 the AAC sampling rate. In both types of signaling, the MIME type indicated to the
30 terminal for HE AAC shall be the same as for AAC LC.
31 Servers using implicit signaling shall include the "SBR-enabled" parameter in the SDP
32 "a=fmtp" line. "SBR-enabled" shall be set to "1" for streams containing SBR and shall be
33 set to "0" for streams not containing SBR. Terminals may rely on this parameter to set
34 the correct output sampling rate to either the indicated rate (where "SBR-enabled" is set
35 to "0") or twice the indicated rate (where "SBR-enabled" is set to "1").
36 The above signaling support is not required if only AAC is supported.
37 8.3 Capability Exchange (CE)
39 8.3.1 Overview
40 Enhanced capability negotiation provides additional functionality for the MSS by
3G Multimedia Streaming Services 22
1 enabling the MSS servers to provide tailored content suitable for a particular MSS
2 terminal. Another very important task is to provide a smooth transition between
3 different releases of MSS. A Capability Exchange (CE) mechanism for streaming is
4 defined in  and should be supported by MSS clients and servers with the additional
5 requirements as described in this document.
6 8.3.2 Description
7 When CE is supported the corresponding device capability profile shall be an RDF
8 document that follows the structure of the CC/PP framework and the CC/PP
9 application UAProf as described in clause 5.2.2 of .
10 The CE vocabulary and its usage is also defined in  and summarized in Table 8-1,
11 Table 8-2, and Table 8-3. Some additional cdma2000 values are also defined. When CE
12 is supported, MSS CE vocabulary usage shall be as described in .
13 When support for "media/types" in  conflicts with this specification, support as
14 described in this document shall take precedence.
15 The MSS base vocabulary contains one component called "Streaming". A vocabulary
16 extension to UAProf shall be defined as an RDF schema. This schema can be found in
17 Annex F or . The schema together with the description of the attributes in this
18 section, defines the vocabulary. All MSS attributes shall be put in the "Streaming"
20 MSS servers should understand the attributes of the "Streaming" component of the
21 MSS base vocabulary.
MSS Capability Exchange Vocabulary ( component = "Streaming" )
Name 3GPP2 values Applicable 3GPP Values
AvcRtpInterleaving Integer value greater
than or equal to 0
Brands List of supported 3G2 List of supported 3GP profiles
profiles identified by identified by brand.
MaxPolyphony Integer value between 5 and
StreamingAccept List of content types (MIME
types) the application
StreamingAccept-Subset List of content types for which
the PSS application supports
3G Multimedia Streaming Services 23
PssVersion or "3GPP2-MSS-0" "3GPP-R4","3GPP-R5","3GPP-
RenderingScreenSize Two integer values equal or
greater than zero. A value
equal "0x0"means that there
exists no possibility to render
visual MSS presentations.
SmilBaseSet "SMIL-3GPP2-FFMS-0", "SMIL-3GPP-R4","SMIL-3GPP-
SmilModules This attribute defines a list of
SMIL 2.0 modules supported
by the client. If the
SmilBaseSet is used those
modules do not need to be
explicitly listed here. In that
case only additional module
support needs to be listed.
SmilAccept This attribute defines a list of
content types (MIME types)
that can be part of a SMIL
SmilAccept-Subset This attribute defines a list of
content types for which the
PSS application supports a
subset. MIME types can in
most cases effectively be used
to express variations in
support for different media
VideoDecodingByteRate Integer value greater than or
equal to 8000 (Bytes per
VideoInitialPostDecoderBuf Integer value equal to or
feringPeriod greater than zero
Version 1.0 and A for 3GPP2 SMIL profiles are defined in .
3G Multimedia Streaming Services 24
VideoPreDecoderBufferSize Integer value equal to or
greater than zero. Values
greater than one but less than
the default buffer size defined
in Annex C are not allowed
2 Table 8-1 Summary of non-UAProf defined Attributes for MSS CE
3 In the UAProf vocabulary  there are several attributes that are of interest for the
4 MSS. Table 8-2 and Table 8-3 summarize the UAProf attributes recommended for MSS
5 applications for "HardwarePlatform" and "SoftwarePlatform" respectively.
6 MSS servers should understand the recommended attributes from the UAProf
7 vocabulary . An MSS server may additionally support other UAProf attributes.
UAProf Capability Exchange Vocabulary (component = "HardwarePlatform" )
Name 3GPP2 values Applicable UAProf Values
BitsPerPixel The number of bits of color or
grayscale information per
ColorCapable Whether the device display
supports color or not: "Yes" or
PixelAspectRatio Ratio of pixel width to pixel
PointingResolution Type of resolution of the
pointing accessory supported
by the device" "Pixel"
Model Model number assigned to the
terminal device by the vendor
or manufacturer: "Viper"
Vendor Model number assigned to the
terminal device by the vendor
or manufacturer: "Dodge"
10 Table 8-2 Summary of UAProf (HardwarePlatform) defined Attributes for MSS CE
UAProf Capability Exchange Vocabulary ( component = "SoftwarePlatform" )
Name 3GPP2 values Applicable UAProf Values
CcppAccept-Charset List of character sets the
CcppAccept-Encoding List of transfer encodings the
3G Multimedia Streaming Services 25
CcppAccept-Language List of preferred document
2 Table 8-3 Summary of UAProf (SoftwarePlatform) defined Attributes for MSS CE
3 The use of RDF enables an extensibility mechanism for CC/PP-based schemas to
4 address the evolution of new devices and applications. The MSS profile schema
5 specification provides a base vocabulary, which may need to be updated to express new
6 attributes. If the base vocabulary is updated, a new unique namespace will be assigned
7 to the updated schema. The base vocabulary shall only be changed by the TSG
8 responsible for the present document. All extensions to the profile schema shall be
9 governed by the methods defined in section 220.127.116.11 “Vocabulary definitions” of 
10 clause 7.7
11 • When an MSS client or server supports CE, it shall support the profile information
12 transport over both HTTP and RTSP between client and server as defined in section
13 5.3.5 “Signalling of profile information between client and server” of . [Note that
14 the 3GPP acronym PSS is equivalent to the 3GPP2 acronym MSS].
15 When device CE profiles are merged, they shall be merged as described in the clause
16 titled "Merging device capability profiles" in .
17 When device CE profiles are exchanged between a CE profile server and an MSS server,
18 they shall be exchanged as described in clause titled "Profile transfer between PSS
19 server and the device profile server" in .
20 8.4 Presentation/Layout Description
21 The presentation/layout description describes the spatial and temporal relationship
22 between the components of the media stream. The spatial format describes the
23 rendering of the visual media components on the MSS terminal display. The temporal
24 format describes when and how long visual media should be displayed or audio media
26 The MSS terminal should support "presentation/layout description". If the MSS
27 terminal supports "presentation/layout description" it
28 • Shall support 3GPP2 SMIL language profile as described in .
29 8.4.1 SMIL Usage in MSS
30 If the terminal has some prior knowledge about the file type it is about to retrieve, e.g.
31 file extensions, then:
32 When retrieving a SMIL file with HTTP the client should include profile information in
33 the GET request. This way the HTTP server can deliver an optimized SMIL presentation
34 to the client. A SMIL presentation can include links to static media. The server should
35 optimize the SMIL file so that links to the referenced static media are adapted to the
36 requesting client. When the "x-wap-profile-warning" indicates that content selection has
37 been applied (201-203) the MSS client should assume that no more capability exchange
38 has to be performed for the static media components. In this case it should not send
39 any profile information when retrieving static media to be included in the SMIL
40 presentation. This will minimize the HTTP header overhead.
3G Multimedia Streaming Services 26
1 8.5 Data Channel Setup
2 The IP bearer for carrying the multimedia streaming traffic shall be set-up according to
3 the procedures described in:
4 • Service option 33 or 66  for cdma2000 1x and cdma2000 1xEV-DV and
5 • Service Option 591  for High Rate Packet Data .
6 Service Option 33/66 uses the Radio Link Protocol Type 3 . The RLP retransmission
7 scheme is negotiated between the MS and base station using the RLP_BLOB procedures
9 As a general rule RLP allows the bearer to trade bandwidth and delay for reliability. As
10 a rule of thumb the bandwidth will be reduced by a factor of:
11 Effective Bandwidth = 1/(1+SUM(FERn)) n = 1…Num RLP Retransmissions,
12 and delay increased by a factor of n*RLP_RoundTripTime. Typically RLPRoundTripTime
13 is 100 to 200 ms.
14 8.5.1 Header Compression
15 Header compression is a method for greatly compressing redundant information in IP,
16 UDP, and RTP headers. Depending on the media type and the number of media frames
17 in a packet this header information can actually consume more bandwidth than the
18 raw media data.
19 There are several general-purpose and media specific header compression methods
20 which can be used with MSS, namely Robust Header Compression (ROHC) and the
21 cdma2000 specific header reduction methods  service option 60 (Header removal)
22 and service option 61 (LLAROHC). Service option 60 and 61 are only applicable to
23 cdma2000 vocoders (EVRC, 13K and VMR-WB), which conform to the RS1 (9.6kbps) or
24 RS2 (14.4kbps) channels. When service option 60 is in use the IP session for service
25 option 60 terminates at the PDSN and not at the MSS client.
codec From MSS
This service option assignment made for identification defined in the Radio
MSS Terminal is not carried over the air interface.
Air Interface PDSN
3G Multimedia Streaming Services 27
IP traffic (uncompressed, compressed) - Control, Video, etc…
Voice traffic (uncompressed, compressed)
1 Figure 8-1 Data Channel Set-up (SO 33/66, 60 and 61)
2 8.6 Session Termination
3 RTSP shall be used to terminate the session when the session is completed. The packet
4 data service call is terminated using the procedures defined in Service Option 33/66
5 . If service option 60 or 61 are in use they should be torn down before the service
6 option 33/66 teardown.
3G Multimedia Streaming Services 28
1 9 Transport Protocol
2 The MSS client shall support packetization and transport via both RTP  /UDP 
3 /IP  and HTTP /TCP  /IP to receive components of the MSS.
4 Support for specific media types are described in the remainder of this section.
5 9.1 RTP/UDP
6 The MSS client shall support transport of continuous media (video, speech, audio and
7 timed text) via RTP/UDP/IP. (Figure 9-1).
8 The MSS client shall provide an IP bearer as described in section 8.5.
Other Session Control Video Application
Media Presentation Description
Capability Exchange Audio
TCP UDP Transport
IP IP IP
RLP RLP R-P R-P
Air Link Air Link Physical Physical Physical
MSS Terminal BTS/BSC/PCF PDSN/Server
11 Figure 9-1 Protocol Stack for Multimedia Streaming Service
12 The MSS terminal shall support RTP packetization and RTCP feedback as defined in
13 "RTP: A Transport Protocol for Real-Time Applications"  and in "RTP Profile for Audio
14 and Video Conferences with Minimal Control"  for the following media types when
16 For the MPEG-4 video coder, the payload format shall be as defined in "RTP Payload
17 Format for MPEG-4 Audio/Visual Streams" .
18 For the H.263 video coder, the payload format shall be as defined in "RTP Payload
19 Format for the 1998 Version of ITU-T Rec. H.263 (H.263+)" .
20 For the H.264 video coder, the payload format shall be as defined in "RTP Payload
21 Format for H.264 Video", .
22 For the 13K speech coder, the payload format shall be as defined in "RTP Payload
23 Format for PureVoice(tm) Audio" .
3G Multimedia Streaming Services 29
1 For the EVRC speech coder, the payload format shall be as defined in "RTP Payload and
2 Format for Enhanced Variable Rate Codecs (EVRC) and Selectable Multimode Vocoders
3 (SMV)" .
4 For the VMR-WB speech coder, the payload format shall be as defined in "RTP Payload
5 formats for the Variable-Rate Multimode Wideband (VMR-WB) Audio codec" .
6 For the MPEG4 AAC Audio Coder, the payload format shall be as defined in "RTP
7 Payload Format for MPEG-4 Audio Visual Streams" .
8 For the MPEG-4 HE AAC Audio Coder, the payload format shall be as defined in "RTP
9 Payload Format for MPEG-4 Audio Visual Streams" .
10 NOTE: The number of octets received by an MSS terminal can be closely calculated by
11 multiplying the "sender's octet count" field of the RTCP Sender report by the "fraction
12 lost" field of the corresponding report block in the RTCP Receiver report.
13 9.1.1 RTCP extension (NADU APP packet)
14 A MSS client should implement the RTCP APP packet (Application-Defined RTCP
15 Packet) as defined in section 6.7 of  to support the client buffer level feedback
16 function for reporting the Next Application Data Unit (NADU).
18 0 1 2 3
19 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
21 |V=2|P| subtype | PT=APP=204 | length |
23 | SSRC/CSRC |
25 | name (ASCII) |
27 | application-dependent data ...
29 Figure 9-2: Generic Format of an RTCP APP packet.
31 · name: The NADU APP data format is detected from the name "PSS0". (value:
33 · subtype: This field shall be set to 0 for NADU format.
34 · Length: The number of 32 bit words -1, as defined in RFC 3550 . This means
35 that the field will be 2+3*N, where N is the number of sources reported on. The
36 length field will typically be 5, i.e. 24 bytes packets.· Application-dependent data:
37 One or more of the following data format blocks (as described in Figure 9-3) can be
38 included in the application-dependent data location of the APP packet. The APP
39 packets length field is used to detect how many blocks of data are present. The
40 block shall be sent for the SSRCs for which there are a report block, part of either a
41 Receiver Report or a Sender Report, included in the RTCP compound packet. An
3G Multimedia Streaming Services 30
1 NADU APP packet shall not contain any other data format than the one described in
2 Figure 9-3 below.
4 0 1 2 3
5 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
7 | SSRC |
9 | Playout Delay | NSN |
11 | Reserved | NUN | Free Buffer Space (FBS) |
13 Figure 9-3: Data format block for NADU reporting
14 SSRC: The SSRC of the media stream the buffered packets belong to.
15 Playout delay (16 bits): The difference between the scheduled playout time of the
16 next ADU to be decoded and the time of sending the NADU APP packet, as
17 measured by the media playout clock, expressed in milliseconds. The client may
18 choose not to indicate this value by using the reserved value (Ox FFFF). In case of
19 an empty buffer, the playout delay is not defined and the client should also use the
20 reserved value 0xFFFF for this field.
21 The playout delay allows the server to have a more precise value of the amount of
22 time before the client will underflow. The playout delay shall be computed until the
23 actual media playout (i.e., audio playback or video display).
25 NSN (16 bits): The RTP sequence number of the next ADU to be decoded for the
26 SSRC reported on. In the case where the buffer does not contain any packets for
27 this SSRC, the next not yet received sequence number shall be reported, i.e. an
28 NSN value that is one larger than the least significant 16 bits of the RTCP SR or RR
29 report block's "extended highest sequence number received".
30 NUN (5 bits): The unit number (within the RTP packet) of the next ADU to be
31 decoded. The first unit in a packet has a unit number equal to zero. The unit
32 number is incremented by one for each ADU in an RTP packet. In the case of an
33 audio codec, an ADU is defined as an audio frame. In the case of H.264 (AVC), an
34 ADU is defined as a NAL unit. In the case of H.263 and MPEG4 Visual Simple
35 Profile, an ADU is defined as a whole or a part of an H.263 video picture or MPEG4
36 VOP that is included in a RTP packet. In the specific case of H.263, each packet
37 carries a single ADU and the NUN field shall be thus set to zero. Future additions
38 of media encoding or transports capable of having more than one ADU in each RTP
39 payload shall define what shall be counted as an ADU for this format.
40 FBS (16 bit): The amount of free buffer space available in the client at the time of
41 reporting. The reported free buffer space shall all be part of the buffer space that
42 has been reported as available for adaptation by the 3GPP-Adaptation RTSP
43 header, see section 8.1.2. The amount of free buffer space are reported in number
44 of complete 64 byte blocks, thus allowing for up to 4194304 bytes to be reported as
45 free. If more is available, it shall be reported as the maximal amount available, i.e.
3G Multimedia Streaming Services 31
1 4194304 with a field value 0xffff.
2 • Reserved (11 bits): These bits are not used and shall be set to 0 and shall be
3 ignored by the receiver.
4 9.2 HTTP/TCP and Container Formats
5 Some media types, which may be used in an MSS presentation do not currently have a
6 corresponding RTP definition. Therefore, it may be necessary to receive these media
7 types using HTTP/TCP/IP transport (Figure 9-1). Among these media types are:
8 • Audio (Synthetic): SP-MIDI , General MIDI  and Compact MIDI ,
9 • Text: Unicode  (e.g. US-ASCII , ISO-8859-1 , UTF-8 , GSM 7-bit
10 default alphabet , Shift-JIS, etc…),
11 • Timed Text: 3GPP Timed Text ,
12 • Bitmap Graphics: GIF87 , GIF89A , PNG ,
13 • Still Image: JPEG ,
14 • Vector Graphics: SVG-Tiny ,  and SVG-Basic , .
15 When delivering multimedia using HTTP/TCP/IP, media elements should be aggregated
16 into a container file.
17 When multiple media elements are aggregated for combined delivery over HTTP/TCP or
18 other non-streaming approaches they shall be combined using the ".3g2" format as
19 described in  when possible to do so.
20 When multiple media elements are aggregated for combined delivery over HTTP/TCP or
21 other non-streaming approaches, and it is not possible to do so using the ".3g2" format,
22 they shall be combined using the ".cmf" format as described in  when possible to do
24 Media that cannot be aggregated using ".3g2" or CMF may use alternative methods.
25 NOTE: When timed text is present in a “.3g2” file along with audio, speech and/or video
26 components, the audio, speech and/or video components may be parsed at the server
27 and streamed using RTSP/RTP as described in 8.1 / 9.1.
28 9.2.1 Transport of CMF Over HTTP
29 HTTP GET request is used to specify a CMF file by URI. When retrieving a CMF file with
30 HTTP GET request, the client should include optional enhancement parameters such
31 that the HTTP server can deliver an optimized CMF content to the client. Optimizations
32 may include, but are not limited to, CMF version, display size and other content
34 The MSS server shall ignore the URI parameters that it does not support. Multiple URI
35 parameters may be listed, and the delimiter "&" shall be used for such cases. Syntax of
36 GET request is shown below.
37 GET URI?uri_parameter_name1=value&uri_parameter_name2=value HTTP/1.1
Item M/O Description
3G Multimedia Streaming Services 32
URI M URI to the 3GPP2 CMF file.
vn represents the CMF version number. CMF Version
URI vn O
parameters number is the 4 bytes of data from the 'vers'
subchunk. The first two bytes specify major version
and the second two bytes specify minor version.
The major and minor version numbers are represented
sh O sh represents the screen height, with 2 bytes.
sw O sw represents the screen width, with 2 bytes.
1 Table 9-1 CMF parameters for HTTP GET
2 M/O stands for "Mandatory" or "Optional", respectively.
3G Multimedia Streaming Services 33
1 10 Pseudo Streaming
2 Transfer of dynamic media content (Video, Speech, Audio, Timed Text) to the MSS
3 terminal may also be accomplished via file download using the TCP protocol as
4 referenced previously in section 9. The ".3g2" file format  shall be used for pseudo
5 streaming of timed multimedia (such as video, associated audio and timed text].
6 The basic sequence of pseudo streaming is depicted in Figure 10-1. MSS client sends
7 HTTP GET request which specifies the file to be downloaded and MSS server transmits
8 the file in the response to the request. MSS client may start multimedia decoding and
9 playback during the download.
GET /foo.3g2 http/1.1
<CR><LF> HTTP/1.1 200 OK
Start of the file receiving
Start of playback
End of the file receiving
11 Figure 10-1 Basic Sequence of Pseudo-streaming
12 10.1 Session Description
13 When deploying pseudo streaming, MSS session set up procedure described in section
14 8.2 is not required.
15 The necessary information for the client is URI of multimedia file and the file size,
16 which are provided by some means such as HTTP, e-mail and etc. The representation of
17 such information is not defined in this specification.
18 10.2 Transport Options
20 10.2.1 Request
21 HTTP GET request is used to specify a pseudo streaming file by URI.
22 The request is optionally enhanced using URI parameters and message-headers. The
23 MSS pseudo-streaming server that does not support these options shall ignore the URI
24 parameters that it does not support. Multiple URI parameters may be listed; the
25 delimiter "&" shall be used for such cases. Syntax of GET request is shown below.
26 GET URI?uri_parameter_name1=value&uri_parameter_name2=value HTTP/1.1
3G Multimedia Streaming Services 34
Item M/O Description
URI M URI to the 3GPP2 file.
URI ac O Access Ticket obtained from the presentation
br O Bit rate selection in accordance with the presentation
st O Specifies the start position of the transmission in
Effective only for the initial transmission request,
ts O 2: The beginning of the transmission
3: Continuous transmission
This parameter shall exist if ac is contained.
message- Range O bytes=0-XXXXX (The beginning of the transmission)
bytes=YYYYY-ZZZZZ (Subsequent transmission)
5 Table 10-1 Pseudo-streaming HTTP GET request parameters
6 M/O stands for "Mandatory" or "Optional", respectively.
7 10.2.2 Response
8 Requested data is sent back to the MSS client in the response. The response is
9 optionally enhanced using message-headers. The MSS pseudo-streaming client that
10 does not support these options shall ignore those that it does not support.
11 Response to the data transmission request is defined in the following.
12 HTTP/1.1 Status Code
16 requested data
3G Multimedia Streaming Services 35
Item M/O Value
Status Code M 200 OK (for full length data)
206 Partial Content (for ranged data)
message Content-Range O bytes=xxxx-xxxx/xxxxxx (The size
Required for the status "206 Partial Content"
1 Table 10-2 Pseudo-streaming HTTP response options
2 Regarding the Range parameter, the start bytes shall correspond to the actual bytes
3 received at the previous request. The following table shows an example. Note that the
4 Range parameter doesn’t specify an actual position of the content in the server but
5 specifies the number of bytes expected through the transmission.
The request number Specified bytes in the request Actual bytes in the response
1 0-96767 48000
2 48000-144767 52000
3 100000-196767 50000
7 Table 10-3 Example of pseudo-streaming message header range parameters
8 The uses of these options are described in the informative
3G Multimedia Streaming Services 36
2 Pseudo Streaming Session Representation Example (Informative).
3 10.3 FFMS Usage in MSS
4 The MSS terminal should support pseudo-streaming. If the MSS terminal does support
5 pseudo-streaming, it shall support the ".3g2" MSS file format as specified in  and
6 specifically the transmission format for pseudo-streaming as specified in section B.1.2
7 of Annex B of that document.
3G Multimedia Streaming Services 37
1 11 Media Types (Codecs)
3 11.1 General Requirements
4 Support for all media types is not required. When media support is not mandatory this
5 will be stated. When a media type is supported it shall include the mandatory codecs
6 specified for that media type.
7 11.2 "Video"
8 The MSS terminal should support the media type "Video". If the media type "video" is
9 supported then the MSS terminal:
10 • Shall support decoding of ITU-T Recommendation H.263 Profile 0 Level 45 
11 • Shall support decoding of at least one of the following:
12 • MPEG-4 Visual Simple Profile Level 0b ,
13 • H.264 Baseline Profile  Level 1b with constraint_set1_flag=1.
14 • Should support decoding of ITU-T Recommendation H.263 Profile 3 Level 45 
15 For continuous video media the following MIME media types shall be used:
16 • For H.263 the MIME media type as defined in clause 4.2.7 of ;
17 • For MPEG-4 video the MIME media type as defined in . When used in SDP
18 the configuration information shall be carried outband in the "config" SDP
19 parameter and inband (as stated in ). As described in , the
20 configuration information sent inband and the config information in the SDP
21 shall be the same except that first_half_vbv_occupancy and
22 latter_half_vbv_occupancy which, if exist, may vary in the configuration
23 information sent inband;
24 • For H.264 (AVC) video the MIME media type as defined in .
25 Note that H.263 profile 0 level 10 is contained in MPEG-4 Visual as the short header
27 The video decoder should include basic error concealment technologies. These
28 techniques may include re-generating data that is lost in transmission by re-using or
29 interpolating from the temporal adjacent frames or from spatial adjacent regions of the
30 same frame.
31 The video encoder and decoder for the MSS terminal should support square pixel
32 format. The encoder should signal this to avoid un-necessary pixel shape conversion.
33 11.3 "Speech"
34 There are two types of speech defined for MSS – Narrow Band (NB) and Wide Band
3G Multimedia Streaming Services 38
1 11.3.1 "Narrow Band Speech"
2 The MSS terminal shall support at least one of the following media type "Narrow Band
4 • EVRC ,
5 • 13K Vocoder .
6 For continuous narrow band speech media the following MIME media types shall be
8 • For EVRC the MIME media type as defined in clause 12.1 of ;
9 • For 13K the MIME media type as defined in clause 4.1 of  for storage and
10 clause 4.1.20 of  for streaming.
11 11.3.2 "Wide Band Speech"
12 The MSS terminal should support the media type "Wide Band Speech". If the media
13 type "Wide Band Speech" is supported the MMS terminal:
14 • Shall support VMR Wide Band ;
15 For continuous wide band speech media the following MIME media types shall be used:
16 • For VMR-WB the MIME media type as defined in 
17 11.4 "Audio"
18 The MSS terminal should support media type "audio". If the media type "audio" is
19 supported the MSS terminal:
20 • Should support the MPEG-4 AAC Profile Level 2 , , ;
21 • Should support the MPEG-4 HE AAC Profile, Level 2 , , .
22 For continuous audio media the following MIME media types shall be used:
23 • For MPEG-4 AAC the MIME media type defined in clause 5.4 of ;
24 • For MPEG-4 HE AAC the MIME media type defined in clause 5.4 of .
25 11.5 "Text in SMIL"
26 The MSS terminal should support media type "Text in SMIL", which is intended to
27 enable formatted text in a SMIL presentation. If text in SMIL is supported, the MSS
29 • Shall support a SMIL plus XHTML profile contained in 3GPP2 SMIL language profile
30  presentation and may ignore any unsupported XHTML modules in a SMIL
32 • Shall support rendering of a SMIL presentation where text is referenced with the
33 SMIL 2.0 "text" element together with the SMIL 2.0 "src" attribute.
3G Multimedia Streaming Services 39
1 11.6 "Timed Text"
2 The MSS terminal should support the media type "Timed Text". If timed text is
3 supported then the MSS terminal:
4 • Shall support 3GPP Timed Text .
5 11.7 Other media
6 Other media types, as referenced in section 9.2, should be supported as described for
7 MMS .
3G Multimedia Streaming Services 40
1 12 Rate Adaptation of Streaming Media
3 12.1 Introduction
4 The goal of multimedia streaming rate adaptation is to achieve with the available
5 resources the highest possible quality of experience for the end-user while maintaining
6 interrupt-free playback of the media. This requires that the available network resources
7 are known or estimated and that transmission rates are adapted to the available
8 network link rates. With this information the server can better manage the network
9 buffers and thereby avoid packet losses by overflowing the buffer or underutilization the
10 network resources. The real-time properties of the transmitted media must be
11 considered so that media does not arrive too late to be useful. This will require that
12 media content rate is adapted to the transmission rate.
13 To avoid client buffer violation (underflow or overflow) while still allowing the server to
14 deliver as much data as possible into the client buffer, a functionality for client buffer
15 feedback is defined. This allows the server to closely monitor the buffering status on the
16 client side and thus to optimize the quality of service. The client can specify how much
17 buffer space the server can utilize and the desired target level of protection. When the
18 desired level of protection is achieved, the server may utilize any resources beyond what
19 is needed to maintain that protection level to increase the quality of the media. The
20 server can also utilize the buffer feedback information to decide if the media quality
21 needs to be lowered in order to avoid a buffer underflow and the resulting playback
23 12.2 Rate Adaptation
24 The bit-rate adaptation for MSS is server centric in the meaning that transmission and
25 content rate are controlled by the server. The server uses RTCP and RTSP as the basic
26 information sources about the state of the client and network. This makes link-rate
27 adaptation for communicating with MSS client possible.
28 12.2.1 Link Characteristics
29 When connected on an assured network channel, a MSS client should inform the server
30 the quality of service parameters for the used wireless link. The known parameters
31 should be included in the RTSP "Mobile-Link-Char" header (section 8.1.1) in either the
32 RTSP SETUP or PLAY request. This enables the server to set some basic assumption
33 about the possible bit-rates and link response. If the client has initially reported these
34 parameters and they are changed during the session the client shall update these
35 parameters by including the "Mobile-Link-Char" header in a SET_PARAMETER or
36 OPTIONS request.
37 A MSS client should inform the server about initial bit-rate available over the link, if
38 known. This reporting shall be done using the RTSP "Bandwidth" header in either the
39 RTSP SETUP or PLAY request. The QoS negotiated guaranteed bit-rate is the best
40 estimate for the bandwidth value.
3G Multimedia Streaming Services 41
1 12.2.2 Adaptation of Transmission Rate
2 The basic information source giving regular reports useful for bit-rate estimations is the
3 RTCP receiver reports as defined by . The RTCP reporting interval is dependent on
4 the RTP profile in use, the bit-rate assigned to RTCP, the average size of RTCP packets,
5 and the number of reporting entities. Most of these parameters can be set or affected
6 by the MSS server through signaling. This allows the server to configure the reporting
7 interval to a desirable working point.
8 In most MSS RTP sessions the server and the client only have one SSRC each, thus
9 providing the highest possible reporting rate. However, some scenarios could result in a
10 large number of SSRCs, thereby possibly lowering the effective reporting interval for
11 client, server or both.
12 The transmission rate adaptation is implementation dependent. The server can use
13 alternate bitstreams for stream switching or use embedded scalable functionalities in
14 the codec. For example, the H.264 extended profile defines two types of switching
15 frames (SP/SI), which can be used for bitstream switching.
16 12.2.3 Receiver Buffer Level Feedback
17 The client buffer feedback signaling functionality should be supported by MSS clients
18 and MSS servers. For MSS clients and servers that support the client buffer feedback
19 signaling functionality, the following parts shall be implemented:
20 • SDP service support, as described in section 8.2.4;
21 • The buffer size (in bytes) provides the MSS server the available buffer space in the
22 client. It is signaled to the server through RTSP, as described in section 8.1.2;
23 • The target buffer protection time (in milliseconds). It is signaled to the server
24 through RTSP, as described in section 8.1.2;
25 The client buffer status feedback information free buffer space, next ADU to be decoded
26 and playout delay. It is signalled to the server via RTCP, as described in section 9.1.1;
27 If a MSS server supports client buffer feedback, it shall include the attribute "3GPP-
28 Adaptation-Support" in the SDP, as described in 8.2.4. Upon reception of such an SDP
29 attribute, if a MSS client supports client buffer feedback, it shall send NADU APP
30 packets according to section 9.1.1 after a successful SETUP response.
31 The "3GPP-Adaptation" header may be included in PLAY, OPTIONS and
32 SET_PARAMETER requests in order to update the target buffer protection time value
33 during a session. The buffer size value shall not be modified during a session.
34 With the total buffer size, and the reported amount of free buffer space, the server can
35 avoid overflowing the buffer. A server should assume that any sent RTP packet will
36 consume receiver buffer space equal to the complete RTP packet size. For interleaved or
37 aggregated media, the actual buffer space consumption may be slightly larger if
38 buffering is done in the ADU domain. This is because each ADU may save metadata
39 corresponding to the RTP header and payload fields, like timestamp and decoding
40 sequence numbers individually. This should only be a problem if a server tries to fill
41 exactly to the last free memory block.
42 The server can determine the time to underflow by calculating the amount of media
43 time present in the buffer. This is done using the next ADU sequence number and the
44 highest received sequence number combined with the server's view of the sent ADUs
3G Multimedia Streaming Services 42
1 and their decoding order and playout time. The playout delay signaling in the RTCP APP
2 packet improves the accuracy of the estimated time. For example, in the case of low
3 frame-rate video or missing frames, the playout delay may contribute significantly to
4 the total buffering time at the client.
5 The level of protection needed against transmission rate variations over a wireless
6 network can be substantial (throughput variation because of network load, radio
7 conditions, several seconds of interruption because of handovers, possible extra
8 buffering to perform retransmission). In order to minimize the initial buffering delay,
9 the client may choose an initial buffering that is less than the required buffering it has
10 determined would be satisfactory. For this reason, the target buffer protection time
11 indicates the amount of playable media (in time), which the client would like to have in
12 its buffer. Therefore a server should not perform content adaptation towards higher
13 content rates until the given target time of media units is available in the buffer.
3G Multimedia Streaming Services 43
1 Annex A Call Flow Example (Informative)
2 This is an informative annex that contains an example video streaming call flow.
3 The MS establishes an IP bearer, an RLP retransmission scheme, and a corresponding
4 Quality of Service context over Service Option 33/66. The MS and streaming server
5 then communicate over the IP bearer using RTSP as follows:
6 The MS issues a DESCRIBE request to the streaming server to get information about a
7 specific media source. The streaming server responds with an OK followed by an SDP
8 description of the desired sessions. The MS then issues a SETUP to initiate the
9 streaming session. If successful, the streaming server returns an OK.
10 On input from the user to start viewing the content stream, the MS issues a PLAY
11 request to the streaming server. On input from the user to quit the session, the
12 terminal issues a TEARDOWN request to the server, which responds with an OK. This
13 scenario is illustrated in Figure A
3G Multimedia Streaming Services 44
MS PDSN MMS Streaming
Base Station Server
Establish SO 33 session
Establish PPP and header compression protocols
IP bearer established
Retrieve Presentation layout from MMS or other method
Presentation Layout Description (SMIL with URLs)
Modify SO 33 according to
requirements, negotiate RLP
retransmission scheme and
IP bearer modified
OK, DESCRIBE response (using SDP)
Audio/Video packets over RTP, graphics/images/timed text
RTCP SR/RR packets for RTP stream
2 Figure A 1 Example Multimedia Streaming Call Flow
3G Multimedia Streaming Services 45
1 Annex B Sample Scenario of a Session (Informative)
2 This is an informative annex that describes a sample scenario of a video streaming
4 The content URL address is rtsp://server.foo.com:554/presentation.3g2. The
5 server.foo.com is the Internet address of the server, and 554 is the port number (note:
6 The default port number for RTSP is 554). The presentation is the path and filename of
7 the content to be streamed.
8 The RTSP session can be invoked by user clicking the link at the HTTP browser on the
9 mobile terminal or other means, such as entering the URL address by key combinations.
10 When the mobile terminal receives the instruction, it would then first start an IP bearer
11 over Service Option 33/66 as described in this document.
12 The client requests a description of the content and the server responses with the
14 C->S: DESCRIBE rtsp://server.foo.com/presentation.3g2 RTSP/1.0
15 CSeq: 312
16 Accept: application/sdp
17 User-Agent: The 3GPP2StreamClient/1.1b3
19 S->C: RTSP/1.0 200 OK
20 CSeq: 312
21 Date: 12 February 2001 15:35:06 GMT
22 Content-Type: application/sdp
23 Content-Length: 414
25 o=somebody 2890844526 2890842807 IN IP4 18.104.22.168
26 s=Sample Stream
28 c=IN IP4 0.0.0.0
30 t=0 0
33 m=audio 0 RTP/AVP 97
35 a=rtpmap:97 EVRC
39 m=video 0 RTP/AVP 98
41 a=rtpmap:98 H-263-2000/90000
42 a=fmtp:98 profile=0;level=10
45 The above response is an RTSP message containing an SDP description of the video
46 (H.263) and audio streams (EVRC).
47 The client can then request to setup the audio and video media components of the
49 C->S: SETUP rtsp://server.foo.com/presentation.3g2/streamed=0 RTSP/1.0
50 CSeq: 313
51 Transport: RTP/AVP;unicast;client_port=4588-4589
52 User-Agent: The3GPP2StreamClient/1.1b3
54 S->C: RTSP/1.0 200 OK
55 CSeq: 313
56 Date: 12 February 2001 15:35:07 GMT
57 Session: 12345678
3G Multimedia Streaming Services 46
1 Transport: RTP/AVP;unicast;client_port=4588-4589;server_port=6256-6257
3 C->S: SETUP rtsp://server.foo.com/presentation.3g2/streamID=1 RTSP/1.0
4 CSeq: 314
5 Session: 12345678
6 Transport: RTP/AVP;unicast;client_port=4588-4589
7 User-Agent: The3GPP2StreamClient/1.1b3
9 S->C: RTSP/1.0 200 OK
10 CSeq: 314
11 Date: 12 February 2001 15:35:07 GMT
12 Session: 12345678
13 Transport: RTP/AVP;unicast;client_port=4590-4591;server_port=6258-6259
14 The above messages set up the session between the server and the client for the
15 specific content. The client has chosen to stream both audio and video components in
16 the session.
17 The client can request play starting from any point relative to the beginning of the
18 stream. The server returns the RTP related information to the client in the PLAY
20 C->S: PLAY rtsp://server.foo.com/presentation.3g2 RTSP/1.0
21 CSeq: 315
22 Session: 12345678
23 Range: npt=0-
24 User-Agent: The3GPP2StreamClient/1.1b
26 S->C: RTSP/1.0 200 OK
27 CSeq: 315
28 Date: 12 February 2001 15:35:07 GMT
29 Session: 12345678
30 Range: npt=0-59.3478
31 RTP-Info: url= rtsp://server.foo.com/presentation.3g2/streamID=0;
33 url= rtsp://server.foo.com/presentation.3g2/streamID=1;
36 NOTE: Headers can be folded onto multiple lines if the continuation line begins with a
37 space or horizontal tab. For more information, see .
38 The client can also request pause of the stream at any point in the presentation
41 C->S: PAUSE rtsp://server.foo.com/presentation.3g2 RTSP/1.0
42 CSeq: 318
43 Session: 12345678
44 User-Agent: The 3GPP2StreamClient/1.1b
46 S->C: RTSP/1.0 200 OK
47 CSeq: 318
48 Session: 12345678
49 Date: 12 February 2001 15:35:45 GMT
50 Note that the client can start playing again from any point of the stream (including the
51 point where it paused).
52 Finally, the client can request the tear down of the RTSP session it created.
53 C->S: TEARDOWN rtsp://server.foo.com/presentation RTSP/1.0
54 CSeq: 350
55 Session: 12345678
56 User-Agent: The 3GPP2StreamClient/1.1b
58 S->C: RTSP/1.0 200 OK
59 CSeq: 350
3G Multimedia Streaming Services 47
1 Session: 12345678
2 After finishing the RTSP session, the mobile terminal can continue on terminating IP
3 bearer over the Service Option 33/66.
3G Multimedia Streaming Services 48
2 Annex C Buffering of Video (Normative)
3 C.1 Introduction
4 This annex describes video buffering requirements in the MSS. MSS buffering is
5 optional and may be signaled in SDP 8.2.1 or MSS capability exchange 8.3.2. When this
6 option is in use, the content of this annex is normative. In other words, MSS clients
7 shall be capable of receiving an RTP packet stream that complies with the specified
8 buffering model and MSS servers shall verify that the transmitted RTP packet stream
9 complies with the specified buffering model.
10 MSS Buffering Parameters
11 The behavior of the MSS buffering model is controlled with the following parameters:
12 the initial pre-decoder buffering period, the initial post-decoder buffering period, the
13 size of the hypothetical pre-decoder buffer, the peak decoding byte rate, and the
14 decoding macroblock rate. The default values of the parameters are defined below.
15 • The default initial pre-decoder buffering period is 1 second.
16 • The default initial post-decoder buffering period is zero.
17 • The default size of the hypothetical pre-decoder buffer is defined according to the
18 maximum video bit-rate according to the table below:
Maximum video bit- Default size of the hypothetical pre-decoder
65536 bits per 20480 bytes
131072 bits per 40960 bytes
Undefined 51200 bytes
20 Table C 1 – Default size of the hypothetical pre-decoder buffer
21 • The maximum video bit-rate can be signaled in the media-level bandwidth attribute
22 of SDP as defined in section 8 of this document. If the video-level bandwidth
23 attribute was not present in the presentation description, the maximum video bit-
24 rate is defined according to the video coding profile and level in use.
25 • The size of the hypothetical post-decoder buffer is an implementation-specific issue.
26 The buffer size can be estimated from the maximum output data rate of the
27 decoders in use and from the initial post-decoder buffering period.
28 • The default, the peak decoding byte rate is defined according to the video coding
29 profile and level in use. For example, H.263 Level 10 requires support for bit-rates
30 up to 64000 bits per second. Thus, the peak decoding byte rate equals to 8000
31 bytes per second.
32 • The default decoding macroblock rate is defined according to the video coding
33 profile and level in use. If MPEG-4 Visual is in use, the default macroblock rate
34 equals to VCV decoder rate. If H.263 is in use, the default macroblock rate equals to
3G Multimedia Streaming Services 49
1 (1 / minimum picture interval) multiplied by number of macroblocks in maximum
2 picture format. For example, H.263 Level 10 requires support for picture formats up
3 to QCIF and minimum picture interval down to 2002 / 30000 sec. Thus, the default
4 macroblock rate would be 30000 x 99 / 2002 ≈ 1484 macroblocks per second.
5 MSS clients may signal their capability of providing larger buffers and faster peak
6 decoding byte rates in the capability exchange process described in section 19 of this
7 document. The average coded video bit-rate should be smaller than or equal to the bit-
8 rate indicated by the video coding profile and level in use, even if a faster peak decoding
9 byte rate were signaled.
10 Initial parameter values for each stream can be signaled within the SDP description of
11 the stream. Signaled parameter values override the corresponding default parameter
12 values. The values signaled within the SDP description guarantee pauseless playback
13 from the beginning of the stream until the end of the stream (assuming a constant-
14 delay reliable transmission channel).
15 MSS servers may update parameter values in the response for an RTSP PLAY request. If
16 an updated parameter value is present, it shall replace the value signaled in the SDP
17 description or the default parameter value in the operation of the MSS buffering model.
18 An updated parameter value is valid only in the indicated playback range, and it has no
19 effect after that. Assuming a constant-delay reliable transmission channel, the updated
20 parameter values guarantee pauseless playback of the actual range indicated in the
21 response for the PLAY request. The indicated pre-decoder buffer size and initial post-
22 decoder buffering period shall be smaller than or equal to the corresponding values in
23 the SDP description or the corresponding default values, whichever ones are valid.
24 The following header fields are defined for RTSP:
25 • x-predecbufsize:<size of the hypothetical pre-decoder buffer>
26 This gives the suggested size of the Annex C hypothetical pre-decoder buffer in
28 • x-initpredecbufperiod:<initial pre-decoder buffering period>
29 This gives the required initial pre-decoder buffering period specified according to
30 Annex C. Values are interpreted as clock ticks of a 90-kHz clock. That is, the value
31 is incremented by one for each 1/90 000 seconds. For example, value 180 000
32 corresponds to a two second initial pre-decoder buffering.
33 • x-initpostdecbufperiod:<initial post-decoder buffering period>
34 This gives the required initial post-decoder buffering period specified according to
35 Annex C. Values are interpreted as clock ticks of a 90-kHz clock.
36 These header fields are defined for the response of an RTSP PLAY request only. Their
37 use is optional.
38 The following example plays the whole presentation starting at SMPTE time code
39 0:10:20 until the end of the clip. The playback is to start at 15:36:00 on 12 Feb 2001.
40 The suggested initial post-decoder buffering period is two seconds.
41 C->S: PLAY rtsp://server.foo.com/presentation.3g2 RTSP/1.0
42 CSeq: 833
43 Session: 12345678
44 Range: smpte=0:10:20-;time=20010212T153600Z
46 S->C: RTSP/1.0 200 OK
47 CSeq: 833
48 Date: 12 Feb 2001 15:35:06 GMT
49 Range: smpte=0:10:22-;time=20010212T153600Z
50 RTP-Info : url= rtsp://server.foo.com/presentation.3g2/streamID=0;
51 seq=399000 ;rtptime=44470648,
3G Multimedia Streaming Services 50
1 url= rtsp://server.foo.com/presentation.3g2/streamID=1;
2 seq=31004 ;rtptime=41090349
3 x-initpredecbufperiod: 180000
5 MSS server buffering verifier
6 The MSS server buffering verifier is specified according to the MSS buffering model. The
7 model is based on two buffers and two timers. The buffers are called the hypothetical
8 pre-decoder buffer and the hypothetical post-decoder buffer. The timers are named the
9 decoding timer and the playback timer.
10 The MSS buffering model is presented below.
11 1. The buffers are initially empty.
12 2. A MSS Server adds each transmitted RTP packet having video payload to the pre-
13 decoder buffer immediately when it is transmitted. All protocol headers at RTP or
14 any lower layer are removed.
15 3. Data is not removed from the pre-decoder buffer during a period called the initial
16 pre-decoder buffering period. The period starts when the first RTP packet is added
17 to the buffer.
18 4. When the initial pre-decoder buffering period has expired, the decoding timer is
19 started from a position indicated in the previous RTSP PLAY request.
20 5. Removal of a video frame is started when both of the following two conditions are
21 met: First, the decoding timer has reached the scheduled playback time of the
22 frame. Second, the previous video frame has been totally removed from the pre-
23 decoder buffer.
24 6. The duration of frame removal is the larger one of the two candidates: The first
25 candidate is equal to the number of macroblocks in the frame divided by the
26 decoding macroblock rate. The second candidate is equal to the number of bytes in
27 the frame divided by the peak decoding byte rate. When the coded video frame has
28 been removed from the pre-decoder buffer entirely, the corresponding
29 uncompressed video frame is located into the post-decoder buffer.
30 7. Data is not removed from the post-decoder buffer during a period called the initial
31 post-decoder buffering period. The period starts when the first frame has been
32 placed into the post-decoder buffer.
33 8. When the initial post-decoder buffering period has expired, the playback timer is
34 started from the position indicated in the previous RTSP PLAY request.
35 9. A frame is removed from the post-decoder buffer immediately when the playback
36 timer reaches the scheduled playback time of the frame.
37 10. Each RTSP PLAY request resets the MSS buffering model to its initial state.
38 A MSS server shall verify that a transmitted RTP packet stream complies with the
39 following requirements:
40 • The MSS buffering model shall be used with the default or signaled buffering
41 parameter values. Signaled parameter values override the corresponding default
42 parameter values.
43 • The occupancy of the hypothetical pre-decoder buffer shall not exceed the default or
44 signaled buffer size.
45 • Each frame shall be inserted into the hypothetical post-decoder buffer before or on
46 its scheduled playback time.
3G Multimedia Streaming Services 51
1 C.2 MSS client buffering requirements
2 When annex C is in use, the MSS client shall be capable of receiving an RTP packet
3 stream that complies with the MSS server buffering verifier, when the RTP packet
4 stream is carried over a constant-delay reliable transmission channel. Furthermore, the
5 video decoder of the MSS client, which may include handling of post-decoder buffering,
6 shall output frames at the correct rate defined by the RTP time-stamps of the received
7 packet stream.
3G Multimedia Streaming Services 52
2 Annex D Pseudo Streaming Session Representation
3 Example (Informative)
4 This Annex shows an example of XHTML  session representation, which is defined
5 in ITU-T Recommendation J.127 
6 D.1 XHTML Presentation Description Format
7 The presentation description may be obtained by the receiver using HTTP or other
8 means such as e-mail and may not necessarily be stored on the server.
9 The presentation description contains a description of the media streams making up
10 the program, including their location, title, encoding types, data size, and other
11 parameters that enables the receiver to start retrieving the most appropriate media.
12 The presentation description is written by the <object> element with the <param>
13 elements of XHTML.
14 An example is shown below. Elements defined in this Annex are written with bold
16 <?xml version="1.0" ?>
17 <!DOCTYPE html
18 PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
19 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
21 <title>Webcasting Test Page</title>
24 <object data="http://www.webcasting.org/media.mp4" type="video/MP2T"
25 copyright="no" standby="Click Here">
26 <param name="disposition" value="devmpzz" valuetype="data" />
27 <param name="duration" value="30000" valuetype="data" />
28 <param name="size" value="240000" valuetype="data" />
29 <param name="title" value="Preview of the movie" valuetype="data" />
30 <param name="ac" value="Jc5gUxzTqJ9ebM3U18GEWdKgtiTWR6Fe" valuetype="data" />
34 Elements used in the presentation description are summarized in Table D 1. In Table D
35 1, M/O stands for "Mandatory" or "Optional" respectively.
Element Attribute M/O Value Description
3G Multimedia Streaming Services 53
object Data M URI String Actual location of the media file.
object Type M MIME Type MIME type of the media.
object Copyright O "yes" | "no" Copyright control.
object Standby M String The displayed text of the link.
param name="ac" O String Access Ticket.
param name="bitrate" O Numeric Bit rate of the content in bps.
param name="disposition" M String Types of the content
distribution, which stands for
downloading, VOD transmission,
valuetype="data" or live transmission.
param name="duration" O Numeric Duration of the content in
param name="size" O Numeric File size of the content in bytes.
String This field is effective for
downloading and VOD
param name="title" M String Title text of the content.
1 Table D 1 Elements defined in this Annex
2 D.2 <object> Element
3 The following attributes for the <object> element are defined.
4 D.2.1 data
5 This is a mandatory attribute that specifies the URI of the media to be transmitted. In
6 pseudo streaming, since the media is transmitted by HTTP, the scheme of the URI shall
7 be http, or the URI shall start with "http://".
8 D.2.2 type
9 This is a mandatory attribute that specifies the MIME type of the media to be
10 transmitted .
11 D.2.3 copyright
12 This attribute takes "yes" or "no", and this is optional. The default value is "no". The
13 copyright attribute takes effect as follows.
3G Multimedia Streaming Services 54
1 yes: The content is protected from storing. The media data cannot be stored in the
2 device after playing.
3 no: The media data can be stored in the device after playing.
4 If this attribute is not specified, the terminalhandles the file as storage allows.
5 D.2.4 standby
6 This is a mandatory attribute that specifies the displayed text of the link to the media.
7 It will typically be "Click Here" or the name of the content.
8 D.2.5 <param> Element
9 Parameters of the media are specified with the <param> element in the HTML
10 description. The following parameters are defined. Each parameter is identified by the
11 name attribute and the value is specified by the value attribute. For all parameters,
12 ‘valuetype="data"’ is included in each <param> element.
13 The terminal ignores unknown parameters.
14 D.2.6 ac
15 This is an optional parameter and the value attribute specifies the access ticket. The
16 maximum length of the value is 512 Bytes. The terminal that obtained the access ticket
17 from the ac parameter in the presentation description shall use this ticket when the
18 terminal carries out the session control as "ac=" parameter in the HTTP request.
19 This is used for identification of fee collection.
20 D.2.7 bitrate
21 This is an optional parameter. It specifies the total bit rate of the media in bits per
22 second. If the media has video and audio track, the bitrate value will be the sum of the
23 bit rate of each track.
24 If the media has multiple bit rates for adaptive bit rate changing, all the values are
25 specified with ‘:’ separator. For example,
26 <param name="bitrate" value="64000:128000:256000" valuetype="data" />
27 D.2.8 disposition
28 The disposition parameter defines the content type, its application, distribution scheme,
29 and so on. The existence of the disposition parameter is mandatory. The disposition
30 parameter itself is not defined in this Annex, but what the parameter specifies is
31 defined as follows.
32 • Category of the content: Video (including Video and Audio), Audio, Voice, etc.
33 • Transmission scheme of the content: File downloading, VOD transmission, Live
35 • Purpose of the content: Just viewing, Storing, Particular use (Wallpaper,
36 Screensaver, Alarm, etc.)
37 D.2.9 duration
38 This is an optional parameter. It specifies the duration of the media in milliseconds. If
39 the media has different duration of video and audio track, the value is the longest
40 duration in the tracks.
3G Multimedia Streaming Services 55
1 D.2.10 size
2 This is an optional parameter. It specifies the data size of the media in bytes, which
3 helps the terminal to obtain the content size in advance of transmitting. Regarding the
4 file downloading and the VOD transmission, the file is already created before
5 transmitting. Therefore, the value of the size parameter is the same as the size of the
7 In addition, if the media has multiple bit rates for adaptive bit rate changing, each size
8 corresponding to each bit rate is specified with ‘:’ separator. For example,
9 <param name="size" value="240000:480000:960000" valuetype="data" />
10 If this parameter is not specified in the presentation description, the terminal receives
11 the content size from the server at the beginning of the transmission. This is carried out
12 with the HEAD request of HTTP.
13 For the live transmission, the file size cannot be estimated before transmission. In this
14 case, the size value indicates the maximum size of the stream that is continuously
15 transmitted. For example, if the size is 1572864 for the live transmission, the
16 connection will be closed after receiving 1.5 MB of the content.
17 D.2.11 title
18 This is a mandatory parameter that describes the title of the content. The maximum
19 length of the value is 40 Bytes. The title may be shown on the terminal when the
20 content is playing.
3G Multimedia Streaming Services 56
1 Annex E Pseudo Streaming Example (Informative)
2 In this Annex, full features of Pseudo Streaming services are introduced. Some of them
3 require using optional parameters and message-header defined in Section 10.2.
4 E.1 Live encoding
5 Pseudo-streaming is a file based service., Therefore, real-time streaming is not possible
6 using pseudo-streaming. However, pseudo-streaming enables "live" streaming service, if
7 some delay (ex. 30seconds) is acceptable.
8 moov box of the 3GPP2 file format contains an index pointer to each chunk, therefore
9 moov cannot be generated before all the media data in the fragment becomes available.
14 buffer generator converter
16 Encoder Server
18 Figure E-1 Block diagram for live pseudo-streaming
19 The encoder buffer stores a fragment of media data (e. g., 30 seconds), then the file
20 generator creates the moov and mdat atoms. The data from the encoder buffer is sent to
21 the server. Then the next fragment data is stored in the encoder buffer and the file
22 generator creates the moof and mdat boxes. The data from the encoder is sent to the
23 server, and the server appends the data to previously sent data. This is iterated for each
24 fragment, and the file on the server becomes longer.
25 The server provides the client the latest fragment at the requested time using the moof
26 to moov conversion as illustrated below.
28 latest fragment (not yet arrived to server)
30 moov mdat moof mdat moof mdat
34 sent to client
ftyp moov mdat
36 Figure E-2 moof -> moov conversion in the server
37 The client will receive moov at the beginning of the file, so the decoder operation is the
38 same as with basic pseudo streaming, therefore, the first video frame in the mdat will
39 always be an I-frame.
3G Multimedia Streaming Services 57
1 E.2 Random positioning
2 The client can choose the beginning position to see the long movie file.
GET /foo.3g2 http/1.1
Range:bytes=125000-134000 HTTP/1.1 206 Partial content
6 Figure E-3 HTTP request for the random positioning
7 Here, HTTP request includes CGI parameter to specify the beginning time.
8 Case 1) Fragment positioning
9 The server will search for the nearest movie fragment and convert the moof to moov,
10 and then transmits to the client.
11 Case 2) I-frame positioning
12 The server may search for an I-frame within the video media and generate a new moov
13 box. In this way the server can provide a temporal position more closely matching the
45 sec is requested
0sec 30sec 30sec 60sec 60sec 90sec
moov mdat moof mdat moof mdat
1) moov->moof ftyp moov mdat moof mdat
2) generate new moof for ftyp moov mdat moof mdat
14 requested position. Note that media re-encoding is not required even in this case.
16 Figure E-4 Movie file reconstruction for random positioning service
17 E.3 Choosing bitrate
18 Considering the difference of the download speed in EV-DO and 1x, it is desirable to
19 choose bitrate of the movie from the client. It is accomplished by transport protocol
20 option and server operation. The transport option is depicted in Figure E-5. It requires
21 CGI parameter to specify the media bitrate.
3G Multimedia Streaming Services 58
GET /foo.3g2?br=128 http/1.1
<CR><LF> HTTP/1.1 200 OK
2 Figure E-5 Protocol enhancement for choosing bitrate
4 One of possible implementation of the server is to make a .3g2 file with multiple tracks,
5 in which each track has a different bitrate media for the same contents. Then, the
6 server regenerates a new .3g2 file with single bitrate media from the multiple rate file.
7 So, another enhancement is required.
Multirate movie moof V1 V2 V3 A1 A2 A3V1 V2 V3 A1 A2A3 V1V2 V3 A1 A2A3 moof V1V2 V3A1 A2 A3 V1 V2 V3 A1 A2 A3V1 V2V3 A1
Extracted single movie moof
V1 A1 V1 A1 moof V1 A1 V1A1 V1A1
9 Figure E-6 Generation of movie file for pseudo-streaming from multirate movie
3G Multimedia Streaming Services 59