• Like
Upcoming SlideShare
Loading in...5
Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • One can consider webpages as media as well!
  • One way delays
  • Define the 4 terms… TCP/UDP/IP suite (internet networking) provides best-effort delivery -- No guarantees about end-to-end delay -- No guarantees about variance of packet delay within stream Because of lack of special effort to deliver packets in timely manner, multimedia hard!
  • Phone with delay? On the Internet and in other networks, Quality of Service (QoS) is the idea that transmission rates, error rates, and other characteristics can be measured, improved, and, to some extent, guaranteed in advance. Schemes in which certain packets are given higher priority
  • -- Variable bitrate media (where number of packets for “unit” of media varies) makes this relationship hard to model -- possible question here.
  • Scalable encoding will be discussed a little later (built into encoding schemes)
  • Pulse Code Modulation: PCM: 8000 x 8 = 64 Kb/s PCM (CD): 44100 x 16 = 715.6 Kb/s (mono) x 2 (stereo)
  • Add animation/block diagrams t
  • MPEG-1: MPEG-1 was the original MPEG standard, designed exclusively for computer use. It allows a 320x240, 30 frame per second video. MPEG-2: MPEG-2 is a higher-resolution version of MPEG-1 designed for digital television broadcast. MPEG-3: MPEG-3 was designed for HDTV. However, HDTV is just normal TV with a higher resolution and frame rate, so this standard was folded into MPEG-2 and is no longer used. MPEG-4: MPEG-4 is a new standard for digital video scheduled for release in November 1998. It is designed for use over low-bit-rate wireless and mobile communication systems. DCT cannot provide the required compression to operate over this type of line, so MPEG-4 does not enforce an encoding method. Instead, it leaves the choice of encoding method up to the designer. MPEG-7: MPEG-7 is yet another standard for digital video that has barely begun. The focus of MPEG-7 is supposed to be designing a representation for digital video that allows it to be stored and queried by content in a video database.
  • Streaming often requested thru web client (browser) Audio/video not tightly integrated with broswer, so helper applications needed (media player)
  • Audio/video break up… Take simple case. A simple architecture is to have the Browser requests the object(s) and after their reception pass them to the player for display - No pipelining
  • Web browser requests and receives a Meta File ___more details___ HERE (a file describing the object) instead of receiving the file itself; Browser launches the appropriate Player and passes it the Meta File; Player sets up a TCP connection with Web Server and downloads the file -- TCP bad…. Delays… -- HTTP bad… No control (Not rich enough)
  • 2 servers: HTTP (meta data), streaming (media)
  • Use UDP, and Server sends at a rate (Compression and Transmission) appropriate for client; to reduce jitter, Player buffers initially for 2-5 seconds, then starts display Use TCP, and sender sends at maximum possible rate under TCP; retransmit when error is encountered; Player uses a much large buffer to smooth delivery rate of TCP -- Chance of starvation with small buffer
  • For user to control display: rewind, fast forward, pause, resume, etc… Out-of-band protocol (uses two connections, one for control messages (Port 554) and for media stream) RFC 2326 permits use of either TCP or UDP for the control messages connection, sometimes called the RTSP Channel As before, meta file is communicated to web browser which then launches the Player; Player sets up an RTSP connection for control messages in addition to the connection for the streaming media
  • Bit rate is 8 KBytes, and every 20 msec, the sender forms a packet of 160 Bytes + a header to be discussed below Ideal case, this just works, but…. (next slide)
  • Loss is in a broader sense: packet never arrives or arrives later than its scheduled playout time
  • With fixed playout delay, the delay should be as small as possible without missing too many packets
  • The playout delay is computed for each talk spurt based on observed average delay and observed deviation from this average delay The beginning of a talk spurt is identified from examining the timestamps in successive and/or sequence numbers of chunks U can ask a couple of questions here: what are disadvantages; where did we use avg. delay estimation etc. before?
  • Mixed quality streams are used to include redundant duplicates of chunks Upon loss playout available redundant chunk (a lower quality one) Can recover from single losses
  • Has no redundancy, but can cause delay in playout beyond Real Time requirements… why does it work?
  • Thin protocol As is, doesn’t specify everything you need Serves as a skeleton that can be filled out
  • Framing, that is identifying and handling a unit of transfer suitable to the application level, (ALF principle). Multiplexing, in case we want to put different media within the same stream. Real-time delivery, possibly sacrificing reliability for timeliness. (but does not in itself provide this) Synchronization, both within successive packets in a stream and between different media in different streams. Feedback, on who is receiving data with what quality
  • Multiple streams can be differentiated. Can tell if packet dropped.
  • Control session associated with each RTP session. Control session has same participants as RTP session. Provides feedback on quality of data stream. Carries information to associate multiple streams as coming from the same participant. Scalable way of estimating group size. Synchronization mechanism across streams from same participant.
  • If each receiver sends RTCP packets to all other receivers, the traffic load resulting can be large RTCP adjusts the interval between reports based on the number of participating receivers Typically, limit the RTCP bandwidth to 5% of the session bandwidth, divided between the sender reports (25%) and the receivers reports (75%)


  • 1. Multimedia Networking Ashwin Bharambe Carnegie Mellon University 15-441 Networking, Spring 2003 Tuesday, April 22, 2003
  • 2. What media are we talking about? Audio and/or Video This lecture concerned about:
  • 3. Lecture Overview
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real-Time Transport Protocol (RTP)
      • Real-Time Transport Control Protocol (RTCP)
  • 4. Application Classes
    • Streaming Stored Media
      • Pipeline reception of stored files over the network
      • No need to wait for entire file before viewing
      • Example: On-demand video
    • Unidirectional Real-Time Media
      • Similar, but data is generated on the fly
      • Non-interactive: just listen/view, no fast-forward!
      • Example: news broadcast
    • Interactive Real-Time Media
      • Example: video conference
  • 5. Streaming Media Applications
    • Characteristics
      • Interactive: Users can play, pause
      • Even rewind/fast-forward
    • Requirements
      • Delay: from client request until display start can be 1 to 10 seconds
      • Time is used to buffer data to reduce chance of stalling
      • Delay constraints on data delivery to maintain continuous playout.
    On demand Video
  • 6. Unidirectional Real-Time Apps
    • Characteristics
      • By storing media: can pause, rewind
      • Often used combined with multicast
    • Requirements
      • As with streaming media, delay constraints on data delivery to maintain continuous playout
    News Broadcast
  • 7. Real-Time Interactive Applications
    • Requirements
      • Very stringent delay requirements because of real-time interactive nature
      • Limit to how much people are willing to adapt to the delay
      • Video: < 150 msec acceptable (one-way)
      • Audio: < 150 msec not perceived, < 400 msec acceptable (one-way)
    Internet Telephony
  • 8. Roadmap
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real Time Protocol (RTP)
      • Real Time Control Protocol (RTCP)
  • 9. Review: Network Basics Source Dest Throughput = Amount of data arriving per unit time. Jitter = Variance of interpacket arrival times Interpacket Arrival Time Delay Loss Best-effort delivery!!
  • 10. End-to-End Delay
    • Interactive applications
      • Video conferencing
        • Human communication starts to break down at ~500 msec delays
      • Feedback loops
      • Delay affected by encoding/decoding
    • Today’s Internet
      • Best effort service and end-to-end mechanisms
      • Provide little opportunity for controlling delay
    • Longer term (only?) solution: Quality of Service
  • 11. Packet Jitter
    • Real-Time systems must deal with this
      • Jitter is hard to bound even with QoS
    • Partial solution: Buffer management
      • Buffers are used to smooth jitter, but how big should the buffer be?
      • Unexpected jitter can lead to starvation or buffer overflow if buffer is too small
      • But over-provisioning buffers increases delay
  • 12. Throughput
    • Estimating throughput is problematic
      • Media coding may be tunable, but changes in quality are generally perceptually annoying
      • Throughput will often vary faster than feedback delay which makes estimation hard
    • Multicast makes this worse
    • Possible Solutions:
      • QoS mechanisms
      • Scalable encoding combined with congestion control (i.e., dynamically adapt compression to bandwidth)
  • 13. Packet Loss
    • Effect of loss is media and encoding specific
      • Loss of one packet in a video stream may result in the loss of multiple frames
      • Other packets received can be rendered useless
      • Retransmission often not feasible (no time)
    • Possible Solutions:
      • Encoding that alleviates packet loss effects
      • Forward Error Correction (FEC)
  • 14. Roadmap
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real Time Protocol (RTP)
      • Real Time Control Protocol (RTCP)
  • 15. Media Encoding
    • Digitization essential, of course!
    • Sample signal at some frequency
      • 8KHz, for example
    • Quantize
      • represent each sample as some fixed #bits
    • Compress: make number of bits small
      • Audio: GSM, G.729, G.723.3, MP3, …
      • Video: MPEG 1/2/4, H.261, …
    • Then:
      • Send -> Receive, decompress/convert, play
  • 16. Audio Encoding
    • Traditional telephone quality encoding
      • 8KHz samples of 8 bits each  64kbps
    • CD quality encoding: 44.1 KHz of 16 bits
      • 1.41 Mbs uncompressed
    • MP3 compression
      • Frequency ranges that are divided in blocks, which are converted using DCT, quantized, and encoded
    12 128 kbps Layer 3 8 192 kbps Layer 2 4 384 kbps Layer 1 Ratio Range
  • 17. Image Encoding: JPEG
    • Divide digitized image in 8 x 8 pixel blocks
    • The DCT phase converts the pixel block into a block of frequency coefficients
      • Discrete Cosine Transform – similar to FFT
    • The quantization phase limits the precision of the frequency coefficient
      • This is based on a quantization table, which controls the degree of compression
    • The encoding phase packs this information in a dense fashion
  • 18. Video Encoding
    • Once frames are captured (&quot;raw&quot; video) resulting file is very large:
      • 320 x 240 x 24 -bit color = 230,400 bytes/frame
      • 15 frames/second = 3,456,000 bytes/second
      • 10 seconds takes around 30 Mbytes! (no audio)
    • Commonly-used encodings: AVI, MPEG
      • Exploit spatial locality inside a single image
      • Temporal locality across images
      • Layered encoding
  • 19. MPEG Encoding of Video
    • Three frame types:
      • I frame: independent encoding of the frame (JPEG)
      • P frame: encodes difference relative to I-frame (predicted)
      • B frame: encodes difference relative to interpolated frame
      • Note that frames will have different sizes
    • Complex encoding, e.g. motion of pixel blocks, scene changes, ..
      • Decoding is easier than encoding.
    • MPEG often uses fixed-rate encoding.
    I B B P B B P B B I B B P B B
  • 20. MPEG System Streams
    • Combine one more MPEG video and audio streams in a single synchronized stream.
    • Consists of a hierarchy with meta data at every level describing the data.
      • System level contains synchronization information
      • Video level is organized as a stream of group of pictures
      • Group of pictures consists of pictures
      • Pictures are organized in slices
  • 21. Roadmap
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real Time Protocol (RTP)
      • Real Time Control Protocol (RTCP)
  • 22. Streaming Media Applications
    • Important and growing
      • Reduction of storage costs
      • High-speed access networks (e.g., DSL)
      • Enhancements to network caching of video using content distribution networks
    • Operation
      • Segment audio/video file; send over UDP/TCP
      • Real-Time Transport Protocol (RTP)
    • User-Interactive Control
      • Real-Time Transport Control Protocol (RTCP)
  • 23. Streaming Helper Application
    • Displays content
      • Typically requested via a Web browser
      • Example: RealPlayer
    • Typical functions:
      • GUI for user control
      • Decompression
      • Jitter removal
      • Error correction
  • 24. Streaming From Web Servers (1)
    • Video sent as HTTP object(s)
    • Options:
      • Interleaved audio and video in one file
      • Two separate files with client synchronizing the display
  • 25. Streaming From Web Servers (2)
    • Alternative
      • Set up connection between server and player, then download
  • 26. Using a Streaming Server
    • Application layer protocol can be better tailored to streaming
      • Gets us around HTTP, allows a choice of UDP vs. TCP
  • 27. Streaming Media Application Client Side Issues
  • 28. Quick Question
    • What protocol does RealPlayer or QuickTime use?
  • 29. Tricky Question
    • What protocol does RealPlayer or QuickTime use?
  • 30. Real Time Streaming Protocol (RTSP)
    • For user to control transmission of stream
      • Display: rewind, fast forward, pause, resume, etc…
      • Think: remote control for your VCR
    • Out-of-band protocol (Port 554)
    • RFC 2326
  • 31. RTSP at Work
  • 32. Roadmap
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real Time Protocol (RTP)
      • Real Time Control Protocol (RTCP)
  • 33. Voice Over IP (VoIP)
    • Internet phone applications generate packets during talk spurts
      • No traffic during silence
    • Operation
      • Sample, quantize, encapsulate and send
    • Protocol choices
      • UDP: some packets may be lost (up to ~20% loss is tolerable)
      • TCP: eliminates loss but creates variance in delay
      • TCP slow start can also create periods of very small throughput
  • 34. Voice Over IP (VoIP): Issues
    • Delay
      • End-to-end delays > 400 msec cannot be tolerated
      • Packets that are that delayed are ignored (i.e., lost)
    • Jitter
      • Use timestamps, sequence numbers
      • Delay playout at receivers (fixed or adaptive)
    • Packet loss
      • Forward Error Correction (FEC)
      • Interleaving
      • Receiver-based repair
  • 35. Compensating for Jitter: VoIP with Fixed Playout Delay Playout delay
  • 36. Compensating for Jitter: Adaptive Playout Delay
    • Disadvantages of fixed playout delay
      • More packet loss (delayed packets), if too small
      • High latency, if too large
    • Solution: track network delay performance
      • Estimate average delay and delay variance
      • Similar to RTT estimation in TCP
    • Adjust playout delay during silent periods
      • Silent periods get shortened/expanded
      • Hardly noticeable!
  • 37. Recovery from Packet Loss: Forward Error Correction (FEC)
    • Send redundant encoded chunk every n chunks (XOR original n chunks)
      • If 1 packet in this group lost, can reconstruct
      • If > 1 packet lost, cannot recover
    • Disadvantages:
      • The smaller the group size, larger the overhead
      • Playout delay increased
  • 38. Recovering from Packet Loss: Piggybacking Lo-fidelity Stream
  • 39. Recovering from Packet Loss: Interleaving
    • Divide 20 msec of audio data into smaller units of 5 msec each and interleave
    • Upon loss, have a set of partially filled chunks
  • 40. Recovering from Packet Loss: Receiver-based Repair
    • Exploit temporal similarity
      • Replicate packets
    • Smart solutions
      • Interpolation
  • 41. Roadmap
    • Multimedia Applications
    • Application Requirements / Challenges
    • Media Encodings
    • Case Studies
      • Streaming Stored Media
      • Real Time Interactive Media (Voice over IP)
    • Protocols
      • Real Time Protocol (RTP)
      • Real Time Control Protocol (RTCP)
  • 42. Real Time Transport Protocol (RTP)
    • Recall:
      • Multimedia apps. append headers to audio/video chunks
      • Format, sequence numbers, timestamps, …
    • RTP: public domain standard for such headers
      • Chunks encapsulated in RTP packets before being sent to UDP
      • Real Time Control Protocol (RTCP) for control
    • RTP logically extends UDP.
      • Sits between UDP and application
      • Thin Protocol
  • 43. RTP: What does it do?
    • Framing
    • Multiplexing
    • “ Real-time delivery” (sort’a)
    • Synchronization
    • Feedback
  • 44. RTP Packet Format
    • Source/Payload type
      • Different formats assigned different codes
      • E.g., GSM -> 3 , MPEG Audio -> 14
    • Sequence numbers
    • Time stamps
    • Synchronization source ID
    • Miscellaneous fields
  • 45. Timestamp vs. Sequence Number
    • Timestamp relates packet to real time
      • Timestamp values sampled from a media specific clock
    • Sequence number relates packet to other packets
    • Allows many packets to have the same timestamp but different sequence numbers
  • 46. Audio silence example
    • Consider audio data: what do you want to do during silence?
      • Not send anything
    • Why might this cause problems?
      • Other side needs to distinguish between loss and silence
    • How does the timestamp/seq# mechanism help?
      • Big jump in timestamps, but correct next sequence number  Silence!
  • 47. SSRC: Synchronization Source
    • Identifies the stream
      • Not just the participant
      • All packets with the same SSRC go together
    • Picked at random
    • Why do we need this?
      • No assumption of stream association from underlying network mechanism
    • Possible problems?
      • Collision
  • 48. Real Time Control Protocol (RTCP)
    • RTCP packets transmitted by each participant in RTP session to all others using multicast
    • Distinct port number from RTP
    • Reports on:
      • Loss rate
      • Inter-arrival jitter
      • Identity of receivers
      • Delay, (indirectly).
  • 49. Control Bandwidth Scaling
    • Needed for scalability
      • Consider one speaker and 2 receivers as compared to one speaker with 20 or 200 receivers
    • Recommendation:
      • 5% of data bandwidth
      • Sending delay should be randomized
      • Should account for overhead of underlying network service (i.e., UDP and IP)
  • 50. Summary
    • Supporting multimedia applications in today’s Internet is difficult
      • Several tricks to alleviate problems posed by best-effort delivery
      • In future, perhaps bandwidth will be extremely plentiful everywhere
      • Or – we will need Quality of Service some day or the other!
    • Bunch of Real-Time protocols standardizing some aspects of real-time communication