ppt
Upcoming SlideShare
Loading in...5
×
 

ppt

on

  • 304 views

 

Statistics

Views

Total Views
304
Views on SlideShare
304
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Welcome Title, Name
  • Ideal figure of media streaming, server captures frames and sends to network, client reads the pipe and play it. This figure assumes enough capacity, but that’s not always true. In most cases, current network has limited capacity Then the server needs to scale down it’s streaming bitrate, and we call it media scaling. There are some ways to do scaling. And here is one example to do that. The server discards some frames before sending them to the network. By that means, the server reduces the bitrate, which is now suitable for the network. So, that’s about the scaling, and another issue of media streaming is repair. In the illustration so far, the network cloud is clear, and the data passes through the network without any problem. But that’s not always true since we might have packet loss in the network. When there are more packets than what the network can handle, the network will drop some packets randomly. Or there are noises in the physical channel, which leads to bits error. For this reason, different types of repair techniques are used to protect the data from loss as an armor. Besides the original video flow, now there is another repair flow which is generated in server and sent to the network. When there are packet losses in the video flow, the client will use the repair data to recover the video. Maybe you’ve already noticed, the repair flow adds more data into the network, and the network only has limited capacity as we showed here. So the server has to scaling more in the video to give some bandwidth to the repair flow. That means, repair is good, but it is not free, the more repair data you add, the less video data you can send. There is a trade-off between repair and scaling when the capacity is constrained.
  • 3 dimensions, but capacity limit is used to reduce to 2d
  • The first part of the background is video compression standard. Here I use MPEG as the example, but all the research models and results should hold for other video standard such as H.26X. In MPEG, there are two types of compression techniques and they are intra-frame compression and inter-frame compression. The intra-frame compression is very similar to the techniques used in JPEG or other picture format. It uses the similarities in a single frame. The inter-frame compression, however, use the similarities among a sequence of pictures. With these two compression methods, there are three types of frames in MPEG and they are I frame, P frame and B frame. I frame, where I stands for Intra-coded, uses intra-frame compression only and it is independent to other frames. When the client get an I frame, it can decode it and play it immediately. P frame, where P stands for Predictive-coded, uses both of intra and inter-frame encoding. It only encodes the difference from its preceding I or P frame. So P frame is dependent on its preceding I or P frame. B frame, where B stands for Bi-directionally predictive coded, also uses both of intra and inter-frame encoding. Different to P frame, it only encodes the difference from its preceding and succeeding I or P frame. MPEG video typically repeats a sequence of I, P and B frames for the duration of a video stream and it is called as Group of Pictures (or GOP). This figure shows a sample GOP, where the second I frame denotes the start of next GOP. Because of the dependencies of I, P and B frames, the loss of P frame can make other P and B frame useless, and the loss of I frame can make the whole GOP useless. So we can tell, I frame is the most important and P frames are more important than B frames. The subscript
  • Another part of the background is Forward Error Correction (FEC), which is the repair methods I study in this research. The idea of FEC is add redundancies to the original data before sending. When there is loss, the redundancies will be used to recover the data. There are two kinds of FEC, media-independent FEC and media dependent FEC, which I will talk about later. Media-independent FEC doesn’t need to know the data content. It uses mathematic algorithm to generate redundant parities for the original data. Reed-Solomon codes is a popular media-independent FEC method and this picture shows how it works. In the first row, the original video frame is divided into K packets. Then in the second row, the FEC generator uses a mathematical function to create N-K redundant FEC packets and now we have N packets in total. All these N packets will be sent to a lossy network and some of them might get lost as shown in the third line. Here I only show one packet is loss, but it could be more. Then in the client side, which is the fourth row, the FEC decoder will check how many packets among those N packets are received successfully. If there are K or more packets are correct, the FEC decoder is able to reconstruct the original K data packets, and then the original video frame.
  • Since there are capacity constraints, the video needs to adjust its streaming bitrate to fit the network. This is called as media scaling. Temporal Scaling is one type of media scaling. We have seen it in the introduction slide, where the server discards the video frame to reduce bitrate. There are two approaches to do Temporal Scalings: pre-encoding temporal scaling and post-encoding temporal scaling. The former drops raw pictures before encoding, and the latter drops the encoded frames. This picture shows how POTS works. The first row shows the raw pictures which are captured from the camera. These pictures are then encoded to MPEG frames as the second row. To reduce the amount of data sent to the network, some MPEG frames are dropped before transmission as the third row.
  • Ideal figure of media streaming, server captures frames and sends to network, client reads the pipe and play it. This figure assumes enough capacity, but that’s not always true. In most cases, current network has limited capacity Then the server needs to scale down it’s streaming bitrate, and we call it media scaling. There are some ways to do scaling. And here is one example to do that. The server discards some frames before sending them to the network. By that means, the server reduces the bitrate, which is now suitable for the network. So, that’s about the scaling, and another issue of media streaming is repair. In the illustration so far, the network cloud is clear, and the data passes through the network without any problem. But that’s not always true since we might have packet loss in the network. When there are more packets than what the network can handle, the network will drop some packets randomly. Or there are noises in the physical channel, which leads to bits error. For this reason, different types of repair techniques are used to protect the data from loss as an armor. Besides the original video flow, now there is another repair flow which is generated in server and sent to the network. When there are packet losses in the video flow, the client will use the repair data to recover the video. Maybe you’ve already noticed, the repair flow adds more data into the network, and the network only has limited capacity as we showed here. So the server has to scaling more in the video to give some bandwidth to the repair flow. That means, repair is good, but it is not free, the more repair data you add, the less video data you can send. There is a trade-off between repair and scaling when the capacity is constrained.
  • Decide by encoder
  • Journal
  • DSIS method
  • R-square means how much the line explains the amount of the total variation in the data. R-square ranges from 0 to 1 where 1 means a perfect line fit and 0 means no correlation.
  • Preplayer records performance statistics
  • Welcome Title, Name

ppt ppt Presentation Transcript

  • ARMOR - A System for Adjusting Repair and Media Scaling for Video Streaming Huahui Wu, Mark Claypool and Robert Kinicki Elsevier Journal of Visual Communication and Image Representation (JVCIR) Volume 19, Number 8, Pages 489-499 December 2008
  • Introduction - Motivation Video Frames Repair by Forward Error Correction (FEC)
  • Operations Research Concept
    • Adjusting Repair and Media Scaling
      • Given Network and Application Environment
      • For each valid FEC and scaling combination, measure the video quality
      • Find the optimal point
    More Repair and More Scaling Video Quality Optimal Points
  • Outline
    • Introduction
    • Background 
    • Models
    • Algorithms
    • User Study
    • Implementation
    • Conclusions
  • Video Compression Standard
    • MPEG
      • Popular compression standard
      • Intra-compression and inter-compression
      • Three types of frames: I, P and B
      • Group Of Pictures (GOP)
    • ARMOR models MPEG dependencies
  • Forward Error Correction (FEC)
    • Media-Independent FEC
      • Reed-Solomon codes [Reed+ 60]
    • ARMOR models benefits of FEC for frame transmission
  • Media Scaling (1 of 2)
    • Sacrifice data to fit the capacity
    • Temporal Scaling (TS)
      • Pre-Encoding Temporal Scaling
      • Post-encoding Temporal Scaling (below  )
  • Media Scaling (2 of 2)
    • Quality Scaling
      • MPEG uses quantization in coding to save bits
      • Quantization Value (1~31)
      • For example: original data = 23, 13, 7, 3
    • ARMOR models both Temporal Scaling and Quality Scaling
    After DeQuantization After Quantization Quantization Value 12, 12, 0, 0 18, 12, 6, 0 21, 12, 6, 3 1, 1, 0, 0 12 3, 2, 1, 0 6 7, 4, 2, 1 3
  • Video Quality Measurements
    • Subjective Measurement
      • User study, expensive, not practical
    • Objective Measurements
      • Playable Frame Rate (R)
        • Good for Temporal Scaling, not for Quality Scaling
      • Peak Signal Noise Ratio (PSNR)
        • Good for Quality Scaling, not for Temporal Scaling
      • Video Quality Metric (VQM)
        • By Institute for Telecommunication science
        • Extracts 7 perception-based features
          • Only one for frame losses
        • Report a distortion value from 0 (no distortion) to 1 (many)
    • ARMOR uses both R and VQM
      • Includes a user study
  • Outline
    • Introduction
    • Background
    • Models 
      • Streaming Bitrate Model (cost)
      • Video Quality Model (benefit)
    • Algorithms
    • User Study
    • Implementation
    • Conclusions
  • Parameters and Variables Video Frames Repair by Forward Error Correction (FEC)
  • Streaming Bitrate Model
    • Total streaming bitrate, including video packets and FEC packets:
    • where G is the constant GOP rate
      • N PD and N BD are the numbers of transmitting P and B frames depending on Temporal Scaling level l TS
    • Two distortion factors
      • Frame Loss
        • Caused by Temporal Scaling and network packet loss
        • Appears jerky in the video playout
        • Measured by Playable Frame Rate
      • Quantization Distortion
        • Caused by a high quantization value with Quality Scaling
        • Appears visually as coarse granularity in every frame
        • Measured by VQM
    • Overall Quality
      • Distorted Playable Frame Rate
    Video Quality Model - Overview [20]
  • Playable Frame Rate (R)
    • Frame Successful Transmission Probability
      • Where Frame Size
    • Frame Dependencies
    • Total Playable Frame Rate
  • Distorted Playable Frame Rate (R D )
    • Quality scaling distortion varies exponentially with the quantization level
    • Distorted Playable Frame Rate
    [4]
  • ARMOR Algorithm
    • For each Repair and Scaling combination
        • Estimate video frame sizes (S I, S P, S B )
      • Compute streaming bitrate B and make sure it’s under capacity constraint T
          • Use frame sizes and FEC amount to get successfully frame transmission rate (q I, q P, q B )
        • Compute playable frame rate (R)
        • Estimate quality scaling distortion (D)
      • Compute distorted playable frame rate (R D )
    • Exhaustively search all FEC and Scaling combinations and find optimal quality
  • Outline
    • Introduction
    • Background
    • Models
    • Algorithms
    • User Study 
    • Implementation
    • Conclusions
  • User Study Goals
    • Accuracy of R D
      • Correlation with user perceptual quality
      • Versus PSNR and VQM?
    • Temporal Scaling versus Quality Scaling
      • What are the differences?
    • Adjusted Repair (FEC) versus No Repair
      • Is Adjusted Repair an effective method for increasing perceptual quality?
  • Video Clips
    • Compare degraded clips to the original
    • Original: 30 fps, no quality scaling
    • Degraded: Combinations of 4 independent factors (2 options each)
      • Video and Network environment
        • Video content: low motion (News) or high motion (Coastguard)
        • Packet loss rate: low loss (1%) or high loss (4%)
      • ARMOR Layer
        • Repair: adjusted repair or no repair
        • Scaling: Quality Scaling or Temporal Scaling
    • 2 4 = 16 combinations for evaluation
  • User Study Application
    • Two-week volunteer study
    • 74 users, most CS undergraduate students
    54321 [ITU-R BT.500-11]
  • Results – Video Quality Metrics (1 of 3)
    • User Score vs. PSNR
    (Same as original clip) (Much worse than original clip)
  • Results – Video Quality Metrics (2 of 3)
    • User Score
    • Vs.
    • VQM Score
    • (1 – VQM distortion)
  • Results – Video Quality Metrics (3 of 3)
    • User Score
    • vs.
    • Distorted
    • Playable
    • Frame Rate
    • (R D )
  • Results – Scaling Methods
    • Temporal Scaling versus Quality Scaling
    User Score ARMOR Prediction (Coastguard) R D 30.0 22.5 15.0 7.5 0.0
  • Results – Repair Methods
    • Adjusted Repair versus No Repair
    User Score ARMOR Prediction (Coastguard) R D 30.0 22.5 15.0 7.5 0.0
  • Outline
    • Introduction
    • Background
    • Models
    • Algorithms
    • User Study
    • Implementation 
    • Conclusions
  • Implementation Goals
    • Provide architecture for ARMOR system
    • Validate ARMOR model
    • Determine if can make improvements in real-time
  • Architecture 1 2 3 4 5 6 7 8 1 2 2 3 3
  • Experiment Settings
    • Video clip Paris
      • medium motion and details
      • two people sitting, talking, with high-motion gestures
      • 1200 CIF (352x288) images
      • average I / P / B frame sizes: 24.24 KB / 5.20 KB / 1.18 KB
    30 frames per sec R F 0.01 to 0.04 p 8 frames per GOP N B 1 Kbyte S 3 frames per GOP N P 50 ms t RTT MPEG Encoder Settings Network (NistNet) Settings
  • Results ARMOR Analytical Results R D ARMOR Measurement Results R D
  • Conclusions
    • Distorted playable frame rate high correlation with user perceptual quality
      • Higher than PSNR or VQM
    • Adjusting repair improves video streaming quality significantly
      • Better than fixed repair and no repair
    • Quality Scaling is more effective than Temporal Scaling
      • But when bandwidth low and network loss high, Quality Scaling should be used with Temporal Scaling
    • Proof of concept  ARMOR can be implemented in real-time to effectively improve streaming quality
  • Future Work?
  • Future Work
    • Implementation of quality scaling
    • Implementation of streaming media protocols
    • Bandwidth estimation techniques for initial streaming rate
    • Alternative repair techniques
    • Evaluate with time-varying bandwidth and packet loss
    • Classification of video motion and scene complexity to predict exponential coefficients
    • User studies to determine if R D works for different scaling combinations
  • ARMOR - A System for Adjusting Repair and Media Scaling for Video Streaming Huahui Wu, Mark Claypool and Robert Kinicki Elsevier Journal of Visual Communication and Image Representation (JVCIR) Volume 19, Number 8, Pages 489-499 December 2008