Roadmap
Introduction
Intra-frame coding
Inter-frame coding
Object-based and scalable video coding*
– Why object-based?
   ...
Object-based Video Coding
Waveform-based coding discussed so far uses a simple source model
(e.g., H.261/263/264, MPEG-1/-...
Essential Tasks in Object-based
           Video Coding
Object/region segmentation
  – Separate pixels based on their colo...
Object/Region Segmentation
The major challenge in content/object-based
coding
Common approaches for segmentation in a stil...
Motion-based Segmentation
Motion-based segmentation: to segment an
image using motion information
– We can first estimate ...
A Mind-bothering Example




            Frame 1                                  Frame 2

It is easy to convince yourself...
Implications into Video Coding
True motion representation might be useful to
computer vision and motion perception, but it...
Simplified Segmentation: Change
               Detection
To detect the changing parts in a video, from time ti to
time tj ...
Change Detection: Pros and Cons

Simple to implement; fast
Detects all changes
Detects even unwanted changes
Positive and ...
Change Detection: An Example
Monitor the traffic




                      EE569 Digital Video Processing   10
If without a static reference frame
Background extraction methods
 – Ad-hoc median detector (your CA#6)
 – To eliminate th...
Simplified Segmentation: Global
         Motion Estimation
Planar homography (feature-based)
– Homogeneous coordinates
– C...
Plane Homography




    EE569 Digital Video Processing   13
Model-based GME
               Target function for minimization




               Solution: Gauss-Newton method



      ...
Multi-resolution GME




      EE569 Digital Video Processing   15
Numerical Example




    EE569 Digital Video Processing   16
Summary for Change Detection and
        Global Motion Estimation
     Motion segmentation becomes relatively easier
     ...
2-D Shape Modeling and Coding
Bitmap coding: a binary map specifying whether
or not a pixel belongs to an object
 – A spec...
Image Matting (Soft segmentation)




  X (i, j )    (i, j ) F (i, j ) [1           (i, j )]B(i, j ),0   (i, j ) 1
       ...
2-D Texture Modeling and Coding*
 Shape-adaptive DCT




 Shape-adaptive wavelet transform

                EE569 Digital ...
Roadmap
Introduction
Intra-frame coding
– Review of JPEG
Inter-frame coding
– Conditional Replenishment (CR)
– Motion Comp...
Scalable vs. Multicast
What is scalable coding?
   foreman.yuv                                        foreman.yuv


  fore...
Spatial scalability
1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0




             EE569 Digital Video Processing   23
Temporal scalability
   1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0




Frame 0,4,8,12,…   Frame 0,2,4,6,8,…                    ...
SNR (Rate) scalability
1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0




PSNRavg=30dB    PSNRavg=35dB                           PS...
Scalability via Bit-Plane Coding
             sign bit

         A= (a0+a12+a222+ … … +a727)



       Least Significant B...
Why DPCM Bad for Scalability?
 Frame number    1                   2             3   …
Base layer      Ibase              ...
Fine Granular Scalability (FGS)


                                   Efficiency gap



                    Enhancement lay...
3D Wavelet/Subband Coding
         y




                                         t



   x

          2D spatial WT+1D te...
Wavelet Video Coder
Original
video                                                                                        ...
Motion-Adaptive 3D Wavelet Transform
   Recall Haar transform
             1                                    d (n)    x...
Lifting
       Even Frames                                        G0      Low Band

Analysis:                       P     ...
MC Wavelet Coding vs.
                      38
                               H.264/AVC
                      36
         ...
Wavelet Synthesis with Lossy
              Motion Vector
Video                                                            ...
R-D Performance with Lossy
      Motion Vector
                    40

                    38 Non-embedded
               ...
Surprising Success of ITU-T
                   Rec. H.263 what is was used for.
What H.263 was developed for . . . . . . a...
What is Streaming Video?
 •Download mode: no delay bound                                  Receiver 1

 •Streaming mode: de...
Outline
• Challenges for quality video transport
• An architecture for video streaming
  –   Video compression
  –   Appli...
Time-varying Available Bandwidth
                                                                             Receiver
   ...
Time-varying Delay
                                                                    Receiver

                         ...
Effect of Packet Loss
                                                                      Receiver
                     ...
Unicast vs. Multicast




Unicast                                      Multicast


          Pros and cons?
            EE...
Heterogeneity For Multicast
•Network heterogeneity
                                                                 256 kb...
Outline
• Challenges for quality video transport
• An architecture for video streaming
  –   Video compression
  –   Appli...
Architecture for Video Streaming




           EE569 Digital Video Processing   45
Video Compression
      Layer 0                64 kb/s
                                            D
      Layer 1        ...
Application of Layered Video
                                                               256 kb/s    Receiver 2
   IP m...
Application-layer QoS Control
 Congestion control (using rate control):
 – Source-based, requires
        rate-adaptive co...
Congestion Control
• Window-based vs. rate control                     (pros and cons?)




  Window-based control        ...
Source-based Rate Control




        EE569 Digital Video Processing   50
Video Multicast
• How to extend source-based rate control to multicast?
• Limitation of source-based rate control in multi...
Receiver-based Rate Control
IP multicast for layered video                                 256 kb/s    Receiver 2
        ...
Error Control
• FEC
   – Channel coding
   – Source coding-based FEC
   – Joint source/channel coding
• Delay-constrained ...
Channel Coding




   EE569 Digital Video Processing   54
Delay-constrained Retransmission




          EE569 Digital Video Processing   55
Outline
• Challenges for quality video transport
• An architecture for video streaming
  –   Video compression
  –   Appli...
EE569 Digital Video Processing   57
Continuous Media Distribution Services

• Content replication (caching & mirroring)
• Network filtering/shaping/thinning
•...
Caching
• What is caching?
• Why using caching? WWW means World Wide Wait?
• Pros and cons?




                  EE569 Di...
Outline
• Challenges for quality video transport
• An architecture for video streaming
  –   Video compression
  –   Appli...
Streaming Server
• Different from a web server
  – Timing constraints
  – Video-cassette-recorder (VCR) functions (e.g.,
 ...
Media Synchronization
• Why media synchronization?
• Example: lip-synchronization (video/audio)




                  EE56...
Protocols for Streaming Video
• Network-layer protocol: Internet Protocol (IP)
• Transport protocol:
   – Lower layer: UDP...
Protocol Stacks




  EE569 Digital Video Processing   64
Summary
• Challenges for quality video transport
   – Time-varying available bandwidth
   – Time-varying delay
   – Packet...
Upcoming SlideShare
Loading in...5
×

video_coding2.ppt

1,156

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,156
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
38
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

video_coding2.ppt

  1. 1. Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* – Why object-based? motion segmentation, shape coding, R-D optimization – scalability issues Spatial/temporal/quality scalabilities EE569 Digital Video Processing 1
  2. 2. Object-based Video Coding Waveform-based coding discussed so far uses a simple source model (e.g., H.261/263/264, MPEG-1/-2) – Does not consider the semantic content (e.g. objects and their shape) of the video Object-based video coding identifies objects (or regions) in a video and encodes them. Potential benefits may include – Improved coding efficiency – Improved visual quality (e.g., no blocking artifacts) – Content description – Content-based interactivity Also called “content-dependent video coding” – The buzz word for MPEG-4 but less successful than expected (so the important question is to understand why it does not work so well) EE569 Digital Video Processing 2
  3. 3. Essential Tasks in Object-based Video Coding Object/region segmentation – Separate pixels based on their color, texture, motion characteristics – Closely related to motion detection and segmentation – Intrinsically ill-defined and desperate for a breakthrough 2D shape modeling and coding – Not all shapes are equally probable – Subtle implications into video coding (hidden pitfalls) 2D texture modeling and coding – Extension of existing block-based MCP into region-based – Deformable textures (tradeoff between spatial and temporal prediction) EE569 Digital Video Processing 3
  4. 4. Object/Region Segmentation The major challenge in content/object-based coding Common approaches for segmentation in a still image: gray-level thresholding, clustering, edge detection, region growing, splitting and merging Object segmentation in video – Motion information can be utilized, but how? – Should we trust more on motion or spatial clues? EE569 Digital Video Processing 4
  5. 5. Motion-based Segmentation Motion-based segmentation: to segment an image using motion information – We can first estimate the motion field and then segment the motion field – However, estimation and segmentation are like two sides of the same coin + EE569 Digital Video Processing 5
  6. 6. A Mind-bothering Example Frame 1 Frame 2 It is easy to convince yourself that tree branches are moving, But how do we know the sky is still? What if it were also moving at the same speed (shouldn’t we observe the same intensity patterns because sky is a smooth region)? EE569 Digital Video Processing 6
  7. 7. Implications into Video Coding True motion representation might be useful to computer vision and motion perception, but it is not indispensable in video coding The fundamental reason lies in the relationship between motion representation and video coding: how to tolerate the uncertainty in motion? The same issue remains in object-based image coding: how to tolerate the uncertainty in shape? (we will discuss this in more detail later) EE569 Digital Video Processing 7
  8. 8. Simplified Segmentation: Change Detection To detect the changing parts in a video, from time ti to time tj , we compute a difference image and threshold the difference by T 1 if | f ( x, y, ti ) f ( x, y, t j ) | T d ij ( x, y ) 0 otherwise f (x, y, tj) f (x, y, ti) dij(x,y) can be further processed, e.g., to remove isolated 1’s, or to group 1’s that are close by to each other EE569 Digital Video Processing 8
  9. 9. Change Detection: Pros and Cons Simple to implement; fast Detects all changes Detects even unwanted changes Positive and negative changes detected (occlusion) Difficult to quantify motion Requires a static reference frame EE569 Digital Video Processing 9
  10. 10. Change Detection: An Example Monitor the traffic EE569 Digital Video Processing 10
  11. 11. If without a static reference frame Background extraction methods – Ad-hoc median detector (your CA#6) – To eliminate the impact of (small) moving objects, use the “robust estimator” approach to iteratively remove the outliers – More sophisticated approaches involve the modeling of background by mixture of Gaussian distributions and graph-cut based optimization EE569 Digital Video Processing 11
  12. 12. Simplified Segmentation: Global Motion Estimation Planar homography (feature-based) – Homogeneous coordinates – Conditions for planar homography – Homography estimation from feature correspondence Hierarchical model-based GME (feature-less) – Directly minimize an energy function (the MSE of MCP errors) – Solve the optimization problem in a coarse-to-fine fashion (more robust and efficient) EE569 Digital Video Processing 12
  13. 13. Plane Homography EE569 Digital Video Processing 13
  14. 14. Model-based GME Target function for minimization Solution: Gauss-Newton method where Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. “Hierarchical Model-Based Motion Estimation.” In Proc. of the Second European Conference on Computer Vision, pp. 237-252, 1992 EE569 Digital Video Processing 14
  15. 15. Multi-resolution GME EE569 Digital Video Processing 15
  16. 16. Numerical Example EE569 Digital Video Processing 16
  17. 17. Summary for Change Detection and Global Motion Estimation Motion segmentation becomes relatively easier to solve when either camera is still or background objects belong to a plane Latest advances include a joint motion segmentation and estimation using level-set methods (PDE-based formulation) Mansouri, A.-R.; Konrad, J., "Multiple motion segmentation with level sets," Image Processing, IEEE Transactions on , vol.12, no.2, pp. 201-220, Feb 2003 EE569 Digital Video Processing 17
  18. 18. 2-D Shape Modeling and Coding Bitmap coding: a binary map specifying whether or not a pixel belongs to an object – A special case of the general alpha-map Contour coding: code only the contour of the object or the region – Chain codes – Polygon approximation – Spline approximation EE569 Digital Video Processing 18
  19. 19. Image Matting (Soft segmentation) X (i, j ) (i, j ) F (i, j ) [1 (i, j )]B(i, j ),0 (i, j ) 1 Not for coding but for interactive editing EE569 Digital Video Processing 19
  20. 20. 2-D Texture Modeling and Coding* Shape-adaptive DCT Shape-adaptive wavelet transform EE569 Digital Video Processing 20
  21. 21. Roadmap Introduction Intra-frame coding – Review of JPEG Inter-frame coding – Conditional Replenishment (CR) – Motion Compensated Prediction (MCP) Scalable video coding – 3D subband/wavelet coding and recent trend EE569 Digital Video Processing 21
  22. 22. Scalable vs. Multicast What is scalable coding? foreman.yuv foreman.yuv foreman128k.cod foreman.cod foreman256k.cod foreman512k.cod foreman1024k.cod 128 256 512 1024 Multicast Scalable coding EE569 Digital Video Processing 22
  23. 23. Spatial scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 EE569 Digital Video Processing 23
  24. 24. Temporal scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 Frame 0,4,8,12,… Frame 0,2,4,6,8,… Frame 0,1,2,3,4,5,… 7.5Hz 15Hz 30Hz EE569 Digital Video Processing 24
  25. 25. SNR (Rate) scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 PSNRavg=30dB PSNRavg=35dB PSNRavg=40dB N 1 PSNRavg PSNRi PSNRi: PSNR of frame i N i 1 EE569 Digital Video Processing 25
  26. 26. Scalability via Bit-Plane Coding sign bit A= (a0+a12+a222+ … … +a727) Least Significant Bit Most Significant Bit (LSB) (MSB) Example A=129 sign=+,a0a1a2 …a7=10000001 sign=-, a0a1a2 …a7=00110011 A=-(4+8+64+128)=-204 EE569 Digital Video Processing 26
  27. 27. Why DPCM Bad for Scalability? Frame number 1 2 3 … Base layer Ibase P P P Enhancement Layer 1 Ienh1 P P P Enhancement Layer 2 Ienh2 P P P suffer from drifting problem suffer from coding efficiency loss EE569 Digital Video Processing 27
  28. 28. Fine Granular Scalability (FGS) Efficiency gap Enhancement layer variable bit-rate ~2dB gap Base layer H.264 with/without FGS 20 kbps option EE569 Digital Video Processing 28 Foreman sequence (5fps)
  29. 29. 3D Wavelet/Subband Coding y t x 2D spatial WT+1D temporal WT EE569 Digital Video Processing 29
  30. 30. Wavelet Video Coder Original video H frames H H H H H H 7 HH 6 HH H 5 HH 4 H 3 LH 2 LH 1 LLH 0 LLL Temporal Spatial Embedded Wavelet Wavelet Quantization & Transform Transform Entropy Coding [Taubman & Zakhor, 1994] [Ohm, 1994] [Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others EE569 Digital Video Processing 30
  31. 31. Motion-Adaptive 3D Wavelet Transform Recall Haar transform 1 d (n) x (2n ) x (2n 1), s(n ) ( x(2n ) x (2n 1)), 2 1 d (n ) x(2n ) x(2n 1) s(n ) ( x (2n ) d (n )) 2 lifting-based implementation Motion-adaptive Haar transform dn f 2 n W [ f 2 n 1 ], 1 2n sn (f W 1[d n ]) 2 W,W-1: forward and backward motion vector EE569 Digital Video Processing 31
  32. 32. Lifting Even Frames G0 Low Band Analysis: P U Odd Frames G1 High Band Motion Compensation Even Frames G0 1 Low Band Synthesis: P U Odd Frames G1 1 High Band [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] EE569 Digital Video Processing 32
  33. 33. MC Wavelet Coding vs. 38 H.264/AVC 36 Non-scalable 34 H.264/AVC Luminance PSNR (dB) 32 30 28 26 Scalable Sequence: Mobile CIF 24 MC 5/3 Wavelet H.264/AVC 22 • high complexity RD control • CABAC 20 • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 [Taubman & Secker, VCIP 2003] bit-rate (Mbps) courtesy D. Taubman EE569 Digital Video Processing 33
  34. 34. Wavelet Synthesis with Lossy Motion Vector Video Video in Inverse out MC Wavelet Embedded Decoder Wavelet Transform Encoding Transform Minimize J=D+ R   d Embedded d Decoder Encoding Motion Estimat or Minimize J=D+ R [Taubman & Secker, ICIP03] EE569 Digital Video Processing 34
  35. 35. R-D Performance with Lossy Motion Vector 40 38 Non-embedded single-rate 36 Video PSNR (dB) 34 32 Embedded wavelet coefficients 30 Lossless motion 28 Embedded wavelet coefficients 26 Lossy motion CIF Foreman 24 0 200 400 60 800 1000 1200 Bit-Rate (kbps) 0 [Taubman & Secker, VCIP 2003] courtesy D. Taubman EE569 Digital Video Processing 35
  36. 36. Surprising Success of ITU-T Rec. H.263 what is was used for. What H.263 was developed for . . . . . . and ?? Analog videophone Internet video streaming EE569 Digital Video Processing 36
  37. 37. What is Streaming Video? •Download mode: no delay bound Receiver 1 •Streaming mode: delay bound Access SW Domain B Domain A Data path Domain C Access Access SW Internet SW Source Receiver 2 cnn.com RealPlayer EE569 Digital Video Processing 37
  38. 38. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 38
  39. 39. Time-varying Available Bandwidth Receiver No bandwidth reservation Access SW Domain B R>=56 kb/s Domain A Data path Access R<56 kb/s SW 56 kb/s RealPlayer Source cnn.com EE569 Digital Video Processing 39
  40. 40. Time-varying Delay Receiver Access SW RealPlayer Domain B Domain A Data path Delayed packets regarded as lost Access SW 56 kb/s Source cnn.com EE569 Digital Video Processing 40
  41. 41. Effect of Packet Loss Receiver No packet loss Access SW Domain B Domain A Data path Access SW Loss of packets No retransmission Source EE569 Digital Video Processing 41
  42. 42. Unicast vs. Multicast Unicast Multicast Pros and cons? EE569 Digital Video Processing 42
  43. 43. Heterogeneity For Multicast •Network heterogeneity 256 kb/s Receiver 2 •Receiver heterogeneity Access SW What Quality? Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 What EE569 Digital Video Processing Quality? 43
  44. 44. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 44
  45. 45. Architecture for Video Streaming EE569 Digital Video Processing 45
  46. 46. Video Compression Layer 0 64 kb/s D Layer 1 256 kb/s + D Layer 2 1 Mb/s + D Layered video encoding/decoding. D denotes the decoder. EE569 Digital Video Processing 46
  47. 47. Application of Layered Video 256 kb/s Receiver 2 IP multicast Access SW Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing 47
  48. 48. Application-layer QoS Control Congestion control (using rate control): – Source-based, requires rate-adaptive compression or rate shaping – Receiver-based – Hybrid Error control: – Forward error correction (FEC) – Retransmission – Error resilient compression – Error concealment EE569 Digital Video Processing 48
  49. 49. Congestion Control • Window-based vs. rate control (pros and cons?) Window-based control Rate control EE569 Digital Video Processing 49
  50. 50. Source-based Rate Control EE569 Digital Video Processing 50
  51. 51. Video Multicast • How to extend source-based rate control to multicast? • Limitation of source-based rate control in multicast • Trade-off between bandwidth efficiency and service flexibility EE569 Digital Video Processing 51
  52. 52. Receiver-based Rate Control IP multicast for layered video 256 kb/s Receiver 2 Access SW Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing 52
  53. 53. Error Control • FEC – Channel coding – Source coding-based FEC – Joint source/channel coding • Delay-constrained retransmission • Error resilient compression • Error concealment EE569 Digital Video Processing 53
  54. 54. Channel Coding EE569 Digital Video Processing 54
  55. 55. Delay-constrained Retransmission EE569 Digital Video Processing 55
  56. 56. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 56
  57. 57. EE569 Digital Video Processing 57
  58. 58. Continuous Media Distribution Services • Content replication (caching & mirroring) • Network filtering/shaping/thinning • Application-level multicast (overlay networks) EE569 Digital Video Processing 58
  59. 59. Caching • What is caching? • Why using caching? WWW means World Wide Wait? • Pros and cons? EE569 Digital Video Processing 59
  60. 60. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 60
  61. 61. Streaming Server • Different from a web server – Timing constraints – Video-cassette-recorder (VCR) functions (e.g., fast forward/backward, random access, and pause/resume). • Design of streaming servers – Real-time operating system – Special disk scheduling schemes EE569 Digital Video Processing 61
  62. 62. Media Synchronization • Why media synchronization? • Example: lip-synchronization (video/audio) EE569 Digital Video Processing 62
  63. 63. Protocols for Streaming Video • Network-layer protocol: Internet Protocol (IP) • Transport protocol: – Lower layer: UDP & TCP – Upper layer: Real-time Transport Protocol (RTP) & Real-Time Control Protocol (RTCP) • Session control protocol: – Real-Time Streaming Protocol (RTSP): RealPlayer – Session Initiation Protocol (SIP): Microsoft Windows MediaPlayer; Internet telephony EE569 Digital Video Processing 63
  64. 64. Protocol Stacks EE569 Digital Video Processing 64
  65. 65. Summary • Challenges for quality video transport – Time-varying available bandwidth – Time-varying delay – Packet loss • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media EE569 Digital Video Processing 65
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×