Your SlideShare is downloading. ×
video_coding2.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

video_coding2.ppt

1,105
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,105
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
37
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Roadmap Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding* – Why object-based? motion segmentation, shape coding, R-D optimization – scalability issues Spatial/temporal/quality scalabilities EE569 Digital Video Processing 1
  • 2. Object-based Video Coding Waveform-based coding discussed so far uses a simple source model (e.g., H.261/263/264, MPEG-1/-2) – Does not consider the semantic content (e.g. objects and their shape) of the video Object-based video coding identifies objects (or regions) in a video and encodes them. Potential benefits may include – Improved coding efficiency – Improved visual quality (e.g., no blocking artifacts) – Content description – Content-based interactivity Also called “content-dependent video coding” – The buzz word for MPEG-4 but less successful than expected (so the important question is to understand why it does not work so well) EE569 Digital Video Processing 2
  • 3. Essential Tasks in Object-based Video Coding Object/region segmentation – Separate pixels based on their color, texture, motion characteristics – Closely related to motion detection and segmentation – Intrinsically ill-defined and desperate for a breakthrough 2D shape modeling and coding – Not all shapes are equally probable – Subtle implications into video coding (hidden pitfalls) 2D texture modeling and coding – Extension of existing block-based MCP into region-based – Deformable textures (tradeoff between spatial and temporal prediction) EE569 Digital Video Processing 3
  • 4. Object/Region Segmentation The major challenge in content/object-based coding Common approaches for segmentation in a still image: gray-level thresholding, clustering, edge detection, region growing, splitting and merging Object segmentation in video – Motion information can be utilized, but how? – Should we trust more on motion or spatial clues? EE569 Digital Video Processing 4
  • 5. Motion-based Segmentation Motion-based segmentation: to segment an image using motion information – We can first estimate the motion field and then segment the motion field – However, estimation and segmentation are like two sides of the same coin + EE569 Digital Video Processing 5
  • 6. A Mind-bothering Example Frame 1 Frame 2 It is easy to convince yourself that tree branches are moving, But how do we know the sky is still? What if it were also moving at the same speed (shouldn’t we observe the same intensity patterns because sky is a smooth region)? EE569 Digital Video Processing 6
  • 7. Implications into Video Coding True motion representation might be useful to computer vision and motion perception, but it is not indispensable in video coding The fundamental reason lies in the relationship between motion representation and video coding: how to tolerate the uncertainty in motion? The same issue remains in object-based image coding: how to tolerate the uncertainty in shape? (we will discuss this in more detail later) EE569 Digital Video Processing 7
  • 8. Simplified Segmentation: Change Detection To detect the changing parts in a video, from time ti to time tj , we compute a difference image and threshold the difference by T 1 if | f ( x, y, ti ) f ( x, y, t j ) | T d ij ( x, y ) 0 otherwise f (x, y, tj) f (x, y, ti) dij(x,y) can be further processed, e.g., to remove isolated 1’s, or to group 1’s that are close by to each other EE569 Digital Video Processing 8
  • 9. Change Detection: Pros and Cons Simple to implement; fast Detects all changes Detects even unwanted changes Positive and negative changes detected (occlusion) Difficult to quantify motion Requires a static reference frame EE569 Digital Video Processing 9
  • 10. Change Detection: An Example Monitor the traffic EE569 Digital Video Processing 10
  • 11. If without a static reference frame Background extraction methods – Ad-hoc median detector (your CA#6) – To eliminate the impact of (small) moving objects, use the “robust estimator” approach to iteratively remove the outliers – More sophisticated approaches involve the modeling of background by mixture of Gaussian distributions and graph-cut based optimization EE569 Digital Video Processing 11
  • 12. Simplified Segmentation: Global Motion Estimation Planar homography (feature-based) – Homogeneous coordinates – Conditions for planar homography – Homography estimation from feature correspondence Hierarchical model-based GME (feature-less) – Directly minimize an energy function (the MSE of MCP errors) – Solve the optimization problem in a coarse-to-fine fashion (more robust and efficient) EE569 Digital Video Processing 12
  • 13. Plane Homography EE569 Digital Video Processing 13
  • 14. Model-based GME Target function for minimization Solution: Gauss-Newton method where Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. “Hierarchical Model-Based Motion Estimation.” In Proc. of the Second European Conference on Computer Vision, pp. 237-252, 1992 EE569 Digital Video Processing 14
  • 15. Multi-resolution GME EE569 Digital Video Processing 15
  • 16. Numerical Example EE569 Digital Video Processing 16
  • 17. Summary for Change Detection and Global Motion Estimation Motion segmentation becomes relatively easier to solve when either camera is still or background objects belong to a plane Latest advances include a joint motion segmentation and estimation using level-set methods (PDE-based formulation) Mansouri, A.-R.; Konrad, J., "Multiple motion segmentation with level sets," Image Processing, IEEE Transactions on , vol.12, no.2, pp. 201-220, Feb 2003 EE569 Digital Video Processing 17
  • 18. 2-D Shape Modeling and Coding Bitmap coding: a binary map specifying whether or not a pixel belongs to an object – A special case of the general alpha-map Contour coding: code only the contour of the object or the region – Chain codes – Polygon approximation – Spline approximation EE569 Digital Video Processing 18
  • 19. Image Matting (Soft segmentation) X (i, j ) (i, j ) F (i, j ) [1 (i, j )]B(i, j ),0 (i, j ) 1 Not for coding but for interactive editing EE569 Digital Video Processing 19
  • 20. 2-D Texture Modeling and Coding* Shape-adaptive DCT Shape-adaptive wavelet transform EE569 Digital Video Processing 20
  • 21. Roadmap Introduction Intra-frame coding – Review of JPEG Inter-frame coding – Conditional Replenishment (CR) – Motion Compensated Prediction (MCP) Scalable video coding – 3D subband/wavelet coding and recent trend EE569 Digital Video Processing 21
  • 22. Scalable vs. Multicast What is scalable coding? foreman.yuv foreman.yuv foreman128k.cod foreman.cod foreman256k.cod foreman512k.cod foreman1024k.cod 128 256 512 1024 Multicast Scalable coding EE569 Digital Video Processing 22
  • 23. Spatial scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 EE569 Digital Video Processing 23
  • 24. Temporal scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 Frame 0,4,8,12,… Frame 0,2,4,6,8,… Frame 0,1,2,3,4,5,… 7.5Hz 15Hz 30Hz EE569 Digital Video Processing 24
  • 25. SNR (Rate) scalability 1 0 1 1 1 …0 1 0 1 0 0 0 …1 1 0 1 0 0 PSNRavg=30dB PSNRavg=35dB PSNRavg=40dB N 1 PSNRavg PSNRi PSNRi: PSNR of frame i N i 1 EE569 Digital Video Processing 25
  • 26. Scalability via Bit-Plane Coding sign bit A= (a0+a12+a222+ … … +a727) Least Significant Bit Most Significant Bit (LSB) (MSB) Example A=129 sign=+,a0a1a2 …a7=10000001 sign=-, a0a1a2 …a7=00110011 A=-(4+8+64+128)=-204 EE569 Digital Video Processing 26
  • 27. Why DPCM Bad for Scalability? Frame number 1 2 3 … Base layer Ibase P P P Enhancement Layer 1 Ienh1 P P P Enhancement Layer 2 Ienh2 P P P suffer from drifting problem suffer from coding efficiency loss EE569 Digital Video Processing 27
  • 28. Fine Granular Scalability (FGS) Efficiency gap Enhancement layer variable bit-rate ~2dB gap Base layer H.264 with/without FGS 20 kbps option EE569 Digital Video Processing 28 Foreman sequence (5fps)
  • 29. 3D Wavelet/Subband Coding y t x 2D spatial WT+1D temporal WT EE569 Digital Video Processing 29
  • 30. Wavelet Video Coder Original video H frames H H H H H H 7 HH 6 HH H 5 HH 4 H 3 LH 2 LH 1 LLH 0 LLL Temporal Spatial Embedded Wavelet Wavelet Quantization & Transform Transform Entropy Coding [Taubman & Zakhor, 1994] [Ohm, 1994] [Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others EE569 Digital Video Processing 30
  • 31. Motion-Adaptive 3D Wavelet Transform Recall Haar transform 1 d (n) x (2n ) x (2n 1), s(n ) ( x(2n ) x (2n 1)), 2 1 d (n ) x(2n ) x(2n 1) s(n ) ( x (2n ) d (n )) 2 lifting-based implementation Motion-adaptive Haar transform dn f 2 n W [ f 2 n 1 ], 1 2n sn (f W 1[d n ]) 2 W,W-1: forward and backward motion vector EE569 Digital Video Processing 31
  • 32. Lifting Even Frames G0 Low Band Analysis: P U Odd Frames G1 High Band Motion Compensation Even Frames G0 1 Low Band Synthesis: P U Odd Frames G1 1 High Band [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] EE569 Digital Video Processing 32
  • 33. MC Wavelet Coding vs. 38 H.264/AVC 36 Non-scalable 34 H.264/AVC Luminance PSNR (dB) 32 30 28 26 Scalable Sequence: Mobile CIF 24 MC 5/3 Wavelet H.264/AVC 22 • high complexity RD control • CABAC 20 • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 [Taubman & Secker, VCIP 2003] bit-rate (Mbps) courtesy D. Taubman EE569 Digital Video Processing 33
  • 34. Wavelet Synthesis with Lossy Motion Vector Video Video in Inverse out MC Wavelet Embedded Decoder Wavelet Transform Encoding Transform Minimize J=D+ R   d Embedded d Decoder Encoding Motion Estimat or Minimize J=D+ R [Taubman & Secker, ICIP03] EE569 Digital Video Processing 34
  • 35. R-D Performance with Lossy Motion Vector 40 38 Non-embedded single-rate 36 Video PSNR (dB) 34 32 Embedded wavelet coefficients 30 Lossless motion 28 Embedded wavelet coefficients 26 Lossy motion CIF Foreman 24 0 200 400 60 800 1000 1200 Bit-Rate (kbps) 0 [Taubman & Secker, VCIP 2003] courtesy D. Taubman EE569 Digital Video Processing 35
  • 36. Surprising Success of ITU-T Rec. H.263 what is was used for. What H.263 was developed for . . . . . . and ?? Analog videophone Internet video streaming EE569 Digital Video Processing 36
  • 37. What is Streaming Video? •Download mode: no delay bound Receiver 1 •Streaming mode: delay bound Access SW Domain B Domain A Data path Domain C Access Access SW Internet SW Source Receiver 2 cnn.com RealPlayer EE569 Digital Video Processing 37
  • 38. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 38
  • 39. Time-varying Available Bandwidth Receiver No bandwidth reservation Access SW Domain B R>=56 kb/s Domain A Data path Access R<56 kb/s SW 56 kb/s RealPlayer Source cnn.com EE569 Digital Video Processing 39
  • 40. Time-varying Delay Receiver Access SW RealPlayer Domain B Domain A Data path Delayed packets regarded as lost Access SW 56 kb/s Source cnn.com EE569 Digital Video Processing 40
  • 41. Effect of Packet Loss Receiver No packet loss Access SW Domain B Domain A Data path Access SW Loss of packets No retransmission Source EE569 Digital Video Processing 41
  • 42. Unicast vs. Multicast Unicast Multicast Pros and cons? EE569 Digital Video Processing 42
  • 43. Heterogeneity For Multicast •Network heterogeneity 256 kb/s Receiver 2 •Receiver heterogeneity Access SW What Quality? Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 What EE569 Digital Video Processing Quality? 43
  • 44. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 44
  • 45. Architecture for Video Streaming EE569 Digital Video Processing 45
  • 46. Video Compression Layer 0 64 kb/s D Layer 1 256 kb/s + D Layer 2 1 Mb/s + D Layered video encoding/decoding. D denotes the decoder. EE569 Digital Video Processing 46
  • 47. Application of Layered Video 256 kb/s Receiver 2 IP multicast Access SW Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing 47
  • 48. Application-layer QoS Control Congestion control (using rate control): – Source-based, requires rate-adaptive compression or rate shaping – Receiver-based – Hybrid Error control: – Forward error correction (FEC) – Retransmission – Error resilient compression – Error concealment EE569 Digital Video Processing 48
  • 49. Congestion Control • Window-based vs. rate control (pros and cons?) Window-based control Rate control EE569 Digital Video Processing 49
  • 50. Source-based Rate Control EE569 Digital Video Processing 50
  • 51. Video Multicast • How to extend source-based rate control to multicast? • Limitation of source-based rate control in multicast • Trade-off between bandwidth efficiency and service flexibility EE569 Digital Video Processing 51
  • 52. Receiver-based Rate Control IP multicast for layered video 256 kb/s Receiver 2 Access SW Domain B Domain A Domain C Access SW Internet Gateway Ethernet Telephone 1 Mb/s networks Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing 52
  • 53. Error Control • FEC – Channel coding – Source coding-based FEC – Joint source/channel coding • Delay-constrained retransmission • Error resilient compression • Error concealment EE569 Digital Video Processing 53
  • 54. Channel Coding EE569 Digital Video Processing 54
  • 55. Delay-constrained Retransmission EE569 Digital Video Processing 55
  • 56. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 56
  • 57. EE569 Digital Video Processing 57
  • 58. Continuous Media Distribution Services • Content replication (caching & mirroring) • Network filtering/shaping/thinning • Application-level multicast (overlay networks) EE569 Digital Video Processing 58
  • 59. Caching • What is caching? • Why using caching? WWW means World Wide Wait? • Pros and cons? EE569 Digital Video Processing 59
  • 60. Outline • Challenges for quality video transport • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media • Summary EE569 Digital Video Processing 60
  • 61. Streaming Server • Different from a web server – Timing constraints – Video-cassette-recorder (VCR) functions (e.g., fast forward/backward, random access, and pause/resume). • Design of streaming servers – Real-time operating system – Special disk scheduling schemes EE569 Digital Video Processing 61
  • 62. Media Synchronization • Why media synchronization? • Example: lip-synchronization (video/audio) EE569 Digital Video Processing 62
  • 63. Protocols for Streaming Video • Network-layer protocol: Internet Protocol (IP) • Transport protocol: – Lower layer: UDP & TCP – Upper layer: Real-time Transport Protocol (RTP) & Real-Time Control Protocol (RTCP) • Session control protocol: – Real-Time Streaming Protocol (RTSP): RealPlayer – Session Initiation Protocol (SIP): Microsoft Windows MediaPlayer; Internet telephony EE569 Digital Video Processing 63
  • 64. Protocol Stacks EE569 Digital Video Processing 64
  • 65. Summary • Challenges for quality video transport – Time-varying available bandwidth – Time-varying delay – Packet loss • An architecture for video streaming – Video compression – Application-layer QoS control – Continuous media distribution services – Streaming server – Media synchronization mechanisms – Protocols for streaming media EE569 Digital Video Processing 65