Basics of MPEG
• Picture sizes: up to 4095 x 4095
• Most algorithms are for the CCIR 601 format for
video frames
• Y-Cb-Cr color space
• NTSC: 525 lines per frame at 60 fps, 720 x 480 pixel
luminance frame, 360 x 480 pixel chrominance frame
• PAL: 625 lines per frame at 50 fps, 720 x 576 pixel
luminance frame, 360 x 576 pixel chrominance frame
• SIF (source input format) for digital TV
• Luminance resolution: 360 x 240 pixels at 30 fps or 360
x 288 pixels at 25 fps
• Chrominance resolution: half the luminance resolution
in both dimensions
Detour: Motion Vectors with
Subpixel Accuracy
• Find motion vector (u,v) with integer pixel accuracy
• Let the MAE be m0
• Compute the MAE at its 4-neighbor pixels (m1 .. m4)
• Horizontal pixels
• Model with the function p(i)=a|i-b|+c
• If 2(m3 – m0) < (m4 – m0), the i coordinate is to the left of
the center
• If (m3 – m0) > 2(m4 – m0), the i coordinate is to the right
of the center
• Otherwise it is along the center line
• Similarly for the vertical direction
0
43
2
1
Basics of MPEG
• Types of pictures
• I (intra) frame
• compressed using only intraframe coding
• Moderate compression but faster random access
• P (predicted) frame
• Coded with motion compression using past I frames or P frames
• Can be used as reference pictures for additional motion
compensation
• B (bidirectional) frame
• Coded by motion compensation by either past or future I or P frames
• D (DC) frame
• Limited use: encodes only DC components of intraframe coding
MPEG: Video Encoding
• The MPEG standards
• do not define an encoding process
• define syntax of the coded stream
• define a decoding process
MPEG: Video Encoding
Pre
processing
Frame
Memory
+
-
DCT
Motion
Compensation
Motion
Estimation
Frame
Memory
+
IDCT
Quantizer
(Q)
Regulator
VLC
Encoder
Buffer
Q-1
Output
Input
Predictiveframe
Motionvectors
MPEG: Video Encoding
• Some highlights
• Interframe predictive coding (P-pictures)
• For each macroblock the motion estimator produces the best
matching macroblock
• The two macroblocks are subtracted and the difference is DCT
coded
• Interframe interpolative coding (B-pictures)
• The motion vector estimation is performed twice
• The encoder forms a prediction error macroblock from either
or from their average
• The prediction error is encoded using a block-based DCT
• The encoder needs to reorder pictures because B-
frames always arrive late
MPEG: Structure of the Coded
Bit-Stream
• Sequence layer: picture dimensions,
pixel aspect ratio, picture rate,
minimum buffer size, DCT
quantization matrices
• GOP layer: will have one I picture,
start with I or B picture, end with I or P
picture, has closed GOP flag, timing
info, user data
• Picture layer: temporal ref number,
picture type, synchronization info,
resolution, range of motion vectors
• Slices: position of slice in picture,
quantization scale factor
• Macroblock: position, H and V motion
vectors, which blocks are coded and
transmitted
GOP-1 GOP-2 GOP-n
I B B B P B B..
Slice-1
Slice-2
…
Slice-N
mb-1 mb-2 mb-n
0 1
2 3 4 5
Sequence layer
GOP layer
Picture layer
Slice layer
Macroblock layer
8x8 block
MPEG: Macroblock Coding
c h a n g e
M Q U A N T
n o c h a n g e t o
M Q U A N T
I p ic t u r e
c h a n g e
M Q U A N T
n o c h a n g e t o
M Q U A N T
c o d e d n o t c o d e d
in t e r fr a m e
c h a n g e
M Q U A N T
n o c h a n g e t o
M Q U A N T
in t r a fr a m e
m o t io n c o m p .
A
m o t io n v e c t o r
s e t t o 0
P p ic t u r e
A
F w d m o t io n
c o m p e n s a t io n
A
B w d m o t io n
c o m p e n s a t io n
A
in t e r p o la t e d
c o m p e n s a t io n
B p ic t u r e
P ic t u r e T y p e
A
MQUANT= scale factor q
],[
],[8
],[
jiqQ
jiDCT
jiQDCT =
Quantization
matrix
MPEG-2
• Why another standard?
• Support higher bit rates e.g., 80-100 Mbits/s for HDTV
instead of the 1.15 Mvits/s for SIF
• Support a larger number of applications
• The encoding standard should be a toolkit rather than a
flat procedure
• Interlaced and non-interlaced frame
• Different color subsampling modes e.g., 4:2:2, 4:2:0, 4:4:4
• Flexible quantization schemes – can be changed at picture level
• Scalable bit-streams
• Profiles and levels
MPEG-2: Effects of Interlacing
• Fields or frame pictures can be encoded
• Prediction Modes and Motion Compensation
• Frame prediction: current frame predicted from previous frame
• Field prediction:
• Top and bottom fields of reference frame predicts first field
• Bottom field of previous frame and top field of current frame predicts
the bottom field of current frame
• 16 X 8 motion compensation mode
• A macroblock may have two of them
• A B picture macroblock may have four!
• Dual prime motion compensation
• Top field of current frame is predicted from two motion vectors
coming from the top and bottom field of reference frame
• Works for P vectors
MPEG-2: Profiles and Levels
Levels
Profiles
SNR
4:2:0
Spatial
4:2:0
High
4:2:0;4:2:2
Multiview
4:2:0
High
Enhancement 1920 X 1151/60 1920 X 1151/60
Lower 960 X 576/30 1920 X 1151/60
Bitrate 100, 80,25 130, 50, 80
High-1440
Enhancement 1440 X 1152/60 1440 X 1152/60 1920 X 1152/60
Lower 720 X 576/30 720 X 576/30 1920 X 1152/60
Bitrate 60, 40, 15 80, 60, 20 100, 40, 60
Main
Enhancement 720 X 576/30 720 X 576/30 720 X 576/30
Lower 352 X 288/30 720 X 576/30
Bitrate 15, 10 20, 15, 4 25, 10, 15
Low
Enhancement 352 X 288/30 352 X 288/30
Lower 352 X 288/30
Bitrate 4, 3 8, 4, 4
MPEG-2 Applications
• Digital Betacam: 90 Mbits/s video
• MPEG-2
• Main Profile, Main Level, 4:2:0: 15 Mbits/s
• High Profile, High Level, 4:2:0: adequate, expensive
• Image quality preserved across generations of
processing
• Multiview Profile
• Stereoscopic view – disparity prediction
• Virtual walk-throughs composed from multiple viewpoints

Mmclass5b

  • 1.
    Basics of MPEG •Picture sizes: up to 4095 x 4095 • Most algorithms are for the CCIR 601 format for video frames • Y-Cb-Cr color space • NTSC: 525 lines per frame at 60 fps, 720 x 480 pixel luminance frame, 360 x 480 pixel chrominance frame • PAL: 625 lines per frame at 50 fps, 720 x 576 pixel luminance frame, 360 x 576 pixel chrominance frame • SIF (source input format) for digital TV • Luminance resolution: 360 x 240 pixels at 30 fps or 360 x 288 pixels at 25 fps • Chrominance resolution: half the luminance resolution in both dimensions
  • 2.
    Detour: Motion Vectorswith Subpixel Accuracy • Find motion vector (u,v) with integer pixel accuracy • Let the MAE be m0 • Compute the MAE at its 4-neighbor pixels (m1 .. m4) • Horizontal pixels • Model with the function p(i)=a|i-b|+c • If 2(m3 – m0) < (m4 – m0), the i coordinate is to the left of the center • If (m3 – m0) > 2(m4 – m0), the i coordinate is to the right of the center • Otherwise it is along the center line • Similarly for the vertical direction 0 43 2 1
  • 3.
    Basics of MPEG •Types of pictures • I (intra) frame • compressed using only intraframe coding • Moderate compression but faster random access • P (predicted) frame • Coded with motion compression using past I frames or P frames • Can be used as reference pictures for additional motion compensation • B (bidirectional) frame • Coded by motion compensation by either past or future I or P frames • D (DC) frame • Limited use: encodes only DC components of intraframe coding
  • 4.
    MPEG: Video Encoding •The MPEG standards • do not define an encoding process • define syntax of the coded stream • define a decoding process
  • 5.
  • 6.
    MPEG: Video Encoding •Some highlights • Interframe predictive coding (P-pictures) • For each macroblock the motion estimator produces the best matching macroblock • The two macroblocks are subtracted and the difference is DCT coded • Interframe interpolative coding (B-pictures) • The motion vector estimation is performed twice • The encoder forms a prediction error macroblock from either or from their average • The prediction error is encoded using a block-based DCT • The encoder needs to reorder pictures because B- frames always arrive late
  • 7.
    MPEG: Structure ofthe Coded Bit-Stream • Sequence layer: picture dimensions, pixel aspect ratio, picture rate, minimum buffer size, DCT quantization matrices • GOP layer: will have one I picture, start with I or B picture, end with I or P picture, has closed GOP flag, timing info, user data • Picture layer: temporal ref number, picture type, synchronization info, resolution, range of motion vectors • Slices: position of slice in picture, quantization scale factor • Macroblock: position, H and V motion vectors, which blocks are coded and transmitted GOP-1 GOP-2 GOP-n I B B B P B B.. Slice-1 Slice-2 … Slice-N mb-1 mb-2 mb-n 0 1 2 3 4 5 Sequence layer GOP layer Picture layer Slice layer Macroblock layer 8x8 block
  • 8.
    MPEG: Macroblock Coding ch a n g e M Q U A N T n o c h a n g e t o M Q U A N T I p ic t u r e c h a n g e M Q U A N T n o c h a n g e t o M Q U A N T c o d e d n o t c o d e d in t e r fr a m e c h a n g e M Q U A N T n o c h a n g e t o M Q U A N T in t r a fr a m e m o t io n c o m p . A m o t io n v e c t o r s e t t o 0 P p ic t u r e A F w d m o t io n c o m p e n s a t io n A B w d m o t io n c o m p e n s a t io n A in t e r p o la t e d c o m p e n s a t io n B p ic t u r e P ic t u r e T y p e A MQUANT= scale factor q ],[ ],[8 ],[ jiqQ jiDCT jiQDCT = Quantization matrix
  • 9.
    MPEG-2 • Why anotherstandard? • Support higher bit rates e.g., 80-100 Mbits/s for HDTV instead of the 1.15 Mvits/s for SIF • Support a larger number of applications • The encoding standard should be a toolkit rather than a flat procedure • Interlaced and non-interlaced frame • Different color subsampling modes e.g., 4:2:2, 4:2:0, 4:4:4 • Flexible quantization schemes – can be changed at picture level • Scalable bit-streams • Profiles and levels
  • 10.
    MPEG-2: Effects ofInterlacing • Fields or frame pictures can be encoded • Prediction Modes and Motion Compensation • Frame prediction: current frame predicted from previous frame • Field prediction: • Top and bottom fields of reference frame predicts first field • Bottom field of previous frame and top field of current frame predicts the bottom field of current frame • 16 X 8 motion compensation mode • A macroblock may have two of them • A B picture macroblock may have four! • Dual prime motion compensation • Top field of current frame is predicted from two motion vectors coming from the top and bottom field of reference frame • Works for P vectors
  • 11.
    MPEG-2: Profiles andLevels Levels Profiles SNR 4:2:0 Spatial 4:2:0 High 4:2:0;4:2:2 Multiview 4:2:0 High Enhancement 1920 X 1151/60 1920 X 1151/60 Lower 960 X 576/30 1920 X 1151/60 Bitrate 100, 80,25 130, 50, 80 High-1440 Enhancement 1440 X 1152/60 1440 X 1152/60 1920 X 1152/60 Lower 720 X 576/30 720 X 576/30 1920 X 1152/60 Bitrate 60, 40, 15 80, 60, 20 100, 40, 60 Main Enhancement 720 X 576/30 720 X 576/30 720 X 576/30 Lower 352 X 288/30 720 X 576/30 Bitrate 15, 10 20, 15, 4 25, 10, 15 Low Enhancement 352 X 288/30 352 X 288/30 Lower 352 X 288/30 Bitrate 4, 3 8, 4, 4
  • 12.
    MPEG-2 Applications • DigitalBetacam: 90 Mbits/s video • MPEG-2 • Main Profile, Main Level, 4:2:0: 15 Mbits/s • High Profile, High Level, 4:2:0: adequate, expensive • Image quality preserved across generations of processing • Multiview Profile • Stereoscopic view – disparity prediction • Virtual walk-throughs composed from multiple viewpoints