3. Outline
An example : MPEG-2 to MPEG-4
¡Ð Drift error analysis for spatial resolution reduction
¡Ð Novel drift compensation architectures and techiques
¡Ð Comparisons of complexity and quality
Conclusion / Future Work
3
5. Purpose of Transcoder
¡Ð Syntax conversion
Mpeg-2 Mpeg-4 to support Mobile devices
Mpeg-2 Transport stream Mpeg-2 Program stream to support DVD
¡Ð Other Conversions
Video summarization for a compact representation of content; satisfy time
constraints
Color depth reduction, e.g., support for 4-bit PDA display
Text summarization, e.g., compact viewing, including in HTML-to-WML
Multi-model, e.g., text-to-speech, audio driven animation model
5
6. Transcoding Example
Transcoding research focuses on efficient techniques to perform
such conversions
6
14. Application Environments
Transcoding is needed to fill the gaps between content, network, terminal
14 and user
15. Overview of Video Tanscoding
Techniques
Conventional Approaches
¡Ð Full decoding , post-processing, full re-encoding
¡Ð Highest quality, but an expensive solution
¡Ð In most cases, requires a hardware-based solution
Low-Cost Approaches
¡Ð Target similar quality as conventional approach, but with much lower
complexity
¡Ð Architectures that utilize compressed-domain processing can provide savings
¡Ð Low-cost solutions may be more flexible as they also enable software
solution
15
16. Bit Rate Reduction
Purpose
¡Ð Bandwidth savings for efficient transmission
¡Ð Compatibility with certain profile/level, e.g., MPEG-4 Simple
Profile @ Level 0
Main Issues
¡Ð Drift compensation architecture
¡Ð Rate control algorithm
¡Ð Trade-off between quality and complexity
16
17. Bit Rate Reduction
Technical challenges
¡Ð Picture quality degradation: re-quantization error, drift
¡Ð Complexity reduction with partial decoding
Approachs
¡Ð Cutting high frequencies, Requantization
¡Ð Open-Loop and Closed-loop architectures
17
19. Experimental Results
Comparison of Open-Loop and Closed Loop architectures
¡Ð Original sequence encoded at 2Mbps, N=30, M=3
¡Ð Transcoded to fixed QP=15 with both architectures; plot shows I/P frams
only
¡Ð Server drift with open-loop
19
20. Joint Transcoding
In many communication systems, it is desirable to distribute an
aggregate rate over multiple programs.
In spatial domain, this is known as statistical multiplexing
(StatMux)
¡Ð Encode pictures proportional to encoding complexity
¡Ð Complexity is determined from pixel domain
¡Ð Distribute bits to achieve min distortion across all programs
20
21. Joint Transcoding
If the programs are already encoded, joint transcoding
techniques will minimize the distortion
¡Ð Extract normalized activity measures from original quantizer
scales
¡Ð Reassign target distributions
21
23. Temporal Resolution Reduction
Purpose
¡Ð Bandwidth savings for efficient transmission
¡Ð Reduce number of frames/sec to meet processing requirements at
terminal
Main Issues
¡Ð Estimate new motion vector based on incoming motion vectors
¡Ð Estimate new residual based on incoming residual values
23
24. Temporal Resolution Reduction
Technical Challenges
¡Ð Picture quality degradation
¡Ð Avoid MV re-estimation
¡Ð Minimize mismatch between predictive and residual components
Approaches
¡Ð MV interpolation via bilinear interpolation
¡Ð MV interpolation via majority voting
24
25. MV Interpolation
Problem: estimation MV between current and new reference
frames
MVskip = mv + mvint
Solutions (how to determine mvint)
¡Ð Bilinear interpolation:
¡Ð Majority voting:
25
26. Estimating New Residue
Residue Compensation
¡Ð Need to minimize between new MV and residue
¡Ð New residue corresponding to MV interpolation by
majority voting:
residueskip = residuei + residue
where wi ≥ wj
26
27. Spatial Resolution Reduction
Purpose
¡Ð Bitstream that can be decoded and displayed on a low resolution
screen
¡Ð Bandwidth savings for efficient transmission
¡Ð Compatibility with certain profile/level, e.g., MPEG-4 Simple
Profile
Main Issues
¡Ð Motion vectors corresponding to reduced resolution reference frame
¡Ð Obtaining texture information for lower reslution MB’s
¡Ð Drift compensation architecture
27
28. Spatial Resolution Reduction
Technical challenges
¡Ð Picture quality degradation
¡Ð Down-conversion filtering
¡Ð Motion vector mapping
Approaches
¡Ð Cascaded approach: full decoding, spatial down sampling,
and full re-encoding
¡Ð Low-cost approaches that avoid spatial down-sampling and
full re-encoding
28
29. Case Study: MPEG-2 to MPEG-4
Motivation
¡Ð MPEG-2 in the DTV/DVD market has created a large amount
digital infrastructure and broadcast quality content
¡Ð MPEG-4 adopted for mobile multimedia communications
¡Ð Error-resilient transmission to low resolution displays on mobile
devices
¡Ð There will be a large demand for this specific transcoding
technology
29
30. Case Study: MPEG-2 to MPEG-4
Topics to be Covered
¡Ð Syntax Conversion: at higher and lower layers
¡Ð MB-level conversions, e.g., MV mapping, texture down-sampling
¡Ð Analysis of drift errors when transcoding to a lower spatial
resolution
¡Ð Presentation of various architectures to overcome sources of drift
¡Ð Rate control and bit allocation issues
¡Ð Evaluation of complexity and quality
30
32. Motion Vector Mapping
Frame-Based
4:1 mapping v.s. 1:1 mapping
¡Ð use adaptive mapping based on variance of 4 motion vectors
32
33. Motion Vector Mapping
Frame-Based
May have up to eight 16*8 MV’s (2 per MB)
For mapping
¡Ð Use top-field MV as default
¡Ð If motion_vertical_filed_select[0][0]= =1, i.e., the bottom
field is used to predict the top field, then the top-field and
33 bottom field-motion vectors are averaged
34. Texture Down-Sampling
Actual implementation
¡Ð use separable 1D filters to compute down-converted blocks
¡Ð mathematically equivalent filters can be derived in spatial domain
¡Ð filtering can be adapted to work on a field basis
¡Ð corresponding up-conversion filters are also available
34
35. Mixed Block Processor
Purpose
¡Ð Pre-process selected MB’s to ensure no mixing modes with one MB
¡Ð Mixed coding modes within MB not supported by coding standards
Processing
¡Ð Map MB modes so that all sub-blocks have same mode, either all intra or inter
¡Ð Modify MV’s and DCT coefficients to correspond with MB modes
Example of Mixed Block MB sub-blocks
MB(0) MB(1) (after down-conversion)
MB Inter Inter MB Intra Intra MB(x)
DCT DCT
b(0) b(1)
Inter MV Zero MV
MB Inter Inter MB Inter Inter
b(2) b(3)
DCT DCT
Inter MV Inter MV
MB(k) MB(k+1) sub-block must have same mode
35
36. Mixed Block Processor (Cont’d)
Three possible methods
¡Ð ZeroOut
Convert mixed-block MB modes to Inter
MV’s and DCT coefficients set to Zero
36
37. Mixed Block Processor (Cont’d)
¡Ð IntraInter
Convert mixed-block MB modes to Inter
MV for Intra block are predicted from neighbors
Corresponding Inter DCT coefficients are computed
Decoding loop
¡Ð InterIntra is needed for
Convert mixed-block MB modes to Intra these options
MV for mixed blocks set to zero
Corresponding Intra DCT coefficients are computed
37
40. Drift Error Analysis
Approach
¡Ð Compare closed-loop reference with simple open-loop
architecture
¡Ð We discuss P frames only, since B frames do not introduce drift
error propagation
Rationale
¡Ð Expose all the possible sources of drift errors
40
41. Reference Analysis
P-frame analysis
g n = D(e1 ) + D( M f ( x1 −1 )) − M r ( yn −1 )
2
n n
2
41
45. Drift Low Architecture
Reduced resolution residual is approximated as
g n = D(e1 ) + M r ( y1 −1 − yn −1 )
2
n n
2
Assumes the following approximation
D ( M f ( x1 −1 )) = M r ( D( x1 −1 )) = M r ( y1 −1 )
n n n
Architecture attempts to eliminate dq
45
47. Drift Full Architecture
Reduced resolution residual is approximated as
g n = D(e1 ) + M r ( x1 −1 − xn −1 )
2
n n
2
Assumes the following approximation
M r ( yn −1 ) = D( M f (U ( yn −1 ))) = D( M f ( xn −1 ))
2 2 2
Architecture attempts to eliminate dq and dr
47
49. MC Low Architecture
Reduced resolution residual is approximated as
g n = D(e1 ) + D( M f ( x1 −1 )) − M r ( D( x1 −1 ))
2
n n n
Assumes the following approximation
yn −1 = y1 −1 = D( x1 −1 )
2
n n
Architecture attempts to eliminate dr
49
51. Intra Refresh Architecture
Inter-Intra used to convert inter-coded blocks to intra
Intra-coded blocks not subject to drift, therefore aim to stop
drift propagation for both dq and dr
Flexible and capable of correcting error caused by MV
mapping as well
Two steps involved:
¡Ð Estimate amount of drift
¡Ð Translate drift estimate into an intra-refresh rate
Intra refresh must work jointly with rate control
51
52. Profile Definitions of
Version 1
Simple Profile
¢w Basic tool of I/P VOP AC/DC Prediction and 4MV unrestricted
¢w Short header and Error Resilience tools
Core Profile
¢w Simple + Binary Shape, Quantization Method ½ and B-VOP
Main Profile
¢w Core + Grey Shape, Interlace and Sprite
Simple Scalable Profile
¢w Simple + Spatial and temporal scalability and B-VOP
52
53. Profile Definitions of
Version 1
N-Bit Profile
¢w Core + N-Bit
Animated 2D Mesh
¢w Core + Scalable Still Texture, 2D Dynamic Mesh
Basic Animated Texture
¢w Binary Shape, Scalable Still Texture and 2D Dynamic Mesh
Still Scalable Texture ¢w Scalable Still Texture
Simple Face-Face Animation Parameters
53
54. Profile Definitions of
Version 2
Advanced Real Time Simple Profile
¢w Simple +
¢w Advanced error resilience with channel,
¢w Improved temporal scalability with low buffering delay
Core Scalable Profile
¢w Simple scalable +
¢w Core +
¢w SNR, Spatial/Temporal Scalability for Region or Object of Internet
54
55. Profile Definitions of
Version 2
Advanced Coding Efficiency Profile
¢w Tool for improving coding efficiency for both rectangular and arbitrary
shaped objects
¢w For applications such as mobile broadcast reception
Advanced Scalable Texture Profile
¢w Tool for decoding arbitrary shaped texture and still image including
scalable shape coding
55
56. Profile Definitions of
Version 2
Advanced Core Profile
¢w Core Profile +
¢w Tool for decoding arbitrary shaped video objects and arbitrary shaped
scalable still image
Simple Face and Body Animation Profile
¢w Simple face animation + body animation
56
58. Comparison of Transcoding Arch.
Reference Architecture
¢w 2 loop solution; corrects for all types of errors
Residual value can change with modified motion vector
Also, compensates for re-quantization error in inter-coded blocks
Intra Refresh Architecture
¢w 1 loop solution; uses intra-block refresh to corrects for errors
Residual value cannot change with modified motion vector
No compensation for re-quantization errors in inter-coded blocks
58
59. Comparison of Transcoding Arch.
MC Low Architecture
¢w 1.5 loop solution; use partial encoder to compensate for errors
Residual value can change with modified motion vector
No compensation for re-quantization errors in inter-coded blocks
¢w Quality and complexity should be between intra refresh and reference
59
63. Complexity Reductions
Down-Conversion Optimizations
¡Ð For intra refresh architecture
Float-to-integer, exploit filter symmetry and zero coefficient
Approximately 70% improvement for down-conversion (5.4s to 1.6s)
¡Ð For reference and partial encoder architectures
Replace frequency synthesis filter with averaging filter
63
64. Complexity Reductions
Speeding up FDCT, IDCT, and MC
¡Ð MMX implementation for FDCT; 26% overall reduction (20.0s to 14.9s)
¡Ð SSE2 implementation for IDCT; 9% overall reduction (16.3s to 14.9s)
¡Ð MMX implementation for common block-based process
Common process include average, clipping, block addition
These optimized routines have a significant impact on MC
64
65. Observations on Complexity
Overall improvement is quite high
¡Ð 61% for Intra Refresh
¡Ð 71% for Reference; 74% for partial Encoding
Transcoding multiple streams in software is feasible
¡Ð 2 streams can be supported by reference; 3 streams by proposed methods
¡Ð All methods provide acceptable quality
Further complexity reduction
¡Ð Computation for RC_Quant can be reduced by avoiding division operations
¡Ð Majority of complexity now in DecTime and MB_Code protions
¡Ð Maybe other marginal gains possible if data is restructured
65
66. Experimental Results: Akiyo
Akiyo
¡Ð Low motion and low-level of detail
¡Ð CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B
¡Ð Source bit rate: 512Kbps
66
68. Experimental Results: Foreman
Foreman
¡Ð Medium motion and medium-level of detail
¡Ð CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B
¡Ð Source bit rate: 2Mbps
68
70. Experimental Results: Football
Football
¡Ð Fast motion and high-level of detail
¡Ð CCIR601 (720*480) -> SIF (352*240), N=15, M=3, drop B
¡Ð Source bit rate: 6Mbps
70
72. Summary of MPEG-2 to MPEG-4
Key observations
¡Ð DriftFull with InterIntra more complex than Reference
Not recommended to be used
¡Ð Simple sequences with low motion and low level of detail
Zeroout: reasonably good quality
InterIntra, IntraInter, Intra_Refresh, MC_Low, DriftLow: high quality
¡Ð Sequences with medium to high motion
Artifacts can be found in Zeroout, InterIntra, IntraInter, DriftLow
Intra_Refresh, MC_Low comparable to Reference
72
73. Summary of MPEG-2 to MPEG-4
Summary
¡Ð Intra Refresh
Offers vest trade-off between quality and complexity
Flexible and adaptable, i.e., easily scaled in terms of complexity-
quality
¡Ð MC Low
Provide a reasonable quality-complexity trade-off
A good alternative to Reference, but less dynamic compared to
Intra-Refresh
73
75. Transcoding of FGS to Simple
Profile (2)
Conceptual illustration
Technique issues
¡Ð How to combine the two bitstreams in DCT domain or even at bitstream
level by advanced processing
¡Ð How to minimize the efforts in the combining processes for converting the
two FGS bitstreams into an MPEG-4 Simple Profile bitstream
75
80. Future Transcoding Considerations
Industry Need
¡Ð Describing a dynamic usage environment
Capabilities of the terminal and network
User preference and natural environment conditions
Types of services that are available
¡Ð Transcoding should be performed according to usage environment
¡Ð This is one of the targets for emerging MPEG-21 strandard
80
81. Future Transcoding Considerations
Research Topic
¡Ð Transcoding strategy is needed for multiple transcoding possibilities
¡Ð For example:
Send QCIF @ 30Hz or CIF @ 10Hz
Key frame w/audio or QCIF @ 7.5Hz
¡Ð What is a suitable quality metric for optimal transcoding strategy?
¡Ð How to measure distortion across spatio-temporal scales?
81
82. Conclusion
Transcoding is a bridge between standards in many
applications
Transcoding is a very useful tool for video streaming systems
in which the content format at the server has been defined
Transcoding is a useful component for UMA which is
concerned with the access to any multimedia content from any
type of terminal or network. This is an important part of
MPEG-21
82