Transcoding of MPEG Compressed Bitstreams: Techniques and ...
Upcoming SlideShare
Loading in...5
×
 

Transcoding of MPEG Compressed Bitstreams: Techniques and ...

on

  • 838 views

 

Statistics

Views

Total Views
838
Views on SlideShare
836
Embed Views
2

Actions

Likes
0
Downloads
46
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Transcoding of MPEG Compressed Bitstreams: Techniques and ... Transcoding of MPEG Compressed Bitstreams: Techniques and ... Presentation Transcript

  • EE665000 µø°T³B²z Transcoding of MPEG Compressed Bitstreams: Techniques and Application 1
  • Outline Introduction ¡Ð Purpose of Transcoder ¡Ð Transcoding Application Overview of Transcoding Techniques ¡Ð Bit-rate Reduction ¡Ð Temporal Resolution Reduction ¡Ð Spatial Resolution Reduction 2
  • Outline An example : MPEG-2 to MPEG-4 ¡Ð Drift error analysis for spatial resolution reduction ¡Ð Novel drift compensation architectures and techiques ¡Ð Comparisons of complexity and quality Conclusion / Future Work 3 View slide
  • Purpose of Transcoder Bitstream Bitstream ¡Ð Bit rate reduction SDTV : 6Mbps 3Mbps, HDTV : 19.2 Mbps 11Mbps ¡Ð Frame rate reduction 30 frame/s 10 frame/s : surveillance application ¡Ð Resolution reduction HDTV SDTV 720*480i, 30Hz 352*240p 10Hz 4 View slide
  • Purpose of Transcoder ¡Ð Syntax conversion Mpeg-2 Mpeg-4 to support Mobile devices Mpeg-2 Transport stream Mpeg-2 Program stream to support DVD ¡Ð Other Conversions Video summarization for a compact representation of content; satisfy time constraints Color depth reduction, e.g., support for 4-bit PDA display Text summarization, e.g., compact viewing, including in HTML-to-WML Multi-model, e.g., text-to-speech, audio driven animation model 5
  • Transcoding Example Transcoding research focuses on efficient techniques to perform such conversions 6
  • Concept: Universal Multimedia Access (UMA) 7
  • Use Case: Video Server Deliver Video From Server to Mobile Device ¡Ð For broadcast or surveillance content 8
  • Use Case: Surveillance 9
  • User Case: Ste-Top Box 10
  • Use Case: DVD Recorder System 11
  • Use Case: DTV Distribution to Remote Devices 12
  • Use Case: Enhanced Server Operation 13
  • Application Environments Transcoding is needed to fill the gaps between content, network, terminal 14 and user
  • Overview of Video Tanscoding Techniques Conventional Approaches ¡Ð Full decoding , post-processing, full re-encoding ¡Ð Highest quality, but an expensive solution ¡Ð In most cases, requires a hardware-based solution Low-Cost Approaches ¡Ð Target similar quality as conventional approach, but with much lower complexity ¡Ð Architectures that utilize compressed-domain processing can provide savings ¡Ð Low-cost solutions may be more flexible as they also enable software solution 15
  • Bit Rate Reduction Purpose ¡Ð Bandwidth savings for efficient transmission ¡Ð Compatibility with certain profile/level, e.g., MPEG-4 Simple Profile @ Level 0 Main Issues ¡Ð Drift compensation architecture ¡Ð Rate control algorithm ¡Ð Trade-off between quality and complexity 16
  • Bit Rate Reduction Technical challenges ¡Ð Picture quality degradation: re-quantization error, drift ¡Ð Complexity reduction with partial decoding Approachs ¡Ð Cutting high frequencies, Requantization ¡Ð Open-Loop and Closed-loop architectures 17
  • Bit-Rate Reduction Architectures 18
  • Experimental Results Comparison of Open-Loop and Closed Loop architectures ¡Ð Original sequence encoded at 2Mbps, N=30, M=3 ¡Ð Transcoded to fixed QP=15 with both architectures; plot shows I/P frams only ¡Ð Server drift with open-loop 19
  • Joint Transcoding In many communication systems, it is desirable to distribute an aggregate rate over multiple programs. In spatial domain, this is known as statistical multiplexing (StatMux) ¡Ð Encode pictures proportional to encoding complexity ¡Ð Complexity is determined from pixel domain ¡Ð Distribute bits to achieve min distortion across all programs 20
  • Joint Transcoding If the programs are already encoded, joint transcoding techniques will minimize the distortion ¡Ð Extract normalized activity measures from original quantizer scales ¡Ð Reassign target distributions 21
  • Block Diagram of Joint Transcoder 22
  • Temporal Resolution Reduction Purpose ¡Ð Bandwidth savings for efficient transmission ¡Ð Reduce number of frames/sec to meet processing requirements at terminal Main Issues ¡Ð Estimate new motion vector based on incoming motion vectors ¡Ð Estimate new residual based on incoming residual values 23
  • Temporal Resolution Reduction Technical Challenges ¡Ð Picture quality degradation ¡Ð Avoid MV re-estimation ¡Ð Minimize mismatch between predictive and residual components Approaches ¡Ð MV interpolation via bilinear interpolation ¡Ð MV interpolation via majority voting 24
  • MV Interpolation Problem: estimation MV between current and new reference frames MVskip = mv + mvint Solutions (how to determine mvint) ¡Ð Bilinear interpolation: ¡Ð Majority voting: 25
  • Estimating New Residue Residue Compensation ¡Ð Need to minimize between new MV and residue ¡Ð New residue corresponding to MV interpolation by majority voting: residueskip = residuei + residue where wi ≥ wj 26
  • Spatial Resolution Reduction Purpose ¡Ð Bitstream that can be decoded and displayed on a low resolution screen ¡Ð Bandwidth savings for efficient transmission ¡Ð Compatibility with certain profile/level, e.g., MPEG-4 Simple Profile Main Issues ¡Ð Motion vectors corresponding to reduced resolution reference frame ¡Ð Obtaining texture information for lower reslution MB’s ¡Ð Drift compensation architecture 27
  • Spatial Resolution Reduction Technical challenges ¡Ð Picture quality degradation ¡Ð Down-conversion filtering ¡Ð Motion vector mapping Approaches ¡Ð Cascaded approach: full decoding, spatial down sampling, and full re-encoding ¡Ð Low-cost approaches that avoid spatial down-sampling and full re-encoding 28
  • Case Study: MPEG-2 to MPEG-4 Motivation ¡Ð MPEG-2 in the DTV/DVD market has created a large amount digital infrastructure and broadcast quality content ¡Ð MPEG-4 adopted for mobile multimedia communications ¡Ð Error-resilient transmission to low resolution displays on mobile devices ¡Ð There will be a large demand for this specific transcoding technology 29
  • Case Study: MPEG-2 to MPEG-4 Topics to be Covered ¡Ð Syntax Conversion: at higher and lower layers ¡Ð MB-level conversions, e.g., MV mapping, texture down-sampling ¡Ð Analysis of drift errors when transcoding to a lower spatial resolution ¡Ð Presentation of various architectures to overcome sources of drift ¡Ð Rate control and bit allocation issues ¡Ð Evaluation of complexity and quality 30
  • Macroblock Conversions Spatial resolution reduced by half [4MB to 1MB] ¡Ð Motion vector mapping ¡Ð Texture down-sampling ¡Ð Mixed block processing 31
  • Motion Vector Mapping Frame-Based 4:1 mapping v.s. 1:1 mapping ¡Ð use adaptive mapping based on variance of 4 motion vectors 32
  • Motion Vector Mapping Frame-Based May have up to eight 16*8 MV’s (2 per MB) For mapping ¡Ð Use top-field MV as default ¡Ð If motion_vertical_filed_select[0][0]= =1, i.e., the bottom field is used to predict the top field, then the top-field and 33 bottom field-motion vectors are averaged
  • Texture Down-Sampling Actual implementation ¡Ð use separable 1D filters to compute down-converted blocks ¡Ð mathematically equivalent filters can be derived in spatial domain ¡Ð filtering can be adapted to work on a field basis ¡Ð corresponding up-conversion filters are also available 34
  • Mixed Block Processor Purpose ¡Ð Pre-process selected MB’s to ensure no mixing modes with one MB ¡Ð Mixed coding modes within MB not supported by coding standards Processing ¡Ð Map MB modes so that all sub-blocks have same mode, either all intra or inter ¡Ð Modify MV’s and DCT coefficients to correspond with MB modes Example of Mixed Block MB sub-blocks MB(0) MB(1) (after down-conversion) MB Inter Inter MB Intra Intra MB(x) DCT DCT b(0) b(1) Inter MV Zero MV MB Inter Inter MB Inter Inter b(2) b(3) DCT DCT Inter MV Inter MV MB(k) MB(k+1) sub-block must have same mode 35
  • Mixed Block Processor (Cont’d) Three possible methods ¡Ð ZeroOut Convert mixed-block MB modes to Inter MV’s and DCT coefficients set to Zero 36
  • Mixed Block Processor (Cont’d) ¡Ð IntraInter Convert mixed-block MB modes to Inter MV for Intra block are predicted from neighbors Corresponding Inter DCT coefficients are computed Decoding loop ¡Ð InterIntra is needed for Convert mixed-block MB modes to Intra these options MV for mixed blocks set to zero Corresponding Intra DCT coefficients are computed 37
  • Reference Architecture 38
  • Open-Loop Architecture Open-Loop analysis 39
  • Drift Error Analysis Approach ¡Ð Compare closed-loop reference with simple open-loop architecture ¡Ð We discuss P frames only, since B frames do not introduce drift error propagation Rationale ¡Ð Expose all the possible sources of drift errors 40
  • Reference Analysis P-frame analysis g n = D(e1 ) + D( M f ( x1 −1 )) − M r ( yn −1 ) 2 n n 2 41
  • Drift Error Analysis Error due to quantization Error due to down-sampling 42
  • Drift Compensation Architectures “Drift Low” ¡Ð Drift compensation in reduced resoltion “Drift Full” ¡Ð Drift compensation in original resolution “MC Low” ¡Ð Drift compensation by partial re-encoding “Intra Refresh” ¡Ð Drift compensation by intra block refresh 43
  • Drift Low Architecture 44
  • Drift Low Architecture Reduced resolution residual is approximated as g n = D(e1 ) + M r ( y1 −1 − yn −1 ) 2 n n 2 Assumes the following approximation D ( M f ( x1 −1 )) = M r ( D( x1 −1 )) = M r ( y1 −1 ) n n n Architecture attempts to eliminate dq 45
  • Drift Full Architecture 46
  • Drift Full Architecture Reduced resolution residual is approximated as g n = D(e1 ) + M r ( x1 −1 − xn −1 ) 2 n n 2 Assumes the following approximation M r ( yn −1 ) = D( M f (U ( yn −1 ))) = D( M f ( xn −1 )) 2 2 2 Architecture attempts to eliminate dq and dr 47
  • MC Low Architecture 48
  • MC Low Architecture Reduced resolution residual is approximated as g n = D(e1 ) + D( M f ( x1 −1 )) − M r ( D( x1 −1 )) 2 n n n Assumes the following approximation yn −1 = y1 −1 = D( x1 −1 ) 2 n n Architecture attempts to eliminate dr 49
  • Intra Refresh Architecture 50
  • Intra Refresh Architecture Inter-Intra used to convert inter-coded blocks to intra Intra-coded blocks not subject to drift, therefore aim to stop drift propagation for both dq and dr Flexible and capable of correcting error caused by MV mapping as well Two steps involved: ¡Ð Estimate amount of drift ¡Ð Translate drift estimate into an intra-refresh rate Intra refresh must work jointly with rate control 51
  • Profile Definitions of Version 1 Simple Profile ¢w Basic tool of I/P VOP AC/DC Prediction and 4MV unrestricted ¢w Short header and Error Resilience tools Core Profile ¢w Simple + Binary Shape, Quantization Method ½ and B-VOP Main Profile ¢w Core + Grey Shape, Interlace and Sprite Simple Scalable Profile ¢w Simple + Spatial and temporal scalability and B-VOP 52
  • Profile Definitions of Version 1 N-Bit Profile ¢w Core + N-Bit Animated 2D Mesh ¢w Core + Scalable Still Texture, 2D Dynamic Mesh Basic Animated Texture ¢w Binary Shape, Scalable Still Texture and 2D Dynamic Mesh Still Scalable Texture ¢w Scalable Still Texture Simple Face-Face Animation Parameters 53
  • Profile Definitions of Version 2 Advanced Real Time Simple Profile ¢w Simple + ¢w Advanced error resilience with channel, ¢w Improved temporal scalability with low buffering delay Core Scalable Profile ¢w Simple scalable + ¢w Core + ¢w SNR, Spatial/Temporal Scalability for Region or Object of Internet 54
  • Profile Definitions of Version 2 Advanced Coding Efficiency Profile ¢w Tool for improving coding efficiency for both rectangular and arbitrary shaped objects ¢w For applications such as mobile broadcast reception Advanced Scalable Texture Profile ¢w Tool for decoding arbitrary shaped texture and still image including scalable shape coding 55
  • Profile Definitions of Version 2 Advanced Core Profile ¢w Core Profile + ¢w Tool for decoding arbitrary shaped video objects and arbitrary shaped scalable still image Simple Face and Body Animation Profile ¢w Simple face animation + body animation 56
  • 57
  • Comparison of Transcoding Arch. Reference Architecture ¢w 2 loop solution; corrects for all types of errors Residual value can change with modified motion vector Also, compensates for re-quantization error in inter-coded blocks Intra Refresh Architecture ¢w 1 loop solution; uses intra-block refresh to corrects for errors Residual value cannot change with modified motion vector No compensation for re-quantization errors in inter-coded blocks 58
  • Comparison of Transcoding Arch. MC Low Architecture ¢w 1.5 loop solution; use partial encoder to compensate for errors Residual value can change with modified motion vector No compensation for re-quantization errors in inter-coded blocks ¢w Quality and complexity should be between intra refresh and reference 59
  • Comparison of Transcoding Arch. 60
  • Complexity Analysis [Non-Optimized] Simulation ¡Ð Machine: Pentium 4, 1.8GHz, 512MB ¡Ð Content: Highway19 @ 384Kbps, 30 sec duration 61
  • Complexity Analysis [Optimized] Simulation ¡Ð Machine: Pentium 4, 1.8GHz, 512MB ¡Ð Content: Highway19 @ 384Kbps, 30 sec duration 62
  • Complexity Reductions Down-Conversion Optimizations ¡Ð For intra refresh architecture Float-to-integer, exploit filter symmetry and zero coefficient Approximately 70% improvement for down-conversion (5.4s to 1.6s) ¡Ð For reference and partial encoder architectures Replace frequency synthesis filter with averaging filter 63
  • Complexity Reductions Speeding up FDCT, IDCT, and MC ¡Ð MMX implementation for FDCT; 26% overall reduction (20.0s to 14.9s) ¡Ð SSE2 implementation for IDCT; 9% overall reduction (16.3s to 14.9s) ¡Ð MMX implementation for common block-based process Common process include average, clipping, block addition These optimized routines have a significant impact on MC 64
  • Observations on Complexity Overall improvement is quite high ¡Ð 61% for Intra Refresh ¡Ð 71% for Reference; 74% for partial Encoding Transcoding multiple streams in software is feasible ¡Ð 2 streams can be supported by reference; 3 streams by proposed methods ¡Ð All methods provide acceptable quality Further complexity reduction ¡Ð Computation for RC_Quant can be reduced by avoiding division operations ¡Ð Majority of complexity now in DecTime and MB_Code protions ¡Ð Maybe other marginal gains possible if data is restructured 65
  • Experimental Results: Akiyo Akiyo ¡Ð Low motion and low-level of detail ¡Ð CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B ¡Ð Source bit rate: 512Kbps 66
  • Akiyo (Cont’) 67
  • Experimental Results: Foreman Foreman ¡Ð Medium motion and medium-level of detail ¡Ð CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B ¡Ð Source bit rate: 2Mbps 68
  • Foreman (Cont’) 69
  • Experimental Results: Football Football ¡Ð Fast motion and high-level of detail ¡Ð CCIR601 (720*480) -> SIF (352*240), N=15, M=3, drop B ¡Ð Source bit rate: 6Mbps 70
  • Football (Cont’) 71
  • Summary of MPEG-2 to MPEG-4 Key observations ¡Ð DriftFull with InterIntra more complex than Reference Not recommended to be used ¡Ð Simple sequences with low motion and low level of detail Zeroout: reasonably good quality InterIntra, IntraInter, Intra_Refresh, MC_Low, DriftLow: high quality ¡Ð Sequences with medium to high motion Artifacts can be found in Zeroout, InterIntra, IntraInter, DriftLow Intra_Refresh, MC_Low comparable to Reference 72
  • Summary of MPEG-2 to MPEG-4 Summary ¡Ð Intra Refresh Offers vest trade-off between quality and complexity Flexible and adaptable, i.e., easily scaled in terms of complexity- quality ¡Ð MC Low Provide a reasonable quality-complexity trade-off A good alternative to Reference, but less dynamic compared to Intra-Refresh 73
  • Transcoding of FGS to Simple Profile (1) Application scenario 74
  • Transcoding of FGS to Simple Profile (2) Conceptual illustration Technique issues ¡Ð How to combine the two bitstreams in DCT domain or even at bitstream level by advanced processing ¡Ð How to minimize the efforts in the combining processes for converting the two FGS bitstreams into an MPEG-4 Simple Profile bitstream 75
  • Transcoding of FGS to Simple Profile (3) Reference architecture 76
  • Transcoding of FGS to Simple Profile (4) Analysis of Reference Architecture ¡Ð P-frame analysis 77
  • Transcoding of FGS to Simple Profile (5) Proposed Architecture 78
  • Transcoding of FGS to Simple Profile (6) Simulation results 79
  • Future Transcoding Considerations Industry Need ¡Ð Describing a dynamic usage environment Capabilities of the terminal and network User preference and natural environment conditions Types of services that are available ¡Ð Transcoding should be performed according to usage environment ¡Ð This is one of the targets for emerging MPEG-21 strandard 80
  • Future Transcoding Considerations Research Topic ¡Ð Transcoding strategy is needed for multiple transcoding possibilities ¡Ð For example: Send QCIF @ 30Hz or CIF @ 10Hz Key frame w/audio or QCIF @ 7.5Hz ¡Ð What is a suitable quality metric for optimal transcoding strategy? ¡Ð How to measure distortion across spatio-temporal scales? 81
  • Conclusion Transcoding is a bridge between standards in many applications Transcoding is a very useful tool for video streaming systems in which the content format at the server has been defined Transcoding is a useful component for UMA which is concerned with the access to any multimedia content from any type of terminal or network. This is an important part of MPEG-21 82