View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
CONTENTS Overview• A Little Theory• Achieving Compression• Various Algorithms• Spatial Compression• Spectral Compression• Steps in Image Compression• Temporal Compression• Video Coding and Frames• Motion Compensated Prediction• Motion Estimation• Block Matching Summary• Basic VC Architecture• Video Encoder• Video Decoder• Using Video Compression Standards• Current Video Compression Standards• Video Coding Standardization Organizations• Dynamics of Video Standardization Process• Video Compression Standards• Scope of Development• References and Further Reading
OVERVIEW To reduce quantity of data used to represent digital video images. •Saves space and bandwidth • Saves energy • Increases portability • Reduce cost Eg.- 720 X 1280 pixels/frame, progressive scanning @60 frames/sec (720X 1280 ppf)(60 fps)(3 colors/pixel)(8 bits/color) = 1.3Gb/sec • 20 Mbps HD channel bandwidth • Requires compression by a factor of 70(equivalent to .35 bits/ pixel)
A Little Theory Video – It is a 3-D array of color pixels. • Two of the dimensions serve as the spatial domain for moving pictures • One dimension serves as the time domain Data Frame – Set of all pixels that corresponds to single time moment. Human eye • Not responsive to every detail • Quantization • Smoothening • More responsive to brightness than chrominance
Achieving Compression Reduce redundancy and irrelevancy. Sources of Redundancy • Temporal: Adjacent frames highly correlated • Spatial: Nearby pixels are often correlated • Color space: RGB components are correlated among themselves • Relatively straight to exploit Sources of Irrelevancy • Perceptually unimportant information • Difficult to model and exploit
Various Algorithms4 Major ways to compress videos DCT(Discrete Cosine Transform) • Samples images at regular intervals • Analyze frequency components present in sample • Discard unimportant frequencies from point of view of human eye Vector Quantization(VQ) • It looks at an array of data rather than individual values and than averaging what it perceives • Compresses the found redundant data keeping the desired object contd…
Fractal compression(FC) • Form of vector quantization. • Finds self-similar section of particular image, than uses fractal algorithm to create the sections Discrete Wavelet Transform(DWT) • Mathematically transform image into frequency components • Process is performed on entire frame, the end result is very effective hierarchical representation of an image • Every layer represents a frequency band
Spatial Compression Removing or reordering information about field of color pixels to conserve space Neighboring pixels will have nearly the same brightness and color values Instead of sending the same number for each and every sample, one number could be sent representing a block of sample points in an area where the information content remain same
Spectral Compression Human eye is much better in distinguishing luminance than chrominance. This can be used as advantage in conveying color information as there is less precision required – a higher level of precision would thus be „ saved‟. Fewer samples required to convey color information – fewer samples-> lesser bandwidth required
Steps in Image Compression If the color is represented in RGB mode, translate it to YCrCb mode Divide the file into 8X8 blocks(group of 8 pixels = 1 block). Transform the pixel information from the spatial domain into the frequency domain with the Discrete Cosine Transform. Quantize the resulting values by dividing each coefficient by an integer value and rounding off to the nearest integer. Look at the resulting coefficients in a zigzag order. Do a run-length encoding of the coefficients ordered in this manner. Follow by Huffman coding. When conversion is done from RGB to YCrCb, it makes 4:4:4 format, but 2 color components are discarded. Thus bandwidth is reduced by 50%
Temporal Compression Redundancy between successive frames is known as temporal compression. Only changes from one frame to the next are encoded as often as large number of pixels will be same on series of frames. This type of compression relies on keyframes. Keyframes stores still images which are used for frame differencing.
Temporal Processing Usually high frame rate: Significant temporal redundancy. Possible representations along temporal dimension. • Transform/subband methods • Good for textbook case of constant velocity uniform global motion. • Inefficient for non uniform motion, i.e. real-world motion. • Requires large number of frame stores. • Leads to delay (Memory cost may also be an issue). • Predictive methods • Good performance using only 2 frame stores. • However, simple frame differencing in not enough…
Video Coding and Frames Goal: Exploit the temporal redundancy Predict current frame based on previously coded frames Three types of coded frames: • I-frame: Intra-coded frame, coded independently of all other frames • P-frame: Predictively coded frame, coded based on previously coded frame • B-frame: Bi-directionally predicted frame, coded based on both previous and future coded frames
Motion Compensated Prediction Simple frame differencing fails when there is motion Must account for motion • Motion-compensated (MC) prediction MC-prediction generally provides significant improvements Questions: • How can we estimate motion? • How can we form MC-prediction?
Motion Estimation Ideal situation: • Partition video into moving objects • Describe object motion • Generally very difficult Practical approach: Block-Matching Motion Estimation • Partition each frame into blocks, e.g. 16x16 pixels • Describe motion of each block • No object identification required • Good, robust performance
Block Matching ME Summary Issues: • Block size? • Search range? • Motion vector accuracy? • Motion typically estimated only from luminance Advantages: • Good, robust performance for compression • Resulting motion vector field is easy to represent (one MV per block) and useful for compression • Simple, periodic structure, easy VLSI implementations Disadvantages: • Assumes translational motion model Breaks down for more complex motion • Often produces blocking artifacts (OK for coding with Block DCT)
Basic VC Architecture Exploiting the redundancies: • Temporal: MC-prediction (P and B frames) • Spatial: Block DCT • Spectral: Color space conversion Scalar quantization of DCT coefficients Zigzag scanning, run length and Huffman coding of the nonzero quantized DCT coefficients
Using Standards in Video Compression Motivation for Standards • Ensuring interoperability: Enabling communication between devices made by different manufacturers • Promoting a technology or industry • Reducing costs What do the Standards imply? • Just the bitstream syntax and the decoding process(e.g. use IDCT, but not how to implement the IDCT) • Enables improved encoding & decoding strategies to be employed in a standard-compatible manner
Current Video Compression StandardsSTANDARD APPLICATION BIT RATEJPEG Continuous-tone still-image Variable compressionH.261 Video telephony and teleconferencing p x 64 kb/s over ISDNMPEG-1 Video on digital storage media (CD- 1.5 Mb/s ROM)MPEG-2 Digital Television > 2 Mb/sH.263 Video telephony over PSTN < 33.6 kb/sMPEG-4 Object-based coding, synthetic Variable content, interactivityH.264 From Low bitrate coding to HD Variable encoding, HD-DVD, Surveillance, Video conferencing.
Video Coding Standardization OrganizationsTwo key Organizations: ITU-T (Video Coding Experts group, VCEG) • International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT) ISO/IEC Moving Picture Experts Group (MPEG) • International Standardization Organization and International Electrotechnical Commission
Dynamics of Video Compression Standardization VCEG is older and more focused on conventional (esp. low-delay) video coding goals (e.g. good compression and packet-loss/error resilience) MPEG is larger and takes on more ambitious goals (e.g. “object oriented video”, “synthetic-natural hybrid coding”, and digital cinema) Sometimes the major organizations team up (e.g. ISO, IEC and ITU teamed up for both MPEG-2 and JPEG) contd…
Relatively little industry consortium activity (DV and organizations that tweak the video coding standards in minor ways, such as DVD, 3GPP, 3GPP2, SMPTE, IETF, etc.) Growing activity for internet streaming media outside of formal standardization (e.g., Microsoft, Real Networks, Quicktime)
MPEG-I MPEG-1, the first lossy compression scheme developed by the MPEG committee. Used for CD-ROM video compression and as part of early Windows Media players. Uses DCT algorithm. For slow moving frames e.g. In talk shows greater compression is achieved. For fast moving frames e.g. In sports channels lesser compression is achieved. The current wildly popular MP3 (MPEG-1, Part 3) audio standard is actually the audio compression portion of the MPEG-1 standard and provides about 10:1compression of audio files at reasonable quality.
MPEG-II Evolved to meet the needs of compressing higher quality video. used in today‟s video DVDs and digital broadcasts. It uses bit rates typically ranging from 5 to 8 Mbits/s. Uses DCT transforms but it also provides support for interlaced video. MPEG-2 is also the current standard for (HDTV) transmission. includes additional color sub sampling, improved compression, error correction and multi-channel extensions for surround sound.
MPEG- III MPEG-3 is the compression standard that never was. Originally evolved to support HD content. It turned out that MPEG- III could be done with minor changes to MPEG-2. So MPEG-3 never happened. Now there are Profiles of MPEG-2 that support HDTV as well as Standard Definition Television(SDTV).
MPEG- IV Goal: solve two video transport problems • sending video over low-bandwidth channels • achieving better compression than MPEG-2 for broadcast signals. The MPEG committee designed MPEG-4 to be a single standard covering the entire digital media workflow from capture, authoring and editing to encoding, distribution, playback and archiving. Based on Apple Computer‟s QuickTime technology. Used in a wide range of bit rates, from 64 Kbits/s to 1,800 Mbits/s.
H.264/AVC H.264/MPEG4-AVC is a jointly developed standard by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) and has been standardized by the ITU under the H.264 name. H.264 uses techniques fairly different from MPEG-2 and can match the best MPEG-2 quality at up to half the data rate. Delivers excellent video quality across the entire bandwidth spectrum from 3G to HDTV and everything in-between (from 40 Kbits/s to upwards of 10 Mbits/s). The H.264 design incorporates a Video Coding Layer (VCL), and network Abstraction Layer(NAL)
JPEG JPEG stands for Joint photographic Experts Group. It exploits the fact that the human eye will not notice small color changes in an image. Not a very good compression technique for full-color or grayscale images.
AVI Stands for Audio Video Interleaved. Sound And Motion Picture File that conforms to the standards set by Microsoft Windows Resource Interchange File Format (RIFF). The video quality is good at smaller resolutions. Only major drawback is that the files tend to be large. To play an .avi, you could use Windows Media Player, RealPlayer, or the DivX player.
.MOV .mov is an Apple QuickTime motion video file format. Developed by Apple Computer for viewing moving images. This file extension identifies an Apple QuickTime movie. .mov is a method of storing sound, graphics and movie files.
MJPEG Short for Motion JPEG. Best suited for broadcast resolution interlaced video, such as NTSC or PAL. Each video field is separately compressed into a JPEG image. Also used for short files such as the short movies that can be made by a digital camera. Not good for movies that are smaller than TV resolutions and ill suited for progressive scan computer monitors.
DivX DivX is a software application that uses MPEG-4 standard to compress digital video. DivX Networks and the open source community are developing DivX jointly.
Scope of Development The MPEG committee continues to add video and graphics standards, such as MPEG-7 and MPEG-21, to their standards efforts. Digital video technology has become a necessity due to the increasing demand to include video data for personal use as well as in the entertainment industry, the corporate world, the government and defense. Compression rates and the quality of data will continue to improve, providing more efficient use of bandwidth, storage and computing resources.
References and Further Reading www.wikipedia.com http://www.videomaker.com/article/10842/ http://www.h264encoder.com/ www.scribd.com http://ivythesis.typepad.com/term_paper_topics/2009/08/video-compression- techniques.html http://drogo.cselt.stet.it/mpeg J.G. Apostolopoulos and S.J. Wee, ``Video Compression Standards, Wiley Encyclopedia of Electrical and Electronics Engineering, John Wiley & Sons, Inc., New York, 1999. V. Bhaskaranand K. Konstantinides, Image and Video Compression Standards: Algorithms and Architectures, Boston, Massachusetts: KluwerAcademic Publishers, 1997. J.L. Mitchell, W.B. Pennebaker, C.E. Fogg, and D.J. LeGall, MPEG Video Compression Standard, New York: Chapman & Hall, 1997.