SlideShare a Scribd company logo
1 of 245
Download to read offline
Dr. Mohieddin Moradi
mohieddinmoradi@gmail.com
1
Dream
Idea
Plan
Implementation
Section I
– ISO/IEC JTC 1/SC 29 Structure and MPEG
– ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group)
– A Generic Interframe Video Encoder
– H.261 Video Coding Standard
– MPEG-1 Video Coding Standard
– MPEG-2 Video Coding Standard
Section II
– MPEG-2 Transport and Program Streams
– H.263 Video Coding Standard
– H.263+ Video Coding Standard
– H.263++ Video Coding Standard
– Bit-rate (R) and Distortion (D) in Video Coding
2
Outline
3
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− Created in 1992
• 300 members, >35 countries - www.dvb.org
• Promotion of open standards for Digital TV broadcasting
− Principal Recommandations
• Physical Layer
− Satellite: DVB-S, DVB-S2
− Cable: DVB-C
− Terrestrial: DVB-T, DVB-T2
− Mobiles DVB-H, DVB-SH
• Signalisation
− Information de services: DVB-SI
− Services synchro: DVB-SAD
• Protection
− DVB-CAS, DVB-CSA
− Interface smartcard: DVB-CI, DVB-CI+
4
DVB
DVB-C
(QAM)
DVB-T
(COFDM)
DVB-T
(COFDM)
DVB-S
(QPSK)
DVB-S
(QPSK)
Multiplexing
Scrambling
MPEG-2
Coding
Descrambling
MPEG-2
Decoding
Satellite
Terrestrial
Cable
DVB-C
(QAM)
Scrambling
Key
DVB
Demultiplexing
Descrambl.
Key
MPEG-2
Video Audio Data Data Audio Video
DVB Systems
5
…bits bits bits ...
Video or Audio
Elementary Stream (ES)
PES Packet
Header Payload
Packetized
Elementary Stream (PES)
Time stamps
TS Packet (188 bytes)
Header Payload
MPEG 2
Transport Stream (TS)
PID
TS Header, contains PID and clock
PES Header
Rule: Every elementary stream gets its own (Packet ID) PID
The MPEG Transport Stream
6
Processing of The Streams in The STB
Tuner/
Demod
MPEG2
Demux
Video
Decomp.
Audio
Decomp.
System
Memory
Processor
• 6 TV
• 20 Radio
• Service Information
QAM
OFDM
A/D
A/D
MPEG2-TS : 40 Mbit/s, e.g..:
188
188
MPEG2-TS
PID Header Payload
DEMUX
queues
PID 1
PID 2
section
section
QPSK
7
8
Digital Terrestrial TV - Layers
. . . provide clean interface points. . . .
Picture
Layer
Multiple Picture Formats
and Frame Rates
1920 x 1080
1280 x 720
50,25, 24 Hz
Transmission
Layer
7 MHz
COFDM / 8-VSBVHF/UHF TV Channel
Video
Compression
Layer
MPEG-2
compression
syntax
ML@MP
or
HL@MP
Data
Headers
Motion
Vectors
Chroma and Luma
DCT Coefficients
Variable Length Codes
Transport
Layer MPEG-2
packets
Video packet Video packetAudio packet Aux data
Packet Headers Flexible delivery of data
9
Digital Television Encode Layers
Delivery System
Bouquet Multiplexer
Program 2 Program 3
Service
Mux
Other Data
Control Data
Program Association Table (PAT)
Picture
Coding
Audio
Coding
Data
Coding
MPEG-2
or AC-3MPEG-2
Control
Data
Video Data Sound
Modulator & Transmitter
Error Protection
Control Data
188 byte packetsMPEG Transport Data Stream
Program 1 Multiplexer
MPEG Transport
Stream MuxControl Data
Program Map Table (PMT)
PES PES PES
10
Digital Television Decode Layers
Audio
Decoder
Data
Decoder
Picture
Decoder
MPEG
or AC-3MPEG-2
Demodulator & Receiver Error
Control
Delivery System
DataMon
Speakers
MPEG Transport Stream
De-Multiplexer
MPEG
DeMux
Transport
Stream
11
− MPEG-2 Container formats (a file format that can contain data compressed by standard codecs)
• TS: Transport Stream (Multiplexed A/V PES and User Data)
• PS: Program Stream
− PES: Packetized Elementary Stream, Audio or Video
− ES: Elementary Streams-Compressed Data
Video
Data
Audio
Data
Elementary
Streams
Video
Encoder
Audio
Encoder
Packetizer
Packetizer
ES
ES
Video
PES
Program
Stream
MUX
Transport
Stream
MUX
Audio
PES
PS: Program Stream
TS: Transport Stream
MPEG-2 Video System Standard
For noisier environments
such as terrestrial
broadcast channels
For an error-free
environment such as
Digital Storage Media
(DSM)
12
MPEG-2 Packetized Elementary Stream (PES)
MPEG-2 Video
Video ES (Elementary Stream)
I0 P3 B1 B2 P6 B3 B4 I9 B7 B8 P12 B B P B B I
I0 B1 B2 P3 B4 B5 P6 B7 B8
Video Frames Frame Frame Frame Frame Frame Frame Frame Frame Frame
MPEG-2 System
Subband Samples
Side
Information
Sync, System
Info. and CRC
Ancillary
Data Field
Audio ES (Elementary Stream)
MPEG-2 System
Audio Tracks
frame frame frame frame frame frame
frame frame frame frame frame frame
MPEG-2 Audio
Video PES (Packetized Elementary Stream)
Audio PES (Packetized Elementary Stream)
Output from MPEG-2 System Encoder:
13
MPEG-2 Packetized Elementary Stream (PES)
MPEG -2 System
Processor
Elementary Stream (ES):
- Digital Control Stream
- Digital Audio (compressed)
- Digital Video (compressed)
- Digital Data
PES Packet has 6 bytes Protocol Header
• 3 bytes start code
• 1 byte stream ID
– 110x xxxx audio stream number x xxxx
– 1110 yyyy video stream number yyyy
– 1111 0010 DSM-CC (Digital Storage Media) control packet
• 2 bytes length field
Packet Start
Code Prefix
24
Stream ID
8 16
PES
Header
(optional)
PES Packet
PES Packet
Length
Packet Start
Code Prefix
24
Stream ID
8 16
PES
Header
(optional)
PES Packet
PES Packet
Length
(up to 65536 bytes including 6 byte protocol header)
14
PES Packet Syntax Diagram
Packet Start
Code Prefix
24
Stream ID
8 16
PES
Header
(optional)
PES Packet Data Bytes
PES Packet
Length
’10’
PES
Scrambling
Control
PES
Priority
Optional
Fields
7
Flags
Copyright
PES
Header
Length
Data
Alignment
Indicator
Stuffing
Bytes
(0xFF)
Original
or
Copy
2 2 1 81 11
DSM
Trick
Mode
PTS
DTS
PES
Extension
Additional
Copy Info
ES
Rate
ESCR
Previous
PES
CRC1
33 42 22 8 7 16
5 Flags Optional
Fields
Pack
Header
Field
PES
Private
Data
Program
Packet
Seq Counter
P-STD
Buffer
PES
Extension
Length
PES
Extension
Data
128 168 16 7
8 m * 8
Packetized Elementary Stream
• The basic stream format for video, audio, data, ..
• PES offers a mechanism to carry conditional access information
• PES can be scrambled and also assigned priority
• PES can carry time references: PTS and DTS
• The largest data size within a PES packet is 64k Bytes.
PES Indicators
• PES_Priority - Indicates priority of the current PES packet.
• PES_Scrambling_Control - Defines whether scrambling is used, and the chosen scrambling method.
• Data_alignment_indicator - Indicates if the payload starts with a video or audio start code.
• Copyright information - Indicates if the payload is copyright protected.
• Original_or_copy - Indicates if this is the original ES
15
MPEG-2 Packetized Elementary Stream (PES)
PES Optional Field
− Presentation Time Stamp (PTS) and possibly a Decode Time Stamp (DTS)
• For audio / video streams these time stamps which may be used to synchronize a set of elementary
streams and control the rate at which they are replayed by the receiver.
− Elementary Stream Clock Reference (ESCR)
− Elementary Stream Rate - Rate at which the ES was encoded.
− Trick Mode - indicates the video/audio is not the normal ES, e.g. after DSM-CC has signaled a replay.
− Copyright Information - set to 1 to indicated copyright ES.
− CRC - this may be used to monitor errors in the previous PES packet
− PES Extension Information - may be used to support MPEG-1 streams.
16
MPEG-2 Packetized Elementary Stream (PES)
It is the central structure used in both PS and TS Streams; results from packetizing continuous streams of
compressed audio or video
− PES packets contain 2 timestamps
1. Decoding Time Stamp (DTS) – this tells the decoder when the packet should be decoded. The
data is then decoded into the bit stream.
2. Presentation Time Stamp (PTS) – this tells the decoder when the data should be displayed.
− The systems part specifies that the decoder must contain a Systems Time Clock (STC: Systems Time Clock).
• When a decoder’s STC is equal to a packet’s DTS the data in the packet is decoded
• When the STC is equal to a packet’s PTS the decoded data is sent to the display device (eg. graphics
card or sound card)
• The state of the encoders clock is placed in the stream at regular intervals. This synchronises the
decoder with the encoder.
17
MPEG-2 Packetized Elementary Stream (PES)
− Packetising the continuous streams of compressed video and audio bitstreams (elementary streams or ES)
generates PES packets.
− A typical method of transmitting elementary stream data from a video or audio encoder is to first create
PES packets from the elementary stream data and then to encapsulate these PES packets inside Transport
Stream (TS) packets or Program Stream (PS) packets.
− The TS packets can then be multiplexed and transmitted using broadcasting techniques, such as those
used in an ATSC and DVB.
− Simply stringing together PES packets from the various encoders with other packets containing necessary
data to generate a single bitstream generates a programme stream.
− A transport stream consists of packets of fixed length containing 4 bytes of header followed by 184 bytes
of data, where the data are obtained by segmenting the PES packets.
18
MPEG-2 Packetized Elementary Stream (PES)
19
MPEG-2 Transport Stream (TS)
Multiplexing
Subsystem
Multiplexer
TransportAudio
Compression
Digital
Modulation
Error
Correction
Encoder
Video
Compression
Video
Ancillary data
Audio
Transmission
Subsystem
Control data
Mixer
Video
Subsystem
Audio
Subsystem
ES
ES
ES
ES
TS
ES
Paketizer
ES
Paketizer
ES
Paketizer
PES
PES
PES
PES
…bits bits bits ...
Video or Audio
Elementary Stream (ES)
PES Packet
Header Payload
Packetized
Elementary Stream (PES)
Time stamps
TS Packet (188 bytes)
Header Payload
MPEG 2
Transport Stream (TS)
PID
TS Header, contains PID and clock
PES Header
Rule: Every elementary stream gets its own (Packet ID) PID
The MPEG Transport Stream
20
Program 1
Video 1
PES
Program 2 video 2
PES
Audio 1
PES
Transport Stream
188 Bytes
MPEG-2 Transport Stream (TS) Formation
21
MPEG-2 Transport Stream
Packetizer Packetizer Packetizer Packetizer Packetizer Packetizer Packetizer
Video
Encoder
Audio
Encoder
Video
Encoder
Audio
Encoder
Video
Encoder
Audio
Encoder
Packetizer Packetizer
Program 1
Video_1 Audio_1 Data_1
Program 2 Program 3
Video_2 Audio_2 Data_2 Video_3 Audio_3 Data_3
TRANSPORT MUX
TP1_1 TP2_1 TP1_2 TP2_2 TP3_1 TP1_3 TP2_3 TP3_2
Transport Stream
TP3_3
TP1_1 TP1_2 TP1_3 TP2_1 TP2_2 TP2_3 TP3_1 TP3_2 TP3_3
Transport MuxTransport Mux Transport Mux
MPEG-2 Transport Stream (TS) Formation
22
23
MPEG-2 Transport Stream (TS) Packet
Video
Audio
Teletext
(DVB) SI
Cond. Access
IP Packets
Private Data
Applications
App. Info
Time
Division
Multiplexing
(TDM)
MPEG-2 packets can contain
− Video, Audio, Teletext, Data streaming (13818-1)
− DSM-CC (Digital Storage Media Command and Control): data carousel, object carousel, SI-tables, etc ) (13818-6)
− DVB Data Piping
1 TS Packet
(188 Bytes)
Payload
PES / Section / Piped Data
(( 184-n) Byte )
Header with
PID
( 3 byte )
Adaptation
Field
( n byte )
Sync
( 1 byte )
TS Packets
It significantly differs from MPEG-1.
• It offers robustness for noisy channels
• It offer ability to assemble multiple programmes into a single stream.
• It uses fixed-length packets of size 188 bytes with a new header syntax.
• This packet can be segmented into four 47 bytes to be accommodated in the payload of four ATM
cells, with the AAL1 adaptation scheme.
• It is therefore more suitable for hardware processing and for error correction schemes, such as those
required in television broadcasting, satellite/cable TV and ATM networks.
24
The MPEG Transport Stream
The transport stream uses a fixed packet length (188 bytes)
• This allows easy decoder/encoder synchronisation.
• It also allows error correction codes to be inserted.
Transport Streams can contain packets from a number of Programs
• These can be different TV channels or maybe an EPG.
• Each program has a unique Packet ID placed in the packet header.
• Decoder can discard packets of other programs by checking the PID.
25
The MPEG Transport Stream
− The multiple programmes with independent time bases can be multiplexed in one transport stream.
− The transport stream also allows
• Synchronous multiplexing of programmes
• Fast access to the desired programme for channel hopping
• Multiplexing of programmes with clocks unrelated to transport clock
• Correct synchronization of elementary streams for playback.
• Control of the decoder buffers during start-up and playback for both constant and variable bit rate
(VBR) programmes.
26
The MPEG Transport Stream
Sync
Byte
PID
8 1 1 1 13 2 2
Continuity
Counter
PES 1 PES 2 PES N……..
Adaptation
Field
Transport Error Indicator
Payload Unit Start Indicator
Transport Priority
Flags
51 1 1
Flag
8 3
Optional
Fields
Adaptation
Field
Length 8
Adaptation
Field Extension
Adaptation Field Control
Scrambling Control
PCR
Original OCR
(OPCR)
Private
Data Length
8
Private …..
Data ….
Adaptation Field
Extension Length
Discontinuity Indicator
Random Access Indicator
PES Priority Indicator
188 bytes
Sync
1 byte
Header
3 bytes
TS Payload
184 bytes
MPEG-2 Transport Stream (TS) Packet
4
Splice
Countdown
PCR Fields
TS Header
42 42
27
PID numbers for Program Specific Information (PSI) used for Service Information (SI)
0x0000 PAT Program Association Table
0x0001 CAT Conditional Access Table
0x0002 TSDT Transport Stream Description Table, EI DVB
0x0003-0x000F reserved
0x0010 NIT, ST Network Information Table, Stuffing Table
0x0011 SDT, BAT, ST Service Description Table, Bouquet Association Table, Stuffing Table
0x0012 EIT, ST Event Information Table, Stuffing Table
0x0013 RST, ST Running Status Table, Stuffing Table
0x0014 TDT, TOT, ST Time and Date Table, Time Offset Table, Stuffing Table
0x0015 Network synchronization
0x0016-0x001D reserved for future use
(0x001E DIT Discontinuity Information Table
(0x001F SIT Selection Information Table
188 bytes
Sync
1 byte
Header
3 bytes
Optional
Adaptation Field
X bits
Payload
184 bytes
PID
Packet Identifier
13 bits
MPEG-2 Transport Stream (TS) Packet
28
PID
− Indicates where the data goes
• Allows filtering of packet for non viewed programs
− Does not indicate PES/section or coding type
− Reserved PID
• Some PSI data
• Program Assocation Table (PAT)
• Conditional Acces Table (CAT)
• Transport Stream Description Table (TSDT)
• User-reserved: Other standard bodies (DVB, ATSC, …)
PSI
− Multiplex description
− Program description
− Stream Descirpion
29
Program ID (PID) and Program Service Information (PSI)
30
Program Association Table (PAT)
Program # 100 – PMT PID 1025
Program # 200 – PMT PID 1026
Program Map Table (PMT)
Program # 100
Video PID – 501 – MPEG-2 Video
Audio PID (English) – 502 – MPEG-2 Audio
Audio PID (Spanish) – 503 – MPEG-2 Audio
Program Map Table (PMT)
Program # 200
Video PID – 601 – AVC Video
Audio PID (English) – 602 – AAC Audio
MPEG-2 Signaling Tables
31
MPEG-2 Signaling Tables
Network Information
Bouquet Association
Service Description
Event Information
Running Status
Time & Date
Stuffing
32
MPEG-2 Signaling Tables
Program Association Table (PAT)
• Identifies a multiplex (ID 16 bits) (The PAT is sent with the well-known PID value of 0x0000)
• Lists all programs (Lists the PIDs of tables describing each program)
─ Program Number (16 bit)
─ PID carrying PMT
• If PID= 0, NIT
Program Map Table (PMT)
• Defines the set of PIDs associated with a program, e.g. audio, video, ...
• PID carrying the PCR
─ Not always a media stream !
• Program Descriptors
─ Protection systems, interactive apps …
• Lists all streams
─ PID: where stream data is carried in the multiplex
─ streamType: type of media compression
─ Stream descriptors
• Language, coding parameters, demux parameters, … 33
MPEG-2 Signaling Tables
Program Associate Table (PAT)
Program Map Table (PMT)
Other PacketsAudio Packet
Video Packet
51 51 51 6664 0 150 101
CAT - Conditional Access Table
− Defines type of scrambling used and PID
values of transport streams which contain
the conditional access management and
entitlement information (EMM).
TSDT- Transport Stream Description Table
− Contains descriptors relating to the overall
transport stream
34
MPEG-2 Signaling Tables
NIT - Network Information Table
− It contains details of the bearer network (network
topology) used to transmit the MPEG multiplex,
including the carrier frequency
Service Description Table (SDT)
− Multiplex Description (channel names, …)
− Editorial description of the services in a TS
− Service names and ancillary services
Event Information Table (EIT)
− Electronic Program Guide for present and
following shows
Time and Date Table (TDT)
− Current date and time, UTC (used to synchronize
STB system time)
35
MPEG-2 Signaling Table (DVB, Mandatory)
Bouquet Association Table (BAT)
− Commercial operator description and services
− Several commercial operators may sell the same
services
Running Status Table (RST)
Stuffing Table (ST)
Time Offset Table (TOT)
− Local offset by region (used to synchronize STB
system time)
Application Information Table (AIT)
− Interactive App signaling (MHP, HbbTV,…)
− Type d’application
IP/MAC Notification Table (INT)
− IP Transport
36
MPEG-2 Signaling Tables (DVB, Optional)
− Scrambling may happen:
• At PES payload level
• At some sections payload level
• At TS packet level
− Most common use case
− PES headers are scrambled
− Exceptions
• PAT: required to get list of programs
• PMT: required to get protection system used
• NIT/TSDT (Transport Stream System Target Decoder): infrastructure management
37
Scrambling in MPEG-2 TS
AV Synchronization
− Want audio and video streams to be played back in sync with each other
− Video stream contains “Presentation Time Stamps (PTS)”
− MPEG-2 clock runs at 90 kHz
• Good for both 25 and 30 fps
− Each program carries a clock
• Program Clock Reference (PCR)
– PCR timestamps are sent with data by sender
• PES Timestamps relate to this clock
− Receiver uses PLL to synchronize clocks
38
MPEG-2 TS Timing
bit
Byte 7 6 5 4 3 2 1 0
Program Clock Reference (PCR) base
The intended time, in 90 kHz clock symbols, of the arrival at the
input of the decoder of the fourth byte of this structure.
(cont.)
reserved
PCR extension. Additional resolution, in 27 MHz clock. PCR = 300*base + ext
PCROPCR
Original PCR (OPCR) base
It should not be modified by any multiplexer or decoder
Used for recovery of single-program PCR from multi-program Transport Stream
(cont.)
reserved
Original PCR extension
MPEG-2 Transport Stream (TS) Packet
PCR
Original PCR
(OPCR)
PCR Fields
42 42
PCR (Program Clock Reference)
39
Program Associate Table (PAT)
Program Map Table (PMT)
Other PacketsAudio Packet
Video Packet
Packet header includes a unique
Packet ID (PID) for each stream
PAT lists PIDs for program map tables
Network Info=10
Prog 1 = 150
Prog 2 = 301
Prog 3 = 511
etc.
Program guides
Subtitles
Multimedia data
Internet Packets
etc.
PMT lists PID associated with a particular program
Video = 51
Audio (English) = 64
Audio (French) = 66
Subtitle = 101
etc.
51 51 51 6664 0 150 101
MPEG-2 Signaling Tables
40
MPEG-2 Example Transport Stream Packet
41
Example Transport Stream Packet
188 Bytes
Header
Flags
• Transport Error Indicator
• Payload Unit Start Indicator
• Transport Priority
• Transport Scrambling Control
Important PIDs
• 0x0000 – PAT PID
• 0x1FFF – “Null PID” gives space
for VBR
Continuity Counter (CC)
• 4-bit per-PID sequence #
• Helps detect packet loss
Adaptation Field (optional)
• Can carry range of other info
• PCR, splice point flags
• Transport of private data
Example Transport Stream
0x47
(sync) Flags
PID
(Payload ID)
More
Flags
CC Adaptation
Field
Data Payload
PID
0
CC
3
PAT
Data
PID
601
CC
11
PID
602
CC
7
PID
0x1FFF
NULL PID
601
CC
12
PID
602
CC
8
MPEG-2/DVB PID Allocation
− Program Association Table (PAT)
• always has PID = 0 (zero)
− Conditional Access Table (CAT)
• always has PID = 1
− Event Information Table (EIT)
• always has PID = 18 (0x0012)
− Program Map Tables (PMTs)
• have the PIDs specified in the PAT
− The audio, video, PCR, subtitle, teletext etc PIDs for all
programs are specified in their respective PMTs
MPEG-2/DVB PID Allocation
Table PID value
PAT 0x0000
CAT 0x0001
TSDT 0x0002
Reserved 0x0003 – 0x000F
NIT, ST 0x0010
SDT, BAT, ST 0x0011
EIT, ST 0x0012
RST, ST 0x0013
TDT, TOT, ST 0x0014
Network
Synchronization
0x0015
Reserved 0x0016 –
0x001B
Inband signaling 0x001C
measurement 0x001D
DIT 0x001E
SIT 0x001F
42
Increase resilience to transmissions errors
− Redundancy
− Reed Solomon 255/191, 25% redundant
− Each RS column is send in a section
− FEC aggregation is in another table
• Can be ignored
• Does not interfere with MPE
Without modifying existing implementations
− No modification on MPE (MPEG Movie File) sections
• Each MPE+IP on a section
• Aggregation of IP datagrams in memory
43
DVB MPE-FEC
44
Data over DVB
− Data piping
• raw transport on a PID
− Data streaming
• send in PES packets
− DSM-CC Data carrousel
• Transport on sections
− Object Carrousel
• Data Carousel + file system
− Multi Protocol Encapsulation (MPE)
• IP datagram over TS
Application
Program 0 PID=16
Program 1 PID=22
Program 2 PID=33
… …
Program M PID=55
PMT (Program Map Table)
for Program 1
CAT (Conditional Access Table) (PID=1)
NAT (Network Information Table)
(always Program 0, PID=16)
NIT is considered a Private data by ISO
Table section ID assigned by systemTable section ID always set to 0x01
Table section ID always set to 0x02 Table section ID always set to 0x00
Stream 1 PCR 31
Stream 2 Video 1 54
Stream 3 Audio 1 48
Stream 3 Audio 2 49
… … …
Stream k Data K 66
PAT (Program Associate Table) (PID=0)
CA Section 1 (Program 1) EMM PID(99)
CA Section 2 (Program 2) EMM PID(109)
CA Section 3 (Program 3) EMM PID(119)
… …
CA Section k (Program k) EMM PID(x)
Private Section 1 NIT Info.
Private Section 2 NIT Info.
Private Section 3 NIT Info.
… …
Private Section k NIT Info.
0 PAT 22 Prog 1. PMT 33 Prog 2. PMT 99 Prog 1 EMM 31 Prog 1 PCR 48 Prog 1 Audio 1 54 Prog 1 Video 1 109 Prog 2 EMM
Multiple-Program MPEG-2 Transport Stream:
PMT (Program Map Table)
for Program 2
Stream 1 PCR 41
Stream 2 Video 1 19
Stream 3 Audio 1 81
Stream 3 Audio 2 82
… … …
Stream k Data K 88
MPEG-2 / DVB PSI (Program Specific Information) Structure
46
Transport Multiplexing & Decoding
Transport
Stream
Demultiplex
and Decoder
Clock
Control
Video
Decoder
Channel
Specific
Decoder
Audio
Decoder
Decoded
Video
Decoded
Audio
Transport stream
containing one or
multiple programs
Transport
Stream
Demultiplex
and Decoder
Channel
Specific
Decoder
Transport Stream
with single program
Program Stream ≠ Transport Stream
Channel
Channel
47
Transport Stream Decoder
Multiplex
Buffer
Video
Decoder
Transport
Buffer
Re-order
Buffer
Decoded
Video
Decoded
Audio
ES Stream
Buffer
Multiplex
Buffer
Transport
Buffer
ES Stream
Buffer
Multiplex
Buffer
Transport
Buffer
ES Stream
Buffer
Audio
Decoder
System
Info.
Decoder
System
Control
Transport Stream Decoder
− At the receiver, the transport streams are decoded by a transport demultiplexer (which includes a clock
extraction mechanism), unpacketised by a depacketiser and sent to audio and video decoders for
decoding.
− The decoded signals are sent to the receiver buffer and presentation unit, which outputs them to a display
device and a speaker at the appropriate time.
− Similarly, if the programme streams are used, they are decoded by the programme stream demultiplexer
and depacketiser and sent to the audio and video decoders.
− The decoded signals are sent to the respective buffer to await presentation.
− Also similar to MPEG-1 systems, the information about systems timing is carried by the clock reference field
in the bitstream that is used to synchronise the decoder Systems Time Clock (STC).
− Presentation Time Stamps (PTS), which are also carried by the bitstream, control the presentation of the
decoded output.
48
Transport Stream Decoder
− For a payload of around 19 Mb/s
• 1 HDTV service - sport & high action
• 2 HDTV services - both film material
• 1 HDTV + 1 or 2 SDTV non action/sport
• 3 SDTV for high action & sport video
• 6 SDTV for film, news & soap operas
• However you do not get more for nothing.
− More services means less quality
49
Examples of DVB Data Containers
Single HDTV
program
HDTV 1
SDTV 1
SDTV 2
SDTV 3
SDTV 4
SDTV 5
Multiple SDTV
programs
SDTV 1
HDTV 1
Simulcast HDTV &
SDTV
Channel bandwidth can be used in different ways
50
− MPEG-2 Container formats (a file format that can contain data compressed by standard codecs)
• TS: Transport Stream (Multiplexed A/V PES and User Data)
• PS: Program Stream
− PES: Packetized Elementary Stream, Audio or Video
− ES: Elementary Streams-Compressed Data
Video
Data
Audio
Data
Elementary
Streams
Video
Encoder
Audio
Encoder
Packetizer
Packetizer
ES
ES
Video
PES
Program
Stream
MUX
Transport
Stream
MUX
Audio
PES
PS: Program Stream
TS: Transport Stream
MPEG-2 Video System Standard
For noisier environments
such as terrestrial
broadcast channels
For an error-free
environment such as
Digital Storage Media
(DSM)
51
Program Stream Structure (Simplified)
Program Stream (PS)
− It is similar to the MPEG-1 systems stream but uses a modified syntax and new functions to support
advanced functionalities (e.g. scalability).
− It provides compatibility with the MPEG-1 systems (MPEG-2 should be capable of decoding an MPEG-1
bitstream.
− Like the MPEG-1 decoder, programme stream decoders typically employ long- and variable-length
packets. Such packets are well suited for software-based processing and error free transmission
environments ( such as storage, disk).
− The packet sizes are usually 1–2 kbytes long, chosen to match the disc sector sizes (typically 2 kbytes).
− However, packet sizes as long as 64 kbytes are also supported.
52
MPEG-2 Systems
53
MPEG-2 Systems
Program Stream (PS)
− It includes features not supported by MPEG-1 systems.
• Scrambling of data
• Assignment of different priorities to packets
• Information to assist alignment of elementary stream packets
• Indication of copyright
• Indication of fast forward, fast reverse and other trick modes for storage devices.
• An optional field in the packets is provided for testing the network performance
• Optional numbering of a sequence of packets is used to detect lost packets.
54
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− H.263 standardization effort started Nov 1993 (finalization:1995)
− The primary goal in the H.263 standard codec was coding of video at low or very low bit rates (less than 64
kbps) for applications such as mobile networks, public switched telephone network (PSTN) and the
narrowband Integrated Services Digital Network (ISDN).
− Later on, the codec was found so attractive that higher resolution pictures could also be coded at
relatively low bit rates.
− The standard recommends operation on five standard pictures of the CIF family, known as sub- QCIF,
QCIF, CIF, 4CIF and 16CIF.
− The H.263+ (H.263 Ver. 2) was the first set of extensions to this family, which was intended for near-term
standardisation of enhancements of H.263 video coding algorithms for real-time telecommunications.
− Work on improving the encoding performance was an ongoing process under H.263++ (H.263 Ver. 3), and
every now and then a new extension called annex was added to the family.
55
H.263, H.263+ and H.263++ Standard
− The codec for long-term standardisation was called H.26L.
− The H.26L project had the mandate from ITU-T to develop a very low bit rate (less than 64 kbit/s with emphasis on
less than 24 kbit/s) video coding recommendation achieving
• Better Video Quality
• Lower Delay
• Lower Complexity
• Better Error Resilience
− In 2001, MPEG-4 committee joined the project in investigating new video coding techniques and technologies as
candidates for recommendation.
− The joint team eventually recommended the Joint Video Team (JVT) Codec which is informally known as
Advanced Video Coding (AVC).
− The standard is formally known as H.264 by the ITU-T and MPEG-4 part 10 by ISO/IEC.
56
H.26L Standard
− H.263 is a combination of H.261 and MPEG
− H.261 only accepts QCIF and CIF format → Various picture formats such as sub-QCIF,4CIF, etc.
− No 1/2 pel motion estimation in H.261, instead it uses a spatial loop filter → Half-pel motion compensation
− H.261 does not use median predictors for motion vectors but simply uses the motion vector in the MB to
the left as predictor.
− In H.263 there are four negotiable options
− H.261 does not use a 3-D VLC for transform coefficient coding → 3D VLC for transform coefficients
− GOB headers are mandatory in H.261.
− Quantizer changes at MB granularity requires 5 bits in H.261 and only 2 bits in H.263.
− No loop filter in H.263
− No macroblock addressing in H.263 (include in MB header)
57
H.263 Improvements over H.261
Unrestricted Motion Vector Mode (Annex D)
–MVs are allowed to point outside (outside pixels obtained from boundary repetition extension)
–Larger ranges: [-31.5, 31.5] instead of [-16, 15.5]
Syntax-Based Arithmetic Coding Mode (Annex E)
–Provide about 5% bit rate reduction and rarely used
Advanced Prediction Mode (Annex F)
–Allow 4 motion vectors per MB, one for each 8x8 block
–Overlapped block motion compensation for luminance
–Allow MVs point outside of picture (Motion vectors can now point to outside of picture).
–Reduce blocking artifacts and increase subjective picture quality.
PB-Frames Mode (Annex G)
–Double the frame rate without significant increase in bit rate
Usage:
– The decoder signals the encoder which of the options it has the capability to decode.
– If the encoder supports some of these options, it may enable them.
58
Negotiable Options in H.263
H.261 H.263
Demo: QCIF, 8 fps @ 28 Kb/s
59
Composed of a baseline plus four negotiable options
60
ITU-T Recommendation H.263
Baseline Codec
Unrestricted/Extended Motion Vector Mode
Advanced Prediction Mode
PB Frames Mode
Syntax-based Arithmetic
Coding Mode
Always 12:11 pixel aspect ratio.
61
Frame Formats
Format Y U,V
SQCIF 128x96 64x48
QCIF 176x144 88x72
CIF 352x288 176x144
4CIF 704x576 352x288
16CIF 1408x1152 704x576
352
288
Pixel
12:11
Picture
4:3
Picture & Macroblock Types
− Two picture types:
• Intra (I-frame) implies no temporal prediction is performed.
• Inter (P-frame) may employ temporal prediction.
− Macroblock (MB) types:
 Intra & Inter MB types (even in P-frames).
• Inter MBs have shorter symbols in P frames
• Intra MBs have shorter symbols in I frames
 Not coded MB types- MB data is copied from previous decoded frame.
62
H.263 Baseline
Motion Vectors
− Motion vectors have 1/2 pixel granularity.
− Reference frames must be interpolated by two.
− MV’s are not coded directly, but rather a median predictor is used.
− The predictor residual is then coded using a VLC table.
63
H.263 Baseline
X
CB
A CBAX MVMVMVMVMV ,,median
Motion Vector Delta (MVD) Symbol Lengths
64
H.263 Baseline
0
2
4
6
8
10
12
14
0 0.5 1 1.5 2 2.5 -
3.5
4.0 -
5.0
5.5 -
12.0
12.5-
15.5
MVD Absolute Value
Codelengthinbits
Transform Coefficient Coding
− Assign a variable length code according to three parameters (3-D VLC):
1) Length of the run of zeros preceding the current nonzero coefficient.
2) Amplitude of the current coefficient.
3) Indication of whether current coefficient is the last one in the block.
− The most common are variable length coded (3-13 bits), the rest are coded with escape sequences (22
bits)
65
H.263 Baseline
Quantization
− H.263 uses a scalar quantizer with center clipping.
− Quantizer varies from 2 to 62, by 2’s.
− Can be varied ±1, ±2 at macroblock boundaries (2 bits)
− Can be varied 2-62 at row and picture boundaries (5 bits).
66
H.263 Baseline
Q
-Q
2Q
-2Q
IN
OUT
Bit Stream Syntax
67
H.263 Baseline
Hierarchy of three layers.
Picture Layer
GOB* Layer
MB Layer
*A GOB is usually a row of macroblocks, except
for frame sizes greater than CIF.
Picture Hdr GOB Hdr MB MB ... GOB Hdr ...
Picture Layer Concepts
− PSC - sequence of bits that can not be emulated anywhere else in the bit stream.
− TR - 29.97 Hz counter indicating time reference for a picture.
− PType - Denotes Intra, Inter-coded, etc.
− P-Quant - Indicates which quantizer (2…62) is used initially for the picture.
68
H.263 Baseline
Picture Start
Code
Temporal
Reference
Picture
Type
Picture
Quant
GOB Layer Concepts, GOB Headers are Optional
− GSC - Another unique start code (17 bits).
− GOB Number - Indicates which GOB, counting vertically from the top (5 bits).
− GOB Quant - Indicates which quantizer (2…62) is used for this GOB (5 bits).
GOB can be decoded independently from the rest of the frame
69
H.263 Baseline
GOB Start
Code
GOB
Number
GOB
Quant
Macroblock Layer Concepts
− COD - if set, indicates empty Inter MB.
− MB Type - indicates Inter, Intra, whether MV is present, etc.
− CBP - indicates which blocks, if any, are empty.
− DQuant - indicates a quantizer change by +/- 2, 4.
− MV Deltas - are the MV prediction residuals.
− Transform coefficients - are the 3-D VLC’s for the coefficients.
70
H.263 Baseline
Coded
Flag
MB
Type
Code Block
Pattern
MV
Deltas
Transform
Coefficients
DQuant
8x8 pixel blocks
macroblock
Y
Cb Cr
Deblocking Filter
71
H.263 Options
No Filter Deblocking Loop Filter
Unrestricted/Extended Motion Vector Mode (UMV Mode)
1. Motion Vectors Over Picture Boundaries
− UMV dramatically improves motion estimation when moving objects
are entering/exiting the frame or moving around the frame border)
− Motion vectors are permitted to point outside the picture boundaries
– Non-existent pixels are created by replicating the edge pixels (When a
pixel referred to by motion vector points to outside of coded area, last full
pixel inside the coded picture area is used).
– Motion vector restricted such that no pixel of 16x16 (or 8x8) block shall
have horizontal or vertical distance more than 15 pixels outside of
picture.
– Improves compression when there is movement across the edge of a
picture boundary or when there is camera panning.
72
H.263 Options
Target Frame NReference Frame N-1
Edge pixels
are repeated.
Unrestricted/Extended Motion Vector Mode
2. Extended MV Range
− To extend the range of the motion
vectors from [-16,15.5] to [-31.5,31.5] with
some restrictions.
− This better addresses high motion scenes.
73
H.263 Options
15.5
15.5
-16
-16
-16
-16
15.5
15.5 (31.5,31.5)
Base motion vector range.
Extended motion vector range,
[-16,15.5] around MV predictor.
Advanced Prediction Mode
− The motion compensation in the core H.263 is based on one motion vector per macroblock of 16×16 pixels,
with half-pixel precision.
− The macroblock motion vector is then differentially coded with predictions taken from three surrounding
macroblocks, as indicated in Figure.
74
H.263 Options
MV: Current Motion Vector
MV1: Previous Motion Vector
MV2: Above Motion Vector
MV3: Above Right Motion Vector
MV2 MV3
MV1 MV
Advanced Prediction Mode
− The predictors are calculated separately for the horizontal and vertical components of the motion vectors,
MV1, MV2 and MV3.
− For each component, the predictor is the median* value of the three candidate predictors for this
component:
− The difference between the components of the current motion vector and their predictions is variable
length coded. The vector differences are defined by
75
H.263 Options
Advanced Prediction Mode
− In the special cases, at the borders of the current group of blocks (GOB) or picture, the following decision
rules are applied in order:
• The candidate predictor MV1 is set to zero if the corresponding macroblock is outside the picture at the left side .
• The candidate predictors MV2 and MV3 are set to MV1 if the corresponding macroblocks are outside the picture at
the top, or if the GOB header of the current GOB is nonempty.
• The candidate predictor MV3 is set to zero if the corresponding macroblock is outside the picture at the right side.
• When the corresponding macroblock is intra coded or was not coded, the candidate predictor is set to zero.
− Like unrestricted motion vector mode, motion vectors can refer to the area outside the picture
76
H.263 Options
MV: Current Motion Vector
MV1: Previous Motion Vector
MV2: Above Motion Vector
MV3: Above Right Motion Vector
MV2 MV3
MV1 MV
Picture or GOB border
MV2 MV3
(0,0) MV
MV1 MV1
MV1 MV
MV2 (0,0)
MV1 MV
Advanced Prediction Mode
− Includes motion vectors across picture boundaries from the previous mode.
− Option of using four motion vectors for 8x8 blocks instead of one motion vector for 16x16 blocks as in
baseline.
• In H.263, one motion vector per macroblock is used except in the advanced prediction mode, where
either one (four vectors with the same value) or four motion vectors per macroblock are employed.
• When there are four motion vectors, the information for the first motion vector is transmitted as the
code word motion vector data (MVD), and the information for the three additional vectors in the
macroblock is transmitted as the code word MVD2–4.
− Overlapped motion compensation to reduce blocking artifacts.
77
H.263 Options
Four motion vectors for 8x8 blocks instead of one motion vector for 16x16 blocks.
− The vectors are obtained by adding predictors to the vector differences indicated by MVD and MVD2–4,
as was the case when only one motion vector per macroblock was present.
− The predictors are calculated separately for the horizontal and vertical components.
− However, the candidate predictors MV1, MV2 and MV3 are redefined as indicated in Figure.
− The neighbouring 8×8 blocks that form the candidates for the prediction of the motion vector MV take
different forms depending on the position of the block in the macroblock.
78
H.263 Options
• Redefinition of the candidate predictors MV1, MV2 and MV3
for each luminance block in a macroblock.
• Motion vector prediction for 8x8 blocks used three
surrounding block motion vectors MV2
MV1
MV3
MV
MV2
MV1
MV3
MV
MV2
MV1
MV3
MV
MV2
MV1
MV3
MV
Overlapped Motion Compensation (OBMC)
− In normal motion compensation, the current block is composed of
• The predicted block from the previous frame (referenced
by the motion vectors)
• The residual data transmitted in the bit stream for the
current block.
− Overlapped motion compensation is only used for the 8×8
luminance blocks.
− Each pixel in an 8×8 luminance prediction block is the weighted
sum of three prediction values, divided by 8 (with rounding).
79
H.263 Options
Reference frame
Current MB
Overlapped Motion Compensation (OBMC)
− To obtain the prediction values, three motion vectors are
used. They are the motion vector of the current luminance
block and two out of four remote vectors, as follows:
• the motion vector of the block at the left or right
side of the current luminance block;
• the motion vector of the block above or below the
current luminance block.
80
H.263 Options
Overlapped Motion Compensation (OBMC)
− Let (m, n) be the column & row indices of an 88 pixel block in a frame.
− Let (i, j) be the column & row indices of a pixel within an 88 block.
− Let (x, y) be the column & row indices of a pixel within the entire frame
(𝒙, 𝒚) = (𝒎𝟖 + 𝒊, 𝒏𝟖 + 𝒋)
81
H.263 Options
B
88 pixel block
n, block column number
m, block row number
y, pixel column number
x, pixel row number
j, pixel column number
i, pixel row number
Overlapped Motion Compensation (OBMC)
• Let (MV0
x,MV0
y) denote the motion vectors for the current block.
• Let (MV1
x,MV1
y) denote the motion vectors for the block above (below) if the
current pixel is in the top (bottom) half of the current block.
• Let (MV2
x,MV2
y) denote the motion vectors for the block to the left (right) if the
current pixel is in the left (right) half of the current block.
82
H.263 Options
MV0
MV1
MV1
MV2 MV2Current
Block
Right
Block
Below
Block
Overlapped Motion Compensation (OBMC)
• The creation of each interpolated (overlapped) pixel, p(i, j), in an 8×8 reference luminance block is
governed by
𝑷(𝒙, 𝒚) = (𝒒(𝒙, 𝒚) 𝑯 𝟎(𝒊, 𝒋) + 𝒓(𝒙, 𝒚) 𝑯 𝟏(𝒊, 𝒋) + 𝒔(𝒙, 𝒚) 𝑯 𝟐(𝒊, 𝒋) + 𝟒)/𝟖
− Where,
𝒒 𝒙, 𝒚 = 𝒙 + 𝑴𝑽 𝟎
𝒙, 𝒚 + 𝑴𝑽 𝟎
𝒚
𝒓 𝒙, 𝒚 = 𝒙 + 𝑴𝑽 𝟏
𝒙, 𝒚 + 𝑴𝑽 𝟏
𝒚
𝒔(𝒙, 𝒚) = (𝒙 + 𝑴𝑽 𝟐
𝒙, 𝒚 + 𝑴𝑽 𝟐
𝒚)
83
H.263 Options
4 5 5 5 5 5 5 4
5 5 5 5 5 5 5 5
5 5 6 6 6 6 5 5
5 5 6 6 6 6 5 5
5 5 6 6 6 6 5 5
5 5 6 6 6 6 5 5
5 5 5 5 5 5 5 5
4 5 5 5 5 5 5 4
1 2 2 2 2 2 2 1
1 1 2 2 2 2 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 2 2 2 2 1 1
1 2 2 2 2 2 2 1
𝑯 𝟐 𝒊, 𝒋 = (𝑯 𝟏 𝒊, 𝒋 ) 𝑻
𝑯 𝟎 𝒊, 𝒋 = 𝑯 𝟏 𝒊, 𝒋 =
84
H0(i, j)
Weighting values for prediction with
motion vector of current block
H2(i, j)
Weighting values for prediction with motion vectors of
luminance blocks to the left or right of current luminance block
Right of the current block Left of the current block
H1(i, j)
Weighting values for prediction with motion vectors of the
luminance blocks on top or bottom of the current luminance block
Bottom of the current block
Top of the current block
Overlapped Motion Compensation (OBMC)
The neighbouring pixels closer to the pixels in the current block take greater weights.
H.263 Options
PB Frames Mode
− A PB frame consists of two P- and B-pictures coded as one unit (coded together) (a P frame as in
baseline, and a B frame)
− The P-picture is predicted from the last decoded P-picture, and the B-picture is predicted from both
the last decoded P-picture and the P-picture currently being decoded (The prediction process is
illustrated in Figure).
− Can increase frame rate 2X with only about 30% increase in bit rate (because of B-frame).
− Since in the PB frames mode a unit of coding is a combined macroblock from P- and B-pictures, the
composite macroblock comprises 12 blocks.
− First the data for the six P-blocks are transmitted as the default H.263 mode, and then the data for the
six B-blocks.
− The composite macroblock may have various combinations of coding status for the P- and B-blocks,
which are dictated by the MCBPC.
85
H.263 Options
Best match
Forward Motion Vector
Macroblock to be coded
Previous reference picture
Current B-picture
Future reference picture
Best match
Backward Motion Vector
86
Forward Motion Vector and Backward Motion Vector, Recall
Forward Prediction
Backward Prediction
PB Frames Mode
87
H.263 Options
Restriction: the backward predictor cannot extend outside the current MB position of the future frame.
Picture 1
P Frame
(decoded P-picture) Picture 2
B Frame
Picture 3
P Frame
(current P-picture)
V 1/2 -V 1/2
PB
Forward Motion Vector Backward Motion Vector
Forward Prediction Backward Prediction
Forward Prediction
P B P
PB frame
TRB
TRD
𝑀𝑉 𝐹 =
𝑇𝑅 𝐵𝑀𝑉
𝑇𝑅 𝐷
+ 𝑀𝑉𝐷
𝑀𝑉 𝐵 =
𝑇𝑅 𝐵 − 𝑇𝑅𝐷 𝑀𝑉
𝑇𝑅 𝐷
𝑖𝑓 𝑀𝑉𝐷 𝑖𝑠 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 0
𝑀𝑉 𝐵 =
𝑀𝑉𝐹 −
𝑀𝑉 𝑖𝑓 𝑀𝑉𝐷 𝑖𝑠 𝑛𝑜𝑡 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 0
H.263 Options
PB Frames Mode
P-picture is predicted from the previous decoded P-picture
B-picture is predicted both from the previous decoded P-picture and the P picture currently being decoded.
𝑀𝑉 𝐹 𝑀𝑉 𝐵
𝑀𝑉
Assume: 𝑀𝑉 𝐷 is the delta vector component given
by the motion vector data of a B-picture (MVDB)
and corresponds to the vector component MV.
Forward and Bi-directional Prediction in B-block
Part of the block that is predicted bidirectionally
part that uses only forward prediction
FWD: Forward prediction
BID: Bidirectional prediction
P-Macroblock
BID
FWD
B-lock
88
Improved PB frames (BPB)
− This mode is an improved version of the optional PB frames mode of H.263 [22-M].
− Most parts of this mode are similar to the PB frames mode, the main difference being that in the
improved PB frames mode, the B part of the composite PB-macroblock, known as BPB-macroblock,
may have a separate motion vector for forward and backward prediction.
− This is in addition to the bidirectional prediction mode that is also used in the normal PB frames mode.
− Hence, there are three different ways of coding a BPB-macroblock, and the coding type is signalled
by the MVDB parameter.
• Bidirectional prediction
• Forward prediction
• Backward prediction
89
H.263 Options
Picture 1
P Frame
(decoded P-picture) Picture 2
B Frame
Picture 3
P Frame
(current P-picture)
V 1/2 -V 1/2
PB
Forward Motion Vector Backward Motion Vector
Forward
Prediction
Backward
Prediction
Forward Prediction
Syntax-based Arithmetic Coding Mode
− In encoding, a symbol is encoded by a specific array of integers (model) based on syntax and by calling
a encode_a_symbol (index, cumul_freq).
− A FIFO buffers the bits from arithmetic encoder.
− In decoding, a symbol is decoded by a specific model based on syntax and by calling decode_a_symbol
(cumul_freq).
− Syntax of top 3 layers: Picture, Group-of-Blocks and Macroblock remains the same, but that of block is
modified.
90
H.263 Options
Syntax-based Arithmetic Coding Mode
− In this mode, all the variable length coding and decoding of baseline H.263 is replaced with arithmetic
coding/decoding.
− This removes the restriction that each sumbol must be represented by an integer number of bits, thus
improving compression efficiency.
− Experiments indicate that compression can be improved by up to 10% over variable length
coding/decoding.
− Complexity of arithmetic coding is higher than variable length coding, however.
91
H.263 Options
92
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
Enhance H.263 with additional options (Draft 20, Sept. ’97)
Coding efficiency
• Advanced intra coding mode
• Deblocking filter mode
• Improved PB-frames mode
• Reference picture resampling mode
• Alternative inter VLC mode
• Modified quantization mode
Error robustness
• Slice structured mode
• Referenced picture selection mode
• Independently segmented decoding mode
Enhanced Communication
• Temporal, SNR, and spatial scalability mode
• Reduced-resolution update mode
93
H.263 Ver. 2 (H.263+)
− H.263+ was standardized in January, 1998.
− The expected enhancements of H.263+ over H.263 fall into two basic categories:
• enhancing quality within existing applications;
• broadening the current range of applications.
− Adds negotiable options and features while still retaining a backwards compatibility mode.
− A few examples of the enhancements are as follows:
• improving perceptual compression efficiency;
• reducing video coding delay;
• providing greater resilience to bit errors and data losses.
94
H.263 Ver. 2 (H.263+)
95
H.263 Ver. 2 (H.263+)
Annex I: Advanced Intra Coding mode
Annex J: Deblocking Filter mode
Annex K: Slice Structured mode
Annex L: Supplemental Enhancement Information Specification
Annex M: improved PB Frame mode
Annex N: Reference Picture Selection mode
Annex O: Temporal, SNR, and Spatial Scalability mode
Annex P: Reference Picture Resampling
Annex Q: Independent Segment Decoding mode
Annex R: Independent Segment Decoding mode
Annex S: Alternative Inter VLC mode
Annex T: Modified Quantization mode
96
H.263+ (v2) Optional Tools
− In addition to the multiples of CIF, H.263+ permits
• any frame size from 4x4 to 2048x1152 pixels in increments of 4.
− Besides the 12:11 pixel aspect ratio (PAR), H.263+ supports
• Square (1:1)
• 525-line 4:3 picture (10:11)
• CIF for 16:9 picture (16:11)
• 525-line for 16:9 picture (40:33)
• and other arbitrary ratios
− In addition to picture clock frequencies of 29.97 Hz (NTSC), H.263+ supports
• 25 Hz (PAL)
• 30 Hz
• and other arbitrary frequencies
97
Arbitrary Frame Size, Pixel Aspect Ratio, Clock Frequency
98
Level 1 Level 2 Level 3
Advanced INTRA Coding Yes Yes Yes
Deblocking Filter Yes Yes Yes
Supplemental Enhancement Information (Full-Frame Freeze
Only)
Yes Yes Yes
Modified Quantization Yes Yes Yes
Unrestricted Motion Vectors No Yes Yes
Slice Structured Mode No Yes Yes
Reference Picture Resampling (Implicit Factor-of-4 Mode Only) No Yes Yes
Advanced Prediction No No Yes
Improved PB-frames No No Yes
Independent Segment Decoding No No Yes
Alternate INTER VLC No No Yes
H.263v2 specified a set of recommended modes in an informative appendix (Appendix II, since deprecated)
The prior informative Appendix II (recommended optional enhancement) was obsoleted by the creation of the normative Annex X.
H.263 Ver. 2 (H.263+)
− In this mode, either the DC coefficient, 1st column, or 1st row of coefficients are predicted from
neighboring blocks (Dc only, Vertical DC & AC, Horizontal DC &AC)
− Prediction is determined on a MB-by-MB basis.
− Essentially DPCM of Intra DCT coefficients.
− Can save up to 40% of the bits on Intra frames.
− A separate VLC table for intra DCT
− Modified quantization for intra coefficients
− Spatial prediction of DCT coefficients
99
Advanced Intra Coding Mode
Three neighboring blocks to the DCT domain
u 0 1 2 3 4 5 6 7
Block A
Rec A(u, v)
Block C
Rec C(u, v)
v
0
1
2
3
4
5
6
7
Block B
Rec B(u, v)
Index Prediction mode Code
0 0
(DC Only)
0
1 1
(Vertical DC & AC)
10
2 2
(Horizontal DC & AC)
11
− At very low bit rates, the block of pixels is mainly made of low-frequency DCT coefficients.
− In these areas, when there is a significant difference between the DC levels of the adjacent blocks, they
appear as block borders.
− The overlapped block matching motion compensation to some extent reduces these blocking artefacts.
− For further reduction in the blockiness, the H.263 specification recommends deblocking of the picture
through the block edge filter.
− The Deblocking Filter mode improves subjective quality by removing blocking and mosquito artifacts
common to block-based video coding at low bit rates.
100
Deblocking Filter Mode
− Deblocking Filter Mode introduces a deblocking filter inside the coding loop.
− Unlike in post-filtering, predicted pictures are computed based on filtered versions of the previous ones.
− Like the Advanced Prediction mode of H.263, the Deblocking Filter mode involves using four motion
vectors per macroblock.
− The filtering is performed on 8×8 block edges and assumes that 8×8 DCT is used and the motion vectors
may have either 8×8 or 16×16 resolution.
− Filtering is equally applied to both luminance and chrominance data.
− No filtering is permitted on the frame and slice edges.
101
Deblocking Filter Mode
− Consider four pixels A, B, C and D on a line (horizontal or
vertical) of the reconstructed picture, where A and B
belong to block 1 and C and D belong to a
neighbouring block 2, which is either to the right of or
below block 1.
− It Filters pixels along block boundaries while preserving
edges in the image content.
− Filter is in the coding loop which means, it filters the
decoded reference frame used for motion
compensation.
− It can be used in conjunction with a post-filter to further
reduce coding artifacts.
102
Deblocking Filter Mode
A B C D
A
B
C
D
block2block1
block1
Blockboundary
Example for filtered pixels on a
horizontal Block edge (Filtered pixels
on a vertical block edge)
Example for filtered pixels on a vertical
Block edge (Filtered pixels on a
horizontal block edge)
Block boundary
Deblocking Filter
− To turn the filter on for a particular edge, either block 1 or block 2 should be an intra or a coded
macroblock with the code COD =0.
− A, B, C and D are replaced by new values, A1, B1, C1, and D1 based on a set of non-linear equations.
− The strength of the filter is proportional to the quantization strength.
− The sign of d1 is the same as the sign of d.
103
H.263 Options
A B C D
A
B
C
D
block2block1
block1
Deblocking Filter
− Figure shows how the value of d1 changes with d and the quantiser parameter QP, to make sure that only
block edges which may suffer from blocking artefacts are filtered and not the natural edges.
− As a result of this modification, only the pixels on the edge are filtered so that their luminance changes are
less than the quantisation parameter, QP.
104
H.263 Options
d1 as a function of d
− To turn the filter on for a particular edge, either block 1 or block 2 should be an intra or a coded
macroblock with the code COD =0.
− A, B, C and D are replaced by new values, A1, B1, C1, and D1 based on a set of non-linear equations.
− The strength of the filter is proportional to the quantization strength.
𝑩𝟏 = 𝒄𝒍𝒊𝒑 (𝑩 + 𝒅𝟏) 𝑪𝟏 = 𝒄𝒍𝒊𝒑 (𝑪 − 𝒅𝟏)
𝑨𝟏 = 𝑨 − 𝒅𝟐 𝑫𝟏 = 𝑫 + 𝒅𝟐
𝒅𝟏 = 𝑭𝒊𝒍𝒕𝒆𝒓 [
𝑨 − 𝟒𝑩 + 𝟒𝑪 − 𝑫
𝟖
, 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉 𝑸𝑼𝑨𝑵𝑻 ] 𝒅𝟐 = 𝒄𝒍𝒊𝒑 𝒅𝟏
𝑨 − 𝑫
𝟒
,
𝒅𝟏
𝟑
𝑭𝒊𝒍𝒕𝒆𝒓(𝒙, 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉) = 𝑺𝑰𝑮𝑵(𝒙) ∗ (𝑴𝑨𝑿(𝟎, 𝒂𝒃𝒔(𝒙) − 𝑴𝑨𝑿(𝟎, 𝟐 ∗ ( 𝒂𝒃𝒔(𝒙) − 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉))))
105
Deblocking Filter Mode
− The Deblocking Filter mode improves subjective quality by removing blocking and mosquito artifacts common to
block-based video coding at low bit rates.
− Many applications make use of a post filter to reduce these artifacts.
− The post-filtering is useful in error-free and error-prone environments.
− This post filter is usually present at the decoder and is outside the coding loop. Therefore, prediction is not based
on the post filtered version of the picture.
− The one-dimensional version of the filter will be described.
− To obtain a two-dimensional effect, the filter is first used in the horizontal direction and then in the vertical
direction.
− The post filter is applied to all pixels within the picture.
− Edge pixels should be repeated when the filter is applied at picture boundaries.
106
Post-Filter
− The pixels A, B, C, D, E, F, G, (H) are aligned horizontally or
vertically.
− The post-filter strength is proportional to the quantization:
Strength(QUANT)
− The Strength1 and Strength2 may be different to better adapt
the total filter strength to QUANT.
− The Strength1, 2 may be related to QUANT for the macroblock
where D belongs or to some average value of QUANT over
parts of the picture or over the whole picture.
107
Post-Filter
𝑫𝟏 = 𝑫 + 𝑭𝒊𝒍𝒕𝒆𝒓
𝑨 + 𝑩 + 𝑪 + 𝑬 + 𝑭 + 𝑮 − 𝟔𝑫
𝟖
, 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉𝟏
when filtering in the first direction
𝑫𝟏 = 𝑫 + 𝑭𝒊𝒍𝒕𝒆𝒓
𝑨 + 𝑩 + 𝑪 + 𝑬 + 𝑭 + 𝑮 − 𝟔𝑫
𝟖
, 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉𝟐
when filtering in the second direction
The relation between Strength1, 2 and QUANT
108
Deblocking Loop Filter Demo
No Filter Deblocking Loop Filter
109
Deblocking Loop Filter and Post Filter Demo
Deblocking Loop Filter and Post FilterNo Filter
110
Deblocking Loop Filter and Post Filter Demo
No Filter
Loop Filter Only
Deblocking Loop Filter and Post Filter
111
No Filter
Deblocking Loop Filter and TMN-8 Post FilterDeblocking Loop Filter Only
TMN-8 Post Filter Only
Deblocking Loop Filter and Post Filter Demo
sequenceForeman24Kbps,10fps
TMN-8:VideoCodecTestModel,near-term,Version8(TMN8)
− The deblocking filter alone reduces
blocking artifacts significantly, mainly
due to the use of four motion vectors
per macroblock.
− The filtering process provides
smoothing, further improving
subjective quality.
− The effects of the post filter are less
noticeable, and adding the post filter
may actually result in blurriness.
− Therefore, the use of the deblocking
filter alone is usually sufficient.
− Allows insertion of resynchronization markers at macroblock boundaries to improve network packetization
and reduce overhead. More on this later
• Allows more flexible tiling of video frames into independently decodable areas to support “view
ports”, a.k.a. “local decode.”
• Improves error resiliency by reducing intra-frame dependence.
• Permits out-of-order transmission to reduce latency.
112
Slice Structured Mode
113
Slice Structured Mode
Slice
Boundaries
No INTRA or MV Prediction Across Slice Boundaries.
Slices Start And End on Macroblock Boundaries.
Slice
Boundaries
No INTRA or MV Prediction Across Slice Boundaries.
Slice Sizes Remain Fixed Between INTRA Frames.
Backwards compatible with H.263 but permits indication of supplemental information for features such as:
• Partial and full picture freeze requests
• Partial and full picture snapshot tags
• Video segment start and end tags for off-line storage
• Progressive refinement segment start and end tags
• Chroma keying info for transparency
• The Chroma Keying Information Function (CKIF) indicates that the "chroma keying" technique is used to represent
"transparent" and "semi-transparent" pixels in the decoded video pictures.
• When being presented on the display, "transparent" pixels are not displayed.
• Instead, a background picture which is either a prior reference picture or is an externally controlled picture is revealed.
• Semitransparent pixels are displayed by blending the pixel value in the current picture with the corresponding value in
the background picture.
114
Supplemental Enhancement Information
− Resampling of a temporally previous reference picture prior to its use as a reference for encoding,
enabling global motion compensation, predictive dynamic resolution conversion, predictive picture area
alteration and registration, and special-effect warping;
− Allows frame size changes of a compressed video sequence without inserting an Intra frame (No Intra
frame required when changing video frame sizes).
− Permits the warping of the reference frame via affine transformations to address special effects such as
zoom, rotation, translation.
− Can be used for emergency rate control by dropping frame sizes adaptively when bit rate get too high.
115
Reference Picture Resampling
− Specifies generalized method applied to previous reference picture to generate warped picture for use in
predicting current picture
− Special case of factor of 4 resampling, which converts horizontal and vertical size by factor of 2
(upsampling) or ½(downsampling) in each direction.
116
Reference Picture Resampling
Pixel positions of the reference picture
Pixel positions of the downsamped predicted picture
a=(A+B+C+D+1+RCRPR)/4
.
.
Downsampling
a
A B
C D
Pixel positions of the reference picture
Pixel positions of the upsamped predicted picture
a=(9A+3B+3C+D+7+RCRPR)/16
b=(3A+9B+C+3D+7+RCRPR)/16
c=(3A+B+9C+3D+7+RCRPR)/16
d=(A+3B+3C+9D+7+RCRPR)/16
a
c
b
d
A B
C D
Upsampling
− Specify arbitrary warping parameters via displacement vectors from corners.
− For source format changes
− Global motion compensation
− Special-effect warping
117
Reference Picture Resampling with Warping
MV00
MV10 MV11
MV01
No Intra frame required when changing video frame sizes
118
Reference Picture Resampling Factor of 4 Size Change
P P P P P
− Allows more flexibility in adapting quantizers on a macroblock by macroblock basis, by
enabling large quantizer changes through the use of escape codes.
− A mode which improves the control of the bit rate by changing the method for controlling the
quantizer step size on a macroblock basis.
− Reduces quantizer step size for chrominance blocks, compared to luminance blocks to
reduces the prevalence of chrominance artifacts .
− Modifies the allowable DCT coefficient range to avoid clipping, yet disallows illegal
coefficient/quantizer combinations.
− Increases the range of representable DCT coefficient values for use with small quantizer step
sizes, and increases error detection performance and reduces decoding complexity by
prohibiting certain unreasonable coefficient representations.
119
Modified Quantization (MQ)
− Allow modification of the quantizer at macroblock layer to any value, not limited to +1, -1, +2, and -2.
• DQUANT uses 2 bits (started with “1”) to specify small changes.
− It uses 6 bits (started with “0”) to specify other changes.
− Codeword: 0xxxxx, where the last 5 bits specify the new QUANT value.
120
Modified Quantization (MQ)
Change of QUANT
Prior QUANT DQUANT = 10 DQUANT = 11
1 +2 +1
2- 10 - 1 +1
11- 20 - 2 +2
21- 28 - 3 +3
29 - 3 +2
30 - 3 +1
31 - 3 - 5
− Enhance chrominance quality by a finer quantizer.
− Improve picture quality by extending the range of representable quantized DCT coefficients, not limited
by [-127, +127].
121
Modified Quantization (MQ)
Range of QUANT Value of QUANT_C
1- 6 QUANT_C = QUANT
7- 9 QUANT_C = QUANT - 1
10- 11 9
12- 13 10
14- 15 11
16- 18 12
19- 21 13
22- 26 14
27- 31 15
− Used for bit rate control by reducing the size of the residual frame adaptively when bit rate gets too high.
− A mode which allows an encoder to maintain a high frame rate during heavy motion by encoding a low-
resolution update to a higher-resolution picture while maintaining high resolution in stationary areas
122
Reduced-Resolution Update (RRU)
Up
sampling
16*16
reconstructed block
8*8 Coefficients
block
Result of inverse
transform
Coefficients
decoding
Block layer
decoding
Bitstream
Scaling-up
Macroblock
Layer Decoding
Pseudo-
Vector
16*16 Reconstructed
prediction error block
16*16 prediction blockMotion
Compensation
Reconstructed
Vector
− A scalable bit stream consists of layers representing different levels of video quality.
− Everything can be discarded except for the base layer and still have reasonable video.
− If bandwidth permits, one or more enhancement layers can also be decoded which refines the base layer
in one of three ways: temporal, SNR, or spatial
123
Scalability Mode
Enh. Layer 1
Enhancement Layer 3
Enhancement Layer 4
Base Layer
Enhancement Layer 2
H.263+Encoder
40kb/s
20kb/s
90kb/s
200kb/s
320kb/s
Layered Video Bitstreams
− Scalability is typically used when one bit stream must support several different transmission bandwidths
simultaneously, or some process downstream needs to change the data rate unbeknownst to the
encoder.
124
Scalability Mode
Example: Conferencing Multipoint Control Unit
125
384 kb/s
384 kb/s
128 kb/s
28.8 kb/s
Scalability Mode
Layered Video Bit Streams in Multipoint Conferencing
126
Scalability Mode
Higher
Frame Rate!
Base Layer + B Frames
Better
Spatial Quality!
Base Layer + SNR Layer
SNR
Enhancement
More
Spatial Resolution!!
Base Layer + Spatial Layer
Spatial
Enhancement
Temporal
Enhancement
SNR Scalability
EI EP EP
Enhancement Layer
I P P
Base Layer
Spatial Scalability
Base Layer I P P
EI EP EPEnhancement
Layer
Temporal Scalability
B2 B4I1
P3 P5
Scalability Mode
Low Temporal Resolution
High Temporal Resolution
127
Two or more frame rates can be supported by the same bit stream.
− It is achieved using bidirectionally predicted pictures or B-pictures.
− The B-frames can be discarded (to lower the frame rate) and the bit
stream remains usable.
− These B-pictures differ from the B-picture part of PB-frames in that they are
separate entities in the bitstream.
− These B-pictures are not syntactically intermixed with a subsequent P or its
enhancement part EP.
− B-pictures and the B part of PB-frames are not used as reference pictures
for the prediction of any other pictures. This property allows for B-pictures to
be discarded if necessary without adversely affecting any subsequent
pictures, thus providing temporal scalability.
− Since H.263 is normally used for low frame rate applications (low bit rates,
e.g. mobile), due to larger separation between the base layer I- and P-
pictures, there is normally one B-picture between them.
128
Temporal Scalability
I
or
P
B B P ......
• I and P frames form the base layer
• B-frames from the temporal enhancement layer
• B-frames can be discarded
Temporal Scalability Demonstration
• layer 0, 3.25 fps, P-frames
• layer 1, 15 fps, B-frames
The difference between the input picture and lower quality base layer
picture is coded.
− The picture in the base layer which is used for the prediction of the
enhancement layer pictures may be an I-picture, a P-picture, or the P
part of PB frames, but should not be a B-picture or the B part of a PB
frame.
− In the enhancement layer two types of picture are identified, EI
(enhancement I-picture) and EP (enhancement P-picture).
− If prediction is only formed from the base layer, then the
enhancement layer picture is referred to as EI-picture.
− In this case, the base layer picture can be an I- or a P-picture (or the P
part of PB frames).
− For both EI- and EP-pictures, prediction from the reference layer uses
no motion vectors → no inter prediction from base layer.
− however, EP may be predictively coded with respect to its previous
reconstructed picture at the same layer, called forward prediction.
129
SNR Scalability
Base Layer (15 kbit/s)
Enhancement Layer (40 kbit/s)
EI EP EP
PPI
EI EP
P P
I - Intracoded or Key Frame
P - Predicted Frame
EI - Enhancement layer key frame (enhancement I-picture)
EP - Enhancement layer predicted frame (enhancement P-picture)
SNR Scalability Demonstration
• layer 0, 10 fps, 40 kbps
• layer 1, 10 fps, 400 kbps
− The arrangement of the enhancement layer pictures in the
spatial scalability is similar to that of SNR scalability.
− The only difference is that before the picture in the reference
layer is used to predict the picture in the spatial enhancement
layer, it is downsampled by a factor of 2 either horizontally or
vertically (one-dimensional spatial scalability), or both
horizontally and vertically (two-dimensional spatial scalability).
− If enhancement layer be 2X the size of the base layer in each
dimension the base layer is interpolated (by 2X) before
predicting the spatial enhancement layer.
130
Spatial Scalability
Base Layer
Enhancement Layer
EI EP EP
PPI
EI EP
P P
I - Intracoded or Key Frame
P - Predicted Frame
EI - Enhancement layer key frame (enhancement I-picture)
EP - Enhancement layer predicted frame (enhancement P-picture)
Spatial Scalability Demonstration
• layer 0, QCIF, 10 fps, 60 kbps
• layer 1, CIF, 10 fps, 300 kbps
− It will increase the robustness of H.263 against the channel errors.
− It is possible for B-pictures to be temporally inserted not only between the
base layer pictures of type I, P, PB and, but also between the
enhancement picture types of EI and EP, whether these consist of SNR or
spatial enhancement pictures.
− It is also possible to have more than one SNR or spatial enhancement
layer in conjunction with the base layer. Thus, a multilayer scalable
bitstream can be a combination of SNR layers, spatial layers and B-
pictures.
− As with the two-layer case, B-pictures may occur in any layer.
− However, any picture in an enhancement layer which is temporally
simultaneous with a B-picture in its reference layer must be a B-picture or
the B-picture part of PB frames. This is to preserve the disposable nature
of B-pictures.
− Note, however, that B-pictures may occur in any layers that have no
corresponding picture in the lower layers. This allows an encoder to send
enhancement video with a higher picture rate than the lower layers.
131
Hybrid or Multilayer Scalability
EP
E
I
P
EI
E
P
P
B
E
P
P
EI
E
I
I
Base
Layer
Enhancement
Layer1
Enhancement
Layer2
E
P
P
B
I - Intracoded or Key Frame
P - Predicted Frame
EI - Enhancement layer key frame (enhancement I-picture)
EP - Enhancement layer predicted frame (enhancement P-picture)
Scalability Demonstration
• SNR/Spatial Scalability, 10 fps
• layer 0, 88x72, ~5 kbit/s, layer 1, 176x144, ~15
• layer 2, 176x144, ~40, layer 3, 352x288, ~80
• layer 4, 352x288, ~200
Pictures, which are dependent on other pictures, are located in the bitstream
after the pictures on which they depend.
− The bitstream syntax order is specified such that for reference pictures (i.e.
pictures having types I, P, EI, EP or the P part of PB) the following two rules
shall be obeyed:
1. All reference pictures with the same temporal reference appear in
the bitstream in increasing enhancement layer order. This is because
each lower layer reference picture is needed to decode the next
higher layer reference picture.
2. All temporally simultaneous reference pictures as discussed in item 1
appear in the bitstream prior to any B-pictures for which any of these
reference pictures is the first temporally subsequent reference picture
in the reference layer of the B-picture. This is done to reduce the
delay of decoding all reference pictures, which may be needed as
references for B-pictures. 132
Transmission Order of Pictures
Enhancement
Layer 2
Base
Layer
Enhancement
Layer 1
4
3
2
1 8
7
6
5
EI
EP
P
I B
B
B
B
Enhancement
Layer 2
Base
Layer
Enhancement
Layer 1
4
3
2
1 5
8
7
6
EI
EP
P
I B
B
B
B
Two Allowable Picture Transmission Orders
− Then, the B-pictures with earlier temporal references follow (temporally
ordered within each enhancement layer).
− The bitstream location of each B-picture complies with the following rules:
• Be after that of its first temporally subsequent reference pictures in the reference layer.
This is because the decoding of the B-pictures generally depends on the prior decoding
of that reference picture.
• Be after that of all reference pictures that are temporally simultaneous with the first
temporally subsequent reference picture in the reference layer. This is to reduce the
delay of decoding all reference pictures, which may be needed as references for B-
pictures.
• Precede the location of any additional temporally subsequent pictures other than B-
pictures in its reference layer. Otherwise, it would increase picture storage memory
requirement for the reference layer pictures.
• Be after that of all EI- and EP-pictures that are temporally simultaneous with the first
temporally subsequent reference picture.
• Precede the location of all temporally subsequent pictures within the same
enhancement layer. Otherwise, it would introduce needless delay and increase picture
storage memory requirements for the enhancement layer. 133
Transmission Order of Pictures
Enhancement
Layer 2
Base
Layer
Enhancement
Layer 1
4
3
2
1 8
7
6
5
EI
EP
P
I B
B
B
B
Enhancement
Layer 2
Base
Layer
Enhancement
Layer 1
4
3
2
1 5
8
7
6
EI
EP
P
I B
B
B
B
Two Allowable Picture Transmission Orders
I B
Base Layer P
EI EP
Enhancement
Layer 1
EP SNR
Scalability
Spatial
Scalability
Enhancement
Layer 2
EI EP EI
Temporal
Scalability
B B
Temporal
Scalability
Enhancement
Layer 3
Hybrid or Multilayer Scalability Example
134
I PBBase Layer
EI EPEnhancement
Layer 1
SNR
Scalability
Enhancement
Layer 3
B B
Temporal
Scalability
Enhancement
Layer 2
EI EI EPSpatial
Scalability
Multilayer Transmission Order Example
I B P
EI EP EP
EI EP EI
Temporal
Scalability
B B
Temporal
Scalability
EP
135
Method for interpolating pixels for 2-D scalability
136
Interpolation for Spatial Scalability
a b
c d
A B
C D
Original pixel positions
Interpolated pixel positions
a =(9A+3B+3C+D+8)/16
b =(3A+9B+C+3D+8)/16
c =(3A+B+9C+3D+8)/16
d =(A+3B+3C+9D+8)/16
Interpolation Formulation (Filtering)
Method for 2-D interpolation at boundaries
137
Interpolation for Spatial Scalability
Original Pixel Positions
Interpolated Pixel Positions
a = A
b = (3*A +B + 2) / 4
c = (A + 3*B + 2) / 4
d = (3*A + C + 2) / 4
e = (A +3*C + 2) / 4
Picture
Boundary
a b c
d
e
A B
C D
Picture
Boundary
− Improved PB-frames
• Improves upon the previous PB-frame mode by permitting forward prediction of “B” frame with a new
vector.
− Reference Picture Selection (RPS)
• A lower latency method for dealing with error prone environments by using some type of back-
channel to indicate to an encoder when a frame has been received and can be used for motion
estimation.
• In RPS Mode, a frame is not used for prediction in the encoder until it’s been acknowledged to be
error free.
138
Other Miscellaneous Features
− Independently Decodable Segments
• When signaled, it restricts the use of data outside of a current Group-of-Block segment or slice
segment. Useful for error resiliency.
− Alternative INTER VLC (AIV):
• Permits use of an alternative VLC table that is better suited for Intra coded blocks, or blocks with low
quantization.
• A mode which reduces the number of bits needed for encoding predictively-coded blocks when
there are many large coefficients the block.
139
Other Miscellaneous Features
140
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− Phone lines are “circuit-switched”.
− A (virtual) circuit is established at call initiation and remains for the duration of the call.
141
Internet Basics
Source Dest.switch
switch
switch
− Computer networks are “packet-switched”.
− Data is fragmented into packets, and each packet finds its way to the destination using different routes.
− Lots of implications...
142
Internet Basics
Source Dest.switch
switch
switchX
143
The Internet Is Heterogeneous
Router
Router Router
Corporate LAN
INTERNET
(Global Public)
AOL
HyperStream
FR: Frame Relay
SMDS: Switched Multimegabit Data Service
ATM: Asynchronous Transfer Mode
LAN LAN
TYMNET
MCI Mail
LAN Mail
Gateway
Host
Dial-up IP
“SLIP: Serial Line Internet Protocol”, “PPP: Point-
to-Point Protocol ”
IP
IPIP
“SMTP: Simple Mail
Transfer Protocol”
E-mail
FR: Frame Relay
“SLIP: Serial Line Internet Protocol”,
“PPP: Point-to-Point Protocol ”
X.25
“SMTP: Simple Mail
Transfer Protocol”
IP
Dial-up
E-mail
FR: Frame RelayFR: Frame Relay
− MCI Mail was one of the first ever commercial email services in the United States and one of the largest telecommunication services in the
world.
− AOL Mail is a free web-based email service provided by AOL, a division of Verizon Communications.
− X. 25 is an ITU-T standard protocol suite for packet-switched data communication in wide area networks (WAN).
− Frame Relay (FR) is a standardized wide area network technology that specifies the physical and data link layers of digital
telecommunications channels using a packet switching methodology.
− Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by ANSI and ITU standards for carriage of user traffic, including
telephony, data, and video signals.
− Switched Multimegabit Data Service (SMDS) is a wide area networking (WAN) connection service designed for LAN interconnection through
the public telephone network. SMDS is designed for moderate bandwidth connections, between 1 to 34 Mbps, although SMDS has and is
being extended to support both lower and higher bandwidth connections.
− Tymnet was an international data communications network headquartered in Cupertino, California that used virtual call packet switched
technology and X.25, SNA/SDLC, ASCII and BSC interfaces to connect host computers at thousands of large companies, educational
institutions, and government agencies.
144
The Internet Is Heterogeneous
OSI (Open System Interconnection) Model
145
Comparison Between OSI and TCP/IP Model
146
147
Layers in the Internet Protocol Architecture
Network Access Layer
consists of routines for accessing
physical networks
1
Internet Layer
defines the datagram and handles the
routing of data.
2
Host-to-Host Transport Layer
provides end-to-end data delivery
services.
3
Application Layer
consists of applications and processes
that use the network.
4
Header
Header
Header
Data
Data
Header Data
Header Header Data
148
Internet Protocol Architecture
I P
FDDI
Ethernet
Token Ring
HDLC
SMDS
X.25
ATM
FR
TCP UDP
SNMP DNS
TELNET FTP SMTP
MIME . . .
. . . Network
Access
Layer
Internet
Host-Host
Transport
Utility/Application
RTP
MBone
VIC/VAT
149
Specific Protocols for Multimedia
IP
TCP UDP
RTP
Physical Network
Data
IP
UDP
RTP
payload
RTP
payload
UDP
RTP
payload
Payload header
− IP implements two basic functions
• Addressing
• Fragmentation
− IP treats each packet as an independent entity.
− Internet routers choose the best path to send each packet based on its
address. Each packet may take a different route.
− Routers may fragment and reassemble packets when necessary for
transmission on smaller packet networks.
− No guarantee a packet will reach its destination, and no guarantee of when
it will get there.
• IP packets have a Time-to-Live, after which they are deleted by a router.
• IP does not ensure secure transmission.
• IP only error-checks headers, not payload.
150
The Internet Protocol (IP)
IP
TCP UDP
RTP
Physical Network
Data
IP
UDP
RTP
payload
RTP
payload
UDP
RTP
payload
Payload header
− TCP is connection-oriented, end-to-end reliable, in-order protocol.
− TCP does not make any reliability assumptions of the underlying networks.
− Acknowledgment is sent for each packet.
− A transmitter places a copy of each packet sent in a timed buffer.
− If no “ack” is received before the time is out, the packet is re-transmitted.
− TCP has inherently large latency → not well suited for streaming multimedia.
151
Transmission Control Protocol (TCP)
IP
TCP UDP
RTP
Physical Network
Data
IP
UDP
RTP
payload
RTP
payload
UDP
RTP
payload
Payload header
− UDP is a simple protocol for transmitting packets over IP.
− Smaller header than TCP, hence lower overhead.
− Does not re-transmit packets.
− This is OK for multimedia since a late packet usually must be discarded anyway.
− Performs check-sum of data.
152
Universal Datagram Protocol (UDP)
IP
TCP UDP
RTP
Physical Network
Data
IP
UDP
RTP
payload
RTP
payload
UDP
RTP
payload
Payload header
153
Transmission Control Protocol (TCP) and Universal Datagram Protocol (UDP)
− RTP carries data that has real time properties
− Typically runs on UDP/IP
− Does not ensure timely delivery or QoS.
− Does not prevent out-of-order delivery.
− Profiles and payload formats must be defined.
• Profiles define extensions to the RTP header for a particular class of applications
such as audio/video conferencing (IETF RFC 1890).
• Payload formats define how a particular kind of payload, such as H.261 video,
should be carried in RTP.
− Used by Netscape LiveMedia, Microsoft NetMeeting®, Intel VideoPhone,
ProShare® Video Conferencing applications and public domain
conferencing tools such as VIC and VAT.
154
Real time Transport Protocol (RTP)
IP
TCP UDP
RTP
Physical Network
Data
IP
UDP
RTP
payload
RTP
payload
UDP
RTP
payload
Payload header
− RTCP is a companion protocol to RTP which
• monitors the quality of service
• conveys information about the participants in an on-going session
− It allows participants to send transmission and reception statistics to other participants.
− It also sends information that allows participants to associate media types such as audio/video for lip-sync.
− Sender reports allow senders to derive round trip propagation times.
− Receiver reports include count of lost packets and inter-arrival jitter.
− Scales to a large number of users since must reduce the rate of reports as the number of participants
increases.
155
Real-time Transport Control Protocol (RTCP)
− Most IP-based communication is unicast. A packet is intended for a single destination.
− In unicasting, the router forwards the received packet through only one of its interfaces.
− The relationship between the source and the destination is one-to-one.
156
Unicasting
− For multi-participant applications, streaming multimedia to each destination individually can waste network resources.
− A multicast address is designed to enable the delivery of packets to a set of hosts that have been configured as
members of a multicast group across various subnetworks.
− In multicasting, the router may forward the received packet through several of its interfaces.
− The source address is a unicast address, but destination address is a group address.
157
Multicast
Packets are
duplicated in
routers
One source and a group of destination
158
Unicast Example, Streaming Media to Multi-participants
S1
D1
S2
D1
D2
R
R
R
1
1
2
S1 sends duplicate
packets because there’s
two participants: D1, D2..
D2 sees excess
traffic on this subnet.
159
Multicast Example, Streaming Media to Multi-participants
S1
D1
S2
D1
D2
R
R
R
1
2
S1 sends single set of
packets to a multicast group.
D2 doesn’t see any
excess traffic on this subnet.
Both D1 receivers
subscribe to the
same multicast group.
− A multicast router may not find another multicast router in the neighborhood to forward the multicast packet.
− We make a multicast backbone (Mbone) out of these isolated routers using the concept of tunneling.
− The multicast backbone (Mbone) was an experimental backbone and virtual network built on top of the Internet for
carrying IP multicast traffic on the Internet. It required specialized hardware and software (early of 1990s).
160
Multicast Backbone (Mbone)
concept of tunneling.
Virtual point- to-point link
Isolated island of routers
Nonmulticast routers
− Easy to deploy (no explicit router support).
− Manual tunnel creation/maintenance.
− No routing policy – single tree.
161
Multicast Backbone (Mbone)
MBONE
162
Multicast Backbone (Mbone)
IP header
G=224.x.x.x
Data
Nonmulticast
routers
IP header
G=224.x.x.x Data
Encapsulator
(router entry point of the tunnel) Decapsulator
(router exit point of the tunnel)
Mbone IP in IP Tunneling
Real-time applications
• Interactive applications are sensitive to packet delays (telephone)
• Non-interactive applications can adapt to a wider range of packet delays (audio, video broadcasts)
• Guarantee of maximum delay is useful
163
Quality of Service Requirements (1)
Arrival
Offset
Graph
Playout
Point
Sampled Audio
Playout Buffer must be small
for interactive applications
Elastic applications
− Interactive data transfer (e.g. HTTP, FTP)
• Sensitive to the average delay, not to the distribution tail
− Bulk data transfer (e.g. mail and news delivery)
• Delay insensitive
− Best effort works well
164
Quality of Service Requirements (2)
Document
Document is only useful when it is
completely received. This means
average packet delay is important,
not maximum packet delay.
Document
Used by hosts to obtain a certain QoS from underlying networks for a multimedia stream (It operates over an IPv4 or IPv6).
− It provides receiver-initiated setup of resource reservations for multicast or unicast data flows.
− At each node, RSVP daemon attempts to make a resource reservation for the stream.
− It communicates with two local modules:
• Admission Control: It determines whether the node has sufficient resources available. “The Internet Busy Signal”
• Policy Control: It determines whether the user has administrative permission to make the reservation.
165
ReSerVation Protocol (RSVP)
Application
RSVPD
Admissions
Control
Packet
Classifier
Packet
Scheduler
Policy
Control
DATA
DATA
RSVPD
Policy
Control
Admissions
Control
Packet
Classifier
Packet
Scheduler DATA
Routing
Process
Host Router
RSVP Functional Diagram
166
ReSerVation Protocol (RSVP)
R4
R5
R3R2
R1
Host A
24.1.70.210
Host B
128.32.32.69
PATH
PATH
2
2. The Host A RSVP daemon generates a PATH message
that is sent to the next hop RSVP router, R1, in the
direction of the session address, 128.32.32.69.
3
3. The PATH message follows the next hop path through R5
and R4 until it gets to Host B. Each router on the path creates
soft session state with the reservation parameters.
1. An application on Host A creates a session,
128.32.32.69/4078, by communicating with the RSVP
daemon on Host A.
1
167
ReSerVation Protocol (RSVP)
R4
R5
R3R2
R1
PATH
PATH
RESV
RESV
5
5. The Host B RSVP daemon generates a RESV message that
is sent to the next hop RSVP router, R4, in the direction of
the source address, 24.1.70.210.
6
6. The RESV message continues to follow the next hop
path through R5 and R1 until it gets to Host A. Each
router on the path makes a resource reservation.
4. An application on Host B communicates with the local
RSVP daemon and asks for a reservation in session
128.32.32.69/4078. The daemon checks for and finds
existing session state.
4
Host A
24.1.70.210
Host B
128.32.32.69
− HTTP generally runs on TCP/IP and is the protocol upon which World-Wide-Web data is transmitted.
− Defines a “stateless” connection between receiver and sender.
− Sends and receives MIME-like messages and handles caching, etc.
− No provisions for latency or QoS guarantees.
168
Hyper-Text Transport Protocol (HTTP)
169
Real-time Streaming Protocol (RTSP)
RTSPMeta FilesMedia file download
A “network remote control” for multimedia servers.
− Establishes and controls either a single or several time-synchronized streams of continuous media such as
audio and video.
− Supports the following operations:
• Requests a presentation from a media server.
• Invite a media server to join a conference and playback or record.
• Notify clients that additional media is available for an existing presentation.
170
RTSP
Media file download
Meta Files
Real-time Streaming Protocol (RTSP)
171
Real-time Streaming Protocol (RTSP)
RTSP - Example
− How do we handle the special cases of
• unicasting?
• Multicasting?
− What about
• packet-loss?
• Quality of service?
• Congestion?
We’ll look at some solutions...
172
How Do We Stream Video Over the Internet?
− HTTP was not designed for streaming multimedia, nevertheless because of its widespread deployment via
Web browsers, many applications stream via HTTP.
− It uses a custom browser plug-in which can start decoding video as it arrives, rather than waiting for the
whole file to download.
− Operates on TCP so it doesn’t have to deal with errors, but the side effect is high latency and large inter-
arrival jitter.
− Usually a receive buffer is employed which can buffer enough data (usually several seconds) to
compensate for latency and jitter.
− Not applicable to two-way communication!
− Firewalls are not a problem with HTTP.
173
HTTP Streaming
− RTP was designed for streaming multimedia.
− Does not resend lost packets since this would add latency and a late packet might as well be lost in
streaming video.
− Used by Intel Videophone, Microsoft NetMeeting, Netscape LiveMedia, RealNetworks, etc.
− Forms the basis for network video conferencing systems (ITU-T H.323)
− Subject to packet loss, and has no quality of service guarantees.
− Can deal with network congestion via RTCP reports under some conditions:
• Should be encoding real time so video rate can be changed dynamically.
• Needs a payload defined for each media it carries.
174
RTP Streaming
− Payloads must be defined in the IETF(Internet Engineering Task Force) for all media carried by RTP.
− A payload has been defined for H.263 and H.263+.
− An RTP packet typically consists of...
− The H.263 payload header contains redundant information about the H.263 bit stream which can assist a payload handler
and decoder in the event that related packets are lost.
− Slice mode of H.263+ aids RTP packetization by allowing fragmentation on MB boundaries (instead of MB rows) and
restricting data dependencies between slices.
− But what do we do when packets are lost or arrive too late to use?
175
H.263 Payload for RTP
RTP Header
H.263 Payload Header
H.263 Payload (bit stream)
176
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− Depends on network topology.
− On the Mbone
• 2-5% packet loss
• single packet loss most common
− For end-to-end transmission, loss rates of 10% not uncommon.
− For ISPs, loss rates may be even higher during high periods of congestion.
177
Internet Packet Loss
178
Distribution of length of loss bursts
observed at a receiver
0.0001
0.001
0.01
0.1
1
0 5 10 15 20 25 30 35 40 45 50
length of loss bursts, b
Probabilityofbursts
oflengthb
Conditional loss probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 2 4 6 8 10 12
Number of consecutive packets lost, n
Probabilityoflosingpacketn+1
Internet Packet Loss
Packet Loss Burst Lengths
Error resiliency and compression have conflicting requirements.
− Video compression attempts to remove as much redundancy out of a video sequence as possible.
− Error resiliency techniques at some point must reconstruct data that has been lost and must rely on
extrapolations from redundant data.
179
Error Resiliency
+
-
REDUNDANCY
CompressionResiliency
− Errors tend to propagate in video compression
because of its predictive nature.
−
180
Error Resiliency
I or P frame P frame
One block is lost.
Error propagates to two
blocks in the next frame.
There are essentially two approaches to dealing with errors from packet loss:
• Error Redundancy Methods
• They are preventative measures that add extra infromation at the encoder to make it easier to
recover when data is lost.
• The extra overhead decreases compression efficiency but should improve overall quality in the
presence of packet loss.
• Error Concealment Techniques
• They are the methods that are used to hide errors that occur once packets are lost.
− Usually both methods are employed.
181
Error Resiliency
182
Intra Coding Resiliency
20
25
30
35
40
45
20 40 60 80 100 120 140 160 180
Data Rate (kbps)
AveragePSNR
resil 0
loss 0
resil 5
loss 0
resil 10
loss 0
resil 0
loss 10-
20
resil 5
loss 10-
20
resil 10
loss 10-
20
− Increasing the number of Intra coded blocks that the encoder produces will reduce error propagation
since Intra blocks are not predicted.
− Blocks that are lost at the decoder are simply treated as empty Inter coded blocks (Skipped Blocks).
− The block is simply copied from the previous frame.
− Very simple to implement.
183
Simple Intra Coding & Skipped Blocks
184
Reference Picture Selection (RPS) Mode of H.263+
I or P
frame
P
frame
P
frame
Last acknowledged
error-free frame.
In RPS Mode, a frame is not used for prediction in the
encoder until it’s been acknowledged to be error free.
No acknowledgment
received yet - not used for
prediction.
− Select one of several picture memories/prediction structures to reduce error propagation.
Bad picture
• Back channel message types
– Neither: no back channel is returned form decoder to encoder
– ACK: decoder returns only acknowledgement messages
– NACK: decoder returns only non-acknowledgement messages
– ACK+NACK: decoder returns both types of messages
• Channel for Back channel messages
– Separate Logical Channel: uses separate logical channel in the multiplex layer of system
– VideoMux: sends back-channel data within forward video data of a video stream coded data
ACK-based: a picture is assumed to contain errors, and thus is not used for prediction unless an ACK is
received.
NACK-based: a picture will be used for prediction unless a NACK is received, in which case the previous
picture that didn’t receive a NACK will be used.
185
Reference Picture Selection (RPS) Mode of H.263+
186
Coding Control (CC)
T Q
Q
T
p
t
qz
q
v
Video
in
To
video
multiplex
coder
-1
-1
P AP1
AP2
APn
Reference Picture Selection (RPS) Mode of H.263+
Reference pictures are interleaved to create two or more independently decodable threads.
− If a frame is lost, the frame rate drops to 1/2 rate until a sync frame is reached.
− Same syntax as Reference Picture Selection, but without ACK/NACK.
− Adds some overhead since prediction is not based on most recent frame.
187
Multi-threaded Video
1
3
2
5 7 9
4
6
8 10
I
P
P
P
P P
P P
P
I
− A video encoder contains a decoder (called the loop decoder) to create decoded previous frames
which are then used for motion estimation and compensation.
− The loop decoder must stay in sync with the real decoder, otherwise errors propagate.
188
Conditional Replenishment
ME/MC DCT, etc.
Decoder
Decoder
Encoder
− One solution is to discard the loop decoder.
− Can do this if we restrict ourselves to just two macroblock types:
• Intra coded
• Empty (just copy the same block from the previous frame)
− The technique is to check if the current block has changed substantially since the previous frame and
then code it as Intra if it has changed. Otherwise mark it as empty.
− A periodic refresh of Intra coded blocks ensures all errors eventually disappear.
189
Conditional Replenishment
ME/MC DCT, etc.
Decoder
Decoder
Encoder
− Lost macroblocks are reported back to the encoder using a reliable back-channel.
− The encoder catalogs spatial propagation of each macroblock over the last M frames.
− When a macroblock is reported missing, the encoder calculates the accumulated error in each MB of the
current frame.
− If an error threshold is exceeded, the block is coded as Intra.
− Additionally, the erroneous macroblocks are not used as prediction for future frames in order to contain
the error.
190
Error Tracking
Appendix II, H.263
− Some parts of a bit stream contribute more to image artifacts than others if lost.
− The bit stream can be prioritized and more protection can be added for higher priority portions.
191
Prioritized Encoding
AC Coefficients
DC Coefficients
MB Information
Motion Vectors
Picture Header
Increasing
Error Protection
Unprotected Encoding
Prioritized Encoding
(23% Overhead)
Prioritized Encoding Demo
VideosusedwithpermissionofICSI,UCBerkeley
− To hide the image degradation from the viewer.
− The main idea behind error concealment is to replace the damaged pixels with pixels from some parts of
the video that have maximum resemblance.
− In general, pixel substitution may come from the same frame or from the previous frame.
− These are called intraframe and interframe error concealment, respectively
192
Error Concealment by Interpolation
d1
d2
Lost block
Take the weighted average of
4 neighboring pixels.
Error Concealment with
• Least Square Constraints
• Bayesian Estimators
• Polynomial Interpolation
• Edge-Based Interpolation
• Multi-directional Recursive Nonlinear Filter (MRNF)
193
Other Error Concealment Techniques
MPQT@0.5 bpp, block loss:10% MRNF-GMLOS, PSNR=34.94dB
Example: MRNF Filtering
− Most multimedia applications place the burden of rate adaptivity on the source.
− For multicasting over heterogeneous networks and receivers, it’s impossible to meet the conflicting
requirements which forces the source to encode at a least-common denominator level.
− The smallest network pipe dictates the quality for all the other participants of the multicast session.
− If congestion occurs, the quality of service degrades as more packets are lost.
194
Network Congestion
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs
Video Compression, Part 3-Section 2, Some Standard Video Codecs

More Related Content

What's hot

Modern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationModern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationDr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Dr. Mohieddin Moradi
 
Latest Technologies in Production & Broadcasting
Latest  Technologies in Production & BroadcastingLatest  Technologies in Production & Broadcasting
Latest Technologies in Production & BroadcastingDr. Mohieddin Moradi
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2Dr. Mohieddin Moradi
 
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDesigning an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDr. Mohieddin Moradi
 
Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Dr. Mohieddin Moradi
 
HDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsHDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsDr. Mohieddin Moradi
 
An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1 Dr. Mohieddin Moradi
 
High-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedHigh-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedIntel® Software
 
An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1    An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1 Dr. Mohieddin Moradi
 

What's hot (20)

Modern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operationModern broadcast camera techniques, set up & operation
Modern broadcast camera techniques, set up & operation
 
Video Quality Control
Video Quality ControlVideo Quality Control
Video Quality Control
 
Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2
 
HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4
 
Latest Technologies in Production & Broadcasting
Latest  Technologies in Production & BroadcastingLatest  Technologies in Production & Broadcasting
Latest Technologies in Production & Broadcasting
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2
 
Thinking about IP migration
Thinking about IP migration Thinking about IP migration
Thinking about IP migration
 
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDesigning an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
 
Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3
 
Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2
 
HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1
 
Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1
 
Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
 
HDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsHDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting Considerations
 
An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1
 
High-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) DemystifiedHigh-Dynamic Range (HDR) Demystified
High-Dynamic Range (HDR) Demystified
 
An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1    An Introduction to HDTV Principles-Part 1
An Introduction to HDTV Principles-Part 1
 
HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5
 

Similar to Video Compression, Part 3-Section 2, Some Standard Video Codecs

mpeg2ts1_es_pes_ps_ts_psi
mpeg2ts1_es_pes_ps_ts_psimpeg2ts1_es_pes_ps_ts_psi
mpeg2ts1_es_pes_ps_ts_psihexiay
 
Mpeg 2 transport streams
Mpeg 2 transport streamsMpeg 2 transport streams
Mpeg 2 transport streamschikien276
 
Mpegts introduction
Mpegts introductionMpegts introduction
Mpegts introductionfrankyao23
 
Sip technology overview
Sip technology overviewSip technology overview
Sip technology overviewOded Ben-Dori
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding StandardVideoguy
 
Multimedia
MultimediaMultimedia
MultimediaBUDNET
 
Mpeg 101 demyst analysis & picture symptoms 20110808 opt
Mpeg 101 demyst analysis & picture symptoms 20110808 optMpeg 101 demyst analysis & picture symptoms 20110808 opt
Mpeg 101 demyst analysis & picture symptoms 20110808 opthexiay
 
Mobile Communication Broadcast System Jochen Schiller
Mobile Communication Broadcast System Jochen SchillerMobile Communication Broadcast System Jochen Schiller
Mobile Communication Broadcast System Jochen SchillerSonali Chauhan
 
Video Compression Equipments
Video Compression EquipmentsVideo Compression Equipments
Video Compression EquipmentsBhargav Kalaria
 

Similar to Video Compression, Part 3-Section 2, Some Standard Video Codecs (20)

mpeg2ts1_es_pes_ps_ts_psi
mpeg2ts1_es_pes_ps_ts_psimpeg2ts1_es_pes_ps_ts_psi
mpeg2ts1_es_pes_ps_ts_psi
 
Mpeg 2 transport streams
Mpeg 2 transport streamsMpeg 2 transport streams
Mpeg 2 transport streams
 
IPTV Codec & Packeting
IPTV Codec & PacketingIPTV Codec & Packeting
IPTV Codec & Packeting
 
Mpegts introduction
Mpegts introductionMpegts introduction
Mpegts introduction
 
Sip technology overview
Sip technology overviewSip technology overview
Sip technology overview
 
Video Coding Standard
Video Coding StandardVideo Coding Standard
Video Coding Standard
 
Beginning of dtv
Beginning of dtvBeginning of dtv
Beginning of dtv
 
Multimedia
Multimedia Multimedia
Multimedia
 
RTP.ppt
RTP.pptRTP.ppt
RTP.ppt
 
Multimedia
MultimediaMultimedia
Multimedia
 
Multimedia Object - Audio
Multimedia Object - AudioMultimedia Object - Audio
Multimedia Object - Audio
 
Spec00479
Spec00479Spec00479
Spec00479
 
Mpeg 101 demyst analysis & picture symptoms 20110808 opt
Mpeg 101 demyst analysis & picture symptoms 20110808 optMpeg 101 demyst analysis & picture symptoms 20110808 opt
Mpeg 101 demyst analysis & picture symptoms 20110808 opt
 
Set Top Box
Set Top BoxSet Top Box
Set Top Box
 
Mobile Communication Broadcast System Jochen Schiller
Mobile Communication Broadcast System Jochen SchillerMobile Communication Broadcast System Jochen Schiller
Mobile Communication Broadcast System Jochen Schiller
 
Video Compression Equipments
Video Compression EquipmentsVideo Compression Equipments
Video Compression Equipments
 
digital_set_top_box2
digital_set_top_box2digital_set_top_box2
digital_set_top_box2
 
digital_set_top_box2
digital_set_top_box2digital_set_top_box2
digital_set_top_box2
 
digital_set_top_box
digital_set_top_boxdigital_set_top_box
digital_set_top_box
 
digital_set_top_box2
digital_set_top_box2digital_set_top_box2
digital_set_top_box2
 

More from Dr. Mohieddin Moradi

An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Dr. Mohieddin Moradi
 
An Introduction to Audio Principles
An Introduction to Audio Principles An Introduction to Audio Principles
An Introduction to Audio Principles Dr. Mohieddin Moradi
 
Video Compression, Part 4 Section 1, Video Quality Assessment
Video Compression, Part 4 Section 1,  Video Quality Assessment Video Compression, Part 4 Section 1,  Video Quality Assessment
Video Compression, Part 4 Section 1, Video Quality Assessment Dr. Mohieddin Moradi
 
Video Compression, Part 4 Section 2, Video Quality Assessment
Video Compression, Part 4 Section 2,  Video Quality Assessment Video Compression, Part 4 Section 2,  Video Quality Assessment
Video Compression, Part 4 Section 2, Video Quality Assessment Dr. Mohieddin Moradi
 

More from Dr. Mohieddin Moradi (9)

HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3
 
HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2
 
SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1
 
An Introduction to Audio Principles
An Introduction to Audio Principles An Introduction to Audio Principles
An Introduction to Audio Principles
 
Video Compression, Part 4 Section 1, Video Quality Assessment
Video Compression, Part 4 Section 1,  Video Quality Assessment Video Compression, Part 4 Section 1,  Video Quality Assessment
Video Compression, Part 4 Section 1, Video Quality Assessment
 
Video Compression, Part 4 Section 2, Video Quality Assessment
Video Compression, Part 4 Section 2,  Video Quality Assessment Video Compression, Part 4 Section 2,  Video Quality Assessment
Video Compression, Part 4 Section 2, Video Quality Assessment
 

Recently uploaded

Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesSreedhar Chowdam
 
Detection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackingDetection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackinghadarpinhas1
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
input buffering in lexical analysis in CD
input buffering in lexical analysis in CDinput buffering in lexical analysis in CD
input buffering in lexical analysis in CDHeadOfDepartmentComp1
 
Chapter 9 Mechanical Injection Systems.pdf
Chapter 9 Mechanical Injection Systems.pdfChapter 9 Mechanical Injection Systems.pdf
Chapter 9 Mechanical Injection Systems.pdfFaizanAhmed396943
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProRay Yuan Liu
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Network Enhancements on BitVisor for BitVisor Summit 12
Network Enhancements on BitVisor for BitVisor Summit 12Network Enhancements on BitVisor for BitVisor Summit 12
Network Enhancements on BitVisor for BitVisor Summit 12cjchen22
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...IJAEMSJORNAL
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliNimot Muili
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier Fernández Muñoz
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
Indian Tradition, Culture & Societies.pdf
Indian Tradition, Culture & Societies.pdfIndian Tradition, Culture & Societies.pdf
Indian Tradition, Culture & Societies.pdfalokitpathak01
 
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...shreenathji26
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 

Recently uploaded (20)

Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
Detection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and trackingDetection&Tracking - Thermal imaging object detection and tracking
Detection&Tracking - Thermal imaging object detection and tracking
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
input buffering in lexical analysis in CD
input buffering in lexical analysis in CDinput buffering in lexical analysis in CD
input buffering in lexical analysis in CD
 
Chapter 9 Mechanical Injection Systems.pdf
Chapter 9 Mechanical Injection Systems.pdfChapter 9 Mechanical Injection Systems.pdf
Chapter 9 Mechanical Injection Systems.pdf
 
A brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision ProA brief look at visionOS - How to develop app on Apple's Vision Pro
A brief look at visionOS - How to develop app on Apple's Vision Pro
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Versatile Engineering Construction Firms
Versatile Engineering Construction FirmsVersatile Engineering Construction Firms
Versatile Engineering Construction Firms
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Network Enhancements on BitVisor for BitVisor Summit 12
Network Enhancements on BitVisor for BitVisor Summit 12Network Enhancements on BitVisor for BitVisor Summit 12
Network Enhancements on BitVisor for BitVisor Summit 12
 
Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
Guardians of E-Commerce: Harnessing NLP and Machine Learning Approaches for A...
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
 
Javier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptxJavier_Fernandez_CARS_workshop_presentation.pptx
Javier_Fernandez_CARS_workshop_presentation.pptx
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
Indian Tradition, Culture & Societies.pdf
Indian Tradition, Culture & Societies.pdfIndian Tradition, Culture & Societies.pdf
Indian Tradition, Culture & Societies.pdf
 
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
Introduction to Artificial Intelligence: Intelligent Agents, State Space Sear...
 
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptxTriangulation survey (Basic Mine Surveying)_MI10412MI.pptx
Triangulation survey (Basic Mine Surveying)_MI10412MI.pptx
 

Video Compression, Part 3-Section 2, Some Standard Video Codecs

  • 2. Section I – ISO/IEC JTC 1/SC 29 Structure and MPEG – ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group) – A Generic Interframe Video Encoder – H.261 Video Coding Standard – MPEG-1 Video Coding Standard – MPEG-2 Video Coding Standard Section II – MPEG-2 Transport and Program Streams – H.263 Video Coding Standard – H.263+ Video Coding Standard – H.263++ Video Coding Standard – Bit-rate (R) and Distortion (D) in Video Coding 2 Outline
  • 4. − Created in 1992 • 300 members, >35 countries - www.dvb.org • Promotion of open standards for Digital TV broadcasting − Principal Recommandations • Physical Layer − Satellite: DVB-S, DVB-S2 − Cable: DVB-C − Terrestrial: DVB-T, DVB-T2 − Mobiles DVB-H, DVB-SH • Signalisation − Information de services: DVB-SI − Services synchro: DVB-SAD • Protection − DVB-CAS, DVB-CSA − Interface smartcard: DVB-CI, DVB-CI+ 4 DVB
  • 6. …bits bits bits ... Video or Audio Elementary Stream (ES) PES Packet Header Payload Packetized Elementary Stream (PES) Time stamps TS Packet (188 bytes) Header Payload MPEG 2 Transport Stream (TS) PID TS Header, contains PID and clock PES Header Rule: Every elementary stream gets its own (Packet ID) PID The MPEG Transport Stream 6
  • 7. Processing of The Streams in The STB Tuner/ Demod MPEG2 Demux Video Decomp. Audio Decomp. System Memory Processor • 6 TV • 20 Radio • Service Information QAM OFDM A/D A/D MPEG2-TS : 40 Mbit/s, e.g..: 188 188 MPEG2-TS PID Header Payload DEMUX queues PID 1 PID 2 section section QPSK 7
  • 8. 8 Digital Terrestrial TV - Layers . . . provide clean interface points. . . . Picture Layer Multiple Picture Formats and Frame Rates 1920 x 1080 1280 x 720 50,25, 24 Hz Transmission Layer 7 MHz COFDM / 8-VSBVHF/UHF TV Channel Video Compression Layer MPEG-2 compression syntax ML@MP or HL@MP Data Headers Motion Vectors Chroma and Luma DCT Coefficients Variable Length Codes Transport Layer MPEG-2 packets Video packet Video packetAudio packet Aux data Packet Headers Flexible delivery of data
  • 9. 9 Digital Television Encode Layers Delivery System Bouquet Multiplexer Program 2 Program 3 Service Mux Other Data Control Data Program Association Table (PAT) Picture Coding Audio Coding Data Coding MPEG-2 or AC-3MPEG-2 Control Data Video Data Sound Modulator & Transmitter Error Protection Control Data 188 byte packetsMPEG Transport Data Stream Program 1 Multiplexer MPEG Transport Stream MuxControl Data Program Map Table (PMT) PES PES PES
  • 10. 10 Digital Television Decode Layers Audio Decoder Data Decoder Picture Decoder MPEG or AC-3MPEG-2 Demodulator & Receiver Error Control Delivery System DataMon Speakers MPEG Transport Stream De-Multiplexer MPEG DeMux Transport Stream
  • 11. 11 − MPEG-2 Container formats (a file format that can contain data compressed by standard codecs) • TS: Transport Stream (Multiplexed A/V PES and User Data) • PS: Program Stream − PES: Packetized Elementary Stream, Audio or Video − ES: Elementary Streams-Compressed Data Video Data Audio Data Elementary Streams Video Encoder Audio Encoder Packetizer Packetizer ES ES Video PES Program Stream MUX Transport Stream MUX Audio PES PS: Program Stream TS: Transport Stream MPEG-2 Video System Standard For noisier environments such as terrestrial broadcast channels For an error-free environment such as Digital Storage Media (DSM)
  • 12. 12 MPEG-2 Packetized Elementary Stream (PES) MPEG-2 Video Video ES (Elementary Stream) I0 P3 B1 B2 P6 B3 B4 I9 B7 B8 P12 B B P B B I I0 B1 B2 P3 B4 B5 P6 B7 B8 Video Frames Frame Frame Frame Frame Frame Frame Frame Frame Frame MPEG-2 System Subband Samples Side Information Sync, System Info. and CRC Ancillary Data Field Audio ES (Elementary Stream) MPEG-2 System Audio Tracks frame frame frame frame frame frame frame frame frame frame frame frame MPEG-2 Audio Video PES (Packetized Elementary Stream) Audio PES (Packetized Elementary Stream)
  • 13. Output from MPEG-2 System Encoder: 13 MPEG-2 Packetized Elementary Stream (PES) MPEG -2 System Processor Elementary Stream (ES): - Digital Control Stream - Digital Audio (compressed) - Digital Video (compressed) - Digital Data PES Packet has 6 bytes Protocol Header • 3 bytes start code • 1 byte stream ID – 110x xxxx audio stream number x xxxx – 1110 yyyy video stream number yyyy – 1111 0010 DSM-CC (Digital Storage Media) control packet • 2 bytes length field Packet Start Code Prefix 24 Stream ID 8 16 PES Header (optional) PES Packet PES Packet Length Packet Start Code Prefix 24 Stream ID 8 16 PES Header (optional) PES Packet PES Packet Length (up to 65536 bytes including 6 byte protocol header)
  • 14. 14 PES Packet Syntax Diagram Packet Start Code Prefix 24 Stream ID 8 16 PES Header (optional) PES Packet Data Bytes PES Packet Length ’10’ PES Scrambling Control PES Priority Optional Fields 7 Flags Copyright PES Header Length Data Alignment Indicator Stuffing Bytes (0xFF) Original or Copy 2 2 1 81 11 DSM Trick Mode PTS DTS PES Extension Additional Copy Info ES Rate ESCR Previous PES CRC1 33 42 22 8 7 16 5 Flags Optional Fields Pack Header Field PES Private Data Program Packet Seq Counter P-STD Buffer PES Extension Length PES Extension Data 128 168 16 7 8 m * 8
  • 15. Packetized Elementary Stream • The basic stream format for video, audio, data, .. • PES offers a mechanism to carry conditional access information • PES can be scrambled and also assigned priority • PES can carry time references: PTS and DTS • The largest data size within a PES packet is 64k Bytes. PES Indicators • PES_Priority - Indicates priority of the current PES packet. • PES_Scrambling_Control - Defines whether scrambling is used, and the chosen scrambling method. • Data_alignment_indicator - Indicates if the payload starts with a video or audio start code. • Copyright information - Indicates if the payload is copyright protected. • Original_or_copy - Indicates if this is the original ES 15 MPEG-2 Packetized Elementary Stream (PES)
  • 16. PES Optional Field − Presentation Time Stamp (PTS) and possibly a Decode Time Stamp (DTS) • For audio / video streams these time stamps which may be used to synchronize a set of elementary streams and control the rate at which they are replayed by the receiver. − Elementary Stream Clock Reference (ESCR) − Elementary Stream Rate - Rate at which the ES was encoded. − Trick Mode - indicates the video/audio is not the normal ES, e.g. after DSM-CC has signaled a replay. − Copyright Information - set to 1 to indicated copyright ES. − CRC - this may be used to monitor errors in the previous PES packet − PES Extension Information - may be used to support MPEG-1 streams. 16 MPEG-2 Packetized Elementary Stream (PES)
  • 17. It is the central structure used in both PS and TS Streams; results from packetizing continuous streams of compressed audio or video − PES packets contain 2 timestamps 1. Decoding Time Stamp (DTS) – this tells the decoder when the packet should be decoded. The data is then decoded into the bit stream. 2. Presentation Time Stamp (PTS) – this tells the decoder when the data should be displayed. − The systems part specifies that the decoder must contain a Systems Time Clock (STC: Systems Time Clock). • When a decoder’s STC is equal to a packet’s DTS the data in the packet is decoded • When the STC is equal to a packet’s PTS the decoded data is sent to the display device (eg. graphics card or sound card) • The state of the encoders clock is placed in the stream at regular intervals. This synchronises the decoder with the encoder. 17 MPEG-2 Packetized Elementary Stream (PES)
  • 18. − Packetising the continuous streams of compressed video and audio bitstreams (elementary streams or ES) generates PES packets. − A typical method of transmitting elementary stream data from a video or audio encoder is to first create PES packets from the elementary stream data and then to encapsulate these PES packets inside Transport Stream (TS) packets or Program Stream (PS) packets. − The TS packets can then be multiplexed and transmitted using broadcasting techniques, such as those used in an ATSC and DVB. − Simply stringing together PES packets from the various encoders with other packets containing necessary data to generate a single bitstream generates a programme stream. − A transport stream consists of packets of fixed length containing 4 bytes of header followed by 184 bytes of data, where the data are obtained by segmenting the PES packets. 18 MPEG-2 Packetized Elementary Stream (PES)
  • 19. 19 MPEG-2 Transport Stream (TS) Multiplexing Subsystem Multiplexer TransportAudio Compression Digital Modulation Error Correction Encoder Video Compression Video Ancillary data Audio Transmission Subsystem Control data Mixer Video Subsystem Audio Subsystem ES ES ES ES TS ES Paketizer ES Paketizer ES Paketizer PES PES PES PES
  • 20. …bits bits bits ... Video or Audio Elementary Stream (ES) PES Packet Header Payload Packetized Elementary Stream (PES) Time stamps TS Packet (188 bytes) Header Payload MPEG 2 Transport Stream (TS) PID TS Header, contains PID and clock PES Header Rule: Every elementary stream gets its own (Packet ID) PID The MPEG Transport Stream 20
  • 21. Program 1 Video 1 PES Program 2 video 2 PES Audio 1 PES Transport Stream 188 Bytes MPEG-2 Transport Stream (TS) Formation 21
  • 22. MPEG-2 Transport Stream Packetizer Packetizer Packetizer Packetizer Packetizer Packetizer Packetizer Video Encoder Audio Encoder Video Encoder Audio Encoder Video Encoder Audio Encoder Packetizer Packetizer Program 1 Video_1 Audio_1 Data_1 Program 2 Program 3 Video_2 Audio_2 Data_2 Video_3 Audio_3 Data_3 TRANSPORT MUX TP1_1 TP2_1 TP1_2 TP2_2 TP3_1 TP1_3 TP2_3 TP3_2 Transport Stream TP3_3 TP1_1 TP1_2 TP1_3 TP2_1 TP2_2 TP2_3 TP3_1 TP3_2 TP3_3 Transport MuxTransport Mux Transport Mux MPEG-2 Transport Stream (TS) Formation 22
  • 23. 23 MPEG-2 Transport Stream (TS) Packet Video Audio Teletext (DVB) SI Cond. Access IP Packets Private Data Applications App. Info Time Division Multiplexing (TDM) MPEG-2 packets can contain − Video, Audio, Teletext, Data streaming (13818-1) − DSM-CC (Digital Storage Media Command and Control): data carousel, object carousel, SI-tables, etc ) (13818-6) − DVB Data Piping 1 TS Packet (188 Bytes) Payload PES / Section / Piped Data (( 184-n) Byte ) Header with PID ( 3 byte ) Adaptation Field ( n byte ) Sync ( 1 byte ) TS Packets
  • 24. It significantly differs from MPEG-1. • It offers robustness for noisy channels • It offer ability to assemble multiple programmes into a single stream. • It uses fixed-length packets of size 188 bytes with a new header syntax. • This packet can be segmented into four 47 bytes to be accommodated in the payload of four ATM cells, with the AAL1 adaptation scheme. • It is therefore more suitable for hardware processing and for error correction schemes, such as those required in television broadcasting, satellite/cable TV and ATM networks. 24 The MPEG Transport Stream
  • 25. The transport stream uses a fixed packet length (188 bytes) • This allows easy decoder/encoder synchronisation. • It also allows error correction codes to be inserted. Transport Streams can contain packets from a number of Programs • These can be different TV channels or maybe an EPG. • Each program has a unique Packet ID placed in the packet header. • Decoder can discard packets of other programs by checking the PID. 25 The MPEG Transport Stream
  • 26. − The multiple programmes with independent time bases can be multiplexed in one transport stream. − The transport stream also allows • Synchronous multiplexing of programmes • Fast access to the desired programme for channel hopping • Multiplexing of programmes with clocks unrelated to transport clock • Correct synchronization of elementary streams for playback. • Control of the decoder buffers during start-up and playback for both constant and variable bit rate (VBR) programmes. 26 The MPEG Transport Stream
  • 27. Sync Byte PID 8 1 1 1 13 2 2 Continuity Counter PES 1 PES 2 PES N…….. Adaptation Field Transport Error Indicator Payload Unit Start Indicator Transport Priority Flags 51 1 1 Flag 8 3 Optional Fields Adaptation Field Length 8 Adaptation Field Extension Adaptation Field Control Scrambling Control PCR Original OCR (OPCR) Private Data Length 8 Private ….. Data …. Adaptation Field Extension Length Discontinuity Indicator Random Access Indicator PES Priority Indicator 188 bytes Sync 1 byte Header 3 bytes TS Payload 184 bytes MPEG-2 Transport Stream (TS) Packet 4 Splice Countdown PCR Fields TS Header 42 42 27
  • 28. PID numbers for Program Specific Information (PSI) used for Service Information (SI) 0x0000 PAT Program Association Table 0x0001 CAT Conditional Access Table 0x0002 TSDT Transport Stream Description Table, EI DVB 0x0003-0x000F reserved 0x0010 NIT, ST Network Information Table, Stuffing Table 0x0011 SDT, BAT, ST Service Description Table, Bouquet Association Table, Stuffing Table 0x0012 EIT, ST Event Information Table, Stuffing Table 0x0013 RST, ST Running Status Table, Stuffing Table 0x0014 TDT, TOT, ST Time and Date Table, Time Offset Table, Stuffing Table 0x0015 Network synchronization 0x0016-0x001D reserved for future use (0x001E DIT Discontinuity Information Table (0x001F SIT Selection Information Table 188 bytes Sync 1 byte Header 3 bytes Optional Adaptation Field X bits Payload 184 bytes PID Packet Identifier 13 bits MPEG-2 Transport Stream (TS) Packet 28
  • 29. PID − Indicates where the data goes • Allows filtering of packet for non viewed programs − Does not indicate PES/section or coding type − Reserved PID • Some PSI data • Program Assocation Table (PAT) • Conditional Acces Table (CAT) • Transport Stream Description Table (TSDT) • User-reserved: Other standard bodies (DVB, ATSC, …) PSI − Multiplex description − Program description − Stream Descirpion 29 Program ID (PID) and Program Service Information (PSI)
  • 30. 30 Program Association Table (PAT) Program # 100 – PMT PID 1025 Program # 200 – PMT PID 1026 Program Map Table (PMT) Program # 100 Video PID – 501 – MPEG-2 Video Audio PID (English) – 502 – MPEG-2 Audio Audio PID (Spanish) – 503 – MPEG-2 Audio Program Map Table (PMT) Program # 200 Video PID – 601 – AVC Video Audio PID (English) – 602 – AAC Audio MPEG-2 Signaling Tables
  • 31. 31 MPEG-2 Signaling Tables Network Information Bouquet Association Service Description Event Information Running Status Time & Date Stuffing
  • 33. Program Association Table (PAT) • Identifies a multiplex (ID 16 bits) (The PAT is sent with the well-known PID value of 0x0000) • Lists all programs (Lists the PIDs of tables describing each program) ─ Program Number (16 bit) ─ PID carrying PMT • If PID= 0, NIT Program Map Table (PMT) • Defines the set of PIDs associated with a program, e.g. audio, video, ... • PID carrying the PCR ─ Not always a media stream ! • Program Descriptors ─ Protection systems, interactive apps … • Lists all streams ─ PID: where stream data is carried in the multiplex ─ streamType: type of media compression ─ Stream descriptors • Language, coding parameters, demux parameters, … 33 MPEG-2 Signaling Tables Program Associate Table (PAT) Program Map Table (PMT) Other PacketsAudio Packet Video Packet 51 51 51 6664 0 150 101
  • 34. CAT - Conditional Access Table − Defines type of scrambling used and PID values of transport streams which contain the conditional access management and entitlement information (EMM). TSDT- Transport Stream Description Table − Contains descriptors relating to the overall transport stream 34 MPEG-2 Signaling Tables
  • 35. NIT - Network Information Table − It contains details of the bearer network (network topology) used to transmit the MPEG multiplex, including the carrier frequency Service Description Table (SDT) − Multiplex Description (channel names, …) − Editorial description of the services in a TS − Service names and ancillary services Event Information Table (EIT) − Electronic Program Guide for present and following shows Time and Date Table (TDT) − Current date and time, UTC (used to synchronize STB system time) 35 MPEG-2 Signaling Table (DVB, Mandatory)
  • 36. Bouquet Association Table (BAT) − Commercial operator description and services − Several commercial operators may sell the same services Running Status Table (RST) Stuffing Table (ST) Time Offset Table (TOT) − Local offset by region (used to synchronize STB system time) Application Information Table (AIT) − Interactive App signaling (MHP, HbbTV,…) − Type d’application IP/MAC Notification Table (INT) − IP Transport 36 MPEG-2 Signaling Tables (DVB, Optional)
  • 37. − Scrambling may happen: • At PES payload level • At some sections payload level • At TS packet level − Most common use case − PES headers are scrambled − Exceptions • PAT: required to get list of programs • PMT: required to get protection system used • NIT/TSDT (Transport Stream System Target Decoder): infrastructure management 37 Scrambling in MPEG-2 TS
  • 38. AV Synchronization − Want audio and video streams to be played back in sync with each other − Video stream contains “Presentation Time Stamps (PTS)” − MPEG-2 clock runs at 90 kHz • Good for both 25 and 30 fps − Each program carries a clock • Program Clock Reference (PCR) – PCR timestamps are sent with data by sender • PES Timestamps relate to this clock − Receiver uses PLL to synchronize clocks 38 MPEG-2 TS Timing
  • 39. bit Byte 7 6 5 4 3 2 1 0 Program Clock Reference (PCR) base The intended time, in 90 kHz clock symbols, of the arrival at the input of the decoder of the fourth byte of this structure. (cont.) reserved PCR extension. Additional resolution, in 27 MHz clock. PCR = 300*base + ext PCROPCR Original PCR (OPCR) base It should not be modified by any multiplexer or decoder Used for recovery of single-program PCR from multi-program Transport Stream (cont.) reserved Original PCR extension MPEG-2 Transport Stream (TS) Packet PCR Original PCR (OPCR) PCR Fields 42 42 PCR (Program Clock Reference) 39
  • 40. Program Associate Table (PAT) Program Map Table (PMT) Other PacketsAudio Packet Video Packet Packet header includes a unique Packet ID (PID) for each stream PAT lists PIDs for program map tables Network Info=10 Prog 1 = 150 Prog 2 = 301 Prog 3 = 511 etc. Program guides Subtitles Multimedia data Internet Packets etc. PMT lists PID associated with a particular program Video = 51 Audio (English) = 64 Audio (French) = 66 Subtitle = 101 etc. 51 51 51 6664 0 150 101 MPEG-2 Signaling Tables 40
  • 41. MPEG-2 Example Transport Stream Packet 41 Example Transport Stream Packet 188 Bytes Header Flags • Transport Error Indicator • Payload Unit Start Indicator • Transport Priority • Transport Scrambling Control Important PIDs • 0x0000 – PAT PID • 0x1FFF – “Null PID” gives space for VBR Continuity Counter (CC) • 4-bit per-PID sequence # • Helps detect packet loss Adaptation Field (optional) • Can carry range of other info • PCR, splice point flags • Transport of private data Example Transport Stream 0x47 (sync) Flags PID (Payload ID) More Flags CC Adaptation Field Data Payload PID 0 CC 3 PAT Data PID 601 CC 11 PID 602 CC 7 PID 0x1FFF NULL PID 601 CC 12 PID 602 CC 8
  • 42. MPEG-2/DVB PID Allocation − Program Association Table (PAT) • always has PID = 0 (zero) − Conditional Access Table (CAT) • always has PID = 1 − Event Information Table (EIT) • always has PID = 18 (0x0012) − Program Map Tables (PMTs) • have the PIDs specified in the PAT − The audio, video, PCR, subtitle, teletext etc PIDs for all programs are specified in their respective PMTs MPEG-2/DVB PID Allocation Table PID value PAT 0x0000 CAT 0x0001 TSDT 0x0002 Reserved 0x0003 – 0x000F NIT, ST 0x0010 SDT, BAT, ST 0x0011 EIT, ST 0x0012 RST, ST 0x0013 TDT, TOT, ST 0x0014 Network Synchronization 0x0015 Reserved 0x0016 – 0x001B Inband signaling 0x001C measurement 0x001D DIT 0x001E SIT 0x001F 42
  • 43. Increase resilience to transmissions errors − Redundancy − Reed Solomon 255/191, 25% redundant − Each RS column is send in a section − FEC aggregation is in another table • Can be ignored • Does not interfere with MPE Without modifying existing implementations − No modification on MPE (MPEG Movie File) sections • Each MPE+IP on a section • Aggregation of IP datagrams in memory 43 DVB MPE-FEC
  • 44. 44 Data over DVB − Data piping • raw transport on a PID − Data streaming • send in PES packets − DSM-CC Data carrousel • Transport on sections − Object Carrousel • Data Carousel + file system − Multi Protocol Encapsulation (MPE) • IP datagram over TS Application
  • 45. Program 0 PID=16 Program 1 PID=22 Program 2 PID=33 … … Program M PID=55 PMT (Program Map Table) for Program 1 CAT (Conditional Access Table) (PID=1) NAT (Network Information Table) (always Program 0, PID=16) NIT is considered a Private data by ISO Table section ID assigned by systemTable section ID always set to 0x01 Table section ID always set to 0x02 Table section ID always set to 0x00 Stream 1 PCR 31 Stream 2 Video 1 54 Stream 3 Audio 1 48 Stream 3 Audio 2 49 … … … Stream k Data K 66 PAT (Program Associate Table) (PID=0) CA Section 1 (Program 1) EMM PID(99) CA Section 2 (Program 2) EMM PID(109) CA Section 3 (Program 3) EMM PID(119) … … CA Section k (Program k) EMM PID(x) Private Section 1 NIT Info. Private Section 2 NIT Info. Private Section 3 NIT Info. … … Private Section k NIT Info. 0 PAT 22 Prog 1. PMT 33 Prog 2. PMT 99 Prog 1 EMM 31 Prog 1 PCR 48 Prog 1 Audio 1 54 Prog 1 Video 1 109 Prog 2 EMM Multiple-Program MPEG-2 Transport Stream: PMT (Program Map Table) for Program 2 Stream 1 PCR 41 Stream 2 Video 1 19 Stream 3 Audio 1 81 Stream 3 Audio 2 82 … … … Stream k Data K 88 MPEG-2 / DVB PSI (Program Specific Information) Structure
  • 46. 46 Transport Multiplexing & Decoding Transport Stream Demultiplex and Decoder Clock Control Video Decoder Channel Specific Decoder Audio Decoder Decoded Video Decoded Audio Transport stream containing one or multiple programs Transport Stream Demultiplex and Decoder Channel Specific Decoder Transport Stream with single program Program Stream ≠ Transport Stream Channel Channel
  • 47. 47 Transport Stream Decoder Multiplex Buffer Video Decoder Transport Buffer Re-order Buffer Decoded Video Decoded Audio ES Stream Buffer Multiplex Buffer Transport Buffer ES Stream Buffer Multiplex Buffer Transport Buffer ES Stream Buffer Audio Decoder System Info. Decoder System Control Transport Stream Decoder
  • 48. − At the receiver, the transport streams are decoded by a transport demultiplexer (which includes a clock extraction mechanism), unpacketised by a depacketiser and sent to audio and video decoders for decoding. − The decoded signals are sent to the receiver buffer and presentation unit, which outputs them to a display device and a speaker at the appropriate time. − Similarly, if the programme streams are used, they are decoded by the programme stream demultiplexer and depacketiser and sent to the audio and video decoders. − The decoded signals are sent to the respective buffer to await presentation. − Also similar to MPEG-1 systems, the information about systems timing is carried by the clock reference field in the bitstream that is used to synchronise the decoder Systems Time Clock (STC). − Presentation Time Stamps (PTS), which are also carried by the bitstream, control the presentation of the decoded output. 48 Transport Stream Decoder
  • 49. − For a payload of around 19 Mb/s • 1 HDTV service - sport & high action • 2 HDTV services - both film material • 1 HDTV + 1 or 2 SDTV non action/sport • 3 SDTV for high action & sport video • 6 SDTV for film, news & soap operas • However you do not get more for nothing. − More services means less quality 49 Examples of DVB Data Containers Single HDTV program HDTV 1 SDTV 1 SDTV 2 SDTV 3 SDTV 4 SDTV 5 Multiple SDTV programs SDTV 1 HDTV 1 Simulcast HDTV & SDTV Channel bandwidth can be used in different ways
  • 50. 50 − MPEG-2 Container formats (a file format that can contain data compressed by standard codecs) • TS: Transport Stream (Multiplexed A/V PES and User Data) • PS: Program Stream − PES: Packetized Elementary Stream, Audio or Video − ES: Elementary Streams-Compressed Data Video Data Audio Data Elementary Streams Video Encoder Audio Encoder Packetizer Packetizer ES ES Video PES Program Stream MUX Transport Stream MUX Audio PES PS: Program Stream TS: Transport Stream MPEG-2 Video System Standard For noisier environments such as terrestrial broadcast channels For an error-free environment such as Digital Storage Media (DSM)
  • 52. Program Stream (PS) − It is similar to the MPEG-1 systems stream but uses a modified syntax and new functions to support advanced functionalities (e.g. scalability). − It provides compatibility with the MPEG-1 systems (MPEG-2 should be capable of decoding an MPEG-1 bitstream. − Like the MPEG-1 decoder, programme stream decoders typically employ long- and variable-length packets. Such packets are well suited for software-based processing and error free transmission environments ( such as storage, disk). − The packet sizes are usually 1–2 kbytes long, chosen to match the disc sector sizes (typically 2 kbytes). − However, packet sizes as long as 64 kbytes are also supported. 52 MPEG-2 Systems
  • 53. 53 MPEG-2 Systems Program Stream (PS) − It includes features not supported by MPEG-1 systems. • Scrambling of data • Assignment of different priorities to packets • Information to assist alignment of elementary stream packets • Indication of copyright • Indication of fast forward, fast reverse and other trick modes for storage devices. • An optional field in the packets is provided for testing the network performance • Optional numbering of a sequence of packets is used to detect lost packets.
  • 55. − H.263 standardization effort started Nov 1993 (finalization:1995) − The primary goal in the H.263 standard codec was coding of video at low or very low bit rates (less than 64 kbps) for applications such as mobile networks, public switched telephone network (PSTN) and the narrowband Integrated Services Digital Network (ISDN). − Later on, the codec was found so attractive that higher resolution pictures could also be coded at relatively low bit rates. − The standard recommends operation on five standard pictures of the CIF family, known as sub- QCIF, QCIF, CIF, 4CIF and 16CIF. − The H.263+ (H.263 Ver. 2) was the first set of extensions to this family, which was intended for near-term standardisation of enhancements of H.263 video coding algorithms for real-time telecommunications. − Work on improving the encoding performance was an ongoing process under H.263++ (H.263 Ver. 3), and every now and then a new extension called annex was added to the family. 55 H.263, H.263+ and H.263++ Standard
  • 56. − The codec for long-term standardisation was called H.26L. − The H.26L project had the mandate from ITU-T to develop a very low bit rate (less than 64 kbit/s with emphasis on less than 24 kbit/s) video coding recommendation achieving • Better Video Quality • Lower Delay • Lower Complexity • Better Error Resilience − In 2001, MPEG-4 committee joined the project in investigating new video coding techniques and technologies as candidates for recommendation. − The joint team eventually recommended the Joint Video Team (JVT) Codec which is informally known as Advanced Video Coding (AVC). − The standard is formally known as H.264 by the ITU-T and MPEG-4 part 10 by ISO/IEC. 56 H.26L Standard
  • 57. − H.263 is a combination of H.261 and MPEG − H.261 only accepts QCIF and CIF format → Various picture formats such as sub-QCIF,4CIF, etc. − No 1/2 pel motion estimation in H.261, instead it uses a spatial loop filter → Half-pel motion compensation − H.261 does not use median predictors for motion vectors but simply uses the motion vector in the MB to the left as predictor. − In H.263 there are four negotiable options − H.261 does not use a 3-D VLC for transform coefficient coding → 3D VLC for transform coefficients − GOB headers are mandatory in H.261. − Quantizer changes at MB granularity requires 5 bits in H.261 and only 2 bits in H.263. − No loop filter in H.263 − No macroblock addressing in H.263 (include in MB header) 57 H.263 Improvements over H.261
  • 58. Unrestricted Motion Vector Mode (Annex D) –MVs are allowed to point outside (outside pixels obtained from boundary repetition extension) –Larger ranges: [-31.5, 31.5] instead of [-16, 15.5] Syntax-Based Arithmetic Coding Mode (Annex E) –Provide about 5% bit rate reduction and rarely used Advanced Prediction Mode (Annex F) –Allow 4 motion vectors per MB, one for each 8x8 block –Overlapped block motion compensation for luminance –Allow MVs point outside of picture (Motion vectors can now point to outside of picture). –Reduce blocking artifacts and increase subjective picture quality. PB-Frames Mode (Annex G) –Double the frame rate without significant increase in bit rate Usage: – The decoder signals the encoder which of the options it has the capability to decode. – If the encoder supports some of these options, it may enable them. 58 Negotiable Options in H.263
  • 59. H.261 H.263 Demo: QCIF, 8 fps @ 28 Kb/s 59
  • 60. Composed of a baseline plus four negotiable options 60 ITU-T Recommendation H.263 Baseline Codec Unrestricted/Extended Motion Vector Mode Advanced Prediction Mode PB Frames Mode Syntax-based Arithmetic Coding Mode
  • 61. Always 12:11 pixel aspect ratio. 61 Frame Formats Format Y U,V SQCIF 128x96 64x48 QCIF 176x144 88x72 CIF 352x288 176x144 4CIF 704x576 352x288 16CIF 1408x1152 704x576 352 288 Pixel 12:11 Picture 4:3
  • 62. Picture & Macroblock Types − Two picture types: • Intra (I-frame) implies no temporal prediction is performed. • Inter (P-frame) may employ temporal prediction. − Macroblock (MB) types:  Intra & Inter MB types (even in P-frames). • Inter MBs have shorter symbols in P frames • Intra MBs have shorter symbols in I frames  Not coded MB types- MB data is copied from previous decoded frame. 62 H.263 Baseline
  • 63. Motion Vectors − Motion vectors have 1/2 pixel granularity. − Reference frames must be interpolated by two. − MV’s are not coded directly, but rather a median predictor is used. − The predictor residual is then coded using a VLC table. 63 H.263 Baseline X CB A CBAX MVMVMVMVMV ,,median
  • 64. Motion Vector Delta (MVD) Symbol Lengths 64 H.263 Baseline 0 2 4 6 8 10 12 14 0 0.5 1 1.5 2 2.5 - 3.5 4.0 - 5.0 5.5 - 12.0 12.5- 15.5 MVD Absolute Value Codelengthinbits
  • 65. Transform Coefficient Coding − Assign a variable length code according to three parameters (3-D VLC): 1) Length of the run of zeros preceding the current nonzero coefficient. 2) Amplitude of the current coefficient. 3) Indication of whether current coefficient is the last one in the block. − The most common are variable length coded (3-13 bits), the rest are coded with escape sequences (22 bits) 65 H.263 Baseline
  • 66. Quantization − H.263 uses a scalar quantizer with center clipping. − Quantizer varies from 2 to 62, by 2’s. − Can be varied ±1, ±2 at macroblock boundaries (2 bits) − Can be varied 2-62 at row and picture boundaries (5 bits). 66 H.263 Baseline Q -Q 2Q -2Q IN OUT
  • 67. Bit Stream Syntax 67 H.263 Baseline Hierarchy of three layers. Picture Layer GOB* Layer MB Layer *A GOB is usually a row of macroblocks, except for frame sizes greater than CIF. Picture Hdr GOB Hdr MB MB ... GOB Hdr ...
  • 68. Picture Layer Concepts − PSC - sequence of bits that can not be emulated anywhere else in the bit stream. − TR - 29.97 Hz counter indicating time reference for a picture. − PType - Denotes Intra, Inter-coded, etc. − P-Quant - Indicates which quantizer (2…62) is used initially for the picture. 68 H.263 Baseline Picture Start Code Temporal Reference Picture Type Picture Quant
  • 69. GOB Layer Concepts, GOB Headers are Optional − GSC - Another unique start code (17 bits). − GOB Number - Indicates which GOB, counting vertically from the top (5 bits). − GOB Quant - Indicates which quantizer (2…62) is used for this GOB (5 bits). GOB can be decoded independently from the rest of the frame 69 H.263 Baseline GOB Start Code GOB Number GOB Quant
  • 70. Macroblock Layer Concepts − COD - if set, indicates empty Inter MB. − MB Type - indicates Inter, Intra, whether MV is present, etc. − CBP - indicates which blocks, if any, are empty. − DQuant - indicates a quantizer change by +/- 2, 4. − MV Deltas - are the MV prediction residuals. − Transform coefficients - are the 3-D VLC’s for the coefficients. 70 H.263 Baseline Coded Flag MB Type Code Block Pattern MV Deltas Transform Coefficients DQuant 8x8 pixel blocks macroblock Y Cb Cr
  • 71. Deblocking Filter 71 H.263 Options No Filter Deblocking Loop Filter
  • 72. Unrestricted/Extended Motion Vector Mode (UMV Mode) 1. Motion Vectors Over Picture Boundaries − UMV dramatically improves motion estimation when moving objects are entering/exiting the frame or moving around the frame border) − Motion vectors are permitted to point outside the picture boundaries – Non-existent pixels are created by replicating the edge pixels (When a pixel referred to by motion vector points to outside of coded area, last full pixel inside the coded picture area is used). – Motion vector restricted such that no pixel of 16x16 (or 8x8) block shall have horizontal or vertical distance more than 15 pixels outside of picture. – Improves compression when there is movement across the edge of a picture boundary or when there is camera panning. 72 H.263 Options Target Frame NReference Frame N-1 Edge pixels are repeated.
  • 73. Unrestricted/Extended Motion Vector Mode 2. Extended MV Range − To extend the range of the motion vectors from [-16,15.5] to [-31.5,31.5] with some restrictions. − This better addresses high motion scenes. 73 H.263 Options 15.5 15.5 -16 -16 -16 -16 15.5 15.5 (31.5,31.5) Base motion vector range. Extended motion vector range, [-16,15.5] around MV predictor.
  • 74. Advanced Prediction Mode − The motion compensation in the core H.263 is based on one motion vector per macroblock of 16×16 pixels, with half-pixel precision. − The macroblock motion vector is then differentially coded with predictions taken from three surrounding macroblocks, as indicated in Figure. 74 H.263 Options MV: Current Motion Vector MV1: Previous Motion Vector MV2: Above Motion Vector MV3: Above Right Motion Vector MV2 MV3 MV1 MV
  • 75. Advanced Prediction Mode − The predictors are calculated separately for the horizontal and vertical components of the motion vectors, MV1, MV2 and MV3. − For each component, the predictor is the median* value of the three candidate predictors for this component: − The difference between the components of the current motion vector and their predictions is variable length coded. The vector differences are defined by 75 H.263 Options
  • 76. Advanced Prediction Mode − In the special cases, at the borders of the current group of blocks (GOB) or picture, the following decision rules are applied in order: • The candidate predictor MV1 is set to zero if the corresponding macroblock is outside the picture at the left side . • The candidate predictors MV2 and MV3 are set to MV1 if the corresponding macroblocks are outside the picture at the top, or if the GOB header of the current GOB is nonempty. • The candidate predictor MV3 is set to zero if the corresponding macroblock is outside the picture at the right side. • When the corresponding macroblock is intra coded or was not coded, the candidate predictor is set to zero. − Like unrestricted motion vector mode, motion vectors can refer to the area outside the picture 76 H.263 Options MV: Current Motion Vector MV1: Previous Motion Vector MV2: Above Motion Vector MV3: Above Right Motion Vector MV2 MV3 MV1 MV Picture or GOB border MV2 MV3 (0,0) MV MV1 MV1 MV1 MV MV2 (0,0) MV1 MV
  • 77. Advanced Prediction Mode − Includes motion vectors across picture boundaries from the previous mode. − Option of using four motion vectors for 8x8 blocks instead of one motion vector for 16x16 blocks as in baseline. • In H.263, one motion vector per macroblock is used except in the advanced prediction mode, where either one (four vectors with the same value) or four motion vectors per macroblock are employed. • When there are four motion vectors, the information for the first motion vector is transmitted as the code word motion vector data (MVD), and the information for the three additional vectors in the macroblock is transmitted as the code word MVD2–4. − Overlapped motion compensation to reduce blocking artifacts. 77 H.263 Options
  • 78. Four motion vectors for 8x8 blocks instead of one motion vector for 16x16 blocks. − The vectors are obtained by adding predictors to the vector differences indicated by MVD and MVD2–4, as was the case when only one motion vector per macroblock was present. − The predictors are calculated separately for the horizontal and vertical components. − However, the candidate predictors MV1, MV2 and MV3 are redefined as indicated in Figure. − The neighbouring 8×8 blocks that form the candidates for the prediction of the motion vector MV take different forms depending on the position of the block in the macroblock. 78 H.263 Options • Redefinition of the candidate predictors MV1, MV2 and MV3 for each luminance block in a macroblock. • Motion vector prediction for 8x8 blocks used three surrounding block motion vectors MV2 MV1 MV3 MV MV2 MV1 MV3 MV MV2 MV1 MV3 MV MV2 MV1 MV3 MV
  • 79. Overlapped Motion Compensation (OBMC) − In normal motion compensation, the current block is composed of • The predicted block from the previous frame (referenced by the motion vectors) • The residual data transmitted in the bit stream for the current block. − Overlapped motion compensation is only used for the 8×8 luminance blocks. − Each pixel in an 8×8 luminance prediction block is the weighted sum of three prediction values, divided by 8 (with rounding). 79 H.263 Options Reference frame Current MB
  • 80. Overlapped Motion Compensation (OBMC) − To obtain the prediction values, three motion vectors are used. They are the motion vector of the current luminance block and two out of four remote vectors, as follows: • the motion vector of the block at the left or right side of the current luminance block; • the motion vector of the block above or below the current luminance block. 80 H.263 Options
  • 81. Overlapped Motion Compensation (OBMC) − Let (m, n) be the column & row indices of an 88 pixel block in a frame. − Let (i, j) be the column & row indices of a pixel within an 88 block. − Let (x, y) be the column & row indices of a pixel within the entire frame (𝒙, 𝒚) = (𝒎𝟖 + 𝒊, 𝒏𝟖 + 𝒋) 81 H.263 Options B 88 pixel block n, block column number m, block row number y, pixel column number x, pixel row number j, pixel column number i, pixel row number
  • 82. Overlapped Motion Compensation (OBMC) • Let (MV0 x,MV0 y) denote the motion vectors for the current block. • Let (MV1 x,MV1 y) denote the motion vectors for the block above (below) if the current pixel is in the top (bottom) half of the current block. • Let (MV2 x,MV2 y) denote the motion vectors for the block to the left (right) if the current pixel is in the left (right) half of the current block. 82 H.263 Options MV0 MV1 MV1 MV2 MV2Current Block Right Block Below Block
  • 83. Overlapped Motion Compensation (OBMC) • The creation of each interpolated (overlapped) pixel, p(i, j), in an 8×8 reference luminance block is governed by 𝑷(𝒙, 𝒚) = (𝒒(𝒙, 𝒚) 𝑯 𝟎(𝒊, 𝒋) + 𝒓(𝒙, 𝒚) 𝑯 𝟏(𝒊, 𝒋) + 𝒔(𝒙, 𝒚) 𝑯 𝟐(𝒊, 𝒋) + 𝟒)/𝟖 − Where, 𝒒 𝒙, 𝒚 = 𝒙 + 𝑴𝑽 𝟎 𝒙, 𝒚 + 𝑴𝑽 𝟎 𝒚 𝒓 𝒙, 𝒚 = 𝒙 + 𝑴𝑽 𝟏 𝒙, 𝒚 + 𝑴𝑽 𝟏 𝒚 𝒔(𝒙, 𝒚) = (𝒙 + 𝑴𝑽 𝟐 𝒙, 𝒚 + 𝑴𝑽 𝟐 𝒚) 83 H.263 Options 4 5 5 5 5 5 5 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 5 5 6 6 6 6 5 5 5 5 5 5 5 5 5 5 4 5 5 5 5 5 5 4 1 2 2 2 2 2 2 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 2 2 2 2 2 2 1 𝑯 𝟐 𝒊, 𝒋 = (𝑯 𝟏 𝒊, 𝒋 ) 𝑻 𝑯 𝟎 𝒊, 𝒋 = 𝑯 𝟏 𝒊, 𝒋 =
  • 84. 84 H0(i, j) Weighting values for prediction with motion vector of current block H2(i, j) Weighting values for prediction with motion vectors of luminance blocks to the left or right of current luminance block Right of the current block Left of the current block H1(i, j) Weighting values for prediction with motion vectors of the luminance blocks on top or bottom of the current luminance block Bottom of the current block Top of the current block Overlapped Motion Compensation (OBMC) The neighbouring pixels closer to the pixels in the current block take greater weights. H.263 Options
  • 85. PB Frames Mode − A PB frame consists of two P- and B-pictures coded as one unit (coded together) (a P frame as in baseline, and a B frame) − The P-picture is predicted from the last decoded P-picture, and the B-picture is predicted from both the last decoded P-picture and the P-picture currently being decoded (The prediction process is illustrated in Figure). − Can increase frame rate 2X with only about 30% increase in bit rate (because of B-frame). − Since in the PB frames mode a unit of coding is a combined macroblock from P- and B-pictures, the composite macroblock comprises 12 blocks. − First the data for the six P-blocks are transmitted as the default H.263 mode, and then the data for the six B-blocks. − The composite macroblock may have various combinations of coding status for the P- and B-blocks, which are dictated by the MCBPC. 85 H.263 Options
  • 86. Best match Forward Motion Vector Macroblock to be coded Previous reference picture Current B-picture Future reference picture Best match Backward Motion Vector 86 Forward Motion Vector and Backward Motion Vector, Recall Forward Prediction Backward Prediction
  • 87. PB Frames Mode 87 H.263 Options Restriction: the backward predictor cannot extend outside the current MB position of the future frame. Picture 1 P Frame (decoded P-picture) Picture 2 B Frame Picture 3 P Frame (current P-picture) V 1/2 -V 1/2 PB Forward Motion Vector Backward Motion Vector Forward Prediction Backward Prediction Forward Prediction
  • 88. P B P PB frame TRB TRD 𝑀𝑉 𝐹 = 𝑇𝑅 𝐵𝑀𝑉 𝑇𝑅 𝐷 + 𝑀𝑉𝐷 𝑀𝑉 𝐵 = 𝑇𝑅 𝐵 − 𝑇𝑅𝐷 𝑀𝑉 𝑇𝑅 𝐷 𝑖𝑓 𝑀𝑉𝐷 𝑖𝑠 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 0 𝑀𝑉 𝐵 = 𝑀𝑉𝐹 − 𝑀𝑉 𝑖𝑓 𝑀𝑉𝐷 𝑖𝑠 𝑛𝑜𝑡 𝑒𝑞𝑢𝑎𝑙 𝑡𝑜 0 H.263 Options PB Frames Mode P-picture is predicted from the previous decoded P-picture B-picture is predicted both from the previous decoded P-picture and the P picture currently being decoded. 𝑀𝑉 𝐹 𝑀𝑉 𝐵 𝑀𝑉 Assume: 𝑀𝑉 𝐷 is the delta vector component given by the motion vector data of a B-picture (MVDB) and corresponds to the vector component MV. Forward and Bi-directional Prediction in B-block Part of the block that is predicted bidirectionally part that uses only forward prediction FWD: Forward prediction BID: Bidirectional prediction P-Macroblock BID FWD B-lock 88
  • 89. Improved PB frames (BPB) − This mode is an improved version of the optional PB frames mode of H.263 [22-M]. − Most parts of this mode are similar to the PB frames mode, the main difference being that in the improved PB frames mode, the B part of the composite PB-macroblock, known as BPB-macroblock, may have a separate motion vector for forward and backward prediction. − This is in addition to the bidirectional prediction mode that is also used in the normal PB frames mode. − Hence, there are three different ways of coding a BPB-macroblock, and the coding type is signalled by the MVDB parameter. • Bidirectional prediction • Forward prediction • Backward prediction 89 H.263 Options Picture 1 P Frame (decoded P-picture) Picture 2 B Frame Picture 3 P Frame (current P-picture) V 1/2 -V 1/2 PB Forward Motion Vector Backward Motion Vector Forward Prediction Backward Prediction Forward Prediction
  • 90. Syntax-based Arithmetic Coding Mode − In encoding, a symbol is encoded by a specific array of integers (model) based on syntax and by calling a encode_a_symbol (index, cumul_freq). − A FIFO buffers the bits from arithmetic encoder. − In decoding, a symbol is decoded by a specific model based on syntax and by calling decode_a_symbol (cumul_freq). − Syntax of top 3 layers: Picture, Group-of-Blocks and Macroblock remains the same, but that of block is modified. 90 H.263 Options
  • 91. Syntax-based Arithmetic Coding Mode − In this mode, all the variable length coding and decoding of baseline H.263 is replaced with arithmetic coding/decoding. − This removes the restriction that each sumbol must be represented by an integer number of bits, thus improving compression efficiency. − Experiments indicate that compression can be improved by up to 10% over variable length coding/decoding. − Complexity of arithmetic coding is higher than variable length coding, however. 91 H.263 Options
  • 93. Enhance H.263 with additional options (Draft 20, Sept. ’97) Coding efficiency • Advanced intra coding mode • Deblocking filter mode • Improved PB-frames mode • Reference picture resampling mode • Alternative inter VLC mode • Modified quantization mode Error robustness • Slice structured mode • Referenced picture selection mode • Independently segmented decoding mode Enhanced Communication • Temporal, SNR, and spatial scalability mode • Reduced-resolution update mode 93 H.263 Ver. 2 (H.263+)
  • 94. − H.263+ was standardized in January, 1998. − The expected enhancements of H.263+ over H.263 fall into two basic categories: • enhancing quality within existing applications; • broadening the current range of applications. − Adds negotiable options and features while still retaining a backwards compatibility mode. − A few examples of the enhancements are as follows: • improving perceptual compression efficiency; • reducing video coding delay; • providing greater resilience to bit errors and data losses. 94 H.263 Ver. 2 (H.263+)
  • 95. 95 H.263 Ver. 2 (H.263+)
  • 96. Annex I: Advanced Intra Coding mode Annex J: Deblocking Filter mode Annex K: Slice Structured mode Annex L: Supplemental Enhancement Information Specification Annex M: improved PB Frame mode Annex N: Reference Picture Selection mode Annex O: Temporal, SNR, and Spatial Scalability mode Annex P: Reference Picture Resampling Annex Q: Independent Segment Decoding mode Annex R: Independent Segment Decoding mode Annex S: Alternative Inter VLC mode Annex T: Modified Quantization mode 96 H.263+ (v2) Optional Tools
  • 97. − In addition to the multiples of CIF, H.263+ permits • any frame size from 4x4 to 2048x1152 pixels in increments of 4. − Besides the 12:11 pixel aspect ratio (PAR), H.263+ supports • Square (1:1) • 525-line 4:3 picture (10:11) • CIF for 16:9 picture (16:11) • 525-line for 16:9 picture (40:33) • and other arbitrary ratios − In addition to picture clock frequencies of 29.97 Hz (NTSC), H.263+ supports • 25 Hz (PAL) • 30 Hz • and other arbitrary frequencies 97 Arbitrary Frame Size, Pixel Aspect Ratio, Clock Frequency
  • 98. 98 Level 1 Level 2 Level 3 Advanced INTRA Coding Yes Yes Yes Deblocking Filter Yes Yes Yes Supplemental Enhancement Information (Full-Frame Freeze Only) Yes Yes Yes Modified Quantization Yes Yes Yes Unrestricted Motion Vectors No Yes Yes Slice Structured Mode No Yes Yes Reference Picture Resampling (Implicit Factor-of-4 Mode Only) No Yes Yes Advanced Prediction No No Yes Improved PB-frames No No Yes Independent Segment Decoding No No Yes Alternate INTER VLC No No Yes H.263v2 specified a set of recommended modes in an informative appendix (Appendix II, since deprecated) The prior informative Appendix II (recommended optional enhancement) was obsoleted by the creation of the normative Annex X. H.263 Ver. 2 (H.263+)
  • 99. − In this mode, either the DC coefficient, 1st column, or 1st row of coefficients are predicted from neighboring blocks (Dc only, Vertical DC & AC, Horizontal DC &AC) − Prediction is determined on a MB-by-MB basis. − Essentially DPCM of Intra DCT coefficients. − Can save up to 40% of the bits on Intra frames. − A separate VLC table for intra DCT − Modified quantization for intra coefficients − Spatial prediction of DCT coefficients 99 Advanced Intra Coding Mode Three neighboring blocks to the DCT domain u 0 1 2 3 4 5 6 7 Block A Rec A(u, v) Block C Rec C(u, v) v 0 1 2 3 4 5 6 7 Block B Rec B(u, v) Index Prediction mode Code 0 0 (DC Only) 0 1 1 (Vertical DC & AC) 10 2 2 (Horizontal DC & AC) 11
  • 100. − At very low bit rates, the block of pixels is mainly made of low-frequency DCT coefficients. − In these areas, when there is a significant difference between the DC levels of the adjacent blocks, they appear as block borders. − The overlapped block matching motion compensation to some extent reduces these blocking artefacts. − For further reduction in the blockiness, the H.263 specification recommends deblocking of the picture through the block edge filter. − The Deblocking Filter mode improves subjective quality by removing blocking and mosquito artifacts common to block-based video coding at low bit rates. 100 Deblocking Filter Mode
  • 101. − Deblocking Filter Mode introduces a deblocking filter inside the coding loop. − Unlike in post-filtering, predicted pictures are computed based on filtered versions of the previous ones. − Like the Advanced Prediction mode of H.263, the Deblocking Filter mode involves using four motion vectors per macroblock. − The filtering is performed on 8×8 block edges and assumes that 8×8 DCT is used and the motion vectors may have either 8×8 or 16×16 resolution. − Filtering is equally applied to both luminance and chrominance data. − No filtering is permitted on the frame and slice edges. 101 Deblocking Filter Mode
  • 102. − Consider four pixels A, B, C and D on a line (horizontal or vertical) of the reconstructed picture, where A and B belong to block 1 and C and D belong to a neighbouring block 2, which is either to the right of or below block 1. − It Filters pixels along block boundaries while preserving edges in the image content. − Filter is in the coding loop which means, it filters the decoded reference frame used for motion compensation. − It can be used in conjunction with a post-filter to further reduce coding artifacts. 102 Deblocking Filter Mode A B C D A B C D block2block1 block1 Blockboundary Example for filtered pixels on a horizontal Block edge (Filtered pixels on a vertical block edge) Example for filtered pixels on a vertical Block edge (Filtered pixels on a horizontal block edge) Block boundary
  • 103. Deblocking Filter − To turn the filter on for a particular edge, either block 1 or block 2 should be an intra or a coded macroblock with the code COD =0. − A, B, C and D are replaced by new values, A1, B1, C1, and D1 based on a set of non-linear equations. − The strength of the filter is proportional to the quantization strength. − The sign of d1 is the same as the sign of d. 103 H.263 Options A B C D A B C D block2block1 block1
  • 104. Deblocking Filter − Figure shows how the value of d1 changes with d and the quantiser parameter QP, to make sure that only block edges which may suffer from blocking artefacts are filtered and not the natural edges. − As a result of this modification, only the pixels on the edge are filtered so that their luminance changes are less than the quantisation parameter, QP. 104 H.263 Options d1 as a function of d
  • 105. − To turn the filter on for a particular edge, either block 1 or block 2 should be an intra or a coded macroblock with the code COD =0. − A, B, C and D are replaced by new values, A1, B1, C1, and D1 based on a set of non-linear equations. − The strength of the filter is proportional to the quantization strength. 𝑩𝟏 = 𝒄𝒍𝒊𝒑 (𝑩 + 𝒅𝟏) 𝑪𝟏 = 𝒄𝒍𝒊𝒑 (𝑪 − 𝒅𝟏) 𝑨𝟏 = 𝑨 − 𝒅𝟐 𝑫𝟏 = 𝑫 + 𝒅𝟐 𝒅𝟏 = 𝑭𝒊𝒍𝒕𝒆𝒓 [ 𝑨 − 𝟒𝑩 + 𝟒𝑪 − 𝑫 𝟖 , 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉 𝑸𝑼𝑨𝑵𝑻 ] 𝒅𝟐 = 𝒄𝒍𝒊𝒑 𝒅𝟏 𝑨 − 𝑫 𝟒 , 𝒅𝟏 𝟑 𝑭𝒊𝒍𝒕𝒆𝒓(𝒙, 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉) = 𝑺𝑰𝑮𝑵(𝒙) ∗ (𝑴𝑨𝑿(𝟎, 𝒂𝒃𝒔(𝒙) − 𝑴𝑨𝑿(𝟎, 𝟐 ∗ ( 𝒂𝒃𝒔(𝒙) − 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉)))) 105 Deblocking Filter Mode
  • 106. − The Deblocking Filter mode improves subjective quality by removing blocking and mosquito artifacts common to block-based video coding at low bit rates. − Many applications make use of a post filter to reduce these artifacts. − The post-filtering is useful in error-free and error-prone environments. − This post filter is usually present at the decoder and is outside the coding loop. Therefore, prediction is not based on the post filtered version of the picture. − The one-dimensional version of the filter will be described. − To obtain a two-dimensional effect, the filter is first used in the horizontal direction and then in the vertical direction. − The post filter is applied to all pixels within the picture. − Edge pixels should be repeated when the filter is applied at picture boundaries. 106 Post-Filter
  • 107. − The pixels A, B, C, D, E, F, G, (H) are aligned horizontally or vertically. − The post-filter strength is proportional to the quantization: Strength(QUANT) − The Strength1 and Strength2 may be different to better adapt the total filter strength to QUANT. − The Strength1, 2 may be related to QUANT for the macroblock where D belongs or to some average value of QUANT over parts of the picture or over the whole picture. 107 Post-Filter 𝑫𝟏 = 𝑫 + 𝑭𝒊𝒍𝒕𝒆𝒓 𝑨 + 𝑩 + 𝑪 + 𝑬 + 𝑭 + 𝑮 − 𝟔𝑫 𝟖 , 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉𝟏 when filtering in the first direction 𝑫𝟏 = 𝑫 + 𝑭𝒊𝒍𝒕𝒆𝒓 𝑨 + 𝑩 + 𝑪 + 𝑬 + 𝑭 + 𝑮 − 𝟔𝑫 𝟖 , 𝑺𝒕𝒓𝒆𝒏𝒈𝒕𝒉𝟐 when filtering in the second direction The relation between Strength1, 2 and QUANT
  • 108. 108 Deblocking Loop Filter Demo No Filter Deblocking Loop Filter
  • 109. 109 Deblocking Loop Filter and Post Filter Demo Deblocking Loop Filter and Post FilterNo Filter
  • 110. 110 Deblocking Loop Filter and Post Filter Demo No Filter Loop Filter Only Deblocking Loop Filter and Post Filter
  • 111. 111 No Filter Deblocking Loop Filter and TMN-8 Post FilterDeblocking Loop Filter Only TMN-8 Post Filter Only Deblocking Loop Filter and Post Filter Demo sequenceForeman24Kbps,10fps TMN-8:VideoCodecTestModel,near-term,Version8(TMN8) − The deblocking filter alone reduces blocking artifacts significantly, mainly due to the use of four motion vectors per macroblock. − The filtering process provides smoothing, further improving subjective quality. − The effects of the post filter are less noticeable, and adding the post filter may actually result in blurriness. − Therefore, the use of the deblocking filter alone is usually sufficient.
  • 112. − Allows insertion of resynchronization markers at macroblock boundaries to improve network packetization and reduce overhead. More on this later • Allows more flexible tiling of video frames into independently decodable areas to support “view ports”, a.k.a. “local decode.” • Improves error resiliency by reducing intra-frame dependence. • Permits out-of-order transmission to reduce latency. 112 Slice Structured Mode
  • 113. 113 Slice Structured Mode Slice Boundaries No INTRA or MV Prediction Across Slice Boundaries. Slices Start And End on Macroblock Boundaries. Slice Boundaries No INTRA or MV Prediction Across Slice Boundaries. Slice Sizes Remain Fixed Between INTRA Frames.
  • 114. Backwards compatible with H.263 but permits indication of supplemental information for features such as: • Partial and full picture freeze requests • Partial and full picture snapshot tags • Video segment start and end tags for off-line storage • Progressive refinement segment start and end tags • Chroma keying info for transparency • The Chroma Keying Information Function (CKIF) indicates that the "chroma keying" technique is used to represent "transparent" and "semi-transparent" pixels in the decoded video pictures. • When being presented on the display, "transparent" pixels are not displayed. • Instead, a background picture which is either a prior reference picture or is an externally controlled picture is revealed. • Semitransparent pixels are displayed by blending the pixel value in the current picture with the corresponding value in the background picture. 114 Supplemental Enhancement Information
  • 115. − Resampling of a temporally previous reference picture prior to its use as a reference for encoding, enabling global motion compensation, predictive dynamic resolution conversion, predictive picture area alteration and registration, and special-effect warping; − Allows frame size changes of a compressed video sequence without inserting an Intra frame (No Intra frame required when changing video frame sizes). − Permits the warping of the reference frame via affine transformations to address special effects such as zoom, rotation, translation. − Can be used for emergency rate control by dropping frame sizes adaptively when bit rate get too high. 115 Reference Picture Resampling
  • 116. − Specifies generalized method applied to previous reference picture to generate warped picture for use in predicting current picture − Special case of factor of 4 resampling, which converts horizontal and vertical size by factor of 2 (upsampling) or ½(downsampling) in each direction. 116 Reference Picture Resampling Pixel positions of the reference picture Pixel positions of the downsamped predicted picture a=(A+B+C+D+1+RCRPR)/4 . . Downsampling a A B C D Pixel positions of the reference picture Pixel positions of the upsamped predicted picture a=(9A+3B+3C+D+7+RCRPR)/16 b=(3A+9B+C+3D+7+RCRPR)/16 c=(3A+B+9C+3D+7+RCRPR)/16 d=(A+3B+3C+9D+7+RCRPR)/16 a c b d A B C D Upsampling
  • 117. − Specify arbitrary warping parameters via displacement vectors from corners. − For source format changes − Global motion compensation − Special-effect warping 117 Reference Picture Resampling with Warping MV00 MV10 MV11 MV01
  • 118. No Intra frame required when changing video frame sizes 118 Reference Picture Resampling Factor of 4 Size Change P P P P P
  • 119. − Allows more flexibility in adapting quantizers on a macroblock by macroblock basis, by enabling large quantizer changes through the use of escape codes. − A mode which improves the control of the bit rate by changing the method for controlling the quantizer step size on a macroblock basis. − Reduces quantizer step size for chrominance blocks, compared to luminance blocks to reduces the prevalence of chrominance artifacts . − Modifies the allowable DCT coefficient range to avoid clipping, yet disallows illegal coefficient/quantizer combinations. − Increases the range of representable DCT coefficient values for use with small quantizer step sizes, and increases error detection performance and reduces decoding complexity by prohibiting certain unreasonable coefficient representations. 119 Modified Quantization (MQ)
  • 120. − Allow modification of the quantizer at macroblock layer to any value, not limited to +1, -1, +2, and -2. • DQUANT uses 2 bits (started with “1”) to specify small changes. − It uses 6 bits (started with “0”) to specify other changes. − Codeword: 0xxxxx, where the last 5 bits specify the new QUANT value. 120 Modified Quantization (MQ) Change of QUANT Prior QUANT DQUANT = 10 DQUANT = 11 1 +2 +1 2- 10 - 1 +1 11- 20 - 2 +2 21- 28 - 3 +3 29 - 3 +2 30 - 3 +1 31 - 3 - 5
  • 121. − Enhance chrominance quality by a finer quantizer. − Improve picture quality by extending the range of representable quantized DCT coefficients, not limited by [-127, +127]. 121 Modified Quantization (MQ) Range of QUANT Value of QUANT_C 1- 6 QUANT_C = QUANT 7- 9 QUANT_C = QUANT - 1 10- 11 9 12- 13 10 14- 15 11 16- 18 12 19- 21 13 22- 26 14 27- 31 15
  • 122. − Used for bit rate control by reducing the size of the residual frame adaptively when bit rate gets too high. − A mode which allows an encoder to maintain a high frame rate during heavy motion by encoding a low- resolution update to a higher-resolution picture while maintaining high resolution in stationary areas 122 Reduced-Resolution Update (RRU) Up sampling 16*16 reconstructed block 8*8 Coefficients block Result of inverse transform Coefficients decoding Block layer decoding Bitstream Scaling-up Macroblock Layer Decoding Pseudo- Vector 16*16 Reconstructed prediction error block 16*16 prediction blockMotion Compensation Reconstructed Vector
  • 123. − A scalable bit stream consists of layers representing different levels of video quality. − Everything can be discarded except for the base layer and still have reasonable video. − If bandwidth permits, one or more enhancement layers can also be decoded which refines the base layer in one of three ways: temporal, SNR, or spatial 123 Scalability Mode Enh. Layer 1 Enhancement Layer 3 Enhancement Layer 4 Base Layer Enhancement Layer 2 H.263+Encoder 40kb/s 20kb/s 90kb/s 200kb/s 320kb/s Layered Video Bitstreams
  • 124. − Scalability is typically used when one bit stream must support several different transmission bandwidths simultaneously, or some process downstream needs to change the data rate unbeknownst to the encoder. 124 Scalability Mode Example: Conferencing Multipoint Control Unit
  • 125. 125 384 kb/s 384 kb/s 128 kb/s 28.8 kb/s Scalability Mode Layered Video Bit Streams in Multipoint Conferencing
  • 126. 126 Scalability Mode Higher Frame Rate! Base Layer + B Frames Better Spatial Quality! Base Layer + SNR Layer SNR Enhancement More Spatial Resolution!! Base Layer + Spatial Layer Spatial Enhancement Temporal Enhancement
  • 127. SNR Scalability EI EP EP Enhancement Layer I P P Base Layer Spatial Scalability Base Layer I P P EI EP EPEnhancement Layer Temporal Scalability B2 B4I1 P3 P5 Scalability Mode Low Temporal Resolution High Temporal Resolution 127
  • 128. Two or more frame rates can be supported by the same bit stream. − It is achieved using bidirectionally predicted pictures or B-pictures. − The B-frames can be discarded (to lower the frame rate) and the bit stream remains usable. − These B-pictures differ from the B-picture part of PB-frames in that they are separate entities in the bitstream. − These B-pictures are not syntactically intermixed with a subsequent P or its enhancement part EP. − B-pictures and the B part of PB-frames are not used as reference pictures for the prediction of any other pictures. This property allows for B-pictures to be discarded if necessary without adversely affecting any subsequent pictures, thus providing temporal scalability. − Since H.263 is normally used for low frame rate applications (low bit rates, e.g. mobile), due to larger separation between the base layer I- and P- pictures, there is normally one B-picture between them. 128 Temporal Scalability I or P B B P ...... • I and P frames form the base layer • B-frames from the temporal enhancement layer • B-frames can be discarded Temporal Scalability Demonstration • layer 0, 3.25 fps, P-frames • layer 1, 15 fps, B-frames
  • 129. The difference between the input picture and lower quality base layer picture is coded. − The picture in the base layer which is used for the prediction of the enhancement layer pictures may be an I-picture, a P-picture, or the P part of PB frames, but should not be a B-picture or the B part of a PB frame. − In the enhancement layer two types of picture are identified, EI (enhancement I-picture) and EP (enhancement P-picture). − If prediction is only formed from the base layer, then the enhancement layer picture is referred to as EI-picture. − In this case, the base layer picture can be an I- or a P-picture (or the P part of PB frames). − For both EI- and EP-pictures, prediction from the reference layer uses no motion vectors → no inter prediction from base layer. − however, EP may be predictively coded with respect to its previous reconstructed picture at the same layer, called forward prediction. 129 SNR Scalability Base Layer (15 kbit/s) Enhancement Layer (40 kbit/s) EI EP EP PPI EI EP P P I - Intracoded or Key Frame P - Predicted Frame EI - Enhancement layer key frame (enhancement I-picture) EP - Enhancement layer predicted frame (enhancement P-picture) SNR Scalability Demonstration • layer 0, 10 fps, 40 kbps • layer 1, 10 fps, 400 kbps
  • 130. − The arrangement of the enhancement layer pictures in the spatial scalability is similar to that of SNR scalability. − The only difference is that before the picture in the reference layer is used to predict the picture in the spatial enhancement layer, it is downsampled by a factor of 2 either horizontally or vertically (one-dimensional spatial scalability), or both horizontally and vertically (two-dimensional spatial scalability). − If enhancement layer be 2X the size of the base layer in each dimension the base layer is interpolated (by 2X) before predicting the spatial enhancement layer. 130 Spatial Scalability Base Layer Enhancement Layer EI EP EP PPI EI EP P P I - Intracoded or Key Frame P - Predicted Frame EI - Enhancement layer key frame (enhancement I-picture) EP - Enhancement layer predicted frame (enhancement P-picture) Spatial Scalability Demonstration • layer 0, QCIF, 10 fps, 60 kbps • layer 1, CIF, 10 fps, 300 kbps
  • 131. − It will increase the robustness of H.263 against the channel errors. − It is possible for B-pictures to be temporally inserted not only between the base layer pictures of type I, P, PB and, but also between the enhancement picture types of EI and EP, whether these consist of SNR or spatial enhancement pictures. − It is also possible to have more than one SNR or spatial enhancement layer in conjunction with the base layer. Thus, a multilayer scalable bitstream can be a combination of SNR layers, spatial layers and B- pictures. − As with the two-layer case, B-pictures may occur in any layer. − However, any picture in an enhancement layer which is temporally simultaneous with a B-picture in its reference layer must be a B-picture or the B-picture part of PB frames. This is to preserve the disposable nature of B-pictures. − Note, however, that B-pictures may occur in any layers that have no corresponding picture in the lower layers. This allows an encoder to send enhancement video with a higher picture rate than the lower layers. 131 Hybrid or Multilayer Scalability EP E I P EI E P P B E P P EI E I I Base Layer Enhancement Layer1 Enhancement Layer2 E P P B I - Intracoded or Key Frame P - Predicted Frame EI - Enhancement layer key frame (enhancement I-picture) EP - Enhancement layer predicted frame (enhancement P-picture) Scalability Demonstration • SNR/Spatial Scalability, 10 fps • layer 0, 88x72, ~5 kbit/s, layer 1, 176x144, ~15 • layer 2, 176x144, ~40, layer 3, 352x288, ~80 • layer 4, 352x288, ~200
  • 132. Pictures, which are dependent on other pictures, are located in the bitstream after the pictures on which they depend. − The bitstream syntax order is specified such that for reference pictures (i.e. pictures having types I, P, EI, EP or the P part of PB) the following two rules shall be obeyed: 1. All reference pictures with the same temporal reference appear in the bitstream in increasing enhancement layer order. This is because each lower layer reference picture is needed to decode the next higher layer reference picture. 2. All temporally simultaneous reference pictures as discussed in item 1 appear in the bitstream prior to any B-pictures for which any of these reference pictures is the first temporally subsequent reference picture in the reference layer of the B-picture. This is done to reduce the delay of decoding all reference pictures, which may be needed as references for B-pictures. 132 Transmission Order of Pictures Enhancement Layer 2 Base Layer Enhancement Layer 1 4 3 2 1 8 7 6 5 EI EP P I B B B B Enhancement Layer 2 Base Layer Enhancement Layer 1 4 3 2 1 5 8 7 6 EI EP P I B B B B Two Allowable Picture Transmission Orders
  • 133. − Then, the B-pictures with earlier temporal references follow (temporally ordered within each enhancement layer). − The bitstream location of each B-picture complies with the following rules: • Be after that of its first temporally subsequent reference pictures in the reference layer. This is because the decoding of the B-pictures generally depends on the prior decoding of that reference picture. • Be after that of all reference pictures that are temporally simultaneous with the first temporally subsequent reference picture in the reference layer. This is to reduce the delay of decoding all reference pictures, which may be needed as references for B- pictures. • Precede the location of any additional temporally subsequent pictures other than B- pictures in its reference layer. Otherwise, it would increase picture storage memory requirement for the reference layer pictures. • Be after that of all EI- and EP-pictures that are temporally simultaneous with the first temporally subsequent reference picture. • Precede the location of all temporally subsequent pictures within the same enhancement layer. Otherwise, it would introduce needless delay and increase picture storage memory requirements for the enhancement layer. 133 Transmission Order of Pictures Enhancement Layer 2 Base Layer Enhancement Layer 1 4 3 2 1 8 7 6 5 EI EP P I B B B B Enhancement Layer 2 Base Layer Enhancement Layer 1 4 3 2 1 5 8 7 6 EI EP P I B B B B Two Allowable Picture Transmission Orders
  • 134. I B Base Layer P EI EP Enhancement Layer 1 EP SNR Scalability Spatial Scalability Enhancement Layer 2 EI EP EI Temporal Scalability B B Temporal Scalability Enhancement Layer 3 Hybrid or Multilayer Scalability Example 134
  • 135. I PBBase Layer EI EPEnhancement Layer 1 SNR Scalability Enhancement Layer 3 B B Temporal Scalability Enhancement Layer 2 EI EI EPSpatial Scalability Multilayer Transmission Order Example I B P EI EP EP EI EP EI Temporal Scalability B B Temporal Scalability EP 135
  • 136. Method for interpolating pixels for 2-D scalability 136 Interpolation for Spatial Scalability a b c d A B C D Original pixel positions Interpolated pixel positions a =(9A+3B+3C+D+8)/16 b =(3A+9B+C+3D+8)/16 c =(3A+B+9C+3D+8)/16 d =(A+3B+3C+9D+8)/16 Interpolation Formulation (Filtering)
  • 137. Method for 2-D interpolation at boundaries 137 Interpolation for Spatial Scalability Original Pixel Positions Interpolated Pixel Positions a = A b = (3*A +B + 2) / 4 c = (A + 3*B + 2) / 4 d = (3*A + C + 2) / 4 e = (A +3*C + 2) / 4 Picture Boundary a b c d e A B C D Picture Boundary
  • 138. − Improved PB-frames • Improves upon the previous PB-frame mode by permitting forward prediction of “B” frame with a new vector. − Reference Picture Selection (RPS) • A lower latency method for dealing with error prone environments by using some type of back- channel to indicate to an encoder when a frame has been received and can be used for motion estimation. • In RPS Mode, a frame is not used for prediction in the encoder until it’s been acknowledged to be error free. 138 Other Miscellaneous Features
  • 139. − Independently Decodable Segments • When signaled, it restricts the use of data outside of a current Group-of-Block segment or slice segment. Useful for error resiliency. − Alternative INTER VLC (AIV): • Permits use of an alternative VLC table that is better suited for Intra coded blocks, or blocks with low quantization. • A mode which reduces the number of bits needed for encoding predictively-coded blocks when there are many large coefficients the block. 139 Other Miscellaneous Features
  • 141. − Phone lines are “circuit-switched”. − A (virtual) circuit is established at call initiation and remains for the duration of the call. 141 Internet Basics Source Dest.switch switch switch
  • 142. − Computer networks are “packet-switched”. − Data is fragmented into packets, and each packet finds its way to the destination using different routes. − Lots of implications... 142 Internet Basics Source Dest.switch switch switchX
  • 143. 143 The Internet Is Heterogeneous Router Router Router Corporate LAN INTERNET (Global Public) AOL HyperStream FR: Frame Relay SMDS: Switched Multimegabit Data Service ATM: Asynchronous Transfer Mode LAN LAN TYMNET MCI Mail LAN Mail Gateway Host Dial-up IP “SLIP: Serial Line Internet Protocol”, “PPP: Point- to-Point Protocol ” IP IPIP “SMTP: Simple Mail Transfer Protocol” E-mail FR: Frame Relay “SLIP: Serial Line Internet Protocol”, “PPP: Point-to-Point Protocol ” X.25 “SMTP: Simple Mail Transfer Protocol” IP Dial-up E-mail FR: Frame RelayFR: Frame Relay
  • 144. − MCI Mail was one of the first ever commercial email services in the United States and one of the largest telecommunication services in the world. − AOL Mail is a free web-based email service provided by AOL, a division of Verizon Communications. − X. 25 is an ITU-T standard protocol suite for packet-switched data communication in wide area networks (WAN). − Frame Relay (FR) is a standardized wide area network technology that specifies the physical and data link layers of digital telecommunications channels using a packet switching methodology. − Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by ANSI and ITU standards for carriage of user traffic, including telephony, data, and video signals. − Switched Multimegabit Data Service (SMDS) is a wide area networking (WAN) connection service designed for LAN interconnection through the public telephone network. SMDS is designed for moderate bandwidth connections, between 1 to 34 Mbps, although SMDS has and is being extended to support both lower and higher bandwidth connections. − Tymnet was an international data communications network headquartered in Cupertino, California that used virtual call packet switched technology and X.25, SNA/SDLC, ASCII and BSC interfaces to connect host computers at thousands of large companies, educational institutions, and government agencies. 144 The Internet Is Heterogeneous
  • 145. OSI (Open System Interconnection) Model 145
  • 146. Comparison Between OSI and TCP/IP Model 146
  • 147. 147 Layers in the Internet Protocol Architecture Network Access Layer consists of routines for accessing physical networks 1 Internet Layer defines the datagram and handles the routing of data. 2 Host-to-Host Transport Layer provides end-to-end data delivery services. 3 Application Layer consists of applications and processes that use the network. 4 Header Header Header Data Data Header Data Header Header Data
  • 148. 148 Internet Protocol Architecture I P FDDI Ethernet Token Ring HDLC SMDS X.25 ATM FR TCP UDP SNMP DNS TELNET FTP SMTP MIME . . . . . . Network Access Layer Internet Host-Host Transport Utility/Application RTP MBone VIC/VAT
  • 149. 149 Specific Protocols for Multimedia IP TCP UDP RTP Physical Network Data IP UDP RTP payload RTP payload UDP RTP payload Payload header
  • 150. − IP implements two basic functions • Addressing • Fragmentation − IP treats each packet as an independent entity. − Internet routers choose the best path to send each packet based on its address. Each packet may take a different route. − Routers may fragment and reassemble packets when necessary for transmission on smaller packet networks. − No guarantee a packet will reach its destination, and no guarantee of when it will get there. • IP packets have a Time-to-Live, after which they are deleted by a router. • IP does not ensure secure transmission. • IP only error-checks headers, not payload. 150 The Internet Protocol (IP) IP TCP UDP RTP Physical Network Data IP UDP RTP payload RTP payload UDP RTP payload Payload header
  • 151. − TCP is connection-oriented, end-to-end reliable, in-order protocol. − TCP does not make any reliability assumptions of the underlying networks. − Acknowledgment is sent for each packet. − A transmitter places a copy of each packet sent in a timed buffer. − If no “ack” is received before the time is out, the packet is re-transmitted. − TCP has inherently large latency → not well suited for streaming multimedia. 151 Transmission Control Protocol (TCP) IP TCP UDP RTP Physical Network Data IP UDP RTP payload RTP payload UDP RTP payload Payload header
  • 152. − UDP is a simple protocol for transmitting packets over IP. − Smaller header than TCP, hence lower overhead. − Does not re-transmit packets. − This is OK for multimedia since a late packet usually must be discarded anyway. − Performs check-sum of data. 152 Universal Datagram Protocol (UDP) IP TCP UDP RTP Physical Network Data IP UDP RTP payload RTP payload UDP RTP payload Payload header
  • 153. 153 Transmission Control Protocol (TCP) and Universal Datagram Protocol (UDP)
  • 154. − RTP carries data that has real time properties − Typically runs on UDP/IP − Does not ensure timely delivery or QoS. − Does not prevent out-of-order delivery. − Profiles and payload formats must be defined. • Profiles define extensions to the RTP header for a particular class of applications such as audio/video conferencing (IETF RFC 1890). • Payload formats define how a particular kind of payload, such as H.261 video, should be carried in RTP. − Used by Netscape LiveMedia, Microsoft NetMeeting®, Intel VideoPhone, ProShare® Video Conferencing applications and public domain conferencing tools such as VIC and VAT. 154 Real time Transport Protocol (RTP) IP TCP UDP RTP Physical Network Data IP UDP RTP payload RTP payload UDP RTP payload Payload header
  • 155. − RTCP is a companion protocol to RTP which • monitors the quality of service • conveys information about the participants in an on-going session − It allows participants to send transmission and reception statistics to other participants. − It also sends information that allows participants to associate media types such as audio/video for lip-sync. − Sender reports allow senders to derive round trip propagation times. − Receiver reports include count of lost packets and inter-arrival jitter. − Scales to a large number of users since must reduce the rate of reports as the number of participants increases. 155 Real-time Transport Control Protocol (RTCP)
  • 156. − Most IP-based communication is unicast. A packet is intended for a single destination. − In unicasting, the router forwards the received packet through only one of its interfaces. − The relationship between the source and the destination is one-to-one. 156 Unicasting
  • 157. − For multi-participant applications, streaming multimedia to each destination individually can waste network resources. − A multicast address is designed to enable the delivery of packets to a set of hosts that have been configured as members of a multicast group across various subnetworks. − In multicasting, the router may forward the received packet through several of its interfaces. − The source address is a unicast address, but destination address is a group address. 157 Multicast Packets are duplicated in routers One source and a group of destination
  • 158. 158 Unicast Example, Streaming Media to Multi-participants S1 D1 S2 D1 D2 R R R 1 1 2 S1 sends duplicate packets because there’s two participants: D1, D2.. D2 sees excess traffic on this subnet.
  • 159. 159 Multicast Example, Streaming Media to Multi-participants S1 D1 S2 D1 D2 R R R 1 2 S1 sends single set of packets to a multicast group. D2 doesn’t see any excess traffic on this subnet. Both D1 receivers subscribe to the same multicast group.
  • 160. − A multicast router may not find another multicast router in the neighborhood to forward the multicast packet. − We make a multicast backbone (Mbone) out of these isolated routers using the concept of tunneling. − The multicast backbone (Mbone) was an experimental backbone and virtual network built on top of the Internet for carrying IP multicast traffic on the Internet. It required specialized hardware and software (early of 1990s). 160 Multicast Backbone (Mbone) concept of tunneling. Virtual point- to-point link Isolated island of routers Nonmulticast routers
  • 161. − Easy to deploy (no explicit router support). − Manual tunnel creation/maintenance. − No routing policy – single tree. 161 Multicast Backbone (Mbone) MBONE
  • 162. 162 Multicast Backbone (Mbone) IP header G=224.x.x.x Data Nonmulticast routers IP header G=224.x.x.x Data Encapsulator (router entry point of the tunnel) Decapsulator (router exit point of the tunnel) Mbone IP in IP Tunneling
  • 163. Real-time applications • Interactive applications are sensitive to packet delays (telephone) • Non-interactive applications can adapt to a wider range of packet delays (audio, video broadcasts) • Guarantee of maximum delay is useful 163 Quality of Service Requirements (1) Arrival Offset Graph Playout Point Sampled Audio Playout Buffer must be small for interactive applications
  • 164. Elastic applications − Interactive data transfer (e.g. HTTP, FTP) • Sensitive to the average delay, not to the distribution tail − Bulk data transfer (e.g. mail and news delivery) • Delay insensitive − Best effort works well 164 Quality of Service Requirements (2) Document Document is only useful when it is completely received. This means average packet delay is important, not maximum packet delay. Document
  • 165. Used by hosts to obtain a certain QoS from underlying networks for a multimedia stream (It operates over an IPv4 or IPv6). − It provides receiver-initiated setup of resource reservations for multicast or unicast data flows. − At each node, RSVP daemon attempts to make a resource reservation for the stream. − It communicates with two local modules: • Admission Control: It determines whether the node has sufficient resources available. “The Internet Busy Signal” • Policy Control: It determines whether the user has administrative permission to make the reservation. 165 ReSerVation Protocol (RSVP) Application RSVPD Admissions Control Packet Classifier Packet Scheduler Policy Control DATA DATA RSVPD Policy Control Admissions Control Packet Classifier Packet Scheduler DATA Routing Process Host Router RSVP Functional Diagram
  • 166. 166 ReSerVation Protocol (RSVP) R4 R5 R3R2 R1 Host A 24.1.70.210 Host B 128.32.32.69 PATH PATH 2 2. The Host A RSVP daemon generates a PATH message that is sent to the next hop RSVP router, R1, in the direction of the session address, 128.32.32.69. 3 3. The PATH message follows the next hop path through R5 and R4 until it gets to Host B. Each router on the path creates soft session state with the reservation parameters. 1. An application on Host A creates a session, 128.32.32.69/4078, by communicating with the RSVP daemon on Host A. 1
  • 167. 167 ReSerVation Protocol (RSVP) R4 R5 R3R2 R1 PATH PATH RESV RESV 5 5. The Host B RSVP daemon generates a RESV message that is sent to the next hop RSVP router, R4, in the direction of the source address, 24.1.70.210. 6 6. The RESV message continues to follow the next hop path through R5 and R1 until it gets to Host A. Each router on the path makes a resource reservation. 4. An application on Host B communicates with the local RSVP daemon and asks for a reservation in session 128.32.32.69/4078. The daemon checks for and finds existing session state. 4 Host A 24.1.70.210 Host B 128.32.32.69
  • 168. − HTTP generally runs on TCP/IP and is the protocol upon which World-Wide-Web data is transmitted. − Defines a “stateless” connection between receiver and sender. − Sends and receives MIME-like messages and handles caching, etc. − No provisions for latency or QoS guarantees. 168 Hyper-Text Transport Protocol (HTTP)
  • 169. 169 Real-time Streaming Protocol (RTSP) RTSPMeta FilesMedia file download A “network remote control” for multimedia servers. − Establishes and controls either a single or several time-synchronized streams of continuous media such as audio and video. − Supports the following operations: • Requests a presentation from a media server. • Invite a media server to join a conference and playback or record. • Notify clients that additional media is available for an existing presentation.
  • 170. 170 RTSP Media file download Meta Files Real-time Streaming Protocol (RTSP)
  • 171. 171 Real-time Streaming Protocol (RTSP) RTSP - Example
  • 172. − How do we handle the special cases of • unicasting? • Multicasting? − What about • packet-loss? • Quality of service? • Congestion? We’ll look at some solutions... 172 How Do We Stream Video Over the Internet?
  • 173. − HTTP was not designed for streaming multimedia, nevertheless because of its widespread deployment via Web browsers, many applications stream via HTTP. − It uses a custom browser plug-in which can start decoding video as it arrives, rather than waiting for the whole file to download. − Operates on TCP so it doesn’t have to deal with errors, but the side effect is high latency and large inter- arrival jitter. − Usually a receive buffer is employed which can buffer enough data (usually several seconds) to compensate for latency and jitter. − Not applicable to two-way communication! − Firewalls are not a problem with HTTP. 173 HTTP Streaming
  • 174. − RTP was designed for streaming multimedia. − Does not resend lost packets since this would add latency and a late packet might as well be lost in streaming video. − Used by Intel Videophone, Microsoft NetMeeting, Netscape LiveMedia, RealNetworks, etc. − Forms the basis for network video conferencing systems (ITU-T H.323) − Subject to packet loss, and has no quality of service guarantees. − Can deal with network congestion via RTCP reports under some conditions: • Should be encoding real time so video rate can be changed dynamically. • Needs a payload defined for each media it carries. 174 RTP Streaming
  • 175. − Payloads must be defined in the IETF(Internet Engineering Task Force) for all media carried by RTP. − A payload has been defined for H.263 and H.263+. − An RTP packet typically consists of... − The H.263 payload header contains redundant information about the H.263 bit stream which can assist a payload handler and decoder in the event that related packets are lost. − Slice mode of H.263+ aids RTP packetization by allowing fragmentation on MB boundaries (instead of MB rows) and restricting data dependencies between slices. − But what do we do when packets are lost or arrive too late to use? 175 H.263 Payload for RTP RTP Header H.263 Payload Header H.263 Payload (bit stream)
  • 177. − Depends on network topology. − On the Mbone • 2-5% packet loss • single packet loss most common − For end-to-end transmission, loss rates of 10% not uncommon. − For ISPs, loss rates may be even higher during high periods of congestion. 177 Internet Packet Loss
  • 178. 178 Distribution of length of loss bursts observed at a receiver 0.0001 0.001 0.01 0.1 1 0 5 10 15 20 25 30 35 40 45 50 length of loss bursts, b Probabilityofbursts oflengthb Conditional loss probability 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 2 4 6 8 10 12 Number of consecutive packets lost, n Probabilityoflosingpacketn+1 Internet Packet Loss Packet Loss Burst Lengths
  • 179. Error resiliency and compression have conflicting requirements. − Video compression attempts to remove as much redundancy out of a video sequence as possible. − Error resiliency techniques at some point must reconstruct data that has been lost and must rely on extrapolations from redundant data. 179 Error Resiliency + - REDUNDANCY CompressionResiliency
  • 180. − Errors tend to propagate in video compression because of its predictive nature. − 180 Error Resiliency I or P frame P frame One block is lost. Error propagates to two blocks in the next frame.
  • 181. There are essentially two approaches to dealing with errors from packet loss: • Error Redundancy Methods • They are preventative measures that add extra infromation at the encoder to make it easier to recover when data is lost. • The extra overhead decreases compression efficiency but should improve overall quality in the presence of packet loss. • Error Concealment Techniques • They are the methods that are used to hide errors that occur once packets are lost. − Usually both methods are employed. 181 Error Resiliency
  • 182. 182 Intra Coding Resiliency 20 25 30 35 40 45 20 40 60 80 100 120 140 160 180 Data Rate (kbps) AveragePSNR resil 0 loss 0 resil 5 loss 0 resil 10 loss 0 resil 0 loss 10- 20 resil 5 loss 10- 20 resil 10 loss 10- 20
  • 183. − Increasing the number of Intra coded blocks that the encoder produces will reduce error propagation since Intra blocks are not predicted. − Blocks that are lost at the decoder are simply treated as empty Inter coded blocks (Skipped Blocks). − The block is simply copied from the previous frame. − Very simple to implement. 183 Simple Intra Coding & Skipped Blocks
  • 184. 184 Reference Picture Selection (RPS) Mode of H.263+ I or P frame P frame P frame Last acknowledged error-free frame. In RPS Mode, a frame is not used for prediction in the encoder until it’s been acknowledged to be error free. No acknowledgment received yet - not used for prediction. − Select one of several picture memories/prediction structures to reduce error propagation. Bad picture
  • 185. • Back channel message types – Neither: no back channel is returned form decoder to encoder – ACK: decoder returns only acknowledgement messages – NACK: decoder returns only non-acknowledgement messages – ACK+NACK: decoder returns both types of messages • Channel for Back channel messages – Separate Logical Channel: uses separate logical channel in the multiplex layer of system – VideoMux: sends back-channel data within forward video data of a video stream coded data ACK-based: a picture is assumed to contain errors, and thus is not used for prediction unless an ACK is received. NACK-based: a picture will be used for prediction unless a NACK is received, in which case the previous picture that didn’t receive a NACK will be used. 185 Reference Picture Selection (RPS) Mode of H.263+
  • 186. 186 Coding Control (CC) T Q Q T p t qz q v Video in To video multiplex coder -1 -1 P AP1 AP2 APn Reference Picture Selection (RPS) Mode of H.263+
  • 187. Reference pictures are interleaved to create two or more independently decodable threads. − If a frame is lost, the frame rate drops to 1/2 rate until a sync frame is reached. − Same syntax as Reference Picture Selection, but without ACK/NACK. − Adds some overhead since prediction is not based on most recent frame. 187 Multi-threaded Video 1 3 2 5 7 9 4 6 8 10 I P P P P P P P P I
  • 188. − A video encoder contains a decoder (called the loop decoder) to create decoded previous frames which are then used for motion estimation and compensation. − The loop decoder must stay in sync with the real decoder, otherwise errors propagate. 188 Conditional Replenishment ME/MC DCT, etc. Decoder Decoder Encoder
  • 189. − One solution is to discard the loop decoder. − Can do this if we restrict ourselves to just two macroblock types: • Intra coded • Empty (just copy the same block from the previous frame) − The technique is to check if the current block has changed substantially since the previous frame and then code it as Intra if it has changed. Otherwise mark it as empty. − A periodic refresh of Intra coded blocks ensures all errors eventually disappear. 189 Conditional Replenishment ME/MC DCT, etc. Decoder Decoder Encoder
  • 190. − Lost macroblocks are reported back to the encoder using a reliable back-channel. − The encoder catalogs spatial propagation of each macroblock over the last M frames. − When a macroblock is reported missing, the encoder calculates the accumulated error in each MB of the current frame. − If an error threshold is exceeded, the block is coded as Intra. − Additionally, the erroneous macroblocks are not used as prediction for future frames in order to contain the error. 190 Error Tracking Appendix II, H.263
  • 191. − Some parts of a bit stream contribute more to image artifacts than others if lost. − The bit stream can be prioritized and more protection can be added for higher priority portions. 191 Prioritized Encoding AC Coefficients DC Coefficients MB Information Motion Vectors Picture Header Increasing Error Protection Unprotected Encoding Prioritized Encoding (23% Overhead) Prioritized Encoding Demo VideosusedwithpermissionofICSI,UCBerkeley
  • 192. − To hide the image degradation from the viewer. − The main idea behind error concealment is to replace the damaged pixels with pixels from some parts of the video that have maximum resemblance. − In general, pixel substitution may come from the same frame or from the previous frame. − These are called intraframe and interframe error concealment, respectively 192 Error Concealment by Interpolation d1 d2 Lost block Take the weighted average of 4 neighboring pixels.
  • 193. Error Concealment with • Least Square Constraints • Bayesian Estimators • Polynomial Interpolation • Edge-Based Interpolation • Multi-directional Recursive Nonlinear Filter (MRNF) 193 Other Error Concealment Techniques MPQT@0.5 bpp, block loss:10% MRNF-GMLOS, PSNR=34.94dB Example: MRNF Filtering
  • 194. − Most multimedia applications place the burden of rate adaptivity on the source. − For multicasting over heterogeneous networks and receivers, it’s impossible to meet the conflicting requirements which forces the source to encode at a least-common denominator level. − The smallest network pipe dictates the quality for all the other participants of the multicast session. − If congestion occurs, the quality of service degrades as more packets are lost. 194 Network Congestion