SlideShare a Scribd company logo
1 of 34
Download to read offline
1
EGE UNIVERSİTY
DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING
MULTIMEDIA INFORMATION SYSTEMS
FINAL PROJECT
Tree-Structured Partitioning into Transform Blocks
and Units and Interpicture Prediction in HEVC
LAÇİN ACAROĞLU
05150000537
Course Teacher
DR. NÜKHET ÖZBEK
2
Contents
BLOCK STRUCTURES in HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1. Block Partitioning for Prediction and Transform Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
2. Coding Tree Blocks and Coding Tree Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
3. Coding Trees, Coding Blocks, and Coding Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
3.1. Coding units/blocks (CU/CB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2. Prediction units/blocks (PU/PB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
3.3. Transform units/blocks (TU/TB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9
INTER-PICTURE PREDICTION in HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
1. Motion Data Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1. Advanced Motion Vector Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.1. AMVP Candidate List Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
1.1.1.a. Spatial Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
1.1.1.b. Temporal Candidate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2. Inter-picture Prediction Block Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.1. Merge Candidate List Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
1.2.1.a. Spatial Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
1.2.1.b. Temporal Candidate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.1.c. Additional Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
1.2.2. Motion Data Strorage Reduction (MDSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2. Fractional Sample Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1. Redesigned Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2. High Precision Filtering Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3. Complexity of HEVC Interpolation Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3. Weighted Sample Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
EXPERİMENTAL STUDY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1. Coding/Access Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 25
1.1. Low Delay Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.1.1. TZ Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
1.2. Random Access Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
3
BLOCK STRUCTURES in HEVC
1.) Block Partitioning for Prediction and Transform Coding :
One significant difference between the different generations of video coding standards is
that they provide different sets of coding modes for a block of samples. On the one hand,
the selected coding mode determines whether the block of samples is predicted using intra-
picture or inter-picture prediction. On the other hand, it can also determine the subdivision
of a given block into subblocks used for prediction and/or transform coding. Blocks that are
used for prediction are typically additionally associated with prediction parameters, such as
motion vectors or intra prediction modes.
2.) Coding Tree Blocks and Coding Tree Units
Each picture of a video sequence is partitioned into so-called macroblocks. A macroblock consists of
a 16x16 block of luma samples and, in the 4:2:0 chroma sampling format, two associated 8x8 blocks
of chroma samples (one for each chroma component). The macroblocks can be considered as the
basic processing units in these standards. For each macroblock of a picture, a coding mode has to be
selected by the encoder. The chosen macroblock coding mode determines whether all samples of a
macroblock are predicted using intra-picture prediction or motioncompensated prediction.
Why is HEVC invented? Due to the popularity of HD video and the growing interest in Ultra HD (UHD)
formats with resolutions of, for example, 3840x2160 or even 7680x4320 luma samples, HEVC has
been designed with a focus on high resolution video.
But what was the main point for HEVC? The standard has been designed to provide an improved
coding efficiency relative to its predecessor H.264 (MPEG-4) AVC for all existing video coding
applications. While increasing the size of the largest supported block size is advantageous for high-
resolution video, it may have a negative impact on coding efficiency for low-resolution video, in
particular if low-complexity encoder implementations are used that are not capable of evaluating all
supported subpartitioning modes. For this reason, HEVC includes a flexible mechanism for partitioning
video pictures into basic processing units of variable sizes.
HEVC is basically replacement of Macroblocks and blocks in prior standards. Unlike 10 years ago, we
have much higher frame sizes to deal with. We need larger macroblocks to efficiently encode the
4
motion vectors for these frame size. On the other hand, small detail is still important and we
sometimes want to perform prediction and transformation at the granularity of 4×4.
3.) Coding Trees, Coding Blocks, and Coding Units
HEVC divides the picture into CTUs (Coding Tree Unit). The width and height of CTU are signaled in a
sequence parameter set, meaning that all the CTUs in a video sequence have the same size.
In HEVC standard, if something is called xxxUnit, it indicates a coding logical unit which is in turn
encoded into an HEVC bit stream. On the other hand, if something is called xxxBlock, it indicates a
portion of video frame buffer where a process is target to. CTU – Coding Tree Unit is therefore a
logical unit. It usually consists of three blocks, namely luma (Y) and two chroma samples (Cb and Cr),
and associated syntax elements. Each block is called CTB (Coding Tree Block).
In main profile, the minimum and the maximum sizes of CTU are specified by the syntax elements in
the sequence parameter set (SPS) among the sizes of 8×8, 16×16, 32×32, and 64×64.
A direct application of the H.264 / MPEG-4 AVC macroblock syntax to the coding tree units in HEVC
would cause some problems. On the one hand, choosing between intra-picture and motion
compensated prediction for large blocks is unfavorable in rate-distortion sense. The decision
5
between intra-picture and interpicture coding at units smaller than a 16x16 macroblock can increase
the coding efficiency.
In HEVC, each picture is partitioned into square-shaped coding tree blocks (CTBs) such that the
resulting number of CTBs is identical for both the luma and chroma picture components. Each CTB of
luma samples together with its two corresponding CTBs of chroma samples and the syntax associated
with these sample blocks is subsumed under a so-called coding tree unit (CTU).
The main advantage of introducing the coding tree structure is that in this way an elegant and
unified syntax is obtained for specifying the partitioning of CTUs into blocks that are used for intra-
picture prediction, motion-compensated prediction, and transform coding.
Example for the partitioning of a 64x64 coding tree unit (CTU) into coding units (CUs) of 8x8 to 32x32
luma samples. The partitioning can be described by a quadtree, also referred to as coding tree, which
is shown on the right. The numbers indicate the coding order of the CUs
3.1.Coding units/blocks (CU/CB) :
Each CTB still has the same size as CTU – 64×64, 32×32, or 16×16. Depending on a part of video
frame, however, CTB may be too big to decide whether we should perform inter-picture prediction
or intra-picture prediction. Thus, each CTB can be differently split into multiple CBs (Coding Blocks)
6
and each CB becomes the decision making point of prediction type. For example, some CTBs are split
to 16×16 CBs while others are split to 8×8 CBs. HEVC supports CB size all the way from the same size
as CTB to as small as 8×8.
CTU size be 2N×2N where N is one of the values of 32, 16, or 8. The CTU can be a single CU or can be
split into four smaller units of equal sizes of N×N, which are nodes of coding tree. If the units are leaf
nodes of coding tree, the units become CUs. Otherwise, it can be split again into four smaller units
when the split size is equal or larger than the minimum CU size specified in the SPS.
The quadtree splitting process can be iterated until the size for a luma CB reaches a minimum
allowed luma CB size that is selected by the encoder using syntax in the SPS and is always 8×8 or
larger (in units of luma samples).
CB is the decision point whether to perform inter-picture or intra-picture prediction. More precisely,
the prediction type is coded in CU (Coding Unit). CU consists of three CBs (Y, Cb, and Cr) and
associated syntax elements.
7
CB is good enough for prediction type decision, but it could still be too large to store motion vectors
(inter prediction) or intra prediction mode. For example, a very small object like snowfall may be
moving in the middle of 8×8 CB – we want to use different MVs depending on the portion in CB.
Some properties :
Recursive Partitioning from CTU : Let CTU size be 2N×2N where N is one of the values of 32, 16, or 8.
The CTU can be a single CU or can be split into four smaller units of equal sizes of N×N, which are
nodes of coding tree. If the units are leaf nodes of coding tree, the units become CUs. Otherwise, it
can be split again into four smaller units when the split size is equal or larger than the minimum CU
size specified in the SPS. This representation results in a recursive structure specified by a coding
tree.
Benefits of Flexible CU Partitioning Structure : The first benefit comes from the support of CU sizes
greater than the conventional 16×16 size. When the region is homogeneous, a large CU can
represent the region by using a smaller number of symbols than is the case using several small
blocks. Adding large size CU is an effective means to increase coding efficiency for higher resolution
content.
CTU is one of the strong points of HEVC in terms of coding efficiency and adaptability for contents
and applications. This ability is especially useful for low-resolution video services, which are still
commonly used in the market. By choosing an appropriate size of CTU and maximum hierarchical
depth, the hierarchical block partitioning structure can be optimized to the target application.
The splitting process of coding tree can be specified recursively and all other syntax elements can be
represented in the same way regardless of the size of CU. This kind of recursive representation is
very useful in terms of reducing parsing complexity and improving clarity when the quadtree depth is
large.
3.2.Prediction blocks/units (PU/PB) :
We want to use different MVs depending on the portion in CB, that is why PB was introduced. Each
CB can be split to PBs differently depending on the temporal and/or spatial predictability. For each
CU, a prediction mode is signaled inside the bitstream. The prediction mode indicates whether the
CU is coded using intra-picture prediction or motion-compensated prediction.
One or more PU’s are specified for each CU, which is a leaf node of coding tree. Coupled
with the CU, the PU works as a basic representative block for sharing the prediction
8
information. Inside one PU, the same prediction process is applied and the relevant
information is transmitted to the decoder on a PU basis.
A CU can be split into one, two or four PUs according to the PU splitting type. HEVC defines
two splitting shapes for the intra coded CU and eight splitting shapes for inter coded CU.
Unlike the CU, the PU may only be split once.
PU Splitting Type : Similar to prior standards, each CU in HEVC can be classified into three categories:
skipped CU, inter coded CU, and intra coded CU. An inter coded CU uses motion compensation
scheme for the prediction of the current block, while an intra coded CU uses neighboring
reconstructed samples for the prediction. A skipped CU is a special form of inter coded CU where
both the motion vector difference and the residual energy are equal to zero.
For each category, PU splitting type is specified differently as shown in Fig. 5 when the CU size is
equal to 2N×2N. As shown in the figure, only PART−2N×2N PU splitting type is allowed for the
skipped CU. For the intra coded CU, two possible PU splitting types of PART−2N×2N and PART−N×N
are supported. Finally, total eight PU splitting types are defined as two square shapes (PART−2N×2N,
PART−N×N), two rectangular shapes (PART−2N×N and PART−N×2N), and four asymmetric shapes
(PART−2N×nU, PART−2N×nD, PART−nL×2N, and PART−nR×2N) for inter coded CU.
9
Constraints According to CU Size : In PART−N×N, CU is split into four equal-sizes PUs, which is
conceptually similar with the case of four equal-size CUs when the CU size is not equal to the
minimum CU size. Thus, HEVC disallows the use of PART−N×N except when the CU size is equal to the
minimum CU size. It was observed that this design choice can reduce the encoding complexity
significantly while the coding efficiency loss is marginal.
To reduce the worst-case complexity, HEVC further restricts the use of PART−N×N and asymmetric
shapes. In case of inter coded CU, the use of PART−N×N is disabled when the CU size is equal to 8×8.
Moreover, asymmetric shapes for inter coded CU are only allowed when the CU size is not equal to
the minimum CU size.
HEVC supports eight different modes for partitioning a CU into PUs. As you can see from figure at
below.
Supported partitioning modes for splitting a coding unit (CU) into one, two, or four prediction units
(PU). The (M=2) x (M=2) mode and the modes shown in the bottom row are not supported for all CU
sizes
3.3.Transform block/unit (TB/TU) :
Once the prediction is made, we need to code residual (difference between predicted image and
actual image) with DCT-like transformation. Again, CB could be too big for this because a CB may
contains both a detailed part (high frequency) and a flat part (low frequency). Therefore, each CB can
be differently split into TBs (Transform Block). Note that TB doesn’t have to be aligned with PB. It is
possible and often makes sense to perform single transform across residuals from multiple PBs.
10
Similar with the PU, one or more TUs are specified for the CU. HEVC allows a residual block
to be split into multiple units recursively to form another quadtree which is analogous to the
coding tree for the CU. The TU is a basic representative block having residual or transform
coefficients for applying the integer transform and quantization. For each TU, one integer
transform having the same size to the TU is applied to obtain residual coefficients. These
coefficients are transmitted to the decoder after quantization on a TU basis.
Residual Quadtree: After obtaining the residual block by prediction process based on PU
splitting type, it is split into multiple TUs according to a quadtree structure. For each TU, an
integer transform is applied. The tree is called transform tree or residual quadtree (RQT)
since the residual block is partitioned by a quadtree structure and a transform is applied for
each leaf node of the quadtree.
Nonsquare Partitioning : SRQT is constructed when PU splitting type is square shape while
NSRQT is utilized for rectangular and asymmetric shapes. For NSRQT, transform shape is
horizontal when the choice of the partition mode is horizontal type such as PART−2N×N,
PART−2N×nU, and PART−2N×nD. The same rule is applied to the vertical type case such as
PART−N×2N, PART−nL×2N, and PART−nR×2N. Although the syntax of SRQT and NSRQT is the
same, the shapes of TUs at each transform tree depth are defined differently for SRQT and
NSRQT. Fig. 6 illustrates an example of transform tree and corresponding TU splitting. Fig.
6(a) represents transform tree. Fig. 6(b) shows TU splitting when the PU shape is square. Fig.
6(c) shows TU splitting when the PU shape is rectangular or asymmetric. Although they share
the same transform tree, the actual TU splitting is different depending on the PU splitting
type.
11
Transform Across Boundary : In HEVC, both the PU size and the TU size can reach the same
size of the corresponding CU. This leads to the fact that the size of TU may be larger than
that of the PU in the same CU, i.e., residuals from different PUs in the same CU can be
transformed together. For example, when the TU size is equal to the CU size, the transform
is applied to the residual block covering the whole CU regardless of the PU splitting type.
Note that this case exists only for inter coded CU, since the prediction is always coupled with
the TU splitting for intra coded CU.
Maximum Depth of Transform Tree : The maximum depth of transform tree is closely
related to the encoding complexity. To provide the flexibility on this feature, HEVC specifies
two syntax elements in the SPS which control the maximum depth of transform tree for intra
coded CU and inter coded CU, respectively. The case when the maximum depth of transform
tree is equal to 1 is denoted as implicit TU splitting since there is no need to transmit any
information on whether the TU is split. In this case, the transform size is automatically
adjusted to be fit inside the PU rather than allowing transform across the boundary.
INTER-PICTURE PREDICTION IN HEVC
Hybrid video coding is known to be a combination of video sample prediction and
transformation of the prediction error. While intra-picture prediction exploits the correlation
between spatially neighboring samples, inter-picture prediction makes use of the temporal
correlation between pictures in order to derive a motion-compensated prediction (MCP) for
a block of image samples.
For this block-based MCP, a video picture is divided into rectangular blocks. Assuming
homogeneous motion inside one block and that moving objects are larger than one block,
for each block, a corresponding block in a previously decoded Picture can be found that
serves as a predictor.
12
1.Motion Data Coding
1.1.Advanced Motion Vector Prediction :
In HEVC, motion vectors are coded in terms of horizontal (x) and vertical (y) components as a
difference to a so called motion vector predictor (MVP).
Why neighboring blocks are important for us?
Because motion vectors of the current block are usually correlated with the motion vectors
of neighboring blocks in the current picture or in the earlier coded pictures. This is because
neighboring blocks are likely to correspond to the same moving object with similar motion
and the motion of the object is not likely to change abruptly over time. Consequently, using
the motion vectors in neighboring blocks as predictors reduces the size of the signaled
motion vector difference. The MVPs are usually derived from already decoded motion
vectors from spatial neighboring blocks or from temporally neighboring blocks in the co-
located picture.
AVC and HEVC have a important difference in here. The way to create (or capture) MVs has
some differences.
➢ In H.264/AVC => wise median of three spatially neighboring motion vectors
➢ In HEVC => motion vector competition
Vector competition explicitly signals which MVP from a list of MVPs, is used for motion
vector derivation. The variable coding quadtree block structure in HEVC can result in one
block having several neighboring blocks with motion vectors as potential MVP candidates.
1.1.1. AMVP Candidate List Construction:
The initial design of AMVP included five MVPs from three different classes of predictors:
three motion vectors from spatial neighbors, the median of the three spatial predictors and
a scaled motion vector from a co-located, temporally neighboring block. Furthermore, the
list of predictors was modified by reordering to place the most probable motion predictor in
the first position and by removing redundant candidates to assure minimal signaling
overhead.
13
This led to significant simplifications of the AMVP design such as removing the median
predictor, reducing the number of candidates in the list from five to two, fixing the candidate
order in the list and reducing the number of redundancy checks.
AMVP candidate list construction (final design) includes:
• Two spatial candidate MVPs that are derived from five spatial neighboring blocks.
• One temporal candidate MVPs derived from two temporal, co-located blocks when
both spatial candidate MVPs are not available or they are identical.
• Zero motion vectors when the spatial, the temporal or both candidates are not
available.
1.1.1.a. Spatial Candidates:
Candidates A and B are derived from five spatially neighboring blocks.
How it works?
Candidate A, motion data from the two blocks A0 and A1 at the bottom left corner is taken
into account in a two pass approach. In the first pass, it is checked whether any of the
candidate blocks contain a reference index that is equal to the reference index of the current
block. The first motion vector found will be taken as candidate A. When all reference indices
from A0 and A1 are pointing to a different reference picture than the reference index of the
current block, the associated motion vector can’t be used as is. Therefore, in a second pass,
the motion vectors need to be scaled according to the temporal distances between the
candidate reference picture and the current reference picture.
Tricky point:
Only motion vectors from spatial neighboring blocks to the left and above the current block
are considered as spatial MVP candidates. This can be explained by the fact that the blocks
to the right and below the current block are not yet decoded and hence, their motion data is
not available. Don’t forget it because we will talk about it in ‘temporal candidates’ part.
14
Here is the mathematical expression for this operation:
td -> temporal distance between current Picture and reference Picture of candidate block
tb -> temporal distance between current Picture and reference Picture of current block
For candidate B, the candidates B0 to B2 are checked sequentially in the same way as A0 and
A1 are checked in the first pass. The second pass, however, is only performed when blocks
A0 and A1 do not contain any motion information. Then, candidate A is set equal to the non-
scaled candidate B, if found, and candidate B is set equal to a second, non-scaled or scaled
variant of candidate B. Since you could also end up in the second pass when there still might
be potential non-scaled candidates, the second pass searches for non-scaled as well as for
scaled MVs derived from candidates B0 to B2.
Overall, this design allows to process A0 and A1 independently from B0, B1, and B2. The
derivation of B should only be aware of the availability of both A0 and A1 in order to search
for a scaled or an additional non-scaled MV derived from B0 to B2. This dependency is
acceptable given that it significantly reduces the complex motion vector scaling operations
for candidate B. Reducing the number of motion vector scalings represents a significant
complexity reduction in the motion vector predictorderivation process.
1.1.1.b. Temporal Candidates:
Since the co-located picture is a reference picture which is already decoded, it is possible to
also consider motion data from the block at the same position, from blocks to the right of
the co-located block or from the blocks below. In HEVC, the block to the bottom right and at
the center of the current block have been determined to be the most suitable to provide a
good temporal motion vector predictor (TMVP).
C0 represents the bottom right neighbor and C1 represents the center block
15
Motion data of C0 is considered first and, if not available, motion data from the co-located
candidate block at the center is used to derive the temporal MVP candidate C. The motion
data of C0 is also considered as not being available when the associated PU belongs to a CTU
beyond the current CTU row. This minimizes the memory bandwidth requirements to store
the co-located motion data. In contrast to the spatial MVP candidates, where the motion
vectors may refer to the same reference picture, motion vector scaling is mandatory for the
TMVP.
H.264 and H.265 difference in temporal candidates:
While the temporal direct mode in H.264/AVC always refers to the first reference picture in
the second reference picture list, list1, and is only allowed in bi-predictive slices, HEVC
offers the possibility to indicate for each picture which reference picture is considered as the
co-located picture.This is done by signaling in the slice header the co-located reference
picture list and reference picture index as well as requiring that these syntax elements in all
slices in a picture should specify the same reference picture.
1.2. Inter-picture Prediction Block Merging:
HEVC uses a quadtree structure to describe the partitioning of a region into subblocks. In
terms of bit rate, this is a very low-cost structure while at the same time, it allows for
partitioning into a wide range of differently sized sub-blocks. While this simplicity is an
advantage for example for encoder design, it also bears the disadvantage of over-
segmenting the image, potentially leading to redundant signaling and ineffective borders.
The main motivation for the use of block-based partitioning in image or video compression is
to code each block with one specific choice out of a set of simple models since, in general, a
single model cannot be expected to capture the properties of a whole image efficiently. The
quadtree data structure allows partitioning of an image into blocks of variable size, and is
thus a suitable framework for optimizing the tradeoff between model accuracy and model
coding cost.
However, the concept of quadtree-based image and video coding is intrinsically tied to
certain drawbacks. For instance, the systematic decomposition of every block into four child
blocks does not allow to jointly represent child blocks that belong to two different parent
blocks. Also, if a given block is divided into four child blocks, all child blocks are typically
coded separately, even if two or three of them share the same coding parameters. Thus, the
suboptimal RD properties of an initial quadtree-based subdivision can be substantially
improved by joining (or merging) nodes belonging to potentially different parents.
16
(a) Details of a picture in the Cactus sequence containing a moving object in the foreground with motion
indicated by the arrow. (b) Quadtree partitioning for motion parameters is indicated by the white squares. (c)
Only those block borders are shown that separate blocks with different motion parameters.
Figure (b) reveals that an RD-optimized tree pruning forces the quadtree segmentation to
subdivide regions unnecessarily. We observed over a range of sequences and coding
conditions for HEVC that those blocks, which are neighbored by a previously coded block
with exactly the same motion parameters covered approximately 40% of all samples on
average.
Figure (c) shows the same partitioning as in (b), but with boundaries between blocks with
identical motion parameters removed. This reveals regions of identical motion parameters,
better capturing the distinct types of motion in the scene. Ideally, the boundary of each
region should coincide with the motion discontinuities in the given video signal. Using
quadtree-based block partitioning combined with merging, the picture can be subdivided
into smaller and smaller blocks, thereby approximating the motion boundary and minimizing
the size of partitions containing the boundary. Subsequently, each created quadtree leaf
block can be merged into a region on either side of the boundary.
Merging operation:
All the PBs along the block scanning pattern coded before the current PB form the set of
causal PBs. The block scanning pattern for the CTUs (the tree roots) is raster scan and within
a CTU, the CUs (leaves) and the associated PBs within are processed in depth-first order
along a z-scan. Hence, the PBs in the striped area of figure are successors (anticausal blocks)
to PB X and do not have associated prediction data yet.
17
Merging Figure
In any case, the current PB X can be merged only with one of the causal PBs. When X is
merged with one of these causal PBs, the PU of X copies and uses the same motion
information as the PU of this particular block. Out of the set of causal PBs, only a subset is
used to build a list of merge candidates. In order to identify a candidate to merge with, an
index to the list is signaled to the decoder for block X.
Relationship between merging neighboring blocks (H.265) and spatial direct mode (H.264):
It should be noted that a scheme merging spatially neighboring blocks is conceptually similar
to spatial prediction modes as, the spatial direct mode in H.264/AVC. This mode also tries to
reduce coding cost by using redundancies of motion parameters in neighboring blocks.
1.2.1. Merge Candidate List Construction:
Block Merging and AMVP appear similar, there is one main difference. The AMVP list only
contains motion vectors for one reference list while a merge candidate contains all motion
data including the information whether one or two reference picture lists are used as well as
a reference index and a motion vector for each list. This significantly reduces motion data
signaling overhead.
The output of this process is a list of merge candidates, each of them being a tuple of motion
parameters, which can be used for motion-compensated prediction (MCP) of a block. That is,
the information on whether one or two motion compensated hypothesis are used for
prediction, and for each hypothesis an MV and a reference picture index. The number of
candidates within this list is controlled by NumMergeCands, which is signaled in the slice
header. The whole process of adding candidates will stop as soon as the number of
candidates reaches NumMergeCands during the following ordered steps: the process starts
with the derivation of the initial candidates from spatially neighboring PBs, called spatial
candidates. Then, a candidate from a PB in a temporally collocated picture can be included,
which is called the temporal candidate. Due to unavailability of certain candidates or some
of them being redundant, the number of initial candidates could be less than
NumMergeCands. In this case, additional “virtual” candidates are inserted to the list so that
the number of candidates in the list is always equal to NumMergeCands.
18
Initial Candidates: The initial candidates are obtained by first analyzing the spatial neighbors
of the current PB. (A1, B1, B0, A0, and B2 --> same as AMVP) A candidate is only available to
be added to the list when the corresponding PU is interpicture predicted and does not lie
outside of the current slice. Moreover, a position-specific redundancy check is performed so
that redundant candidates are excluded from the list to improve the coding efficiency.
Redundancy checks prior to adding a candidate to the list.
For each candidate to check, outgoing arrows point to a candidate for comparison.
First, the candidate at position A1 is added to the list if available. Then, the candidates at the
positions B1, B0, A0, and B2 are processed in that order. Each of these candidates is
compared to one or two candidates at a previously tested position and is added to the list
only in case these comparisons show a difference. These redundancy checks are depicted in
figüre by an arrow each. Note that instead of performing a comparison for every candidate
pair, this process is constrained to a subset of comparisons to reduce the complexity while
maintaining the coding efficiency.
Major steps for the construction of the block merging candidate list in HEVC.
Merge candidate list construction includes :
• Four spatial merge candidates that are derived from five spatial neighboring blocks
• One temporal merge candidate derived from two temporal, co-located blocks
19
• Additional merge candidates including combined bi-predictive candidates and zero
motion vector candidates
1.2.1.a. Spatial Candidates:
The first candidates in the merge candidate list are the spatial neighbors. In order to derive a
list of motion vector predictors for AMVP, one MVP is derived from A0 and A1 and one from
B0, B1 and B2, respectively in that order. However, for inter-prediction block merging, up to
four candidates are inserted in the merge list by sequentially checking A1, B1, B0, A0 and B2,
in that order. Instead of just checking whether a neighboring block is available and contains
motion information, some additional redundancy checks are performed before taking all the
motion data of the neighboring block as a merge candidate.
Redundancy checks porposes
• Avoid having candidates with redundantmotion data in the list
• Prevent merging two partitions that could be expressed by other means which
would create redundant syntax
During the development of HEVC, the checks for redundant motion data have been reduced
to a subset in a way that the coding efficiency is kept while the comparison logic is
significantly reduced. In the final design, no more than two comparisons are performed per
candidate resulting in five overall comparisons. Given the order of {A1, B1, B0, A0, B2}, B0
only checks B1, A0 only A1 and B2 only A1 and B1.
1.2.1.b. Temporal Candidates:
Since a merge candidate comprises all motion data and the TMVP is only one motion vector,
the derivation of the whole motion data only depends on the slice type. For bi-predictive
slices, a TMVP is derived for each reference picture list. Depending on the availability of the
TMVP for each list, the prediction type is set to bi-prediction or to the list for which the
TMVP is available. All associated reference picture indices are set equal to zero.
Consequently for uni-predictive slices, only the TMVP for list 0 is derived together with the
reference picture index equal to zero. When at least one TMVP is available and the temporal
merge candidate is added to the list, no redundancy check is performed. This makes the
merge list construction independent of the co-located picture which improves error
resilience. Consider the case where the temporal merge candidate would be redundant and
therefore not included in the merge candidate list. In the event of a lost co-located picture,
the decoder could not derive the temporal candidates and hence not check whether it would
be redundant. The indexing of all subsequent candidates would be affected by this.
1.2.1.c. Additional Candidates:
After the spatial and the temporal merge candidates have been added, it can happen that
the list has not yet the fixed length. In order to compensate for the coding efficiency loss
20
that comes along with the non-length adaptive list index signaling, additional candidates are
generated.
Depending on the slice type, two kind of candidates are used to fully populate the list:
• Combined bi-predictive candidates
• Zero motion vector candidates
In bi-predictive slices, additional candidates can be generated based on the existing ones by
combining reference picture list 0 motion data of one candidate with and the list 1 motion
data of another one.
When the list is still not full after adding the combined bi-predictive candidates, or for uni-
predictive slices, zero motion vector candidates are calculated to complete the list. All zero
motion vector candidates have one zero displacement motion vector for uni-predictive slices
and two for bi-predictive slices.
1.2.2. Motion Data Strorage Reduction (MDSR):
The usage of the TMVP, in AMVP as well as in the merge mode, requires the storage of the
motion data (including motion vectors, reference indices and coding modes) in co-located
reference pictures. Considering the granularity of motion representation, the memory size
needed for storing motion data could be significant. HEVC employs motion data storage
reduction (MDSR) to reduce the size of the motion data buffer and the associated memory
access bandwidth by sub-sampling motion data in the reference pictures.
H.264 and H.265 difference:
While H.264/AVC is storing these information on a 4x4 block basis, HEVC uses a 16x16 block
where, in case of sub-sampling a 4x4 grid, the information of the top-left 4x4 block is stored.
Due to this sub-sampling, MDSR impacts on the quality of the temporal prediction.
Furthermore, there is a tight correlation between the position of the MV used in the co-
located picture, and the position of the MV stored by MDSR. TMVP candidates provide the
best trade off between coding efficiency and memory bandwidth reduction.
2. Fractional Sample Interpolation
Interpolation tasks arise naturally in the context of video coding because the true
displacements of objects from one picture to another are independent of the sampling grid
of cameras. Therefore, in MCP, fractional-sample accuracy is used to more accurately
capture continuous motion. Samples available at integer positions are filtered to estimate
values at fractional positions. This spatial domain operation can be seen in the frequency
domain as introducing phase delays to individual frequency components. An ideal
interpolation filter for band-limited signals induces a constant phase delay to all frequencies
and does not alter their magnitudes. The efficiency of MCP is limited by many factors—the
21
spectral content of original and already reconstructed pictures, camera noise level, motion
blur, quantization noise in reconstructed pictures, etc.
Similar to H.264/AVC, HEVC supports motion vectors with quarter-pixel accuracy for the
luma component and one-eighth pixel accuracy for chroma components. If the motion
vector has a half or quarter-pixel accuracy, samples at fractional positions need to be
interpolated using the samples at integer-sample positions. The interpolation process in
HEVC introduces several improvements over H.264/AVC that contributes to the significant
coding efficiency increase of HEVC.
In order to improve the filter response in the high frequency range, luma and chroma
interpolation filters have been re-designed and the tap-lengths were increased. The luma
interpolation process in HEVC uses a symmetric 8-tap filter for half-sample positions and an
asymmetric 7-tap filter for quarter-sample positions. For chroma samples, a 4-tap filter was
introduced. The intermediate values used in interpolation process are kept at a higher
accuracy in HEVC to improve coding efficiency.
H.264 and H.265 difference:
➢ H.264/AVC obtains the quarter-sample values by first obtaining the values of nearest
half-pixel samples and averaging those with the nearest integer samples, according
the position of the quarter-pixel. However, HEVC obtains the quarter-pixel samples
without using such cascaded steps but by instead directly applying a 7 or 8-tap filter
on the integer pixels.
➢ In H.264/AVC, a bi-predictivelycodedblock is calculated by averaging two
unipredicted blocks. If interpolation is performed to obtain the samples of the
uniprediction blocks, those samples are shifted and clipped to input bit-depth after
interpolation, prior to averaging. On the other hand, HEVC keeps the samples of each
one of the uni-prediction blocks at a higher accuracy and only performs rounding to
input bit-depth at the final stage, improving the coding efficiency by reducing the
rounding error.
2.1. Redesigned Filters:
An important parameter for interpolation filters is the number of filter taps as it has a direct
influence on both coding efficiency and implementation complexity. In terms of
implementation, it not only has an impact on the arithmetic operations but also on the
memory bandwidth required to access the reference samples.
Increasing the number of taps can yield filters that produce desired response for a larger
range of frequencies which can help to predict the corresponding frequencies in the samples
to be coded. Considering modern computing capabilities, the performance of many MCP
filters were evaluated in the context of HEVC and a coding efficiency/complexity trade-off
was targeted during the standardization.
22
The basic idea is to forward transform the known integer samples to the DCT domain and
inverse transform the DCT coefficients to the spatial domain using DCT basis sampled at
desired fractional positions instead of integer positions. These operations can be combined
into a single FIR filtering step.
The half-pel interpolation filter of HEVC comes closer to the desired response than the
H.264/AVC filter.
2.2. High Precision Filtering Operations:
In H.264/AVC, some of the intermediate values used within interpolation are shifted to
lower accuracy, which introduces rounding error and reduces coding efficiency. This loss of
accuracy is due to several reasons. Firstly, the half-pixel samples obtained by 6-tap FIR filter
are first rounded to input bit-depth, prior to using those for obtaining the quarter-pixel
samples.
Instead of using a two-stage cascaded filtering process, HEVC interpolation filter computes
the quarter-pixels directly using a 7-tap filter using the coefficients, which significantly
reduces the rounding error to 1=128. The second reason for reduction of accuracy in
H.264/AVC motion compensation process is due to averaging in bi-prediction. In H.264/AVC,
the prediction signal of the bi-predictively coded motion blocks are obtained by averaging
prediction signals from two prediction lists.
23
If the motion vectors have fractional pixel accuracy, then S1 and S2 are obtained using
interpolation and the intermediate values are rounded to input bit-depth. In HEVC, instead
of averaging each prediction signal at the precision of the bit-depth, they are averaged at a
higher precision if fractional motion vectors are used for the corresponding block.
In HEVC the only clipping operation is at the very end of the motion compensation process,
with no clipping in intermediate stages. As there is also no rounding in intermediate stages,
HEVC interpolation filter allows certain implementation optimizations. Consider the case
where bi-prediction is used and motion vectors of each prediction direction points to the
same fractional position. In these cases, final prediction could be obtained by first adding
two reference signals and performing interpolation and rounding once, instead of
interpolating each reference block, thus saving one interpolation process.
2.3. Complexity of HEVC Interpolation Filter:
When evaluating the complexity of a video coding algorithm there are several parameters
• Memory bandwidth
• Number of operations
• Storage buffer size
In terms of memory bandwidth,
H.264/AVC => 6-tap filter for luma sub-pixels and bilinear filter for chroma
H.265/HEVC => 7–8 tap filter for luma sub-pixels and 4-tap filter for chroma sub-pixels
This difference increases the amount of data that needs to be fetched from the reference
memory.
The worst case happens when a small motion block is bi-predicted and its corresponding
motion vector points to a sub-pixel position where two-dimensional filtering needs to be
performed. In order to reduce the worst case memory bandwidth, HEVC introduces several
restrictions. Firstly, the smallest prediction block size is fixed to be 4x8 or 8x4, instead of 4x4.
In addition, these smallest block sizes of size 4x8 and 8x4 can only be predicted with uni-
prediction. With these restrictions in place, the worst-case memory bandwidth of HEVC
interpolation filter is around 51% higher than that of H.264/AVC. The increase in memory
bandwidth is not very high for larger block sizes. For example, for a 32x32 motion block,
HEVC requires around 13% increased memory bandwidth over H.264/AVC.
Similarly, the longer tap-length filters increase the number of arithmetic operations required
to obtain the interpolated sample. The high-precision bi-directional averaging increases the
size of intermediate storage buffers for storing the temporary uni-prediction signals as each
one of the prediction signals need to be stored at a higher bit-depth compared to H.264/AVC
before the bi-directional averaging takes place.
24
HEVC uses 7-tap FIR filter for interpolating samples at quarter-pixel locations, which has an
impact on motion estimation of a video encoder. An H.264/AVC encoder could store only the
integer and half-pel samples in the memory and generate the quarter-pixels on-the-fly
during motion estimation. This would be significantly more costly in HEVC because of the
complexity of generating each quarter-pixel sample on-the-fly with a 7-tap FIR filter. Instead,
an HEVC encoder could store the quarter-pixel samples in addition to integer and half-pixel
samples and use those in motion estimation. Alternatively, an HEVC encoder could estimate
the values of quarter-pixel samples during motion estimation by low complexity non-
normative means.
3. Weighted Sample Prediction
Similar to H.264/AVC, HEVC includes a weighted prediction (WP) tool that is particularly
useful for coding sequences with fades. In WP, a multiplicative weighting factor and an
additive offset are applied to the motion compensated prediction.
(P) = inter prediction signal
(P’) = linearly weighted prediction signal
(o) = Offset
(w) = Illumination Compensation
Mathematical Expression => P’ = w x P + o
WP has a very small overhead in PPS and slice headers contain only non-default WP scaling
values. WP is an optional PPS parameter and it may be switched on/off when necessary.
The inputs to the WP process are:
• The width and the height of the luma prediction block
• Prediction samples to be weighted
• The prediction list utilization flags
• The reference indices for each list
• The color component index
In H.264/AVC, weight and offset parameters are either derived by relative distances
between the current picture and the reference distances (implicit mode) or weight and
offset parameters are explicitly signaled (explicitmode). Unlike H.264/AVC, HEVC only
includes explicit mode as the coding efficiency provided by deriving the weighted prediction
parameters with implicit mode was considered negligible. It should be noted that the
weighted prediction process defined in HEVC version 1 was found to be not optimal for
higher bit-depths as the offset parameter is calculated at low precision. The next version of
the standard will likely modify the derivationof the offset parameter for higher bit-depths.
The determination of appropriate WP parameters in an encoder is outside the scope of the
HEVC standard. Several algorithms for estimating WP parameters have been proposed in
25
literature. Optimal solutions are obtained when the Illumination Compensation weights,
motion estimation and Rate Distortion Optimization (RDO) are considered jointly. However,
practical systems usually employ simplified techniques, such as determining approximate
weights by considering picture-to-picture mean variation.
EXPERIMENTAL STUDY
I use HM 16.8 codec and some different configuration files. I will try to see some differences
over codec config files as well.
If you look below, you will see some part of config files. This part is about “motion search”
and ı will show this parameters’ effect in this section.
Motion Search - Encoder Parameter Description
FastSearch TZ search similar to the corresponding search
mode in the reference software for H.264/AVC
scalable video coding.
SearchRange Specifies the search range used for motion
estimation.
BipredSearchRange Specifies the search range used for bi-prediction
refinement in motion estimation.
HadamardME Enable (1) or disable (0) Hadamard transform for
fractional motion estimation. DCT is used for
fractional motion estimation. HadamardME
reduces the complexity also giving away a little
coding efficiency.
FEN Enable (1) or disable (0) fast encoding.
Use subsampled SAD(Sum of Absolute Difference)
for blocks with more than eight rows for integer
motion estimation and use one iteration instead
of four for bipredictive search.
FDM Enable (1) or disable (0) to use fast encoder
decisions for 2Nx2N merge mode.
Stop merge search if skip mode is chosen.
26
1. Coding/Access Settings:
Intra only → uses no temporal prediction
Random Access → allows the use of future frames as reference in temporal prediction
Low Delay → uses only previous frames as reference in temporal prediction
I will examine only “Random Access” and “Low Delay” in this section.
1.1. Low Delay Configuration Files:
1) Fast Search = 0 → Full Search , 1 → TZ Search
There is two different searching algorithm. As you can see in same quality (Bitrates and PSNR
values are nearly same with each other.) there is huge “encoding time” differences.
The TZ search algorithm is much faster than the full search algorithm. We can easily see it
above. After the full search scan, I see that the total time is about 2000 percent. (About 20
times longer)
TZ search algorithm:
TZ Search is the algorithm used in the ME process in HEVC. In this secton we will discuss it shortly.
A. Prediction Motion Vector:
The TZ Search algorithm employs four predictors: the left predictor, the up, and the upper right ones
to compute the median predictor of the corresponding block.
27
Median Calculation → Median( A, B, C ) = A + B + C − Min( A, Min( B, C ) ) − Max( A, Max( B, C ) )
B. Initial Grid Search
After selecting the start position, the second step is to perform the first search and to determine the
search range and the search pattern.
Search patterns could be diamond or square search patterns.
C. Raster Search
When the previous search is completed, the next step is the raster search, which chooses the best
distance corresponding to the best-matched point from the last search. Depending on this distance,
three different cases are presented.
1.case → If the best distance is equal to zero, then the process is stopped.
2.case → If 1<Bestdistance<iRaster, a refinement stage is proceeded directly. In fact, the "iRaster" is
a parameter that represents the central point and the first resulting from the research that it should
not exceed.
3.case → If Bestdistance>iRaster which is set appropriately, a raster scan is achieved using the
iRaster’s value as the stride length. The raster search is performed when the difference obtained
between the motion vectors from the first phase and the starting position is too large.
This raster search is carried out on the complete search window. A “Full Search” algorithm is shown
in figüre (below) for a 16x16 search window with iRaster equal to 4.
This is why we try to use algorithm for this operation. Our motto is “same work but efficiency way”.
28
2) FEN = 0 (Fast Encoder Decision)
In this config file searching algorithm is TZ Search (Fast Search = 1). So if I want to see
difference, I need to look at related files.
We see that the "FEN" value (it could be 0 or 1, open or closed) has been changed (closed,
given 0) on the right. This means that when “Fast Encoder Decision” is closed, encoding time
is increasing. I find this increase by 38 percent.
3) FDM = 0 (Fast Decision for Merge RD cost)
If this mode is turned off, an increase in encoding time is observed. This means that the fast
decision for the "merge" blocks has a decisive effect on encoding time. This increase is
around 13 percent.
Comparison between FEN and FDM:
The time increment when FEN mode is turned off is greater than the increase in FDM mode.
(about 25 percent). This means that FEN mode is much faster than FDM for encoding
operation. This is probably due to the fact that the mode (FEN) have more related blocks
than other mode’s (FDM) operation. So FEN mode has more general coverage than the other
(FDM).
29
4) SearchRange = 128
1.2. Random Access Configuration Files:
1) Fast Search = 0 → Full Search, 1 → TZ Search
The total time is increased about 2850 percent (About 28.5 times longer) for full search
algorithm.
2) FEN = 0 (Fast Encoder Decision)
The total time is increased about 37 percent.
30
3) FDM = 0 (Fast Decision for Merge RD cost)
The total time is increased about 15 percent.
4) SearchRange = 128
2. Conclusion:
Reference parameter configuration is shown below.
I prefer to use this as a reference because this is the fastest way for encoding. So if I change
any parameter, then I will see “encoding time increasing”.
31
Low Delay (QP=36, GOP=4) Random Access (QP=38, GOP=8)
Reference Bitrate = 223.89
PSNR = 38.21
Time = 1151.6
Reference Bitrate = 190.01
PSNR = 37.61
Time = 633.82
Fast Search = 0 Bitrate = 223.34
PSNR = 38.21
Time = 24369.1
Fast Search = 0 Bitrate = 190.05
PSNR = 37.61
Time = 18690.00
FEN = 0 Bitrate = 222.85
PSNR = 38.17
Time = 1592.47
FEN = 0 Bitrate = 190.02
PSNR = 37.61
Time = 868.42
FDM = 0 Bitrate = 223.54
PSNR = 38.19
Time = 1305.1
FDM = 0 Bitrate = 190.26
PSNR = 37.61
Time = 732.144
Search Range = 128 Bitrate = 223.47
PSNR = 38.19
Time = 1038.81
Search Range = 128 Bitrate = 189.86
PSNR = 37.60
Time = 907.46
Note : As you can see PSNR and Bitrate values do not change much. These are negligible for
now. Then I will examine the time changing for encoding operations.
Changed parameter Low Delay Random Access
Fast Search = 0 + %2000 + %2850
FEN = 0 + %38 + %37
FDM = 0 + %13 + %15
Search Range = 128 - %9 + %7
As you can see, same encoder parameters have nearly same effect on encoder time
changing.
Let's continue to interpret other parameters. First make a general evaluation.
What are our initial expectations from low-delay and random-access modes? After we have
understood this, we can understand the effects of the parameters that we have changed and
how it has impacted.
32
Do not forget that we have compared reference config files outputs in this part!
Bitrate and PSNR values are as expected. Low Delay will be the same like under normal
conditions, so it will be larger than random access.
However there is a smaller additional reason that the QP value of the low delay is less than
Random-Access. But it is not the only reason.
When we look at the total time things are changing. The reason for it is LD total time must
be less than RA total time in normal case. But we have two different parameter in here.
Firstly QP is more smaller for LD and secondly GOP size is half of RA’s GOP size. These
parameters effect the total time and finally LD has more encoding time than RA. This is so
understandable and logical.
Note for Search Range: Changing the Search Range parameter did not make much difference
because of the TZ search algorithm we used. We would probably see the difference if we
used the Full Search algorithm. Because in this case every area of the search range is
scanned and a change in the range will make a difference.
33
As seen above there is no big difference between different search range values. Minor
differences can be caused by different processes running in the system while the program is
running.
References:
Philipp Helle, Simon Oudin, Benjamin Bross, Student Member, IEEE, Detlev Marpe, Senior
Member, IEEE, M. Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand,
Fellow, IEEE, Block Merging for Quadtree-Based Partitioning in HEVC, IEEE TRANSACTIONS
ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012
Journal of Marine Science and Technology, Vol. 23, No. 5, pp. 727-731 (2015) 727 DOI:
10.6119/JMST-015-0604-2, MOTION VECTOR PREDICTION USING RESELECTION AMONG
CANDIDATE SET, S. T. Uou, Eric Y. T. Juan, and Jim Z. C. Lai
Vivienne Sze and Madhukar Budagavi Gary J. Sullivan “High Efficiency Video Coding (HEVC)-
Algorithms and Architectures” pp. 49-66 and pp. 113-141
Philipp Helle, Simon Oudin, Benjamin Bross, Student Member, IEEE, Detlev Marpe, Senior
Member, IEEE, M. Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand,
Fellow, Block Merging for Quadtree-Based Partitioning in HEVC, IEEE, IEEE TRANSACTIONS
ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012
S. Ahn, B. Lee, and M. Kim. A Novel Fast CU Encoding Scheme Based on Spatiotemporal
Encoding Parameters for HEVC Inter Coding. IEEE Transactions on Circuits and Systems for
Video Technology, 25(3):422– 435, March 2015.
B. Bross, P. Helle, H. Lakshman, and K. Ugur. Inter-picture prediction in HEVC. In High
Efficiency Video Coding (HEVC), pages 113–140. Springer, 2014.
RandaKhemiri, NejmeddineBahri, FatmaBelghith, FatmaSayadi, MohamedAtri, etal..
Fastmotion estimation for HEVC video coding. 2016 International Image Processing,
34
Applications and Systems (IPAS), Nov 2016, Hammamet, Tunisia. IEEE, Image Processing,
Applications and Systems (IPAS), 2016 International. <10.1109/IPAS.2016.7880120>
B. Haskell, P. Gordon, R. Schmidt, J. Scattaglia, “Interframe Coding of 525-Line, Monochrome
Television at 1.5 Mbits/s,” IEEE Trans. Communications, vol. 25, no. 11, pp. 1339-1348, Nov.
1977.
Felipe Sampaio, Mateus Grellert - Motion Vectors Merging: Low Complexity Prediction Unit
Decision Heuristic for the InterPrediction of HEVC Encoders, 2012 IEEE International
Conference on Multimedia and Expo
Gwo-Long Li, Chuen-Ching Wang, and Kuang-Hung Chiang “AN EFFICIENT MOTION VECTOR
PREDICTION METHOD FOR AVOIDING AMVP DATA DEPENDENCY FOR HEVC” 2014 IEEE
International Conference on Acoustic, Speech and Signal Processing (ICASSP)
Web Links:
https://slideplayer.com/slide/10724934/
https://www.slideshare.net/mmv-lab-univalle/slides7-ccc-v8

More Related Content

What's hot

Ibm power vc version 1.2.3 introduction and configuration
Ibm power vc version 1.2.3 introduction and configurationIbm power vc version 1.2.3 introduction and configuration
Ibm power vc version 1.2.3 introduction and configurationgagbada
 
Cenet-- capability enabled networking: towards least-privileged networking
Cenet-- capability enabled networking: towards least-privileged networkingCenet-- capability enabled networking: towards least-privileged networking
Cenet-- capability enabled networking: towards least-privileged networkingJithu Joseph
 
Pratical mpi programming
Pratical mpi programmingPratical mpi programming
Pratical mpi programmingunifesptk
 
Ngen mvpn with pim implementation guide 8010027-002-en
Ngen mvpn with pim implementation guide   8010027-002-enNgen mvpn with pim implementation guide   8010027-002-en
Ngen mvpn with pim implementation guide 8010027-002-enNgoc Nguyen Dang
 
Optimal decision making for air traffic slot allocation in a Collaborative De...
Optimal decision making for air traffic slot allocation in a Collaborative De...Optimal decision making for air traffic slot allocation in a Collaborative De...
Optimal decision making for air traffic slot allocation in a Collaborative De...Stefano Pacchiega
 
Load runner controller
Load runner controllerLoad runner controller
Load runner controllerAshwin Mane
 
Implementing tws extended agent for tivoli storage manager sg246030
Implementing tws extended agent for tivoli storage manager   sg246030Implementing tws extended agent for tivoli storage manager   sg246030
Implementing tws extended agent for tivoli storage manager sg246030Banking at Ho Chi Minh city
 
programación en prolog
programación en prologprogramación en prolog
programación en prologAlex Pin
 
Uni v e r si t ei t
Uni v e r si t ei tUni v e r si t ei t
Uni v e r si t ei tAnandhu Sp
 
ScreenOS Idp policy creation en
ScreenOS Idp policy creation enScreenOS Idp policy creation en
ScreenOS Idp policy creation enMohamed Al-Natour
 
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...Nitesh Pandit
 
Badripatro dissertation 09307903
Badripatro dissertation 09307903Badripatro dissertation 09307903
Badripatro dissertation 09307903patrobadri
 
Emf2192 ib _ethercat aif module__v3-1__en
Emf2192 ib _ethercat aif module__v3-1__enEmf2192 ib _ethercat aif module__v3-1__en
Emf2192 ib _ethercat aif module__v3-1__enCharles Santos
 
A Matlab Implementation Of Nn
A Matlab Implementation Of NnA Matlab Implementation Of Nn
A Matlab Implementation Of NnESCOM
 

What's hot (20)

Ibm power vc version 1.2.3 introduction and configuration
Ibm power vc version 1.2.3 introduction and configurationIbm power vc version 1.2.3 introduction and configuration
Ibm power vc version 1.2.3 introduction and configuration
 
Cenet-- capability enabled networking: towards least-privileged networking
Cenet-- capability enabled networking: towards least-privileged networkingCenet-- capability enabled networking: towards least-privileged networking
Cenet-- capability enabled networking: towards least-privileged networking
 
Pratical mpi programming
Pratical mpi programmingPratical mpi programming
Pratical mpi programming
 
Matconvnet manual
Matconvnet manualMatconvnet manual
Matconvnet manual
 
Swi prolog-6.2.6
Swi prolog-6.2.6Swi prolog-6.2.6
Swi prolog-6.2.6
 
The maxima book
The maxima bookThe maxima book
The maxima book
 
Ngen mvpn with pim implementation guide 8010027-002-en
Ngen mvpn with pim implementation guide   8010027-002-enNgen mvpn with pim implementation guide   8010027-002-en
Ngen mvpn with pim implementation guide 8010027-002-en
 
Optimal decision making for air traffic slot allocation in a Collaborative De...
Optimal decision making for air traffic slot allocation in a Collaborative De...Optimal decision making for air traffic slot allocation in a Collaborative De...
Optimal decision making for air traffic slot allocation in a Collaborative De...
 
Load runner controller
Load runner controllerLoad runner controller
Load runner controller
 
Javanotes6 linked
Javanotes6 linkedJavanotes6 linked
Javanotes6 linked
 
Implementing tws extended agent for tivoli storage manager sg246030
Implementing tws extended agent for tivoli storage manager   sg246030Implementing tws extended agent for tivoli storage manager   sg246030
Implementing tws extended agent for tivoli storage manager sg246030
 
programación en prolog
programación en prologprogramación en prolog
programación en prolog
 
Di11 1
Di11 1Di11 1
Di11 1
 
Uni v e r si t ei t
Uni v e r si t ei tUni v e r si t ei t
Uni v e r si t ei t
 
Oracle
OracleOracle
Oracle
 
ScreenOS Idp policy creation en
ScreenOS Idp policy creation enScreenOS Idp policy creation en
ScreenOS Idp policy creation en
 
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...
iPDC-v1.3.0 - A Complete Technical Report including iPDC, PMU Simulator, and ...
 
Badripatro dissertation 09307903
Badripatro dissertation 09307903Badripatro dissertation 09307903
Badripatro dissertation 09307903
 
Emf2192 ib _ethercat aif module__v3-1__en
Emf2192 ib _ethercat aif module__v3-1__enEmf2192 ib _ethercat aif module__v3-1__en
Emf2192 ib _ethercat aif module__v3-1__en
 
A Matlab Implementation Of Nn
A Matlab Implementation Of NnA Matlab Implementation Of Nn
A Matlab Implementation Of Nn
 

Similar to Tree structured partitioning into transform blocks and units and interpicture prediction in hevc

Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media LayerLinkedTV
 
Data over dab
Data over dabData over dab
Data over dabDigris AG
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
 
matconvnet-manual.pdf
matconvnet-manual.pdfmatconvnet-manual.pdf
matconvnet-manual.pdfKhamis37
 
High Performance Traffic Sign Detection
High Performance Traffic Sign DetectionHigh Performance Traffic Sign Detection
High Performance Traffic Sign DetectionCraig Ferguson
 
eclipse.pdf
eclipse.pdfeclipse.pdf
eclipse.pdfPerPerso
 
Db2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesDb2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesbupbechanhgmail
 
Protel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designProtel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designhoat6061
 
Protel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designProtel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designemmansraj
 
Qgis 1.7.0 user-guide_en
Qgis 1.7.0 user-guide_enQgis 1.7.0 user-guide_en
Qgis 1.7.0 user-guide_enxp5000
 
Distributed Mobile Graphics
Distributed Mobile GraphicsDistributed Mobile Graphics
Distributed Mobile GraphicsJiri Danihelka
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedDaniel Göker
 

Similar to Tree structured partitioning into transform blocks and units and interpicture prediction in hevc (20)

49529487 abis-interface
49529487 abis-interface49529487 abis-interface
49529487 abis-interface
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
Milan_thesis.pdf
Milan_thesis.pdfMilan_thesis.pdf
Milan_thesis.pdf
 
Software guide 3.20.0
Software guide 3.20.0Software guide 3.20.0
Software guide 3.20.0
 
Liebman_Thesis.pdf
Liebman_Thesis.pdfLiebman_Thesis.pdf
Liebman_Thesis.pdf
 
Data over dab
Data over dabData over dab
Data over dab
 
Ns doc
Ns docNs doc
Ns doc
 
test6
test6test6
test6
 
jc_thesis_final
jc_thesis_finaljc_thesis_final
jc_thesis_final
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...
 
AAPM-2005-TG18.pdf
AAPM-2005-TG18.pdfAAPM-2005-TG18.pdf
AAPM-2005-TG18.pdf
 
matconvnet-manual.pdf
matconvnet-manual.pdfmatconvnet-manual.pdf
matconvnet-manual.pdf
 
High Performance Traffic Sign Detection
High Performance Traffic Sign DetectionHigh Performance Traffic Sign Detection
High Performance Traffic Sign Detection
 
eclipse.pdf
eclipse.pdfeclipse.pdf
eclipse.pdf
 
Db2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesDb2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy services
 
Protel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designProtel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_design
 
Protel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_designProtel 99 se_traning_manual_pcb_design
Protel 99 se_traning_manual_pcb_design
 
Qgis 1.7.0 user-guide_en
Qgis 1.7.0 user-guide_enQgis 1.7.0 user-guide_en
Qgis 1.7.0 user-guide_en
 
Distributed Mobile Graphics
Distributed Mobile GraphicsDistributed Mobile Graphics
Distributed Mobile Graphics
 
Robust link adaptation in HSPA Evolved
Robust link adaptation in HSPA EvolvedRobust link adaptation in HSPA Evolved
Robust link adaptation in HSPA Evolved
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Recently uploaded (20)

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Tree structured partitioning into transform blocks and units and interpicture prediction in hevc

  • 1. 1 EGE UNIVERSİTY DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING MULTIMEDIA INFORMATION SYSTEMS FINAL PROJECT Tree-Structured Partitioning into Transform Blocks and Units and Interpicture Prediction in HEVC LAÇİN ACAROĞLU 05150000537 Course Teacher DR. NÜKHET ÖZBEK
  • 2. 2 Contents BLOCK STRUCTURES in HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Block Partitioning for Prediction and Transform Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 2. Coding Tree Blocks and Coding Tree Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 3. Coding Trees, Coding Blocks, and Coding Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 3.1. Coding units/blocks (CU/CB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2. Prediction units/blocks (PU/PB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 3.3. Transform units/blocks (TU/TB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9 INTER-PICTURE PREDICTION in HEVC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 1. Motion Data Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.1. Advanced Motion Vector Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.1.1. AMVP Candidate List Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12 1.1.1.a. Spatial Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 1.1.1.b. Temporal Candidate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2. Inter-picture Prediction Block Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.2.1. Merge Candidate List Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 1.2.1.a. Spatial Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 1.2.1.b. Temporal Candidate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.2.1.c. Additional Candidates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 1.2.2. Motion Data Strorage Reduction (MDSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2. Fractional Sample Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.1. Redesigned Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2. High Precision Filtering Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3. Complexity of HEVC Interpolation Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3. Weighted Sample Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24 EXPERİMENTAL STUDY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1. Coding/Access Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 25 1.1. Low Delay Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.1.1. TZ Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26 1.2. Random Access Configuration Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
  • 3. 3 BLOCK STRUCTURES in HEVC 1.) Block Partitioning for Prediction and Transform Coding : One significant difference between the different generations of video coding standards is that they provide different sets of coding modes for a block of samples. On the one hand, the selected coding mode determines whether the block of samples is predicted using intra- picture or inter-picture prediction. On the other hand, it can also determine the subdivision of a given block into subblocks used for prediction and/or transform coding. Blocks that are used for prediction are typically additionally associated with prediction parameters, such as motion vectors or intra prediction modes. 2.) Coding Tree Blocks and Coding Tree Units Each picture of a video sequence is partitioned into so-called macroblocks. A macroblock consists of a 16x16 block of luma samples and, in the 4:2:0 chroma sampling format, two associated 8x8 blocks of chroma samples (one for each chroma component). The macroblocks can be considered as the basic processing units in these standards. For each macroblock of a picture, a coding mode has to be selected by the encoder. The chosen macroblock coding mode determines whether all samples of a macroblock are predicted using intra-picture prediction or motioncompensated prediction. Why is HEVC invented? Due to the popularity of HD video and the growing interest in Ultra HD (UHD) formats with resolutions of, for example, 3840x2160 or even 7680x4320 luma samples, HEVC has been designed with a focus on high resolution video. But what was the main point for HEVC? The standard has been designed to provide an improved coding efficiency relative to its predecessor H.264 (MPEG-4) AVC for all existing video coding applications. While increasing the size of the largest supported block size is advantageous for high- resolution video, it may have a negative impact on coding efficiency for low-resolution video, in particular if low-complexity encoder implementations are used that are not capable of evaluating all supported subpartitioning modes. For this reason, HEVC includes a flexible mechanism for partitioning video pictures into basic processing units of variable sizes. HEVC is basically replacement of Macroblocks and blocks in prior standards. Unlike 10 years ago, we have much higher frame sizes to deal with. We need larger macroblocks to efficiently encode the
  • 4. 4 motion vectors for these frame size. On the other hand, small detail is still important and we sometimes want to perform prediction and transformation at the granularity of 4×4. 3.) Coding Trees, Coding Blocks, and Coding Units HEVC divides the picture into CTUs (Coding Tree Unit). The width and height of CTU are signaled in a sequence parameter set, meaning that all the CTUs in a video sequence have the same size. In HEVC standard, if something is called xxxUnit, it indicates a coding logical unit which is in turn encoded into an HEVC bit stream. On the other hand, if something is called xxxBlock, it indicates a portion of video frame buffer where a process is target to. CTU – Coding Tree Unit is therefore a logical unit. It usually consists of three blocks, namely luma (Y) and two chroma samples (Cb and Cr), and associated syntax elements. Each block is called CTB (Coding Tree Block). In main profile, the minimum and the maximum sizes of CTU are specified by the syntax elements in the sequence parameter set (SPS) among the sizes of 8×8, 16×16, 32×32, and 64×64. A direct application of the H.264 / MPEG-4 AVC macroblock syntax to the coding tree units in HEVC would cause some problems. On the one hand, choosing between intra-picture and motion compensated prediction for large blocks is unfavorable in rate-distortion sense. The decision
  • 5. 5 between intra-picture and interpicture coding at units smaller than a 16x16 macroblock can increase the coding efficiency. In HEVC, each picture is partitioned into square-shaped coding tree blocks (CTBs) such that the resulting number of CTBs is identical for both the luma and chroma picture components. Each CTB of luma samples together with its two corresponding CTBs of chroma samples and the syntax associated with these sample blocks is subsumed under a so-called coding tree unit (CTU). The main advantage of introducing the coding tree structure is that in this way an elegant and unified syntax is obtained for specifying the partitioning of CTUs into blocks that are used for intra- picture prediction, motion-compensated prediction, and transform coding. Example for the partitioning of a 64x64 coding tree unit (CTU) into coding units (CUs) of 8x8 to 32x32 luma samples. The partitioning can be described by a quadtree, also referred to as coding tree, which is shown on the right. The numbers indicate the coding order of the CUs 3.1.Coding units/blocks (CU/CB) : Each CTB still has the same size as CTU – 64×64, 32×32, or 16×16. Depending on a part of video frame, however, CTB may be too big to decide whether we should perform inter-picture prediction or intra-picture prediction. Thus, each CTB can be differently split into multiple CBs (Coding Blocks)
  • 6. 6 and each CB becomes the decision making point of prediction type. For example, some CTBs are split to 16×16 CBs while others are split to 8×8 CBs. HEVC supports CB size all the way from the same size as CTB to as small as 8×8. CTU size be 2N×2N where N is one of the values of 32, 16, or 8. The CTU can be a single CU or can be split into four smaller units of equal sizes of N×N, which are nodes of coding tree. If the units are leaf nodes of coding tree, the units become CUs. Otherwise, it can be split again into four smaller units when the split size is equal or larger than the minimum CU size specified in the SPS. The quadtree splitting process can be iterated until the size for a luma CB reaches a minimum allowed luma CB size that is selected by the encoder using syntax in the SPS and is always 8×8 or larger (in units of luma samples). CB is the decision point whether to perform inter-picture or intra-picture prediction. More precisely, the prediction type is coded in CU (Coding Unit). CU consists of three CBs (Y, Cb, and Cr) and associated syntax elements.
  • 7. 7 CB is good enough for prediction type decision, but it could still be too large to store motion vectors (inter prediction) or intra prediction mode. For example, a very small object like snowfall may be moving in the middle of 8×8 CB – we want to use different MVs depending on the portion in CB. Some properties : Recursive Partitioning from CTU : Let CTU size be 2N×2N where N is one of the values of 32, 16, or 8. The CTU can be a single CU or can be split into four smaller units of equal sizes of N×N, which are nodes of coding tree. If the units are leaf nodes of coding tree, the units become CUs. Otherwise, it can be split again into four smaller units when the split size is equal or larger than the minimum CU size specified in the SPS. This representation results in a recursive structure specified by a coding tree. Benefits of Flexible CU Partitioning Structure : The first benefit comes from the support of CU sizes greater than the conventional 16×16 size. When the region is homogeneous, a large CU can represent the region by using a smaller number of symbols than is the case using several small blocks. Adding large size CU is an effective means to increase coding efficiency for higher resolution content. CTU is one of the strong points of HEVC in terms of coding efficiency and adaptability for contents and applications. This ability is especially useful for low-resolution video services, which are still commonly used in the market. By choosing an appropriate size of CTU and maximum hierarchical depth, the hierarchical block partitioning structure can be optimized to the target application. The splitting process of coding tree can be specified recursively and all other syntax elements can be represented in the same way regardless of the size of CU. This kind of recursive representation is very useful in terms of reducing parsing complexity and improving clarity when the quadtree depth is large. 3.2.Prediction blocks/units (PU/PB) : We want to use different MVs depending on the portion in CB, that is why PB was introduced. Each CB can be split to PBs differently depending on the temporal and/or spatial predictability. For each CU, a prediction mode is signaled inside the bitstream. The prediction mode indicates whether the CU is coded using intra-picture prediction or motion-compensated prediction. One or more PU’s are specified for each CU, which is a leaf node of coding tree. Coupled with the CU, the PU works as a basic representative block for sharing the prediction
  • 8. 8 information. Inside one PU, the same prediction process is applied and the relevant information is transmitted to the decoder on a PU basis. A CU can be split into one, two or four PUs according to the PU splitting type. HEVC defines two splitting shapes for the intra coded CU and eight splitting shapes for inter coded CU. Unlike the CU, the PU may only be split once. PU Splitting Type : Similar to prior standards, each CU in HEVC can be classified into three categories: skipped CU, inter coded CU, and intra coded CU. An inter coded CU uses motion compensation scheme for the prediction of the current block, while an intra coded CU uses neighboring reconstructed samples for the prediction. A skipped CU is a special form of inter coded CU where both the motion vector difference and the residual energy are equal to zero. For each category, PU splitting type is specified differently as shown in Fig. 5 when the CU size is equal to 2N×2N. As shown in the figure, only PART−2N×2N PU splitting type is allowed for the skipped CU. For the intra coded CU, two possible PU splitting types of PART−2N×2N and PART−N×N are supported. Finally, total eight PU splitting types are defined as two square shapes (PART−2N×2N, PART−N×N), two rectangular shapes (PART−2N×N and PART−N×2N), and four asymmetric shapes (PART−2N×nU, PART−2N×nD, PART−nL×2N, and PART−nR×2N) for inter coded CU.
  • 9. 9 Constraints According to CU Size : In PART−N×N, CU is split into four equal-sizes PUs, which is conceptually similar with the case of four equal-size CUs when the CU size is not equal to the minimum CU size. Thus, HEVC disallows the use of PART−N×N except when the CU size is equal to the minimum CU size. It was observed that this design choice can reduce the encoding complexity significantly while the coding efficiency loss is marginal. To reduce the worst-case complexity, HEVC further restricts the use of PART−N×N and asymmetric shapes. In case of inter coded CU, the use of PART−N×N is disabled when the CU size is equal to 8×8. Moreover, asymmetric shapes for inter coded CU are only allowed when the CU size is not equal to the minimum CU size. HEVC supports eight different modes for partitioning a CU into PUs. As you can see from figure at below. Supported partitioning modes for splitting a coding unit (CU) into one, two, or four prediction units (PU). The (M=2) x (M=2) mode and the modes shown in the bottom row are not supported for all CU sizes 3.3.Transform block/unit (TB/TU) : Once the prediction is made, we need to code residual (difference between predicted image and actual image) with DCT-like transformation. Again, CB could be too big for this because a CB may contains both a detailed part (high frequency) and a flat part (low frequency). Therefore, each CB can be differently split into TBs (Transform Block). Note that TB doesn’t have to be aligned with PB. It is possible and often makes sense to perform single transform across residuals from multiple PBs.
  • 10. 10 Similar with the PU, one or more TUs are specified for the CU. HEVC allows a residual block to be split into multiple units recursively to form another quadtree which is analogous to the coding tree for the CU. The TU is a basic representative block having residual or transform coefficients for applying the integer transform and quantization. For each TU, one integer transform having the same size to the TU is applied to obtain residual coefficients. These coefficients are transmitted to the decoder after quantization on a TU basis. Residual Quadtree: After obtaining the residual block by prediction process based on PU splitting type, it is split into multiple TUs according to a quadtree structure. For each TU, an integer transform is applied. The tree is called transform tree or residual quadtree (RQT) since the residual block is partitioned by a quadtree structure and a transform is applied for each leaf node of the quadtree. Nonsquare Partitioning : SRQT is constructed when PU splitting type is square shape while NSRQT is utilized for rectangular and asymmetric shapes. For NSRQT, transform shape is horizontal when the choice of the partition mode is horizontal type such as PART−2N×N, PART−2N×nU, and PART−2N×nD. The same rule is applied to the vertical type case such as PART−N×2N, PART−nL×2N, and PART−nR×2N. Although the syntax of SRQT and NSRQT is the same, the shapes of TUs at each transform tree depth are defined differently for SRQT and NSRQT. Fig. 6 illustrates an example of transform tree and corresponding TU splitting. Fig. 6(a) represents transform tree. Fig. 6(b) shows TU splitting when the PU shape is square. Fig. 6(c) shows TU splitting when the PU shape is rectangular or asymmetric. Although they share the same transform tree, the actual TU splitting is different depending on the PU splitting type.
  • 11. 11 Transform Across Boundary : In HEVC, both the PU size and the TU size can reach the same size of the corresponding CU. This leads to the fact that the size of TU may be larger than that of the PU in the same CU, i.e., residuals from different PUs in the same CU can be transformed together. For example, when the TU size is equal to the CU size, the transform is applied to the residual block covering the whole CU regardless of the PU splitting type. Note that this case exists only for inter coded CU, since the prediction is always coupled with the TU splitting for intra coded CU. Maximum Depth of Transform Tree : The maximum depth of transform tree is closely related to the encoding complexity. To provide the flexibility on this feature, HEVC specifies two syntax elements in the SPS which control the maximum depth of transform tree for intra coded CU and inter coded CU, respectively. The case when the maximum depth of transform tree is equal to 1 is denoted as implicit TU splitting since there is no need to transmit any information on whether the TU is split. In this case, the transform size is automatically adjusted to be fit inside the PU rather than allowing transform across the boundary. INTER-PICTURE PREDICTION IN HEVC Hybrid video coding is known to be a combination of video sample prediction and transformation of the prediction error. While intra-picture prediction exploits the correlation between spatially neighboring samples, inter-picture prediction makes use of the temporal correlation between pictures in order to derive a motion-compensated prediction (MCP) for a block of image samples. For this block-based MCP, a video picture is divided into rectangular blocks. Assuming homogeneous motion inside one block and that moving objects are larger than one block, for each block, a corresponding block in a previously decoded Picture can be found that serves as a predictor.
  • 12. 12 1.Motion Data Coding 1.1.Advanced Motion Vector Prediction : In HEVC, motion vectors are coded in terms of horizontal (x) and vertical (y) components as a difference to a so called motion vector predictor (MVP). Why neighboring blocks are important for us? Because motion vectors of the current block are usually correlated with the motion vectors of neighboring blocks in the current picture or in the earlier coded pictures. This is because neighboring blocks are likely to correspond to the same moving object with similar motion and the motion of the object is not likely to change abruptly over time. Consequently, using the motion vectors in neighboring blocks as predictors reduces the size of the signaled motion vector difference. The MVPs are usually derived from already decoded motion vectors from spatial neighboring blocks or from temporally neighboring blocks in the co- located picture. AVC and HEVC have a important difference in here. The way to create (or capture) MVs has some differences. ➢ In H.264/AVC => wise median of three spatially neighboring motion vectors ➢ In HEVC => motion vector competition Vector competition explicitly signals which MVP from a list of MVPs, is used for motion vector derivation. The variable coding quadtree block structure in HEVC can result in one block having several neighboring blocks with motion vectors as potential MVP candidates. 1.1.1. AMVP Candidate List Construction: The initial design of AMVP included five MVPs from three different classes of predictors: three motion vectors from spatial neighbors, the median of the three spatial predictors and a scaled motion vector from a co-located, temporally neighboring block. Furthermore, the list of predictors was modified by reordering to place the most probable motion predictor in the first position and by removing redundant candidates to assure minimal signaling overhead.
  • 13. 13 This led to significant simplifications of the AMVP design such as removing the median predictor, reducing the number of candidates in the list from five to two, fixing the candidate order in the list and reducing the number of redundancy checks. AMVP candidate list construction (final design) includes: • Two spatial candidate MVPs that are derived from five spatial neighboring blocks. • One temporal candidate MVPs derived from two temporal, co-located blocks when both spatial candidate MVPs are not available or they are identical. • Zero motion vectors when the spatial, the temporal or both candidates are not available. 1.1.1.a. Spatial Candidates: Candidates A and B are derived from five spatially neighboring blocks. How it works? Candidate A, motion data from the two blocks A0 and A1 at the bottom left corner is taken into account in a two pass approach. In the first pass, it is checked whether any of the candidate blocks contain a reference index that is equal to the reference index of the current block. The first motion vector found will be taken as candidate A. When all reference indices from A0 and A1 are pointing to a different reference picture than the reference index of the current block, the associated motion vector can’t be used as is. Therefore, in a second pass, the motion vectors need to be scaled according to the temporal distances between the candidate reference picture and the current reference picture. Tricky point: Only motion vectors from spatial neighboring blocks to the left and above the current block are considered as spatial MVP candidates. This can be explained by the fact that the blocks to the right and below the current block are not yet decoded and hence, their motion data is not available. Don’t forget it because we will talk about it in ‘temporal candidates’ part.
  • 14. 14 Here is the mathematical expression for this operation: td -> temporal distance between current Picture and reference Picture of candidate block tb -> temporal distance between current Picture and reference Picture of current block For candidate B, the candidates B0 to B2 are checked sequentially in the same way as A0 and A1 are checked in the first pass. The second pass, however, is only performed when blocks A0 and A1 do not contain any motion information. Then, candidate A is set equal to the non- scaled candidate B, if found, and candidate B is set equal to a second, non-scaled or scaled variant of candidate B. Since you could also end up in the second pass when there still might be potential non-scaled candidates, the second pass searches for non-scaled as well as for scaled MVs derived from candidates B0 to B2. Overall, this design allows to process A0 and A1 independently from B0, B1, and B2. The derivation of B should only be aware of the availability of both A0 and A1 in order to search for a scaled or an additional non-scaled MV derived from B0 to B2. This dependency is acceptable given that it significantly reduces the complex motion vector scaling operations for candidate B. Reducing the number of motion vector scalings represents a significant complexity reduction in the motion vector predictorderivation process. 1.1.1.b. Temporal Candidates: Since the co-located picture is a reference picture which is already decoded, it is possible to also consider motion data from the block at the same position, from blocks to the right of the co-located block or from the blocks below. In HEVC, the block to the bottom right and at the center of the current block have been determined to be the most suitable to provide a good temporal motion vector predictor (TMVP). C0 represents the bottom right neighbor and C1 represents the center block
  • 15. 15 Motion data of C0 is considered first and, if not available, motion data from the co-located candidate block at the center is used to derive the temporal MVP candidate C. The motion data of C0 is also considered as not being available when the associated PU belongs to a CTU beyond the current CTU row. This minimizes the memory bandwidth requirements to store the co-located motion data. In contrast to the spatial MVP candidates, where the motion vectors may refer to the same reference picture, motion vector scaling is mandatory for the TMVP. H.264 and H.265 difference in temporal candidates: While the temporal direct mode in H.264/AVC always refers to the first reference picture in the second reference picture list, list1, and is only allowed in bi-predictive slices, HEVC offers the possibility to indicate for each picture which reference picture is considered as the co-located picture.This is done by signaling in the slice header the co-located reference picture list and reference picture index as well as requiring that these syntax elements in all slices in a picture should specify the same reference picture. 1.2. Inter-picture Prediction Block Merging: HEVC uses a quadtree structure to describe the partitioning of a region into subblocks. In terms of bit rate, this is a very low-cost structure while at the same time, it allows for partitioning into a wide range of differently sized sub-blocks. While this simplicity is an advantage for example for encoder design, it also bears the disadvantage of over- segmenting the image, potentially leading to redundant signaling and ineffective borders. The main motivation for the use of block-based partitioning in image or video compression is to code each block with one specific choice out of a set of simple models since, in general, a single model cannot be expected to capture the properties of a whole image efficiently. The quadtree data structure allows partitioning of an image into blocks of variable size, and is thus a suitable framework for optimizing the tradeoff between model accuracy and model coding cost. However, the concept of quadtree-based image and video coding is intrinsically tied to certain drawbacks. For instance, the systematic decomposition of every block into four child blocks does not allow to jointly represent child blocks that belong to two different parent blocks. Also, if a given block is divided into four child blocks, all child blocks are typically coded separately, even if two or three of them share the same coding parameters. Thus, the suboptimal RD properties of an initial quadtree-based subdivision can be substantially improved by joining (or merging) nodes belonging to potentially different parents.
  • 16. 16 (a) Details of a picture in the Cactus sequence containing a moving object in the foreground with motion indicated by the arrow. (b) Quadtree partitioning for motion parameters is indicated by the white squares. (c) Only those block borders are shown that separate blocks with different motion parameters. Figure (b) reveals that an RD-optimized tree pruning forces the quadtree segmentation to subdivide regions unnecessarily. We observed over a range of sequences and coding conditions for HEVC that those blocks, which are neighbored by a previously coded block with exactly the same motion parameters covered approximately 40% of all samples on average. Figure (c) shows the same partitioning as in (b), but with boundaries between blocks with identical motion parameters removed. This reveals regions of identical motion parameters, better capturing the distinct types of motion in the scene. Ideally, the boundary of each region should coincide with the motion discontinuities in the given video signal. Using quadtree-based block partitioning combined with merging, the picture can be subdivided into smaller and smaller blocks, thereby approximating the motion boundary and minimizing the size of partitions containing the boundary. Subsequently, each created quadtree leaf block can be merged into a region on either side of the boundary. Merging operation: All the PBs along the block scanning pattern coded before the current PB form the set of causal PBs. The block scanning pattern for the CTUs (the tree roots) is raster scan and within a CTU, the CUs (leaves) and the associated PBs within are processed in depth-first order along a z-scan. Hence, the PBs in the striped area of figure are successors (anticausal blocks) to PB X and do not have associated prediction data yet.
  • 17. 17 Merging Figure In any case, the current PB X can be merged only with one of the causal PBs. When X is merged with one of these causal PBs, the PU of X copies and uses the same motion information as the PU of this particular block. Out of the set of causal PBs, only a subset is used to build a list of merge candidates. In order to identify a candidate to merge with, an index to the list is signaled to the decoder for block X. Relationship between merging neighboring blocks (H.265) and spatial direct mode (H.264): It should be noted that a scheme merging spatially neighboring blocks is conceptually similar to spatial prediction modes as, the spatial direct mode in H.264/AVC. This mode also tries to reduce coding cost by using redundancies of motion parameters in neighboring blocks. 1.2.1. Merge Candidate List Construction: Block Merging and AMVP appear similar, there is one main difference. The AMVP list only contains motion vectors for one reference list while a merge candidate contains all motion data including the information whether one or two reference picture lists are used as well as a reference index and a motion vector for each list. This significantly reduces motion data signaling overhead. The output of this process is a list of merge candidates, each of them being a tuple of motion parameters, which can be used for motion-compensated prediction (MCP) of a block. That is, the information on whether one or two motion compensated hypothesis are used for prediction, and for each hypothesis an MV and a reference picture index. The number of candidates within this list is controlled by NumMergeCands, which is signaled in the slice header. The whole process of adding candidates will stop as soon as the number of candidates reaches NumMergeCands during the following ordered steps: the process starts with the derivation of the initial candidates from spatially neighboring PBs, called spatial candidates. Then, a candidate from a PB in a temporally collocated picture can be included, which is called the temporal candidate. Due to unavailability of certain candidates or some of them being redundant, the number of initial candidates could be less than NumMergeCands. In this case, additional “virtual” candidates are inserted to the list so that the number of candidates in the list is always equal to NumMergeCands.
  • 18. 18 Initial Candidates: The initial candidates are obtained by first analyzing the spatial neighbors of the current PB. (A1, B1, B0, A0, and B2 --> same as AMVP) A candidate is only available to be added to the list when the corresponding PU is interpicture predicted and does not lie outside of the current slice. Moreover, a position-specific redundancy check is performed so that redundant candidates are excluded from the list to improve the coding efficiency. Redundancy checks prior to adding a candidate to the list. For each candidate to check, outgoing arrows point to a candidate for comparison. First, the candidate at position A1 is added to the list if available. Then, the candidates at the positions B1, B0, A0, and B2 are processed in that order. Each of these candidates is compared to one or two candidates at a previously tested position and is added to the list only in case these comparisons show a difference. These redundancy checks are depicted in figüre by an arrow each. Note that instead of performing a comparison for every candidate pair, this process is constrained to a subset of comparisons to reduce the complexity while maintaining the coding efficiency. Major steps for the construction of the block merging candidate list in HEVC. Merge candidate list construction includes : • Four spatial merge candidates that are derived from five spatial neighboring blocks • One temporal merge candidate derived from two temporal, co-located blocks
  • 19. 19 • Additional merge candidates including combined bi-predictive candidates and zero motion vector candidates 1.2.1.a. Spatial Candidates: The first candidates in the merge candidate list are the spatial neighbors. In order to derive a list of motion vector predictors for AMVP, one MVP is derived from A0 and A1 and one from B0, B1 and B2, respectively in that order. However, for inter-prediction block merging, up to four candidates are inserted in the merge list by sequentially checking A1, B1, B0, A0 and B2, in that order. Instead of just checking whether a neighboring block is available and contains motion information, some additional redundancy checks are performed before taking all the motion data of the neighboring block as a merge candidate. Redundancy checks porposes • Avoid having candidates with redundantmotion data in the list • Prevent merging two partitions that could be expressed by other means which would create redundant syntax During the development of HEVC, the checks for redundant motion data have been reduced to a subset in a way that the coding efficiency is kept while the comparison logic is significantly reduced. In the final design, no more than two comparisons are performed per candidate resulting in five overall comparisons. Given the order of {A1, B1, B0, A0, B2}, B0 only checks B1, A0 only A1 and B2 only A1 and B1. 1.2.1.b. Temporal Candidates: Since a merge candidate comprises all motion data and the TMVP is only one motion vector, the derivation of the whole motion data only depends on the slice type. For bi-predictive slices, a TMVP is derived for each reference picture list. Depending on the availability of the TMVP for each list, the prediction type is set to bi-prediction or to the list for which the TMVP is available. All associated reference picture indices are set equal to zero. Consequently for uni-predictive slices, only the TMVP for list 0 is derived together with the reference picture index equal to zero. When at least one TMVP is available and the temporal merge candidate is added to the list, no redundancy check is performed. This makes the merge list construction independent of the co-located picture which improves error resilience. Consider the case where the temporal merge candidate would be redundant and therefore not included in the merge candidate list. In the event of a lost co-located picture, the decoder could not derive the temporal candidates and hence not check whether it would be redundant. The indexing of all subsequent candidates would be affected by this. 1.2.1.c. Additional Candidates: After the spatial and the temporal merge candidates have been added, it can happen that the list has not yet the fixed length. In order to compensate for the coding efficiency loss
  • 20. 20 that comes along with the non-length adaptive list index signaling, additional candidates are generated. Depending on the slice type, two kind of candidates are used to fully populate the list: • Combined bi-predictive candidates • Zero motion vector candidates In bi-predictive slices, additional candidates can be generated based on the existing ones by combining reference picture list 0 motion data of one candidate with and the list 1 motion data of another one. When the list is still not full after adding the combined bi-predictive candidates, or for uni- predictive slices, zero motion vector candidates are calculated to complete the list. All zero motion vector candidates have one zero displacement motion vector for uni-predictive slices and two for bi-predictive slices. 1.2.2. Motion Data Strorage Reduction (MDSR): The usage of the TMVP, in AMVP as well as in the merge mode, requires the storage of the motion data (including motion vectors, reference indices and coding modes) in co-located reference pictures. Considering the granularity of motion representation, the memory size needed for storing motion data could be significant. HEVC employs motion data storage reduction (MDSR) to reduce the size of the motion data buffer and the associated memory access bandwidth by sub-sampling motion data in the reference pictures. H.264 and H.265 difference: While H.264/AVC is storing these information on a 4x4 block basis, HEVC uses a 16x16 block where, in case of sub-sampling a 4x4 grid, the information of the top-left 4x4 block is stored. Due to this sub-sampling, MDSR impacts on the quality of the temporal prediction. Furthermore, there is a tight correlation between the position of the MV used in the co- located picture, and the position of the MV stored by MDSR. TMVP candidates provide the best trade off between coding efficiency and memory bandwidth reduction. 2. Fractional Sample Interpolation Interpolation tasks arise naturally in the context of video coding because the true displacements of objects from one picture to another are independent of the sampling grid of cameras. Therefore, in MCP, fractional-sample accuracy is used to more accurately capture continuous motion. Samples available at integer positions are filtered to estimate values at fractional positions. This spatial domain operation can be seen in the frequency domain as introducing phase delays to individual frequency components. An ideal interpolation filter for band-limited signals induces a constant phase delay to all frequencies and does not alter their magnitudes. The efficiency of MCP is limited by many factors—the
  • 21. 21 spectral content of original and already reconstructed pictures, camera noise level, motion blur, quantization noise in reconstructed pictures, etc. Similar to H.264/AVC, HEVC supports motion vectors with quarter-pixel accuracy for the luma component and one-eighth pixel accuracy for chroma components. If the motion vector has a half or quarter-pixel accuracy, samples at fractional positions need to be interpolated using the samples at integer-sample positions. The interpolation process in HEVC introduces several improvements over H.264/AVC that contributes to the significant coding efficiency increase of HEVC. In order to improve the filter response in the high frequency range, luma and chroma interpolation filters have been re-designed and the tap-lengths were increased. The luma interpolation process in HEVC uses a symmetric 8-tap filter for half-sample positions and an asymmetric 7-tap filter for quarter-sample positions. For chroma samples, a 4-tap filter was introduced. The intermediate values used in interpolation process are kept at a higher accuracy in HEVC to improve coding efficiency. H.264 and H.265 difference: ➢ H.264/AVC obtains the quarter-sample values by first obtaining the values of nearest half-pixel samples and averaging those with the nearest integer samples, according the position of the quarter-pixel. However, HEVC obtains the quarter-pixel samples without using such cascaded steps but by instead directly applying a 7 or 8-tap filter on the integer pixels. ➢ In H.264/AVC, a bi-predictivelycodedblock is calculated by averaging two unipredicted blocks. If interpolation is performed to obtain the samples of the uniprediction blocks, those samples are shifted and clipped to input bit-depth after interpolation, prior to averaging. On the other hand, HEVC keeps the samples of each one of the uni-prediction blocks at a higher accuracy and only performs rounding to input bit-depth at the final stage, improving the coding efficiency by reducing the rounding error. 2.1. Redesigned Filters: An important parameter for interpolation filters is the number of filter taps as it has a direct influence on both coding efficiency and implementation complexity. In terms of implementation, it not only has an impact on the arithmetic operations but also on the memory bandwidth required to access the reference samples. Increasing the number of taps can yield filters that produce desired response for a larger range of frequencies which can help to predict the corresponding frequencies in the samples to be coded. Considering modern computing capabilities, the performance of many MCP filters were evaluated in the context of HEVC and a coding efficiency/complexity trade-off was targeted during the standardization.
  • 22. 22 The basic idea is to forward transform the known integer samples to the DCT domain and inverse transform the DCT coefficients to the spatial domain using DCT basis sampled at desired fractional positions instead of integer positions. These operations can be combined into a single FIR filtering step. The half-pel interpolation filter of HEVC comes closer to the desired response than the H.264/AVC filter. 2.2. High Precision Filtering Operations: In H.264/AVC, some of the intermediate values used within interpolation are shifted to lower accuracy, which introduces rounding error and reduces coding efficiency. This loss of accuracy is due to several reasons. Firstly, the half-pixel samples obtained by 6-tap FIR filter are first rounded to input bit-depth, prior to using those for obtaining the quarter-pixel samples. Instead of using a two-stage cascaded filtering process, HEVC interpolation filter computes the quarter-pixels directly using a 7-tap filter using the coefficients, which significantly reduces the rounding error to 1=128. The second reason for reduction of accuracy in H.264/AVC motion compensation process is due to averaging in bi-prediction. In H.264/AVC, the prediction signal of the bi-predictively coded motion blocks are obtained by averaging prediction signals from two prediction lists.
  • 23. 23 If the motion vectors have fractional pixel accuracy, then S1 and S2 are obtained using interpolation and the intermediate values are rounded to input bit-depth. In HEVC, instead of averaging each prediction signal at the precision of the bit-depth, they are averaged at a higher precision if fractional motion vectors are used for the corresponding block. In HEVC the only clipping operation is at the very end of the motion compensation process, with no clipping in intermediate stages. As there is also no rounding in intermediate stages, HEVC interpolation filter allows certain implementation optimizations. Consider the case where bi-prediction is used and motion vectors of each prediction direction points to the same fractional position. In these cases, final prediction could be obtained by first adding two reference signals and performing interpolation and rounding once, instead of interpolating each reference block, thus saving one interpolation process. 2.3. Complexity of HEVC Interpolation Filter: When evaluating the complexity of a video coding algorithm there are several parameters • Memory bandwidth • Number of operations • Storage buffer size In terms of memory bandwidth, H.264/AVC => 6-tap filter for luma sub-pixels and bilinear filter for chroma H.265/HEVC => 7–8 tap filter for luma sub-pixels and 4-tap filter for chroma sub-pixels This difference increases the amount of data that needs to be fetched from the reference memory. The worst case happens when a small motion block is bi-predicted and its corresponding motion vector points to a sub-pixel position where two-dimensional filtering needs to be performed. In order to reduce the worst case memory bandwidth, HEVC introduces several restrictions. Firstly, the smallest prediction block size is fixed to be 4x8 or 8x4, instead of 4x4. In addition, these smallest block sizes of size 4x8 and 8x4 can only be predicted with uni- prediction. With these restrictions in place, the worst-case memory bandwidth of HEVC interpolation filter is around 51% higher than that of H.264/AVC. The increase in memory bandwidth is not very high for larger block sizes. For example, for a 32x32 motion block, HEVC requires around 13% increased memory bandwidth over H.264/AVC. Similarly, the longer tap-length filters increase the number of arithmetic operations required to obtain the interpolated sample. The high-precision bi-directional averaging increases the size of intermediate storage buffers for storing the temporary uni-prediction signals as each one of the prediction signals need to be stored at a higher bit-depth compared to H.264/AVC before the bi-directional averaging takes place.
  • 24. 24 HEVC uses 7-tap FIR filter for interpolating samples at quarter-pixel locations, which has an impact on motion estimation of a video encoder. An H.264/AVC encoder could store only the integer and half-pel samples in the memory and generate the quarter-pixels on-the-fly during motion estimation. This would be significantly more costly in HEVC because of the complexity of generating each quarter-pixel sample on-the-fly with a 7-tap FIR filter. Instead, an HEVC encoder could store the quarter-pixel samples in addition to integer and half-pixel samples and use those in motion estimation. Alternatively, an HEVC encoder could estimate the values of quarter-pixel samples during motion estimation by low complexity non- normative means. 3. Weighted Sample Prediction Similar to H.264/AVC, HEVC includes a weighted prediction (WP) tool that is particularly useful for coding sequences with fades. In WP, a multiplicative weighting factor and an additive offset are applied to the motion compensated prediction. (P) = inter prediction signal (P’) = linearly weighted prediction signal (o) = Offset (w) = Illumination Compensation Mathematical Expression => P’ = w x P + o WP has a very small overhead in PPS and slice headers contain only non-default WP scaling values. WP is an optional PPS parameter and it may be switched on/off when necessary. The inputs to the WP process are: • The width and the height of the luma prediction block • Prediction samples to be weighted • The prediction list utilization flags • The reference indices for each list • The color component index In H.264/AVC, weight and offset parameters are either derived by relative distances between the current picture and the reference distances (implicit mode) or weight and offset parameters are explicitly signaled (explicitmode). Unlike H.264/AVC, HEVC only includes explicit mode as the coding efficiency provided by deriving the weighted prediction parameters with implicit mode was considered negligible. It should be noted that the weighted prediction process defined in HEVC version 1 was found to be not optimal for higher bit-depths as the offset parameter is calculated at low precision. The next version of the standard will likely modify the derivationof the offset parameter for higher bit-depths. The determination of appropriate WP parameters in an encoder is outside the scope of the HEVC standard. Several algorithms for estimating WP parameters have been proposed in
  • 25. 25 literature. Optimal solutions are obtained when the Illumination Compensation weights, motion estimation and Rate Distortion Optimization (RDO) are considered jointly. However, practical systems usually employ simplified techniques, such as determining approximate weights by considering picture-to-picture mean variation. EXPERIMENTAL STUDY I use HM 16.8 codec and some different configuration files. I will try to see some differences over codec config files as well. If you look below, you will see some part of config files. This part is about “motion search” and ı will show this parameters’ effect in this section. Motion Search - Encoder Parameter Description FastSearch TZ search similar to the corresponding search mode in the reference software for H.264/AVC scalable video coding. SearchRange Specifies the search range used for motion estimation. BipredSearchRange Specifies the search range used for bi-prediction refinement in motion estimation. HadamardME Enable (1) or disable (0) Hadamard transform for fractional motion estimation. DCT is used for fractional motion estimation. HadamardME reduces the complexity also giving away a little coding efficiency. FEN Enable (1) or disable (0) fast encoding. Use subsampled SAD(Sum of Absolute Difference) for blocks with more than eight rows for integer motion estimation and use one iteration instead of four for bipredictive search. FDM Enable (1) or disable (0) to use fast encoder decisions for 2Nx2N merge mode. Stop merge search if skip mode is chosen.
  • 26. 26 1. Coding/Access Settings: Intra only → uses no temporal prediction Random Access → allows the use of future frames as reference in temporal prediction Low Delay → uses only previous frames as reference in temporal prediction I will examine only “Random Access” and “Low Delay” in this section. 1.1. Low Delay Configuration Files: 1) Fast Search = 0 → Full Search , 1 → TZ Search There is two different searching algorithm. As you can see in same quality (Bitrates and PSNR values are nearly same with each other.) there is huge “encoding time” differences. The TZ search algorithm is much faster than the full search algorithm. We can easily see it above. After the full search scan, I see that the total time is about 2000 percent. (About 20 times longer) TZ search algorithm: TZ Search is the algorithm used in the ME process in HEVC. In this secton we will discuss it shortly. A. Prediction Motion Vector: The TZ Search algorithm employs four predictors: the left predictor, the up, and the upper right ones to compute the median predictor of the corresponding block.
  • 27. 27 Median Calculation → Median( A, B, C ) = A + B + C − Min( A, Min( B, C ) ) − Max( A, Max( B, C ) ) B. Initial Grid Search After selecting the start position, the second step is to perform the first search and to determine the search range and the search pattern. Search patterns could be diamond or square search patterns. C. Raster Search When the previous search is completed, the next step is the raster search, which chooses the best distance corresponding to the best-matched point from the last search. Depending on this distance, three different cases are presented. 1.case → If the best distance is equal to zero, then the process is stopped. 2.case → If 1<Bestdistance<iRaster, a refinement stage is proceeded directly. In fact, the "iRaster" is a parameter that represents the central point and the first resulting from the research that it should not exceed. 3.case → If Bestdistance>iRaster which is set appropriately, a raster scan is achieved using the iRaster’s value as the stride length. The raster search is performed when the difference obtained between the motion vectors from the first phase and the starting position is too large. This raster search is carried out on the complete search window. A “Full Search” algorithm is shown in figüre (below) for a 16x16 search window with iRaster equal to 4. This is why we try to use algorithm for this operation. Our motto is “same work but efficiency way”.
  • 28. 28 2) FEN = 0 (Fast Encoder Decision) In this config file searching algorithm is TZ Search (Fast Search = 1). So if I want to see difference, I need to look at related files. We see that the "FEN" value (it could be 0 or 1, open or closed) has been changed (closed, given 0) on the right. This means that when “Fast Encoder Decision” is closed, encoding time is increasing. I find this increase by 38 percent. 3) FDM = 0 (Fast Decision for Merge RD cost) If this mode is turned off, an increase in encoding time is observed. This means that the fast decision for the "merge" blocks has a decisive effect on encoding time. This increase is around 13 percent. Comparison between FEN and FDM: The time increment when FEN mode is turned off is greater than the increase in FDM mode. (about 25 percent). This means that FEN mode is much faster than FDM for encoding operation. This is probably due to the fact that the mode (FEN) have more related blocks than other mode’s (FDM) operation. So FEN mode has more general coverage than the other (FDM).
  • 29. 29 4) SearchRange = 128 1.2. Random Access Configuration Files: 1) Fast Search = 0 → Full Search, 1 → TZ Search The total time is increased about 2850 percent (About 28.5 times longer) for full search algorithm. 2) FEN = 0 (Fast Encoder Decision) The total time is increased about 37 percent.
  • 30. 30 3) FDM = 0 (Fast Decision for Merge RD cost) The total time is increased about 15 percent. 4) SearchRange = 128 2. Conclusion: Reference parameter configuration is shown below. I prefer to use this as a reference because this is the fastest way for encoding. So if I change any parameter, then I will see “encoding time increasing”.
  • 31. 31 Low Delay (QP=36, GOP=4) Random Access (QP=38, GOP=8) Reference Bitrate = 223.89 PSNR = 38.21 Time = 1151.6 Reference Bitrate = 190.01 PSNR = 37.61 Time = 633.82 Fast Search = 0 Bitrate = 223.34 PSNR = 38.21 Time = 24369.1 Fast Search = 0 Bitrate = 190.05 PSNR = 37.61 Time = 18690.00 FEN = 0 Bitrate = 222.85 PSNR = 38.17 Time = 1592.47 FEN = 0 Bitrate = 190.02 PSNR = 37.61 Time = 868.42 FDM = 0 Bitrate = 223.54 PSNR = 38.19 Time = 1305.1 FDM = 0 Bitrate = 190.26 PSNR = 37.61 Time = 732.144 Search Range = 128 Bitrate = 223.47 PSNR = 38.19 Time = 1038.81 Search Range = 128 Bitrate = 189.86 PSNR = 37.60 Time = 907.46 Note : As you can see PSNR and Bitrate values do not change much. These are negligible for now. Then I will examine the time changing for encoding operations. Changed parameter Low Delay Random Access Fast Search = 0 + %2000 + %2850 FEN = 0 + %38 + %37 FDM = 0 + %13 + %15 Search Range = 128 - %9 + %7 As you can see, same encoder parameters have nearly same effect on encoder time changing. Let's continue to interpret other parameters. First make a general evaluation. What are our initial expectations from low-delay and random-access modes? After we have understood this, we can understand the effects of the parameters that we have changed and how it has impacted.
  • 32. 32 Do not forget that we have compared reference config files outputs in this part! Bitrate and PSNR values are as expected. Low Delay will be the same like under normal conditions, so it will be larger than random access. However there is a smaller additional reason that the QP value of the low delay is less than Random-Access. But it is not the only reason. When we look at the total time things are changing. The reason for it is LD total time must be less than RA total time in normal case. But we have two different parameter in here. Firstly QP is more smaller for LD and secondly GOP size is half of RA’s GOP size. These parameters effect the total time and finally LD has more encoding time than RA. This is so understandable and logical. Note for Search Range: Changing the Search Range parameter did not make much difference because of the TZ search algorithm we used. We would probably see the difference if we used the Full Search algorithm. Because in this case every area of the search range is scanned and a change in the range will make a difference.
  • 33. 33 As seen above there is no big difference between different search range values. Minor differences can be caused by different processes running in the system while the program is running. References: Philipp Helle, Simon Oudin, Benjamin Bross, Student Member, IEEE, Detlev Marpe, Senior Member, IEEE, M. Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand, Fellow, IEEE, Block Merging for Quadtree-Based Partitioning in HEVC, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012 Journal of Marine Science and Technology, Vol. 23, No. 5, pp. 727-731 (2015) 727 DOI: 10.6119/JMST-015-0604-2, MOTION VECTOR PREDICTION USING RESELECTION AMONG CANDIDATE SET, S. T. Uou, Eric Y. T. Juan, and Jim Z. C. Lai Vivienne Sze and Madhukar Budagavi Gary J. Sullivan “High Efficiency Video Coding (HEVC)- Algorithms and Architectures” pp. 49-66 and pp. 113-141 Philipp Helle, Simon Oudin, Benjamin Bross, Student Member, IEEE, Detlev Marpe, Senior Member, IEEE, M. Oguz Bici, Kemal Ugur, Joel Jung, Gordon Clare, and Thomas Wiegand, Fellow, Block Merging for Quadtree-Based Partitioning in HEVC, IEEE, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012 S. Ahn, B. Lee, and M. Kim. A Novel Fast CU Encoding Scheme Based on Spatiotemporal Encoding Parameters for HEVC Inter Coding. IEEE Transactions on Circuits and Systems for Video Technology, 25(3):422– 435, March 2015. B. Bross, P. Helle, H. Lakshman, and K. Ugur. Inter-picture prediction in HEVC. In High Efficiency Video Coding (HEVC), pages 113–140. Springer, 2014. RandaKhemiri, NejmeddineBahri, FatmaBelghith, FatmaSayadi, MohamedAtri, etal.. Fastmotion estimation for HEVC video coding. 2016 International Image Processing,
  • 34. 34 Applications and Systems (IPAS), Nov 2016, Hammamet, Tunisia. IEEE, Image Processing, Applications and Systems (IPAS), 2016 International. <10.1109/IPAS.2016.7880120> B. Haskell, P. Gordon, R. Schmidt, J. Scattaglia, “Interframe Coding of 525-Line, Monochrome Television at 1.5 Mbits/s,” IEEE Trans. Communications, vol. 25, no. 11, pp. 1339-1348, Nov. 1977. Felipe Sampaio, Mateus Grellert - Motion Vectors Merging: Low Complexity Prediction Unit Decision Heuristic for the InterPrediction of HEVC Encoders, 2012 IEEE International Conference on Multimedia and Expo Gwo-Long Li, Chuen-Ching Wang, and Kuang-Hung Chiang “AN EFFICIENT MOTION VECTOR PREDICTION METHOD FOR AVOIDING AMVP DATA DEPENDENCY FOR HEVC” 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) Web Links: https://slideplayer.com/slide/10724934/ https://www.slideshare.net/mmv-lab-univalle/slides7-ccc-v8