SlideShare a Scribd company logo
1 of 1
Download to read offline
4K Ultra High Definition Video Coding using
Homogeneous Motion Discovery Oriented Prediction
Ashek Ahmmed†
Afrin Rahman†
Mark PickeringΨ
Aous Thabit Naman∗
†
Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh.
Ψ
School of Engineering and Information Technology, The University of New South Wales, Canberra, Australia.
∗
School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, Australia.
Abstract
State of the art video compression techniques use the motion model to approximate geometric boundaries
of moving objects where motion discontinuities occur. Motion hints based inter-frame prediction paradigm
moves away from this redundant approach and employs an innovative framework consisting of motion hint
fields that are continuous and invertible, at least, over their respective domains. However, estimation of
motion hint is computationally demanding, in particular for high resolution video sequences. Discovery of
homogeneous motion models and their associated masks over the current frame and then use these models
and masks to form a prediction of the current frame, provides a computationally simpler approach to video
coding compared to motion hint. In this paper, the potential of this coherent motion model based approach,
equipped with bigger blocks, is investigated for coding 4K Ultra High Definition (UHD) video sequences.
Experimental results show a savings in bit rate of 4.68% is achievable over standalone HEVC.
Introduction
Block-based translational motion model assigns a single motion vector to all the pixels inside a block
based on the assumption that the constituting pixels are moving in the same direction at a constant
speed. This uniformity of motion within a block hypothesis does not hold if the block is on object
boundaries i.e. where motion discontinuity exists. Hence, this model fails to efficiently model the
actual locations of discontinuities in the motion field.
Partitioning motion blocks, with object boundaries, into smaller square or rectangular sub-blocks
represents a popular approach to improve the compression efficiency since it is possible to better
match the blocks to the objects in the scene.
Motion hint can provide a global description of motion over specific domain and is related to the
foreground-background segmentation where the foreground and background motions are the hints.
The inspiration behind motion hint is to avoid using motion model for the purpose of describing
object boundaries since the spatial structure of previously-decoded reference frames can be exploited
to infer appropriate boundaries in the frames to be predicted.
The motion hint based prediction paradigm introduced in [2] is promising. Each reference frame is
segmented into super-pixels and those super-pixels are then grouped into homogeneous motion groups
iteratively. However, for high definition, full high definition and 4K ultra high definition resolution
video sequences the number of super-pixels becomes too many for the segmentation algorithm to
deal with and produce representative enough foreground-background shapes within a viable number
of iterations. This phenomenon is depicted in Figure 1.
Figure 1: Outcome of the motion hint segmentation approach, described in [2], after 4 iterations. The example sequence
is the Kimono 1080p sequence. Many background super-pixels are still misclassified as foreground, hence poor quality
motion hint segmentation is yielded.
In this paper we investigate the applicability of a prediction paradigm [1], where a bi-directional
affine motion model compensated prediction is used as a reference frame and prediction generation
process does not require any foreground-background segmentation, for coding 4K ultra high defini-
tion 4K (UHD) video sequences.
Structure of the coding/decoding architecture
In the considered approach, depicted with a simplified block diagram in Figure 2, the affine mo-
tion field between the reference frame Ri and the current B-frame, C is estimated using standard
gradient-based image registration technique. The mask, f
(Ri→C)
1 , used to estimate the associated 6-
parameter affine model is the entire C frame i.e. f
(Ri→C)
1 is a binary image with all values equal to 1.
The resultant affine motion model M
(Ri→C)
1 is approximated by the 3-corner motion vectors (MVs),
specifically by the top left, top right and the center pixels’ MVs. The fractional part of these MVs
are quantized to the accuracy of 1/16-th of a pixel using the Exponential Golomb coding technique.
This quantized motion model, M
(Ri→C)
1 is employed to warp Ri for generating an affine motion
compensated prediction, CRi→C
1 of C .
CRi→C
1 = M
(Ri→C)
1 Ri (1)
Figure 2: Block diagram of the coding/decoding framework that uses the bi-directional affine motion model compensated
prediction as a reference frame, along with the usual temporal reference(s), for the B-frames.
Next, an error analysis is carried out over the prediction error associated to the prediction CRi→C
1 .
Prediction error blocks having the sum-squared error (SSE) greater than the block-wise mean SSE
of this error image are identified. An example of these blocks, marked with white boundary pixels,
for a 4K UHD sequence is shown in Figure 3. Blocks with high SSE are then used to form another
mask, f
(Ri→C)
2 , which is fed in the affine motion model estimation process involving Ri and C ,
for generating a second affine model, namely M
(Ri→C)
2 . The reference frame Ri is warped by this
newly estimated and quantized affine motion model to generate another prediction of C , specifically
CRi→C
2 .
CRi→C
2 = M
(Ri→C)
2 Ri (2)
Figure 3: The affine motion model, M
(Ri→C)
1 performed poorly in blocks with white boundary pixels i.e. these blocks
have high prediction error energy. The example scenario is for predicting frame 5 using coded frames 1 and 9 of the
Vehicles (3840 × 2160) sequence. Used block size is 240 × 240 pixels.
Now with the help of the mask f
(Ri→C)
2 , the predictions CRi→C
1 and CRi→C
2 are fused into a single
prediction of C , from the reference frame Ri , in the following way:
CRi→C = 1 − f
(Ri→C)
2 · CRi→C
1 + f
(Ri→C)
2 · CRi→C
2 (3)
Similarly, using the reference frame Rj and C , the prediction of C namely CRj→C is formed. The
predictions CRi→C and CRj→C are then combined through a weighting scheme to generate the bi-
directional affine motion models compensated prediction, Raffine , of C .
Raffine = wi · CRi→C + wj · CRj→C (4)
where wi, wj ∈ [0, 1] are real-valued weights.
What is communicated from the encoder are the 4 sets of 3-corner motion vectors M
(Ri→C)
1 ,
M
(Ri→C)
2 , M
(Rj→C)
1 , and M
(Rj→C)
2 . That means in total 4 × 3 = 12 MVs, each with 1/16-th
of a pixel accuracy, are necessary to yield the reference frame Raffine at the decoder.
Experimental Analysis
The RD performance of the employed coder is investigated on three 4K UHD video sequences. The
first 49 frames of each sequence are coded by the HM 16.10 reference software for HEVC. The HM
encoder is configured using the random access main configuration i.e. hierarchical GOP structure is
used with GOP size = 8, Intra period = 32 as per the common test conditions. Four different quantiza-
tion parameter values (QP = 22, 27, 32, 37) are used. For each B-frame, the available pair of reference
frames are fed into the bi-directional affine motion models based prediction process to generate the
additional reference frame, Raffine . All the obtained results are summarized in Table 1.
Sequence Delta rate Delta PSNR
Park and Buildings −2.70% 0.06 dB
Vehicles −4.68% 0.15 dB
Book −3.97% 0.08 dB
Average −3.78% 0.10 dB
Table 1: The Bjøntegaard delta gains obtained from the test sequences over standalone HEVC by using the bi-directional
affine motion models compensated reference frames.
Conclusions
• investigated the RD performance of a hybrid coder, that uses an affine motion compensated refer-
ence frame along with the typical references, for coding 4K UHD video sequences.
• generation of this additional reference is done by finding homogeneous motion regions over the
current frame and it does not require any super-pixel level segmentation.
• experimental results are encouraging e.g. an improvement in bit rebate of up to 4.68% is achieved
over standalone HEVC.
• the associated motion information and signalling overhead could further be optimized to have in-
creased bit rate savings.
References
[1] A. Ahmmed, D. Taubman, A. T. Naman, and M. Pickering. Homogeneous motion discovery ori-
ented reference frame for high efficiency video coding. In Picture Coding Symposium (PCS),
2016, pages 1–5, 2016.
[2] A. Ahmmed, R. Xu, A. T. Naman, M. J. Alam, M. Pickering, and D. Taubman. Motion seg-
mentation initialization strategies for bi-directional inter-frame prediction. In IEEE International
Workshop on Multimedia Signal Processing, pages 58–63, Sept 2013.
Acknowledgements
The authors would like to thank Dr. Matteo Naccari of the B.B.C. Research and Development team for providing the 4K
UHD video sequences and valuable suggestions during this work.

More Related Content

What's hot

Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAMYu Huang
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIYu Huang
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataYu Huang
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IVYu Huang
 
Motion Estimation in h.264 encoder
Motion Estimation in h.264 encoderMotion Estimation in h.264 encoder
Motion Estimation in h.264 encoderTalal Khaliq
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learningYu Huang
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
 
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...cscpconf
 
Multiple region of interest tracking of non rigid objects using demon's algor...
Multiple region of interest tracking of non rigid objects using demon's algor...Multiple region of interest tracking of non rigid objects using demon's algor...
Multiple region of interest tracking of non rigid objects using demon's algor...csandit
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learningYu Huang
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose EstimationArithmer Inc.
 
Driving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIDriving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIYu Huang
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIIYu Huang
 
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIMEEVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIMEacijjournal
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Yu Huang
 

What's hot (20)

Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
06466595
0646659506466595
06466595
 
Deep VO and SLAM
Deep VO and SLAMDeep VO and SLAM
Deep VO and SLAM
 
Deep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data IIDeep Learning’s Application in Radar Signal Data II
Deep Learning’s Application in Radar Signal Data II
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal Data
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IV
 
Masters Thesis
Masters ThesisMasters Thesis
Masters Thesis
 
Distortion Correction Scheme for Multiresolution Camera Images
Distortion Correction Scheme for Multiresolution Camera ImagesDistortion Correction Scheme for Multiresolution Camera Images
Distortion Correction Scheme for Multiresolution Camera Images
 
Motion Estimation in h.264 encoder
Motion Estimation in h.264 encoderMotion Estimation in h.264 encoder
Motion Estimation in h.264 encoder
 
Anchor free object detection by deep learning
Anchor free object detection by deep learningAnchor free object detection by deep learning
Anchor free object detection by deep learning
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
 
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...
MULTIPLE REGION OF INTEREST TRACKING OF NON-RIGID OBJECTS USING DEMON'S ALGOR...
 
Multiple region of interest tracking of non rigid objects using demon's algor...
Multiple region of interest tracking of non rigid objects using demon's algor...Multiple region of interest tracking of non rigid objects using demon's algor...
Multiple region of interest tracking of non rigid objects using demon's algor...
 
Passive stereo vision with deep learning
Passive stereo vision with deep learningPassive stereo vision with deep learning
Passive stereo vision with deep learning
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose Estimation
 
Driving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XIIDriving behaviors for adas and autonomous driving XII
Driving behaviors for adas and autonomous driving XII
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning III
 
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIMEEVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 

Similar to DICTA 2017 poster

Optic flow estimation with deep learning
Optic flow estimation with deep learningOptic flow estimation with deep learning
Optic flow estimation with deep learningYu Huang
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivYu Huang
 
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGESEFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGESijcnac
 
Report bep thomas_blanken
Report bep thomas_blankenReport bep thomas_blanken
Report bep thomas_blankenxepost
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...IJERA Editor
 
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...IJERA Editor
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdfNarenRajVivek
 
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKING
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGA PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKING
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGIRJET Journal
 
Wavelet-Based Warping Technique for Mobile Devices
Wavelet-Based Warping Technique for Mobile DevicesWavelet-Based Warping Technique for Mobile Devices
Wavelet-Based Warping Technique for Mobile Devicescsandit
 
47549379 paper-on-image-processing
47549379 paper-on-image-processing47549379 paper-on-image-processing
47549379 paper-on-image-processingmaisali4
 
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...IJERA Editor
 
Survey paper on image compression techniques
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniquesIRJET Journal
 
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingRepeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingCSCJournals
 
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...cscpconf
 
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
 
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...AM Publications
 

Similar to DICTA 2017 poster (20)

Optic flow estimation with deep learning
Optic flow estimation with deep learningOptic flow estimation with deep learning
Optic flow estimation with deep learning
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
 
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGESEFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
EFFICIENT IMAGE COMPRESSION USING LAPLACIAN PYRAMIDAL FILTERS FOR EDGE IMAGES
 
I0341042048
I0341042048I0341042048
I0341042048
 
Report bep thomas_blanken
Report bep thomas_blankenReport bep thomas_blanken
Report bep thomas_blanken
 
Oc2423022305
Oc2423022305Oc2423022305
Oc2423022305
 
557 480-486
557 480-486557 480-486
557 480-486
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
 
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKING
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGA PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKING
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKING
 
Wavelet-Based Warping Technique for Mobile Devices
Wavelet-Based Warping Technique for Mobile DevicesWavelet-Based Warping Technique for Mobile Devices
Wavelet-Based Warping Technique for Mobile Devices
 
47549379 paper-on-image-processing
47549379 paper-on-image-processing47549379 paper-on-image-processing
47549379 paper-on-image-processing
 
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...
Kernel Estimation of Videodeblurringalgorithm and Motion Compensation of Resi...
 
Survey paper on image compression techniques
Survey paper on image compression techniquesSurvey paper on image compression techniques
Survey paper on image compression techniques
 
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video TranscodingRepeat-Frame Selection Algorithm for Frame Rate Video Transcoding
Repeat-Frame Selection Algorithm for Frame Rate Video Transcoding
 
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
 
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...
 
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...
DETERMINATION OF SPATIAL RESOLUTION IN COMPUTED RADIOGRAPHY (CR) BY COMPARING...
 

Recently uploaded

Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 

DICTA 2017 poster

  • 1. 4K Ultra High Definition Video Coding using Homogeneous Motion Discovery Oriented Prediction Ashek Ahmmed† Afrin Rahman† Mark PickeringΨ Aous Thabit Naman∗ † Department of Computer Science and Engineering, Rajshahi University of Engineering and Technology, Rajshahi, Bangladesh. Ψ School of Engineering and Information Technology, The University of New South Wales, Canberra, Australia. ∗ School of Electrical Engineering and Telecommunications, The University of New South Wales, Sydney, Australia. Abstract State of the art video compression techniques use the motion model to approximate geometric boundaries of moving objects where motion discontinuities occur. Motion hints based inter-frame prediction paradigm moves away from this redundant approach and employs an innovative framework consisting of motion hint fields that are continuous and invertible, at least, over their respective domains. However, estimation of motion hint is computationally demanding, in particular for high resolution video sequences. Discovery of homogeneous motion models and their associated masks over the current frame and then use these models and masks to form a prediction of the current frame, provides a computationally simpler approach to video coding compared to motion hint. In this paper, the potential of this coherent motion model based approach, equipped with bigger blocks, is investigated for coding 4K Ultra High Definition (UHD) video sequences. Experimental results show a savings in bit rate of 4.68% is achievable over standalone HEVC. Introduction Block-based translational motion model assigns a single motion vector to all the pixels inside a block based on the assumption that the constituting pixels are moving in the same direction at a constant speed. This uniformity of motion within a block hypothesis does not hold if the block is on object boundaries i.e. where motion discontinuity exists. Hence, this model fails to efficiently model the actual locations of discontinuities in the motion field. Partitioning motion blocks, with object boundaries, into smaller square or rectangular sub-blocks represents a popular approach to improve the compression efficiency since it is possible to better match the blocks to the objects in the scene. Motion hint can provide a global description of motion over specific domain and is related to the foreground-background segmentation where the foreground and background motions are the hints. The inspiration behind motion hint is to avoid using motion model for the purpose of describing object boundaries since the spatial structure of previously-decoded reference frames can be exploited to infer appropriate boundaries in the frames to be predicted. The motion hint based prediction paradigm introduced in [2] is promising. Each reference frame is segmented into super-pixels and those super-pixels are then grouped into homogeneous motion groups iteratively. However, for high definition, full high definition and 4K ultra high definition resolution video sequences the number of super-pixels becomes too many for the segmentation algorithm to deal with and produce representative enough foreground-background shapes within a viable number of iterations. This phenomenon is depicted in Figure 1. Figure 1: Outcome of the motion hint segmentation approach, described in [2], after 4 iterations. The example sequence is the Kimono 1080p sequence. Many background super-pixels are still misclassified as foreground, hence poor quality motion hint segmentation is yielded. In this paper we investigate the applicability of a prediction paradigm [1], where a bi-directional affine motion model compensated prediction is used as a reference frame and prediction generation process does not require any foreground-background segmentation, for coding 4K ultra high defini- tion 4K (UHD) video sequences. Structure of the coding/decoding architecture In the considered approach, depicted with a simplified block diagram in Figure 2, the affine mo- tion field between the reference frame Ri and the current B-frame, C is estimated using standard gradient-based image registration technique. The mask, f (Ri→C) 1 , used to estimate the associated 6- parameter affine model is the entire C frame i.e. f (Ri→C) 1 is a binary image with all values equal to 1. The resultant affine motion model M (Ri→C) 1 is approximated by the 3-corner motion vectors (MVs), specifically by the top left, top right and the center pixels’ MVs. The fractional part of these MVs are quantized to the accuracy of 1/16-th of a pixel using the Exponential Golomb coding technique. This quantized motion model, M (Ri→C) 1 is employed to warp Ri for generating an affine motion compensated prediction, CRi→C 1 of C . CRi→C 1 = M (Ri→C) 1 Ri (1) Figure 2: Block diagram of the coding/decoding framework that uses the bi-directional affine motion model compensated prediction as a reference frame, along with the usual temporal reference(s), for the B-frames. Next, an error analysis is carried out over the prediction error associated to the prediction CRi→C 1 . Prediction error blocks having the sum-squared error (SSE) greater than the block-wise mean SSE of this error image are identified. An example of these blocks, marked with white boundary pixels, for a 4K UHD sequence is shown in Figure 3. Blocks with high SSE are then used to form another mask, f (Ri→C) 2 , which is fed in the affine motion model estimation process involving Ri and C , for generating a second affine model, namely M (Ri→C) 2 . The reference frame Ri is warped by this newly estimated and quantized affine motion model to generate another prediction of C , specifically CRi→C 2 . CRi→C 2 = M (Ri→C) 2 Ri (2) Figure 3: The affine motion model, M (Ri→C) 1 performed poorly in blocks with white boundary pixels i.e. these blocks have high prediction error energy. The example scenario is for predicting frame 5 using coded frames 1 and 9 of the Vehicles (3840 × 2160) sequence. Used block size is 240 × 240 pixels. Now with the help of the mask f (Ri→C) 2 , the predictions CRi→C 1 and CRi→C 2 are fused into a single prediction of C , from the reference frame Ri , in the following way: CRi→C = 1 − f (Ri→C) 2 · CRi→C 1 + f (Ri→C) 2 · CRi→C 2 (3) Similarly, using the reference frame Rj and C , the prediction of C namely CRj→C is formed. The predictions CRi→C and CRj→C are then combined through a weighting scheme to generate the bi- directional affine motion models compensated prediction, Raffine , of C . Raffine = wi · CRi→C + wj · CRj→C (4) where wi, wj ∈ [0, 1] are real-valued weights. What is communicated from the encoder are the 4 sets of 3-corner motion vectors M (Ri→C) 1 , M (Ri→C) 2 , M (Rj→C) 1 , and M (Rj→C) 2 . That means in total 4 × 3 = 12 MVs, each with 1/16-th of a pixel accuracy, are necessary to yield the reference frame Raffine at the decoder. Experimental Analysis The RD performance of the employed coder is investigated on three 4K UHD video sequences. The first 49 frames of each sequence are coded by the HM 16.10 reference software for HEVC. The HM encoder is configured using the random access main configuration i.e. hierarchical GOP structure is used with GOP size = 8, Intra period = 32 as per the common test conditions. Four different quantiza- tion parameter values (QP = 22, 27, 32, 37) are used. For each B-frame, the available pair of reference frames are fed into the bi-directional affine motion models based prediction process to generate the additional reference frame, Raffine . All the obtained results are summarized in Table 1. Sequence Delta rate Delta PSNR Park and Buildings −2.70% 0.06 dB Vehicles −4.68% 0.15 dB Book −3.97% 0.08 dB Average −3.78% 0.10 dB Table 1: The Bjøntegaard delta gains obtained from the test sequences over standalone HEVC by using the bi-directional affine motion models compensated reference frames. Conclusions • investigated the RD performance of a hybrid coder, that uses an affine motion compensated refer- ence frame along with the typical references, for coding 4K UHD video sequences. • generation of this additional reference is done by finding homogeneous motion regions over the current frame and it does not require any super-pixel level segmentation. • experimental results are encouraging e.g. an improvement in bit rebate of up to 4.68% is achieved over standalone HEVC. • the associated motion information and signalling overhead could further be optimized to have in- creased bit rate savings. References [1] A. Ahmmed, D. Taubman, A. T. Naman, and M. Pickering. Homogeneous motion discovery ori- ented reference frame for high efficiency video coding. In Picture Coding Symposium (PCS), 2016, pages 1–5, 2016. [2] A. Ahmmed, R. Xu, A. T. Naman, M. J. Alam, M. Pickering, and D. Taubman. Motion seg- mentation initialization strategies for bi-directional inter-frame prediction. In IEEE International Workshop on Multimedia Signal Processing, pages 58–63, Sept 2013. Acknowledgements The authors would like to thank Dr. Matteo Naccari of the B.B.C. Research and Development team for providing the 4K UHD video sequences and valuable suggestions during this work.