Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMDArchitecturesPresented by:•Ahmed Abdel-Hafeez•Ahmed El-Bohy...
Outline• Abstract.• Introduction.• 8-bit partial sums.• Multilevel 8-bit partial sums.• Computational complexity.• Simulat...
Abstract• Fast block motion estimation algorithms are needed for real-timeimplementations of video coding standards due to...
4Introduction- - ApplicationsARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectu...
Chronological Table of Video Coding StandardsThe objective of video coding is to compress moving imagesH.261(1990)MPEG-1(1...
Introduction-Basics- Video6Frame 1 Frame 2 Frame 3 Frame 4Luminance (Y) : Describes the brightness of the pixel.Chrominanc...
Introduction-Basics- Video DataDrawback• An uncompressed video data is big in size.– This is due to data redundancy, there...
• Predict current frame based on previously codedframes• Types of coded frames:– I-frame – Intra-coded frame, coded indepe...
Block Matching9ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 201...
• What is Motion Estimation?– Predict current frame from previousframe– Determine the displacement of an objectin the vide...
Block Based Motion Estimation AlgorithmsTime-domain Algorithms Frequency-domain AlgorithmsMatching Algorithms Gradient Bas...
Motion Estimation(ctd)12ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures s...
Motion Estimation(ctd)13ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures s...
14Motion Estimation(ctd)ReferenceFrameCurrentFrameCurrent 16x16 BlockSearchWindowSum of AbsoluteDifference (SAD)ARAB ACADE...
• CCF(Cross-Correlation Function)• MSE(Mean Square Error Function)• MAE(Mean Absolute Error)• SAD(Sum of Absolute Differen...
SAD(dx,dy) =(MVx, MVy) = min (dx,dy)ЄR2 SAD(dx,dy)1 11 |),(),(|NxxmNyynkk dyndxmInmISAD16ARAB ACADEMY-CAIRO Fast Block Mot...
Search Algorithms17SearchAlgorithmsFASTMULTISTEP3SS 4SS HBS UDSEXHAUSTIVESE MSE VF PFGSEFULLARAB ACADEMY-CAIRO Fast Block ...
Search Algorithms(ctd)• There is a trade-off between the run time andthe accuracy.• Full search will be most accurate beca...
Full-Search19ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 ...
•Simplest algorithm, but computationally most expensive20Exhaustive SearchARAB ACADEMY-CAIRO Fast Block Motion Estimation ...
Three Step Search (3SSA)21ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures...
Three Step Search (3SSA)(ctd)22ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architec...
Three Step Search (3SSA)(ctd)23ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architec...
Three Step Search (3SSA)(ctd)24ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architec...
253SSA Block Matching►Three-Step Search (3SS)– 9 Points: Central point & its 8surroundings– Distance: w/2– Find the best m...
4SSA26ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
Unrestricted center-bitiased DiamondSearch Algorithm (UDSA)27ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Pa...
Hexagon-Bitased search algorithm(HBSA)28ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD...
Problem Definition• The high computational requirement of the FullSearch (FS) algorithm does not allow it to work inreal t...
Aim• To improve the accuracy of some of the fastblock motion estimation techniques withoutincreasing the computational com...
Limitation• If the partial sums for an algorithm is morethan 8 bits for a reference block cannot beput, accessed, and mani...
Procedure• Devise a scheme that uses only 8 bit partialsum and discard as many SAD computationsas possible, without exclud...
Partial Sums33268+ 483600Add the hundreds (200 + 400)Add the tens (60 +80) 140Add the ones (8 + 3)Add the partial sums(600...
8 Bit Partial Sums- Objective• The objective of this paper is to find newpartial sums of only eight bits, so that theycan ...
8-bit Partial Sums012345678910111213141516 X 1635ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums U...
Lower Bound36ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 ...
Scheme One- Algorithm37ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures sp...
Scheme One- Algorithm(ctd)• Step 2) For every current block, execute the blockmotion-estimation process.– Step 2.1) Initia...
Scheme One- Algorithm(ctd)– Step 2.2) Search• For (each search location of in a motion-estimation algorithm)39ARAB ACADEMY...
40Scheme One- Flow ChartARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures s...
Multilevel 8-bit Partial Sums16 X 1641ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD A...
Multi-level Visualisation
Multi-level Visualisation
Multi-level Visualisation (ctd)
Multi-level Visualisation (ctd)
Multi-level Visualisation (ctd)
Multi-level Visualisation (ctd)
Multi-level Visualisation (ctd
Partial Sum PyramidPartial Sum Pyramid8 x 164 x 162 x 161 x 16Level 1 Level 2 Level 3 Level 449ARAB ACADEMY-CAIRO Fast Blo...
50Multilevel 8-bit Partial Sums- UpperBound (UB)ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Us...
Scheme Two Algorithm51ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spr...
Scheme Two Algorithm (ctd)52ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectur...
Scheme Two Algorithm (ctd)53– Step 2.2) Search• For (each search location of in a motion-estimation algorithm)
Scheme Two- Flow Chart54
Possible Conditions55ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spri...
Possible Combinations56ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures sp...
AVERAGEEXECUTION TIME(INMILLISECONDS)PERFRAME FORVARIOUSMETHODSResults57ARAB ACADEMY-CAIRO Fast Block Motion Estimation Wi...
Possible Combinations58ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures sp...
SIMD59ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING FSA60ARAB ACADEMY-CAIRO Fast Block Motion Estimat...
COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING SEA61ARAB ACADEMY-CAIRO Fast Block Motion Estimat...
COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING 3SSA62ARAB ACADEMY-CAIRO Fast Block Motion Estima...
COMPUTATIONAL COMPLEXITY ANDAVERAGENUMBER OF CPU CYCLES PER BLOCK USING 4SSA63ARAB ACADEMY-CAIRO Fast Block Motion Estimat...
COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING UDSA64ARAB ACADEMY-CAIRO Fast Block Motion Estima...
COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING HBSA65ARAB ACADEMY-CAIRO Fast Block Motion Estima...
THE PERCENTAGE OF SPEEDUP OFFERED BY SIMD IMPLEMENTATION FORA MOTION ESTIMATION ALGORITHM WITH SCHEME 2 INCORPORATED66ARAB...
ConclusionIntroduced a new technique of 8 bit partialsum.The partial sums were used to make best useof SIMD architecture...
Conclusion The notion of the 8-bit partial sums has then beenextended to the four-level case and shown that there are15 p...
Conclusion Extensive simulations have been carried out to findthe average number of CPU cycles needed per block forvariou...
70References1. “FPGA Implementation of a Novel, Fast Motion Estimation Algorithm for Real-Time VideoCompression”, FPGA 200...
71ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
Upcoming SlideShare
Loading in …5
×

Fast block motion estimation with 8 bit partial sums using SIMD architecture

1,017 views

Published on

Published in: Technology, Travel
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,017
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
26
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Fast block motion estimation with 8 bit partial sums using SIMD architecture

  1. 1. Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMDArchitecturesPresented by:•Ahmed Abdel-Hafeez•Ahmed El-Bohy•Ahmed Emam•Ahmed KandilSupervised by/Presented to:Pf.Dr. Attalah HashaadPublished by: Chunjiang J. Duanmu et. al.Published in August 2007.
  2. 2. Outline• Abstract.• Introduction.• 8-bit partial sums.• Multilevel 8-bit partial sums.• Computational complexity.• Simulation Results.• Conclusion.2ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  3. 3. Abstract• Fast block motion estimation algorithms are needed for real-timeimplementations of video coding standards due to the high computationalcomplexity of the full-search algorithm for block motion estimation.• In this paper, an algorithm using 8-bit partial sums of 16 luminance valuesfor a fast block motion estimation is proposed. The technique of using thepartial sums is employed to reduce the computational complexity of notonly the full search algorithm but also some of the fast block motionestimation algorithms while maintaining their accuracy.• Furthermore, it is shown that the byte-type data-parallelism on an SIMDarchitecture can be utilized to access and process these partial sumsconcurrently to accelerate the process of motion estimation.• Simulation results are presented to demonstrate that the use of thepartial sums can accelerate the execution of the full-search and anothersearch algorithms on an SIMD architecture significantly.3ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  4. 4. 4Introduction- - ApplicationsARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slideBasics
  5. 5. Chronological Table of Video Coding StandardsThe objective of video coding is to compress moving imagesH.261(1990)MPEG-1(1993)H.263(1995/96) H.263+(1997/98)H.263++(2000)H.264( MPEG-4Part 10 )(2002)MPEG-4 v1(1998/99)MPEG-4 v2(1999/00)MPEG-4 v3(2001)1990 1992 1994 1996 1998 2000 2002 2003MPEG-2(H.262)(1994/95)ISO/IECMPEGITU-TVCEG5
  6. 6. Introduction-Basics- Video6Frame 1 Frame 2 Frame 3 Frame 4Luminance (Y) : Describes the brightness of the pixel.Chrominance (CbCr) : Describes the color of the pixel.FrameARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  7. 7. Introduction-Basics- Video DataDrawback• An uncompressed video data is big in size.– This is due to data redundancy, there are twogeneral types of data redundancy in a video:7Spatial redundancyIn a frame, adjacent pixels areusually correlated. e.g. - The grass isgreen in the background of a frame.Frame 1 Frame 2 Frame 3 Frame 4Time based redundancyIn a video, adjacent frames areusually correlated. e.g. - The greenbackground is persisting frame afterframe.ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  8. 8. • Predict current frame based on previously codedframes• Types of coded frames:– I-frame – Intra-coded frame, coded independently of allother frames– P-frame – Predictively coded frame, coded based onpreviously coded frame– B-frame – Bi-directionally predicted frame, coded based onboth previous and future coded framesIntroduction-Basics- VideoCompression8ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  9. 9. Block Matching9ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  10. 10. • What is Motion Estimation?– Predict current frame from previousframe– Determine the displacement of an objectin the video sequence– The amount of data to be coded can bereduced significantly if the previous frameis subtracted from the current frame.10Motion EstimationARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  11. 11. Block Based Motion Estimation AlgorithmsTime-domain Algorithms Frequency-domain AlgorithmsMatching Algorithms Gradient Based AlgorithmsBlock-MatchingFeature-matchingPel-recursive Block-recursive Phase-correlation(DFT)Matchingin (DCT)domainMatchingin waveletdomainMesh Based Motion Estimation AlgorithmsMotion Estimation Classification11ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  12. 12. Motion Estimation(ctd)12ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  13. 13. Motion Estimation(ctd)13ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  14. 14. 14Motion Estimation(ctd)ReferenceFrameCurrentFrameCurrent 16x16 BlockSearchWindowSum of AbsoluteDifference (SAD)ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  15. 15. • CCF(Cross-Correlation Function)• MSE(Mean Square Error Function)• MAE(Mean Absolute Error)• SAD(Sum of Absolute Difference)• PDC(Pixel Difference Classification)• MAE(or MAD,SAD are commonly employed due to theirsimplicity in hardware implementation)Distortion Criterion for measuring distance betweenprevious block and search area block15ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  16. 16. SAD(dx,dy) =(MVx, MVy) = min (dx,dy)ЄR2 SAD(dx,dy)1 11 |),(),(|NxxmNyynkk dyndxmInmISAD16ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  17. 17. Search Algorithms17SearchAlgorithmsFASTMULTISTEP3SS 4SS HBS UDSEXHAUSTIVESE MSE VF PFGSEFULLARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  18. 18. Search Algorithms(ctd)• There is a trade-off between the run time andthe accuracy.• Full search will be most accurate because ofexhaustive search, but will require more time• Fast search is faster but the accuracy will bereduced because of estimation algorithms.18ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  19. 19. Full-Search19ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slidenot suitable for real time.
  20. 20. •Simplest algorithm, but computationally most expensive20Exhaustive SearchARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  21. 21. Three Step Search (3SSA)21ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  22. 22. Three Step Search (3SSA)(ctd)22ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  23. 23. Three Step Search (3SSA)(ctd)23ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  24. 24. Three Step Search (3SSA)(ctd)24ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  25. 25. 253SSA Block Matching►Three-Step Search (3SS)– 9 Points: Central point & its 8surroundings– Distance: w/2– Find the best match– Use previous best as center– Half distance, select 8 new– Repeat algorithm 3 times– Examines 25 points– Assumes a uniformdistribution of MV’s1111111 11232222222333 3 333ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  26. 26. 4SSA26ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  27. 27. Unrestricted center-bitiased DiamondSearch Algorithm (UDSA)27ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  28. 28. Hexagon-Bitased search algorithm(HBSA)28ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  29. 29. Problem Definition• The high computational requirement of the FullSearch (FS) algorithm does not allow it to work inreal time applications, despite its high accuracy.• Fast Block motion estimation algorithms havelower computational complexity, but loweraccuracy.• Since, fast block motion estimation are chosenfor real time applications  Hence in this papertoo.29ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  30. 30. Aim• To improve the accuracy of some of the fastblock motion estimation techniques withoutincreasing the computational complexity.• To make best use of Single InstructionMultiple Data (SIMD) architecture and to takeadvantage of byte-type data-parallelism tofurther accelerate the execution of thealgorithms to achieve the main goal.30ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  31. 31. Limitation• If the partial sums for an algorithm is morethan 8 bits for a reference block cannot beput, accessed, and manipulated in acontiguous memory space, since there arepartial sums of other reference blocks lying inbetween; due to this, a large number of CPUcycles are lost in manipulating these data. As aconsequence, these algorithms are notsuitable for SIMD implementations.31ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  32. 32. Procedure• Devise a scheme that uses only 8 bit partialsum and discard as many SAD computationsas possible, without excluding the optimalmotion vector.– The proposed partial sums can not only be utilizedin the full-search algorithm as well as in some ofthe fast block motion-estimation algorithms.• Devise a scheme that generalises the previousscheme to multi-level case and optimallyutilise it.32ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  33. 33. Partial Sums33268+ 483600Add the hundreds (200 + 400)Add the tens (60 +80) 140Add the ones (8 + 3)Add the partial sums(600 + 140 + 11)+ 11751ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  34. 34. 8 Bit Partial Sums- Objective• The objective of this paper is to find newpartial sums of only eight bits, so that theycan be of the packed byte-type on an SIMDarchitecture.• In this way, eight additions or subtractions, forthe partial sums can be executed in one SIMDinstruction34ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  35. 35. 8-bit Partial Sums012345678910111213141516 X 1635ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide∑(n)
  36. 36. Lower Bound36ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slideusing
  37. 37. Scheme One- Algorithm37ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide• Step 1) Initializationa) Compute all of the 8-bit partial sums ofsixteen luminance values for the currentframe and save them in a contiguousmemory space.b) Retrieve all the 8-bit partial sums of sixteenluminance values for the reference frame in asaved contiguous memory
  38. 38. Scheme One- Algorithm(ctd)• Step 2) For every current block, execute the blockmotion-estimation process.– Step 2.1) Initialization38ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  39. 39. Scheme One- Algorithm(ctd)– Step 2.2) Search• For (each search location of in a motion-estimation algorithm)39ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  40. 40. 40Scheme One- Flow ChartARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  41. 41. Multilevel 8-bit Partial Sums16 X 1641ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  42. 42. Multi-level Visualisation
  43. 43. Multi-level Visualisation
  44. 44. Multi-level Visualisation (ctd)
  45. 45. Multi-level Visualisation (ctd)
  46. 46. Multi-level Visualisation (ctd)
  47. 47. Multi-level Visualisation (ctd)
  48. 48. Multi-level Visualisation (ctd
  49. 49. Partial Sum PyramidPartial Sum Pyramid8 x 164 x 162 x 161 x 16Level 1 Level 2 Level 3 Level 449ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  50. 50. 50Multilevel 8-bit Partial Sums- UpperBound (UB)ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide.
  51. 51. Scheme Two Algorithm51ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide• Step 1) Initializationa) Compute all of the 8-bit partial sums of levelsone and four for the current frame and savethem in a contiguous memory space.b) Retrieve all of the 8-bit partial sums of levelsone and four for the reference frame in asaved contiguous memory space.
  52. 52. Scheme Two Algorithm (ctd)52ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide• Step 2) For every current block, execute the blockmotion-estimation process.– Step 2.1) Initialization
  53. 53. Scheme Two Algorithm (ctd)53– Step 2.2) Search• For (each search location of in a motion-estimation algorithm)
  54. 54. Scheme Two- Flow Chart54
  55. 55. Possible Conditions55ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slideCondition 1:Condition 2:Condition 3:Condition 4:
  56. 56. Possible Combinations56ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  57. 57. AVERAGEEXECUTION TIME(INMILLISECONDS)PERFRAME FORVARIOUSMETHODSResults57ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  58. 58. Possible Combinations58ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  59. 59. SIMD59ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  60. 60. COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING FSA60ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  61. 61. COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING SEA61ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  62. 62. COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING 3SSA62ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  63. 63. COMPUTATIONAL COMPLEXITY ANDAVERAGENUMBER OF CPU CYCLES PER BLOCK USING 4SSA63ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  64. 64. COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING UDSA64ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  65. 65. COMPUTATIONAL COMPLEXITY AND AVERAGENUMBER OF CPU CYCLES PER BLOCK USING HBSA65ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  66. 66. THE PERCENTAGE OF SPEEDUP OFFERED BY SIMD IMPLEMENTATION FORA MOTION ESTIMATION ALGORITHM WITH SCHEME 2 INCORPORATED66ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  67. 67. ConclusionIntroduced a new technique of 8 bit partialsum.The partial sums were used to make best useof SIMD architecture, and hence improvingthe speed of motion estimation algorithm.Since these partial sums have thecharacteristic of having only 8 bits, eight ofthem can be processed concurrently using asingle 64-bit SIMD register.67ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  68. 68. Conclusion The notion of the 8-bit partial sums has then beenextended to the four-level case and shown that there are15 possible methods of utilizing these multilevel partialsums to accelerate the block motion-estimation algorithmswithout any loss of accuracy. The full-search algorithm has then been used to determineas to which one of these 15 methods would provide thelowest computational complexity in order for it to bechosen to accelerate various motion-estimation algorithms.68ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  69. 69. Conclusion Extensive simulations have been carried out to findthe average number of CPU cycles needed per block forvarious algorithms incorporating the chosen method. These simulations have shown that the proposedscheme is capable of providing a substantial speed-upfor the various existing motion-estimation algorithmsthrough the reduction of their computationalcomplexities. The simulation results also demonstrate that theimplementation on an SIMD architecture can furtheraccelerate the proposed scheme by more than 93%.69ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  70. 70. 70References1. “FPGA Implementation of a Novel, Fast Motion Estimation Algorithm for Real-Time VideoCompression”, FPGA 2001, CA. USA, S. Ramachandran and S. Srinivasan, Feb. 20012. “Image & Video Compression for Multimedia Engineering”, Y.Q. Shi and H. Sun, 20003. “A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation”, IEEE Trans. ImageProcessing, S. Zhu and K. K. Ma, Feb. 20004. “A Novel Four-Step Search Algorithm for Fast Block Motion Estimation”, IEEE Trans. Circuits System,Video Technology, L. M. Po and W. C. Ma, June 19965. “Successive Elimination Algorithm for Motion Estimation” W. Li and E. Salari IEEE Trans. , Jan. 19956. “A New Three-Step Search Algorithm for Block Motion Estimation”, IEEE Trans. Circuits System,Video Technology, R. Li, B. Zeng, and M.L. Liou, Aug. 19947. “Predictive Coding Based on Efficient Motion Estimation”, IEEE Trans. on communications, R.Srinivasan, K.R. Rao, Aug. 19858. “Motion Compensated Inter-Frame Coding for Video-Conferencing”, T. Koga, K. Iinuma, A. Hirano,Y. Iijima, and T. Ishiguro, Proc. NTC81, Nov. 19819. “Displacement Measurement and its Applications”, IEEE Trans. on communications, J.R. Jain andA.K Jain, Dec. 1981ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
  71. 71. 71ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide

×