Successfully reported this slideshow.
Upcoming SlideShare
×

# Block Matching Project

7,664 views

Published on

Block matching for motion Estimation

• Full Name
Comment goes here.

Are you sure you want to Yes No
• please give me the coding to yassinenissay4@gmail.com

Are you sure you want to  Yes  No

Are you sure you want to  Yes  No
• please send the codings at chandana.pandey@gmail.com

Are you sure you want to  Yes  No
• in coding it gives error as char

Are you sure you want to  Yes  No
• plz give me its whole coding of matlab.....on mail salman.80@rediffmail.com

Are you sure you want to  Yes  No

### Block Matching Project

1. 1. 1 ECE 565 Computer Vision and Image Processing Spring 2010 Computer Assignment # 2 Block Matching Based Motion Prediction Method By Dhawal Subodh Wazalwar A20249257 ECE Dept Illinois Institute of Technology Due date 30th April 2010 1
2. 2. Block Matching Based Motion Estimation 1. Introduction The main aim of the Block Matching(BM) Motion Estimation is to compare images taken at two different time frames and estimate the direction of motion taken place between the two frames. Here, the challenge is to get the best motion vector by using a pixel domain search method and proper manipulation of BM parameters. Fig 1.1 tries to demonstrate the basic concept of block matching motion estimation. Here, two blocks Reference and Target Block are shown and the Target block is the translated version of Reference block. The key aspect in this method is how accurately can we estimate motion direction with appropriate Motion Vectors Column i,j i+v1,j+v2 Row Reference Block at i,j Target Block at i+v1,j+v2 Direction of Motion Vector v1,v2 Motion Vector Magnitude Fig 1.1 Basic Understanding of Block Matching Any BM Motion Estimation method needs following parameters to be selected beforehand : • The Matching Criteria Various metrics like Sum of Absolute Differences(SAD),Mean Square Error(MSE). Mean Absolute Difference(MAD) can be used for quantitatively proving that the two blocks in comparison are matching. Here, in this implementation SAD is used which is represented as where, It+1(i,j) -----> Target Frame It (i,j) -----> Reference Frame k,l -----> The block location N -----> Block size [v1;v2] ----> Motion Vector for that block • Search Strategy The search strategy influences the quality of Motion Field and also the overall computation time as demonstrated in later parts. There are various search strategies like Full Exhaustive Search, 3-point Search, Cross Search Method and many more versions of it. In this implementation, we have used full exhaustive search method. 2
3. 3. • Search Window and Block Size The Block size(N) is the size of the block of target and reference frame to be compared for estimating the motion vectors. On the other hand, Search Window(M) is the resolution for which the search over the block is done. 2. Algorithm 2.1 Implementation Steps Flow chart in Fig 2.1 shows the important steps in the Full Exhaustive Search Block Matching Motion Estimation Method. Reference Image Target Image Initialize N,M,th Select the appropriate Target block It+1(i,j) of NXN Yes No Reference Calculate Checked Calculate for all Block SAD(0) Motion SAD vectors It(i+v1,j+v2) Is No Is SAD(v)>SAD(0) SAD -th*N*N min No Yes Yes Motion Store Vectors =0 Motion Vectors Display Assign block Determine Motion In the DFD,FD Field Predicted Image Fig 2.1 Flow Chart showing the Implementation steps for Full Exhaustive BM Method 3
5. 5. Fig 3.2 Target and Predicted Image for Example 1 with N=16,M=16,th=0 Fig 3.3 FD and DFD for the Image 1 with N=16,M=16,th=0 Now, finally the most important of all Motion Vector Field is shown below in Fig 3.4 Fig 3.4 Motion Vector Field showing the estimated motion direction for N=16,M=16,th=0 Here, in this motion vector field diagram, we can see that it clearly shows the motion for the displacement of face and shoulder part, while the random fields near the boundary are for the background part which has Camera 5
6. 6. effect and is not a part of actual motion displacement. Note: The Direction shown is for the motion from Predicted ie Target Image to the Reference Image since the algorithm tries to find the target block in the reference block translated by the appropriate motion vector. Also, if we see the reference and the target image, we can see that actual information is approximately in the following region: Face: Row 15:100, Column 50:150 Shoulders: Row 95:144, Column 1:176 So ideally, we should have motion vectors at the following coordinate range shown in the Figure 3.5 Fig 3.5 Range of Ideal Motion Vector Field So by changing the threshold value, we can neglect the unwanted motion vectors which came because of the background random motion, which we need not consider. Case 2: Block size NXN=16X16, Search Window M=16,Threshold=0.5 The results for Case 2 are shown in Fig 3.6 and 3.7 Fig 3.6 Predicted and DFD for N=16,M=16,th=0.5 6
7. 7. Motion Vector Field Columns 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 Rows 5 6 7 8 9 Fig 3.7 Motion Vector Field for Image 1 with N=16,M=16,th=0.5 We can see in Fig 3.7, we get better Motion Field Representation than obtained in Fig 3.4 for Case 1. Also, after comparing the Reference and Target Image in Fig 3.1 and 3.2 , it was found that the max displacement in any direction is not more than 5-6 pixels approximately. This was determined by using 'pixval' command in Matlab and finding the displacement at key positions in Image like Person's Face, Eyes and other such features visible. Therefore, in optimal Motion field matrix, we can expect max absolute value to be not more than 5-6 in this case. So here in Case 2 with th=0.5, although we see almost expected Motion Vector, after checking matrix values, we find that max absolute value here is 8. So, th=0.5 is not the optimal threshold. Note: Here, one important observation can be made regarding 'Search Range' M. As mentioned above, max displacement was approximately found to be 5-6. So, if we give search range less than this value(say M=4), it would be interesting to see the Motion Field, shown in Fig 3.8 Fig 3.8 Motion Vector Field for N=16,M=4,th=0.5 7
8. 8. If we compare direction of motion vectors in Fig 3.7 and Fig 3.8, we can see the difference, in that for th=4, we don't get the expected direction. Also in this case, Total SAD=86609 as compared to 67616 for N=16,M=16,th=0.5. Therefore, we should always keep Search Range M greater than the absolute of maximum displacement between Reference and Target Image to get the accurate motion estimation direction. So, after continuing the changes in the threshold value, it was found that we get better possible motion vector field when threshold value is in the range of 1.5 to 2, and it was found that the best results were seen at th=1.7, after which there was not much change in the motion field. Case 3: N=16,M=16,th=1.7 Predicted ,DFD images and the corresponding Motion Vector Field are shown in Fig 3.9 and Fig 3.10 Fig 3.9 Predicted and DFD Image for N=16, M=16, th=1.7 Motion Vector Field Columns 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 Rows 5 6 7 8 9 Fig 3.10 Example 1 Motion Vector Field for N=16,M=16,th=1.7 (Close to desired results) Here, after the absolute value of maximum displacement shown by Motion vectors is 6. So, visually as well as theoretically its is proved that Optimum threshold for Example 1 Image is '1.7' Now, we will change the block size N=8, keeping same M=16 and Threshold th=1.7, and Fig 3.11 and Fig 3.12 show the obtained results. 8
9. 9. Case 4: Image 1 with N=8,M=16,th=1.7 Here if we compare DFD in Fig 3.11 and Fig 3.9 for N=16, we can see that for N=8, we seem to be getting results because almost see very sketchy outline of shoulder area compared to that in Fig 3.9 for N=16. This is also demonstrated by that Total SAD value has reduced from 70343 to 60450, which is testimony of the fact we have obtained better Prediction Image. This is true because smaller the block of Target Image used for comparison, more is the possibility of detecting it in the Reference Image, meaning when we reduce the block size, we start getting minor details which were missed for N=16. Fig 3.11 Example 1 Predicted and DFD Image with N=8,M=16,th=1.7 Fig 3.12 Motion Field for N=8,M=16, th=1.7 On the other hand, if we compare Motion vector fields in Fig 3.12 for N=8 and Fig 3.10 at N=16, we actually the reverse effect. In that, yes we are able to get finer details but missing out on few border motion vectors for same threshold value(here th=1.7).That means, the same threshold value is not optimal in all case but only for that particular block size. The quality of Motion Vector depends on the condition expressed below SAD(V')<SAD(0) - th*N*N Now here, if all factors reduce by same amount then the same threshold value will give optimal results. However, it was found that Total SAD value reduces by different amount(here from 70343 to 60450) compared to the 'th*N*N' which becomes 1/4th. The total SAD(0) value remains same. This shows that all factors have different reduction factors and so results depends on the particular block situation. Therefore, we see mismatch in the 9
10. 10. Motion Vector Field results for N=8 and N=16 for same Threshold value. So, every time we change the block size, we should freshly determine the optimal threshold value for that block size. Now, here one more thing to be taken into consideration is the computational time. For example for N=16,M=16 the number of iterations/computations were found to be 107812, while for N=8, M=16, there number was 431245. in terms of time the later analysis took almost 25 seconds more. This can be explained by reducing the block size, we have to get motion vectors for more number of blocks and so the increase in computation time. Also, if we decrease the search window ( say M=8), the computation time would be less. Fig 3.13 and fig 3.14 shows results for N=8,M=8,th=1.7. Fig 3.13 Predicted and DFD for N=8,M=8 and th=1.7 Fig 3.14 Motion Field Vector for N=8,M=8,th=1.7 If we compare the DFD for N=8,M=8 in Fig 3.14 and N=8,M=16 in Fig 3.12, we don't see much changes due to the fact that actual displacement between Reference and Target frame is not more than 6 pixels approximately. So for fixed N, if we change M above 6-7, we should not find much difference. This is also demonstrated by the fact that Total SAD value doesn't change much as in the Tabular Summary for all cases applied on Image1 shown in Tab 3.1 10
11. 11. Block Size(N) Search Threshold(th) Number of Time in secs Total SAD Total SAD(0) Range(M) Iterations 16 16 0 107812 10.13 65986 231549 16 16 0.5 107812 9.43 67616 231549 16 16 1.7 107812 16 70343 231549 8 16 1.7 431245 30.98 60450 231549 8 8 1.7 114445 4.92 60491 231549 Tab 3.1 Overall Summary of all cases analyzed 3.2 Analysis on Example Image 2 Similar Analysis was done for the Example Image 2 and the results for all the cases are shown in Fig 3.15 to Fig 3.24 Case 1: N=16,M=16,th=0 Fig 3.15 Reference Example Image 2 Fig 3.16 Target and Predicted Image for Example 2 with N=16,M=16,th=0 11
12. 12. Fig 3.17 DFD and FD Image for Example 2 with N=16,M=16,th=0 Motion Vector Field Columns 0 5 10 15 20 25 0 2 4 6 Rows 8 10 12 14 16 Fig 3.18 Motion vector Field for Image 2 with N=16,M=16 and th=0 If we compare Reference and Target Image, we can see there are some regions which because of motion get occluded, for example the surrounding region behind the Vehicle. So, we can't estimate motion for such areas, even though we see some motion vectors in that region. This problem is called “Occlusion” Problem and is one of the limitations of Basic Block Matching method. Another limitation is “Aperture Problem”, in which we have blocks of same intensity values multiple times in the regions successively and so we don't get any motion estimation for such cases, since in actual mathematically there is no motion. However, if the region is sub block of bigger block, we need to have motion vectors since there is a motion. Although not exactly, but we see a similar case at Vehicles Door region at approximately Block(10,5) in Motion Vector field. In Fig 3.18 we still see some motion vector of small magnitude because actually we don't have same intesnity throughout that region, but it has similar texture, which can be proved by the fact that the region at Block(10,5) is not much visible even in Frame Difference(FD). Now, for N=16,M=16 and th=0, we are almost getting the desired Motion vector field indicating motion at all places, which is actually true if we compare the Reference and Target Image. This is unlike Example image 1 where we were concerned in estimating motion in particular region. 12
13. 13. Now, if here th=0 will be the best possible Threshold Value. Now, if we change the threshold we expect little degraded results, since some required motion vectors will not satisfy the threshold condition. Case 2: Image 2 with N=16,M=16, th=1 Fig 3.19 to Fig 3.20 shows the results for Case 2 Fig 3.19 Predicted and DFD Image for N=16,M=16,th=1 Fig 3.20 Motion Vector Field for N=16,M=16,th=1 Here, if we compare the DFD for th=1 in Fig 3.19 and for th=0 in Fig 3.17, we can't see much of the difference visually. However, if we compare Total SAD value for both cases, we find that it has increased from 692509 to 694021. Also, if we compare Motion Vector Field in Fig 3.20 and Fig 3.18, as expected we see some required motion vectors missing. This proves the point made earlier that th=0 is the most optimal threshold value. Case 3: N=8,M=16,th=0 Fig 3.21 and Fig 3.22 shows the Predicted, DFD and Motion Vector Field for N=8,M=16,th=0 13
14. 14. Fig 3.21 Predicted and DFD Image for N=8,M=16,th=0 If we compare DFD image for N=8 in Fig 3.21 and for N=16 in Fig 3.17 with th=0, we can clearly see for N=8, the bright spot on the vehicle is missing, and thus giving more accurate predicted Image. This is because for smaller block size 8X8, the bright spot couldn't be a part of matching block in the Target Frame, so we don't see it in the predicted image. Thus, with smaller block size, we start getting finer details in the Target Image. If we compare Motion Vector field in Fig 3.22 and Fig 3.16, we can see that for same threshold value i.e th=0, we get the optimal results. So, in this case, the quality of Motion Field is actually improved for N=8,M=16 and th=0. This is unlike the Example 1 Image where the situation was reverse and we had to track motion at some part of whole Image. Finally, for N=8,M=8,th=0, results are shown in Fig 3.23 and Fig 3.24 and then all the cases analyzed are tabulated in Table 3.2. Fig 3.22 Motion vector Field for N=8,M=16,th=0 Fig 3.23 Predicted and DFD Image with N=8,M=8,th=0; 14
15. 15. Motion Vector Field Columns 0 5 10 15 20 25 30 35 40 45 0 5 10 Rows 15 20 25 30 Fig 3.24 Motion Vector Field for N=8,M=8,th=0 Block Size(N) Search Threshold(th) Number of Time in secs Total SAD Total SAD(0) Range(M) Iterations 16 16 0 375706 35.98 692509 1714664 16 16 0.5 375706 35.57 692812 1714664 16 16 1 375706 36.39 694021 1714664 8 16 0 1470151 114.9 619854 1714664 8 8 0 390151 17.11 623930 1714664 Tab 3.2 Summary for All case analyzed For Example Image 2 3.3 Changing the Search Method From the analysis shown in Tab 3.1 and Tab 3.2, we can see that Full exhaustive search method gives very good motion estimation prediction. The main reason for getting such good results is because we try searching over entire Search Window Resolution. On the flip side, this method takes a lot of computational effort. So, there are various methods as mentioned in the introduction part for getting faster and almost close to Full Exhaustive Method Results. As regarding the quality of Motion Vector Field, we may miss some Object motion vector fields, since we scan at few specific points rather than at each and every location in Search Range M as done in Full Exhaustive Search Method. Thus, the quality of Motion vector field depends on the search method's accuracy. For example, here in the same implementation if we put the Threshold Condition ( SAD(v') < SAD(0) -th*N*N) inside the loop of Motion Vector Search Range and if the condition is not satisfied, we break out of the loop and assume that there is no motion and assign '0' value to the motion vector, we get following Motion Vector Field for N=16,M=16,th=0 shown in Fig 3.25. We can easily see that many required motion vectors are missing but still we get a rough estimate of the motion estimation. The number of iterations/computations performed reduces to 46547 as compared to 107812 in Full Search Method. If we use methods like 3 point, Cross Search Methods, we can get better accuracy in Motion Vector Field results. 15
16. 16. Fig 3.25 Motion Vector Field for Modified Search Method with N=16,M=16,th=0 4. Conclusion Based on summary obtained in Table 3.1 for Example Image 1 and Table 3.2 for Example Image 2, following conclusions can be drawn: • Effect of Changing BM parameters on Motion Estimation Accuracy ◦ Block Size(N): Smaller the Block size, we get finer details of the target Image in the predicted image i.e lower Total SAD value. However, the computational effort increases. ◦ Search Window(M): Theoretically, greater the value of M, better the quality of predicted image, lower total SAD value and more accurate Motion Vector Field Representation. Practically, we need to ensure that the value of M is greater than the absolute maximum displacement between Reference and Target image in any direction. Also, higher the value, more is the computational effort. ◦ Threshold(th): The optimal value of 'th' depends on the images. If we need to estimate for entire Image, optimal value will be independent of N and is equal to zero. However, if we have to consider only a particular region in image for motion estimation, every time we change N and M, we need to get optimal threshold value. In general , increasing 'th' results in increased total SAD value. • Full Exhaustive Search Method gives a very accurate Motion vector Field Representation, however the computational effort required is more. Using other search methods like 3 point search method, can help getting faster results. • Block Matching based motion estimation method cannot estimate motion for the part when it is occluded in between the frames or if it has Aperture problem. 5. References [1] 'Digital Video Processing', A. Murat Tekalp, Chapter 6 Block Based Methods [2] 'ECE 565 Computer Vision and Image Processing' class notes, Professor Jovan Brankov 16