Computer Vision
Chap.6 : Motion Representation
SUB CODE: 3171614
SEMESTER: 7TH IT
PREPARED BY:
PROF. KHUSHALI B. KATHIRIYA
Outline
• The motion field of rigid objects
• Motion parallax
• Optical flow
• The image brightness constancy equation
• Affine flow
• Differential techniques
• Feature-based techniques
• Regularization and robust estimation
Prepared by: Prof. Khushali B Kathiriya
2
Motion Filed and Optical Flow
PREPARED BY:
PROF. KHUSHALI B. KATHIRIYA
What is Motion Representation?
• Motion analysis was motivated by the need for
tracking an object and advancement in image
processing hardware.
• Analyzing human motion is a challenging task with
a wide variety of applications in computer vision
and in graphics. One such application, of particular
importance in computer animation, is the
retargeting of motion from one performer to
another. While humans move in three dimensions,
the vast majority of human motions are captured
using video, requiring 2D-to-3D pose and camera
recovery, before existing retargeting approaches
may be applied.
Prepared by: Prof. Khushali B Kathiriya
4
The Motion field
• In computer vision the motion field is an ideal representation of 3D motion as it is projected onto
a camera image. Given a simplified camera model, each point {displaystyle (y_{1},y_{2})}in the
image is the projection of some point in the 3D scene but the position of the projection of a fixed
point in space can vary with time.
• The motion field can formally be defined as the time derivative of the image position of all image
points given that they correspond to fixed 3D points. This means that the motion field can be
represented as a function which maps image coordinates to a 2-dimensional vector. The motion
field is an ideal description of the projected 3D motion in the sense that it can be formally defined
but in practice it is normally only possible to determine an approximation of the motion field from
the image data.
Prepared by: Prof. Khushali B Kathiriya
5
The Motion Field
Prepared by: Prof. Khushali B Kathiriya
9
The Motion Field
Prepared by: Prof. Khushali B Kathiriya
10
The Motion Field
Prepared by: Prof. Khushali B Kathiriya
11
The Motion Field
Prepared by: Prof. Khushali B Kathiriya
12
Optical Flow
• Motion of brightness patterns in the image
Prepared by: Prof. Khushali B Kathiriya
13
When Optical Flow ≠ Motion Filed?
Prepared by: Prof. Khushali B Kathiriya
14
When Optical Flow ≠ Motion Filed?
Prepared by: Prof. Khushali B Kathiriya
15
Motion Illusion
Prepared by: Prof. Khushali B Kathiriya
16
Prepared by: Prof. Khushali B Kathiriya
17
An affine (or first-order) optic flow model has 6 parameters, describing image translation, dilation,
rotation and shear. The class affine_flow provides methods to estimates these parameters for two
frames of an image sequence. (we have seen in 1st chap.)
Motion Parallax
PREPARED BY:
PROF. KHUSHALI B. KATHIRIYA
Motion Parallax
• Motion parallax refers to the fact that objects moving at a constant speed across the
frame will appear to move a greater amount if they are closer to an observer (or
camera) than they would if they were at a greater distance.
• This phenomenon is true whether it is the object itself that is moving or the
observer/camera that is moving relative to the object. The reason for this effect has to
do with the amount of distance the object moves as compared with the percentage of
the camera's field of view that it moves across.
• Ref. video: https://youtu.be/ANQtiQqfEtA
Prepared by: Prof. Khushali B Kathiriya
19
Feature-based Techniques
PREPARED BY:
PROF. KHUSHALI B. KATHIRIYA
Feature-based Techniques
• The method of finding image displacements which is easiest to understand is the feature-
based approach. This finds features (for example, image edges, corners, and other structures
well localized in two dimensions) and tracks these as they move from frame to frame. This
involves two stages. Firstly, the features are found in two or more consecutive images.
• The act of feature extraction, if done well, will both reduce the amount of information to be
processed (and so reduce the workload), and also go some way towards obtaining a higher
level of understanding of the scene, by its very nature of eliminating the unimportant parts.
Secondly, these features are matched between the frames. In the simplest and commonest
case, two frames are used and two sets of features are matched to give a single set of
motion vectors.
Prepared by: Prof. Khushali B Kathiriya
21
Feature-based Techniques
• Successive video frames may contain the same objects (still or moving). Motion estimation
examines the movement of objects in an image sequence to try to obtain vectors
representing the estimated motion. Motion compensation uses the knowledge of object
motion so obtained to achieve data compression. In interframe coding, motion estimation
and compensation have become powerful techniques to eliminate the temporal redundancy
due to high correlation between consecutive frames.
• In real video scenes, motion can be a complex combination of translation and rotation. Such
motion is difficult to estimate and may require large amounts of processing. However,
translational motion is easily estimated and has been used successfully for motion
compensated coding.
Prepared by: Prof. Khushali B Kathiriya
22
Feature-based Techniques
• Most of the motion estimation algorithms make the following assumptions:
1. Objects move in translation in a plane that is parallel to the camera plane, i.e., the effects of
camera zoom, and object rotations are not considered.
2. Illumination is spatially and temporally uniform.
3. Occlusion of one object by another, and uncovered background are neglected.
Prepared by: Prof. Khushali B Kathiriya
23
Feature-based Techniques
• There are two mainstream techniques of motion estimation:
1. pel-recursive algorithm (PRA)
2. block-matching algorithm (BMA).
• RAs are iterative refining of motion estimation for individual pels by gradient methods. BMAs
assume that all the pels within a block has the same motion activity. BMAs estimate motion
on the basis of rectangular blocks and produce one motion vector for each block. PRAs
involve more computational complexity and less regularity, so they are difficult to realize in
hardware. In general, BMAs are more suitable for a simple hardware realization because of
their regularity and simplicity.
Prepared by: Prof. Khushali B Kathiriya
24
Prepared by: Prof. Khushali B Kathiriya
25
Feature-based Techniques
• Figure illustrates a process of block-matching algorithm. In a typical BMA, each frame is
divided into blocks, each of which consists of luminance and chrominance blocks. Usually, for
coding efficiency, motion estimation is performed only on the luminance block. Each
luminance block in the present frame is matched against candidate blocks in a search area on
the reference frame. These candidate blocks are just the displaced versions of original block.
• The best (lowest distortion, i.e., most matched) candidate block is found and its
displacement (motion vector) is recorded. In a typical interframe coder, the input frame is
subtracted from the prediction of the reference frame. Consequently the motion vector and
the resulting error can be transmitted instead of the original luminance block; thus
interframe redundancy is removed and data compression is achieved. At receiver end, the
decoder builds the frame difference signal from the received data and adds it to the
reconstructed reference frames. The summation gives an exact replica of the current frame.
The better the prediction the smaller the error signal and hence the transmission bit rate.
Prepared by: Prof. Khushali B Kathiriya
26

CV_Chap 6 Motion Representation

  • 1.
    Computer Vision Chap.6 :Motion Representation SUB CODE: 3171614 SEMESTER: 7TH IT PREPARED BY: PROF. KHUSHALI B. KATHIRIYA
  • 2.
    Outline • The motionfield of rigid objects • Motion parallax • Optical flow • The image brightness constancy equation • Affine flow • Differential techniques • Feature-based techniques • Regularization and robust estimation Prepared by: Prof. Khushali B Kathiriya 2
  • 3.
    Motion Filed andOptical Flow PREPARED BY: PROF. KHUSHALI B. KATHIRIYA
  • 4.
    What is MotionRepresentation? • Motion analysis was motivated by the need for tracking an object and advancement in image processing hardware. • Analyzing human motion is a challenging task with a wide variety of applications in computer vision and in graphics. One such application, of particular importance in computer animation, is the retargeting of motion from one performer to another. While humans move in three dimensions, the vast majority of human motions are captured using video, requiring 2D-to-3D pose and camera recovery, before existing retargeting approaches may be applied. Prepared by: Prof. Khushali B Kathiriya 4
  • 5.
    The Motion field •In computer vision the motion field is an ideal representation of 3D motion as it is projected onto a camera image. Given a simplified camera model, each point {displaystyle (y_{1},y_{2})}in the image is the projection of some point in the 3D scene but the position of the projection of a fixed point in space can vary with time. • The motion field can formally be defined as the time derivative of the image position of all image points given that they correspond to fixed 3D points. This means that the motion field can be represented as a function which maps image coordinates to a 2-dimensional vector. The motion field is an ideal description of the projected 3D motion in the sense that it can be formally defined but in practice it is normally only possible to determine an approximation of the motion field from the image data. Prepared by: Prof. Khushali B Kathiriya 5
  • 6.
    The Motion Field Preparedby: Prof. Khushali B Kathiriya 9
  • 7.
    The Motion Field Preparedby: Prof. Khushali B Kathiriya 10
  • 8.
    The Motion Field Preparedby: Prof. Khushali B Kathiriya 11
  • 9.
    The Motion Field Preparedby: Prof. Khushali B Kathiriya 12
  • 10.
    Optical Flow • Motionof brightness patterns in the image Prepared by: Prof. Khushali B Kathiriya 13
  • 11.
    When Optical Flow≠ Motion Filed? Prepared by: Prof. Khushali B Kathiriya 14
  • 12.
    When Optical Flow≠ Motion Filed? Prepared by: Prof. Khushali B Kathiriya 15
  • 13.
    Motion Illusion Prepared by:Prof. Khushali B Kathiriya 16
  • 14.
    Prepared by: Prof.Khushali B Kathiriya 17 An affine (or first-order) optic flow model has 6 parameters, describing image translation, dilation, rotation and shear. The class affine_flow provides methods to estimates these parameters for two frames of an image sequence. (we have seen in 1st chap.)
  • 15.
  • 16.
    Motion Parallax • Motionparallax refers to the fact that objects moving at a constant speed across the frame will appear to move a greater amount if they are closer to an observer (or camera) than they would if they were at a greater distance. • This phenomenon is true whether it is the object itself that is moving or the observer/camera that is moving relative to the object. The reason for this effect has to do with the amount of distance the object moves as compared with the percentage of the camera's field of view that it moves across. • Ref. video: https://youtu.be/ANQtiQqfEtA Prepared by: Prof. Khushali B Kathiriya 19
  • 17.
  • 18.
    Feature-based Techniques • Themethod of finding image displacements which is easiest to understand is the feature- based approach. This finds features (for example, image edges, corners, and other structures well localized in two dimensions) and tracks these as they move from frame to frame. This involves two stages. Firstly, the features are found in two or more consecutive images. • The act of feature extraction, if done well, will both reduce the amount of information to be processed (and so reduce the workload), and also go some way towards obtaining a higher level of understanding of the scene, by its very nature of eliminating the unimportant parts. Secondly, these features are matched between the frames. In the simplest and commonest case, two frames are used and two sets of features are matched to give a single set of motion vectors. Prepared by: Prof. Khushali B Kathiriya 21
  • 19.
    Feature-based Techniques • Successivevideo frames may contain the same objects (still or moving). Motion estimation examines the movement of objects in an image sequence to try to obtain vectors representing the estimated motion. Motion compensation uses the knowledge of object motion so obtained to achieve data compression. In interframe coding, motion estimation and compensation have become powerful techniques to eliminate the temporal redundancy due to high correlation between consecutive frames. • In real video scenes, motion can be a complex combination of translation and rotation. Such motion is difficult to estimate and may require large amounts of processing. However, translational motion is easily estimated and has been used successfully for motion compensated coding. Prepared by: Prof. Khushali B Kathiriya 22
  • 20.
    Feature-based Techniques • Mostof the motion estimation algorithms make the following assumptions: 1. Objects move in translation in a plane that is parallel to the camera plane, i.e., the effects of camera zoom, and object rotations are not considered. 2. Illumination is spatially and temporally uniform. 3. Occlusion of one object by another, and uncovered background are neglected. Prepared by: Prof. Khushali B Kathiriya 23
  • 21.
    Feature-based Techniques • Thereare two mainstream techniques of motion estimation: 1. pel-recursive algorithm (PRA) 2. block-matching algorithm (BMA). • RAs are iterative refining of motion estimation for individual pels by gradient methods. BMAs assume that all the pels within a block has the same motion activity. BMAs estimate motion on the basis of rectangular blocks and produce one motion vector for each block. PRAs involve more computational complexity and less regularity, so they are difficult to realize in hardware. In general, BMAs are more suitable for a simple hardware realization because of their regularity and simplicity. Prepared by: Prof. Khushali B Kathiriya 24
  • 22.
    Prepared by: Prof.Khushali B Kathiriya 25
  • 23.
    Feature-based Techniques • Figureillustrates a process of block-matching algorithm. In a typical BMA, each frame is divided into blocks, each of which consists of luminance and chrominance blocks. Usually, for coding efficiency, motion estimation is performed only on the luminance block. Each luminance block in the present frame is matched against candidate blocks in a search area on the reference frame. These candidate blocks are just the displaced versions of original block. • The best (lowest distortion, i.e., most matched) candidate block is found and its displacement (motion vector) is recorded. In a typical interframe coder, the input frame is subtracted from the prediction of the reference frame. Consequently the motion vector and the resulting error can be transmitted instead of the original luminance block; thus interframe redundancy is removed and data compression is achieved. At receiver end, the decoder builds the frame difference signal from the received data and adds it to the reconstructed reference frames. The summation gives an exact replica of the current frame. The better the prediction the smaller the error signal and hence the transmission bit rate. Prepared by: Prof. Khushali B Kathiriya 26