SlideShare a Scribd company logo
Use of 3D vision
• Shape from X
• Shape from X is a generic name for techniques that aim
to extract shape from intensity images and other cues
such as focus.
• Some of these methods estimate local surface orientation
(e.g., surface normal) rather than absolute depth.
• Shape from motion
• 3D vision tasks
1 Marr’s theory
2 Other vision paradigms: Active and purposive vision
• Basics of projective geometry
1 Points and hyperplanes in projective space
2 Homography
3 Estimating homography from point correspondences
• Scene reconstruction from multiple views
1 Triangulation
2 Projective reconstruction
3 Matching constraints
4 Bundle adjustment
5 Upgrading the projective reconstruction, self-calibration
• Shape from X
1 Shape from motion
2 Shape from texture
3 Other shape from X techniques
• Full 3D objects
1 3D objects, models, and related issues
2 Line labeling
3 Volumetric representation, direct measurements
4 Volumetric modeling strategies
5 Surface modeling strategies
6 Registering surface patches and their fusion to get a full
3D model
• 2D view-based representations of a 3D scene
1 Viewing space
2 Multi-view representations and aspect graphs
• 3D reconstruction from an unorganized set of 2D
views, and Structure from Motion
There are many serious reasons why 3D vision using intensity
images as input is regarded as difficult.
1.The imaging system of a camera and the human eye
performs perspective projection, which leads to considerable
loss of information.
2. The relationship between image intensity and the 3D
geometry of the corresponding scene point is very
complicated.
3.Mutual occlusion of objects in the scene, and even self-
occlusion of one object,further complicates the vision task.
4.Noise in images, and the high time complexity of many
algorithms, contributes further to the problem, although this is
not specific to 3D vision.
• Marr [Marr, 1982] defines 3D vision as ‘From an image (or
a series of images) of a scene, derive an accurate three-
dimensional geometric description of the scene and
quantitatively determine the properties of the object in the
scene’.
Marr’s theory
• Marr proposed that a computer vision system was just an example of an
information processing device that could be understood at three levels:
1. Computational theory. The theory describes what the device is
supposed to do what information it provides from other information
provided as input. It should also describe the logic of the strategy that
performs this task.
2. Representation and algorithm. These address precisely how the
computation may be carried out in particular, information representations
and algorithms to manipulate them.
3. Implementation. The physical realization of the algorithm specifically,
programs and hardware.
• Having derived some such description, it is then
necessary to remove the dependence on the vantage
point and to transform the description into an object-
centered one.
• The requirement, then, is to move from pixels to surface
delineation, then to surface characteristic description
(orientation), then to a full 3D description. These
transformations are effected by moving from the 2D image
to a primal sketch, then to a 2.5D sketch, and thence to a
full 3D representation.
The primal sketch
• The primal sketch aims to capture, in as general a way
as possible, the significant intensity changes in an image.
Hitherto, such changes have been referred to as ‘edges’,
• but Marr makes the observation that this word implies a
physical meaning that cannot be inferred at this stage
The 2.5D sketch
• The 2.5D sketch reconstructs the relative distances from
the viewer of surfaces detected in the scene, and may be
called a depth map.
The 3D representation
• At this stage the Marr paradigm overlaps with top-down,
model-based approaches. It is required to take the
evidence derived so far and identify objects within it. This
can only be achieved with some knowledge about what
‘objects’ are, and, consequently, som means of describing
them. The important point is that this is a transition to an
object centered coordinate system, allowing object
descriptions to be viewer independent.
The Marr paradigm advocates a set of relatively independent
modules; the low-level modules aim to recover a meaningful
description of the input intensity image, the middle-level
modules use different cues such as intensity changes,
contours, texture, motion to recover shape or location in space.
The Marr paradigm is a nice theoretic framework, but
unfortunately does not lead to successful vision applications
performing, e.g., recognition and navigation tasks.
It was shown later that most low-level and middle-level tasks
are ill-posed, with no unique solution.
One popular way developed in the eighties to make the task
well-posed is regularization. A constraint requiring continuity
and smoothness of the solution is often added.
Other vision paradigms: Active and purposive vision
• When consistent geometric information has to be explicitly modeled (as
for manipulation of the object), an object-centered co-ordinate system
seems to be appropriate.
• Two schools are trying to explain the vision mechanism:
 The first and older one tries to use explicit metric information in the
early stages of the visual task (lines, curvatures, normals, etc.).
 Geometry is typically extracted in a bottom-up fashion without any
information about the purpose of this representation.
 The output is a geometric model.
• The second and younger school does not extract metric (geometric)
information from visual data until needed for a specific task.
• A database or collection of intrinsic images (or views) is the model.
• Many traditional computer vision systems and theories capture data
with cameras with fixed characteristics while active perception and
purposive vision may be appropriate.
• Active vision system ... characteristics of the data acquisition are
dynamically controlled by the scene interpretation.
• Many visual tasks tend to be simpler if the observer is active and
controls its visual sensors.
• The controlled eye (or camera) movement is an example.
• If there is not enough data to interpret the scene the camera can look at
it from other viewpoint.
• Active vision is an intelligent data acquisition controlled by the
measured, partially interpreted scene parameters and their errors from
the scene.
• The active approach can make most ill-posed vision
tasks tractable.
• There is no established theory that provides a mathematical
(computational) model explaining the understanding aspects of
human vision.
• Two recent developments towards new vision theory are:
• Qualitative vision
 that looks for a qualitative description of objects or scenes.
 The motivation is not to represent geometry that is not needed
for qualitative (non-geometric) tasks or decisions.
 Qualitative information is more invariant to various unwanted
transformations (e.g. slightly differing viewpoints) or noise than
quantitative ones.
 Qualitativeness (or invariance) enables interpretation of observed
events at several levels of complexity
• Purposive paradigm
 The key question is to identify the goal of the task, the
motivation being to ease the task by making explicit just that
piece of information that is needed.
 Collision avoidance for autonomous vehicle navigation is an
example where precise shape description is not needed.
 The approach may be heterogeneous and a qualitative
answer may be sufficient in some cases.
 The paradigm does not yet have a solid theoretical basis, but
the study of biological vision is a rich source of inspiration
55:148 Digital Image Processing
Chapter 11
3D Vision, Geometry
Topics:
Basics of projective geometry
Points and hyperplanes in projective space
Homography
Estimating homography from point correspondence
Basics of projective geometry
Single or multiple view geometry deals with mathematics of relation between
• 3D geometric features (points, lines, corners) in the scene
• their camera projections
• relations among multiple camera projections of a 3D scene
Points and hyperplanes in projective space
Scene: (𝒅 + 𝟏)-dimensional space excluding the origin, i.e., ℜ𝒅+𝟏 − 𝟎
Why origin is excluded?
Origin ≈ pinhole ≈ optical center
An equivalence relation “≅” is defined as follows:
𝒙𝟏, … , 𝒙𝒅+𝟏
𝐓 ≅ 𝒙𝟏
′
, … , 𝒙𝒅+𝟏
′ 𝐓
𝐢𝐟𝐟 ∃ 𝜶 ≠ 𝟎 𝐬. 𝐭. 𝒙𝟏, … , 𝒙𝒅+𝟏
𝐓 = 𝜶 𝒙𝟏
′
, … , 𝒙𝒅+𝟏
′ 𝐓
The area developed from photogrammetry, which measures 3D distances from
photographs.
The mathematical vehicle for multiple view geometry is projective geometry.
We require to study perspective projection (called also central projection),
which describes image formation by a pinhole camera or a thin lens.
Projective space: a 𝓟𝒅is the quotient space of this equivalence relation. It can be
imagined as the set of all lines in R^d+1 passing through the origin
Perspective projection of parallel lines
Homogeneous points
Each equivalent class of the relation “≅” generates an open line from the origin.
Note that the origin is not included in any of these lines and thus the disjoin
property of equivalent classes is satisfied
For each line or equivalent class, exactly one point is projected in the acquired
image and is the point where the projective hyperplane intersects the line.
These points in the projective space are referred to a homogeneous points.
What is the property of homogenous points?
Homogeneous points are coplanar lying on the projection plane.
For simplicity, let us assume that our projection plane is 𝒛 = 𝟏
Homogeneous points
Note that homogeneous points form the image hyperplane.
Thus, to determine the perspective projection of a scene point, we need to
determine corresponding homogeneous point
𝒙𝟏, … , 𝒙𝒅+𝟏
𝐓
𝑷
𝒙𝟏
′
, … , 𝒙𝒅+𝟏
′
= 𝟏 𝐓,
where 𝒙𝒊 = 𝜶𝒙𝒊
′
| 𝜶: 𝐜𝐨𝐧𝐬𝐭𝐚𝐧𝐭.
Note that the points 𝒙𝟏, … , 𝒙𝒅, 𝟎 𝐓
do not have an Euclidean counterpart
• Consider the limiting case 𝒙𝟏, … , 𝒙𝒅, 𝜶 𝐓
that is projectively equivalent to
𝒙𝟏/𝜶, … , 𝒙𝒅/𝜶, 𝟏 𝐓
, and assume that 𝜶 𝟎.
• This corresponds to a point on the projective hyperplane 𝓟𝒅 going to infinity
in the direction of the radius vector 𝒙𝟏, … , 𝒙𝒅, 𝟎 𝐓
Properties of projection
A line in the scene space through (but
not including) the origin is mapped
onto a point in the projective plane
A plane in the scene space through
the origin (but not including) is
mapped to a line on the projection
plane
Homography
Homography ≈ Collineation ≈ Projective
transformation
is a mapping from one projection plane to
another projection plane for the same
𝒅 + 𝟏 -dimensional scene and the common
origin
𝓟𝒅
𝑯
𝓟𝒅.
Also, expressed as
𝐮′
≅ 𝑯𝐮,
where 𝑯 is a 𝒅 + 𝟏 × 𝒅 + 𝟏 matrix.
Property:
Any three collinear points in 𝓟𝒅
remain
collinear in 𝓟𝒅
Prove!
Satisfies cross ratio property (see the
figure)
Matrix formulation for Homography
𝜶
𝒖′
𝒗′
𝟏
=
𝒉𝟏𝟏 𝒉𝟏𝟐 𝒉𝟏𝟑
𝒉𝟐𝟏 𝒉𝟐𝟐 𝒉𝟐𝟑
𝒉𝟑𝟏 𝒉𝟑𝟐 𝒉𝟑𝟑
𝒖
𝒗
𝟏
The scale factor 𝜶 ≠ 𝟎 and 𝐝𝐞𝐭 𝑯 ≠0; otherwise everything is mapped onto a
single point.
Eliminating the scale factor 𝜶, we get
𝒖′ =
𝒉𝟏𝟏𝒖+𝒉𝟏𝟐𝒗+𝒉𝟏𝟑
𝒉𝟑𝟏𝒖+𝒉𝟑𝟐𝒗+𝒉𝟑𝟑
and 𝒗′ =
𝒉𝟐𝟏𝒖+𝒉𝟐𝟐𝒗+𝒉𝟐𝟑
𝒉𝟑𝟏𝒖+𝒉𝟑𝟐𝒗+𝒉𝟑𝟑
Various linear transformations
Sub groups of homographys
Any homography can be uniquely decomposed as
𝑯 = 𝑯𝑷𝑯𝑨𝑯𝑺
where
𝑯𝑷 = 𝑰 𝟎
𝐚𝐓
𝒃
, 𝑯𝑨 = 𝑲 𝟎
𝟎𝐓
𝟏
, 𝑯𝑺 =
𝑹 −𝑹𝐭
𝟎𝐓
𝟏
Estimating homography from point correspondence
Given a set of orders pairs of points 𝒖𝒊, 𝒖𝒊
′
𝒊=𝟏
𝒎
To solve the homogeneous system of linear equations
𝜶𝒊𝒖𝒊
′
= 𝑯𝒖𝒊, 𝒊 = 𝟏, … , 𝒎
for 𝑯 and 𝜶𝒊.
Number of equations : 𝒎(𝒅 + 𝟏)
Number of unknowns: 𝒎 + 𝒅 + 𝟏 𝟐
− 𝟏
Degenerative configuration, i.e., 𝑯 may not be uniquely solved even if 𝒎 ≥ 𝐝 + 𝟐
and caused when 𝒅 or more points are coplanar
Correspondence of more than sufficient points lead to the notion of optimal
fitting reducing the effect of noise
Maximum likelihood estimation
𝒖𝒊, 𝒗𝒊
𝐓
and 𝒖𝒊
′
, 𝒗𝒊
′ 𝐓
| 𝒊 = 𝟏, … , 𝒎 are identified corresponding points in two different
projection planes
Principle: Find the homography (i.e., the transformation matrix 𝑯) that
maximizes the likelihood mapping of the points 𝒖𝒊, 𝒗𝒊
𝐓 on the first plane to
𝒖𝒊
′
, 𝒗𝒊
′ 𝐓
on to the second plane
Model:
Ideal points are in the vicinity of the identified points, i.e., there noise in the
process of locating the points 𝒖𝒊, 𝒗𝒊
𝐓 and 𝒖𝒊
′
, 𝒗𝒊
′ 𝐓
Method to solve the problem
• Determine the ML function using Gaussian model
• It contains several multiplicative terms
• Take log → multiplications are converted to addition
• Remove the minus sign (see the Gaussian expression)
• Maximization is converted to a minimization term
Final expression for maximum likelihood estimation
min
𝒉,𝒖𝒊,𝒗𝒊
𝒊=𝟏
𝒎 𝒖𝒊 − 𝒖𝒊
𝟐
+ 𝒗𝒊 − 𝒗𝒊
𝟐
+
𝒉𝟏𝟏𝒖𝒊 + 𝒉𝟏𝟐𝒗𝒊 + 𝒉𝟏𝟑
𝒉𝟑𝟏𝒖𝒊 + 𝒉𝟑𝟐𝒗𝒊 + 𝒉𝟑𝟑
− 𝒖𝒊
′
𝟐
+
𝒉𝟐𝟏𝒖𝒊 + 𝒉𝟐𝟐𝒗𝒊 + 𝒉𝟐𝟑
𝒉𝟑𝟏𝒖𝒊 + 𝒉𝟑𝟐𝒗𝒊 + 𝒉𝟑𝟑
− 𝒗𝒊
′
𝟐
Scene reconstruction from multiple views
• Triangulation
Projective reconstruction
Matching constraints
• Matching constraints are relations satisfied by collections
of corresponding image points in n views. They have the
property that a multilinear function of homogeneous
image coordinates must vanish; the coefficients of these
functions form multiview tensors.
Bundle adjustment
• The non-linear least squares specialized for this task is
known from photogrammetry as bundle adjustment.
Upgrading the projective reconstruction, self-
calibration
• There are several kinds of additional knowledge,
permitting the projective ambiguity to be refined to an
affine, similarity, or Euclidean one. Methods that use
additional knowledge to compute a similarity
reconstruction instead of mere projective one are also
known as self-calibration because this is in fact
equivalent to finding intrinsic camera parameters
• Self-calibration methods can be divided into two groups:
constraints on the cameras and constraints on the
scene.
Shape from X
• Shape from X is a generic name for techniques that aim
to extract shape from intensity images and other cues
such as focus.
• Some of these methods estimate local surface orientation
(e.g., surface normal) rather than absolute depth.
• Shape may be extracted from motion, optical flow,
texture, focus/de-focus,vergence, and contour.
• Each of these techniques may be used to derive a 2.5D
sketch for Marr’s visiontheory; they are also of practical
use on their own.
Shape from motion
• Motion is a primary property exploited by human
observers of the 3D world.
• The real world we see is dynamic in many respects, and
the relative movement of objects in view, their translation
and rotation relative to the observer, the motion of the
observer relative to other static and moving objects all
provide very strong clues to shape and depth.
• 3D information from moving scenes can be done as a
two-phase process:
1. Finding correspondences or calculating the nature of
the flow is a lower-level phase that operates on pixel arrays.
2. The shape extraction phase follows as a separate,
higher-level process. This phase is examined here.
Rigidity, and the structure from motion theorem
• Ullman’s success in this area was based on the psycho-physical observation that the human
visual system seems to assume that objects are rigid.
• This rigidity constraint prompted the proof of an elegant structure from motion theorem
saying that three orthographic projections of four non-co-planar points have a unique 3D
interpretation as belonging to one rigid body.
• First note that the body’s motion may be decomposed into translational and rotational
movement; the former gives the movement of a fixed point with respect to the observer, and
the latter relative rotation of the body (for example, about the chosen fixed point).
• Ullman’s result is the best possible in the sense that unique reconstruction of a rigid
body cannot be guaranteed with fewer than three projections of four points, or with
three projections of fewer than four points. It should also be remembered that the
result refers to orthographic projection when in general image projections are
perspective, as far as it is recognizable, is easy to identify.
Shape from optical flow
Full 3D objects
• Volumetric modeling strategies include constructive solid geometry, super_x0002_quadrics
and generalized cylinders.
• Surface modeling strategies include boundary representations, triangulated surfaces, and
quadric patches.
• Line labeling is an outmoded but accessible technique for reconstructing objects with planar
faces.
• Transitions to 3D objects need a co-ordinate system that is object centered.
• 3D objects may be measured mechanically by computed tomography, by range finders or by
shape from motion techniques.
3D model-based vision
• To create a full 3D model from a set of range images, the
surfaces must first be registered rotations and translations
should be found that match one surface to another.
• Model-based vision uses a priori knowledge about an
object to ease its recognition.
• Techniques exist to locate curved objects from range
images.
2D view-based representations of a 3D scene
• 2D view-based representations of 3D scenes may be
achieved with multi-view representations.
• It is possible to select a few stored reference images, and
render any view from them.
• Interpolation of views is not enough and view extrapolation is
needed. This requires knowledge of geometry, and the view-
based approach does not differ significantly from 3D
geometry reconstruction.
• It is possible to perform a 3D reconstruction from an
unorganized set of 2D views. This approach has been used
widely recently by, e.g., Google StreetView.
Reconstructing scene geometry
• Large scale scene features such as plane parameters
may be recaptured from properties of known objects such
as straight lines and approximate size.
• Well known geometric results identify vanishing points
and ground orientation.
• Similar approaches may well work even if large scale
clues are unavailable.
Shape from optical flow
• In a continuous sequence, we are therefore interested in
the apparent movement of each pixel (x, y) which is given
by the optical flow field (dx/dt, dy/dt).
Determining shape from optical flow is mathematically non-
trivial, and here an early simplification of the subject is
presented as an illustration [Clocksin, 1980]. The simpli-
fication is in two parts:
• Motion is due to the observer travelling in a straight line
through a static landscape.Without loss of generality, suppose
the motion is in the direction of the z axis of a viewer-centered
co-ordinate system (i.e., the observer is positioned at the origin).
• Rather than being projected onto a 2D plane, the image is seen
on the surface of a unit sphere, centered at the observer (a
‘spherical retina’). Points in 3D are represented in spherical polar
rather than Cartesian co-ordinates—spherical polar co-ordinates
(r, θ, ϕ) (see Figure 12.1) are related to (x, y, z) by the equations
Shape from texture
• The angle at which the surface is seen would cause a
(perspective) distortion of the texture primitive (texel), and
the relative size of the primitives would vary according to
distance from the observer.
• Considering a textured surface patterned with identical
texels which have been recovered by lower-level
processing, note that with respect to a viewer it has three
properties at any point projected onto a retinal image:
distance from the observer, slant; the angle at which the
surface is sloping away from the viewer (the angle between
the surface normal and the line of sight); and tilt, the
direction in which the slant takes place.
Attempts to re-capture some of this information is based on
the texture gradient—that is, the direction of maximum rate
of change of the perceived size of the texels, and a scalar
measurement of this rate.
• Texture is usually used as an additional or complementary
feature, augmenting another, stronger clue in shape
extraction.
Other shape from X techniques
• Shape from focus/de-focus techniques are based on the
fact that lenses have finite depth of field, and only objects at
the correct distance are in focus; others are blurred in
proportion to their distance.
• Two main approaches can be distinguished:
• Shape from focus measures depth in one location in an
active manner; this technique is used in 3D measuring
machines in mechanical engineering. The object to be
measured is fixed on a motorized table that moves along x,
y, z axes.
• Shape from de-focus typically estimates depth using two
input images captured at different depths. The relative
depth of the whole scene can be reconstructed from
image blur. The image is modeled as a convolution of the
image with a proper point spread function the function is
either known from capturing setup parameters or
estimated.
• Shape from vergence uses two cameras fixed on a
common rod. Using two servo_x0002_mechanisms, the
cameras can change the direction of their optical axes
(verge) in the plane containing a line segment joining their
optical centers. Such devices are called stereo heads;
• Shape from contour aims to describe a 3D shape from
contours seen from one or more view directions. Objects
with smooth bounding surfaces are quite difficult to
analyze.
• The set of all points on the object surface where surface
normal is perpendicular to the observer’s visual ray is
called a rim
Assuming orthographic projection, the rim points generate a
silhouette of an object in the image. Silhouettes can be
easily and reliably captured if back-light illumination is used,
although there is possible complication in thespecial case in
which two distinct rim points project to a single image point.
• The inherent difficulty in shape from contour comes from the
loss of information in projecting 3D to 2D.
Humans are surprisingly successful at perceiving clear 3D shapes from
contours, and it seems that tremendous background knowledge is used to
assist. Understanding this human ability is one of the major challenges for
computer vision.
Full 3D objects
3D objects, models, and related issues:
The notion of a 3D object allows us to consider a 3D
volume as a part of the entire 3D world.
This volume has a particular interpretation (semantics,
purpose) for the task in hand.
we have treated geometric and radiometric techniques that
provide intermediate 3D cues, and it was implicitly assumed
that such cues help to understand the nature of a 3D
object.
Shape is another informal concept that humans typically
connect with a 3D object.
• Computer vision aims at scientific methods for 3D object
description, but there are no mathematical tools yet
available to express shape in its general sense.
• Curvilinear surfaces with no restriction on surface shape
are called free-form surfaces.
• Roughly speaking, the 3D vision task distinguishes two
classes of approach:
1. Reconstruction of the 3D object model or representation
from real-world measurements with the aim of estimating a
continuous function representing the surface.
2. Recognition of an instance of a 3D object in the scene. It
is assumed that object classes are known in advance, and
that they are represented by a suitable 3D model.
Humans meet and recognize often deformable objects
that change their shape.
• Computer vision as well as computer graphics use 3D
models to encapsulate the shape of an 3D object.
• 3D models serve in computer graphics to generate detailed
surface descriptions used to render realistic 2D images.
• In computer vision, the model is used either for
reconstruction (copying, displaying an object from a different
viewpoint,modifying an object slightly during animation) or for
recognition purposes, where features are used that
distinguish objects from different classes.
• There are two main classes of models: volumetric and
surface.
• Volumetric models represent the ‘inside’ of a 3DZ object
explicitly, while surface models use only object surfaces,
as most vision-based measuring techniques can only see
the surface of a non-transparent solid.
• 3D models make a transition towards an object-centered
co-ordinate system, allowing
• object descriptions to be viewer independent. This is the
most difficult phase within Marr’s paradigm.
• 3D models of objects are common in other areas besides
computer vision, notably computer-aided design (CAD)
and computer graphics, where image synthesis is
required that is, an exact (2D) pictorial representation of
some modeled 3D object.
• Various representation schemes exist, with different
properties. A representation is called complete if two
different objects cannot correspond to the same model, so
a particular model is unambiguous.
• A representation is called unique if an object cannot
correspond to two different models.
• Most 3D representation methods sacrifice either the
completeness or the uniqueness property.
• Commercial CAD systems frequently sacrifice uniqueness.
Line labeling
• blocks world approach.
• Line labeling is an outmoded but accessible technique for
reconstructing objects with planar faces.
Independently, other researchers built on these ideas to develop what is now a very well known
line labeling algorithm
• Line labeling is able to detect ‘impossible’ .
Volumetric representation, direct measurements
• An object is placed in some reference co-ordinate system
and its volume is subdivided into small volume elements
called voxels—it is usual for these to be cubes.
• The most straightforward representation of voxel-based
volumetric models is the 3D occupancy grid, which is
implemented as a 3D Boolean array
• The object is fixed to a measuring machine, and an
absolute co-ordinate system is attached to it. Points on the
object surface are touched by a measuring needle which
provides 3D co-ordinates;
Another 3D measurement technique,computed tomography, looks
inside the object and thus yields more detailed information than the
binary occupancy grid.
Volumetric modeling strategies
• Constructive Solid Geometry
• The principal idea of Constructive Solid Geometry (CSG), which
has found some success is to construct 3D bodies from a
selection of solid primitives.
• A CSG model is stored as a tree, with leaf nodes representing the
primitive solid and edges enforcing precedence among the set
theoretical operations
• Super-quadrics
• Super-quadrics are geometric bodies that can be understood as a
generalization of basic quadric solids, introduced in computer
graphics [Barr, 1981].
• Super-ellipsoids are instances of super-quadrics used in computer
vision.
• where a1, a2, and a3 define the super-quadric size in the x,
y, and z directions, respectively. εvert is the squareness
parameter in the latitude plane and εhori is the squareness
parameter in the longitude plane.
• The squareness values used in respective planes are 0 (i.e.,
square) ≤ ε ≤ 2 (i.e., deltoid), as only those are convex
bodies. If squareness parameters are greater than 2, the
body changes to a cross-like shape.
Generalized cylinders
• Generalized cylinders, or generalized cones, are often also called
sweep representations.
• a cone is defined by a circle whose radius changes linearly with
distance traveled, moving along a straight line.
• These generalized cones turn out to be very good at
representing some classes of solid body.
• The advantage of symmetrical volumetric primitives, such
as generalized cylinders and super-quadrics, is their ability
to capture common symmetries and represent certain
shapes with few parameters.
• An influential early vision system called ACRONYM used
generalized cones as its modeling scheme.
• There is a modification of the sweep representation called
a skeleton representation, which stores only the spines of
the objects.
Surface modeling strategies
• A solid object can be represented by surfaces bounding it;
such a description can vary from simple triangular patches
to visually appealing structures such as non-uniform
rational B-splines (NURBS) popular in geometric modeling.
Computer vision solves two main problems with surfaces:
1. reconstruction creates surface description from sparse
depth measurements that are typically corrupted by outliers;
2.segmentation aims to classify surface or surface patches
into surface types.
• Boundary representations (B-reps) can be viewed
conceptually as a triple:
• A set of surfaces of the object.
• A set of space curves representing intersections between
the surfaces.
• A graph describing the surface connectivity.
B-reps are an appealing and intuitively natural way of
representing 3D bodies in that they consist of an explicit list
of the bodies’ faces.
In the simplest case, ‘faces’ are taken to be planar, so
bodies are always polyhedral, and we are dealing the
whole time with piecewise planar surfaces.
• Triangulation of irregular data points (e.g., a 3D point
cloud obtained from a range scanner) is an example of an
interpolation method.
• The best-known technique is called Delaunay
triangulation, which can be defined in two, three, or more
space dimensions.
Registering surface patches and their fusion
to get a full 3D model
• A range image represents distance measurements from
an observer to an object; it yields a partial 3D description
of the surface from one view only.
• Several range images are needed to capture the whole
surface of an object.
• Range image registration finds a rigid geometric
transformation between two range images of the same
object captured from two different viewpoints.
• The method automates the construction of a 3D model of a
3D free-form object from a set of range images as follows.
1. The object is placed on a turntable and a set of range
images from different viewpoints is measured by a
structured-light (laser-plane) range finder.
2. A triangulated surface is constructed over the range
images.
3. Large data sets are reduced by decimation of triangular
meshes in each view.
4. Surfaces are registered into a common object-centered co-
ordinate system and out_x0002_liers in measurements are
removed.
• A 4-connected mesh cannot represent all objects; e.g., a
sphere cannot be covered by a four-sided polygon.
• By splitting each polygon by an edge, a triangulation of
the surface, which is able to represent any surface, is
easily obtained.
• A polygon may be split two ways; it is preferable to
choose the shortest edge because this results in triangles
with larger inner angles.
2D view-based representations of a 3D scene
• Viewing space
• The trouble is that there is potentially an infinite number of
possible viewpoints that induce an infinite number of
object appearances.
• To cope with the huge number of viewpoints and
appearances it is necessary to sample a viewpoint space
and group together similar neighboring views.
• A simplified model is a viewing sphere model that is
often used in the orthographic projection case
Multi-view representations and aspect graphs
• Other representation methods attempt to combine all the
viewpoint-specific models into a single data structure.
One of them is the characteristic view technique in which
all possible 2D projections of the convex polyhedral object
are grouped into a finite number of topologically
equivalent classes.
• A similar approach is based on aspect which is defined as
the topological structure of singularities in a single view of
an object aspect has useful invariance properties.
• Most small changes in vantage point will not affect aspect,
and such vantage points (that isbmost) are referred to as
stable.
3D reconstruction from an unorganized set of 2D
views, and Structure from Motion
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

More Related Content

Similar to 3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

Computer vision.pptx for pg students study about computer vision
Computer vision.pptx for pg students study about computer visionComputer vision.pptx for pg students study about computer vision
Computer vision.pptx for pg students study about computer vision
shesnasuneer
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
Lakshmi Sarvani Videla
 
Hidden Surface Removal methods.pptx
Hidden Surface Removal methods.pptxHidden Surface Removal methods.pptx
Hidden Surface Removal methods.pptx
bcanawakadalcollege
 
Final Paper
Final PaperFinal Paper
Final Paper
Nicholas Chehade
 
Hidden Surface Removal.pptx
Hidden Surface Removal.pptxHidden Surface Removal.pptx
Hidden Surface Removal.pptx
bcanawakadalcollege
 
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
cscpconf
 
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
csandit
 
Basic image matching techniques, epipolar geometry and normalized image
Basic image matching techniques, epipolar geometry and normalized imageBasic image matching techniques, epipolar geometry and normalized image
Basic image matching techniques, epipolar geometry and normalized image
National Cheng Kung University
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentation
VijaylaxmiNagurkar
 
3-1_geo Spatial analysis_spatial_modeling.pptx
3-1_geo Spatial analysis_spatial_modeling.pptx3-1_geo Spatial analysis_spatial_modeling.pptx
3-1_geo Spatial analysis_spatial_modeling.pptx
Ashwini Rao
 
Object tracking
Object trackingObject tracking
Object tracking
ahmadamin636
 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
ijcsity
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
Seiya Ito
 
Image Registration
Image RegistrationImage Registration
Image Registration
Angu Ramesh
 
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
International Journal of Science and Research (IJSR)
 
Object Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature MatchingObject Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature Matching
IJERA Editor
 
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
CSCJournals
 
A03501001006
A03501001006A03501001006
A03501001006
theijes
 
Unit II & III_uncovered topics.doc notes
Unit II & III_uncovered topics.doc notesUnit II & III_uncovered topics.doc notes
Unit II & III_uncovered topics.doc notes
smithashetty24
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
mokamojah
 

Similar to 3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv (20)

Computer vision.pptx for pg students study about computer vision
Computer vision.pptx for pg students study about computer visionComputer vision.pptx for pg students study about computer vision
Computer vision.pptx for pg students study about computer vision
 
Model Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point CloudsModel Based Emotion Detection using Point Clouds
Model Based Emotion Detection using Point Clouds
 
Hidden Surface Removal methods.pptx
Hidden Surface Removal methods.pptxHidden Surface Removal methods.pptx
Hidden Surface Removal methods.pptx
 
Final Paper
Final PaperFinal Paper
Final Paper
 
Hidden Surface Removal.pptx
Hidden Surface Removal.pptxHidden Surface Removal.pptx
Hidden Surface Removal.pptx
 
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
 
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
A NOVEL APPROACH TO SMOOTHING ON 3D STRUCTURED ADAPTIVE MESH OF THE KINECT-BA...
 
Basic image matching techniques, epipolar geometry and normalized image
Basic image matching techniques, epipolar geometry and normalized imageBasic image matching techniques, epipolar geometry and normalized image
Basic image matching techniques, epipolar geometry and normalized image
 
Deep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentationDeep learning for 3 d point clouds presentation
Deep learning for 3 d point clouds presentation
 
3-1_geo Spatial analysis_spatial_modeling.pptx
3-1_geo Spatial analysis_spatial_modeling.pptx3-1_geo Spatial analysis_spatial_modeling.pptx
3-1_geo Spatial analysis_spatial_modeling.pptx
 
Object tracking
Object trackingObject tracking
Object tracking
 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
 
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
[3D勉強会@関東] Deep Reinforcement Learning of Volume-guided Progressive View Inpa...
 
Image Registration
Image RegistrationImage Registration
Image Registration
 
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
Handling Uncertainty under Spatial Feature Extraction through Probabilistic S...
 
Object Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature MatchingObject Capturing In A Cluttered Scene By Using Point Feature Matching
Object Capturing In A Cluttered Scene By Using Point Feature Matching
 
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
Tracking Chessboard Corners Using Projective Transformation for Augmented Rea...
 
A03501001006
A03501001006A03501001006
A03501001006
 
Unit II & III_uncovered topics.doc notes
Unit II & III_uncovered topics.doc notesUnit II & III_uncovered topics.doc notes
Unit II & III_uncovered topics.doc notes
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 

More from shesnasuneer

shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvvshape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
shesnasuneer
 
Introduction to Java(1) - CPPT+opy.Jpptx
Introduction to Java(1) - CPPT+opy.JpptxIntroduction to Java(1) - CPPT+opy.Jpptx
Introduction to Java(1) - CPPT+opy.Jpptx
shesnasuneer
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
shesnasuneer
 
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
OBJECT RECOGNITION.pptxttrrttrtrtrrtrtttOBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
shesnasuneer
 
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdfCOMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
shesnasuneer
 
chapter 4 computervision PART1.pcomputerptx
chapter 4 computervision PART1.pcomputerptxchapter 4 computervision PART1.pcomputerptx
chapter 4 computervision PART1.pcomputerptx
shesnasuneer
 
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISIONchapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
shesnasuneer
 
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISIONchapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
shesnasuneer
 
computervision1.pdf it is about computer vision
computervision1.pdf it is about computer visioncomputervision1.pdf it is about computer vision
computervision1.pdf it is about computer vision
shesnasuneer
 
computervision1.pptx its about computer vision
computervision1.pptx its about computer visioncomputervision1.pptx its about computer vision
computervision1.pptx its about computer vision
shesnasuneer
 
features of java.pdf about java buzzwords
features of java.pdf about java buzzwordsfeatures of java.pdf about java buzzwords
features of java.pdf about java buzzwords
shesnasuneer
 
chAPTER1CV.pptx is abouter computer vision in artificial intelligence
chAPTER1CV.pptx is abouter computer vision in artificial intelligencechAPTER1CV.pptx is abouter computer vision in artificial intelligence
chAPTER1CV.pptx is abouter computer vision in artificial intelligence
shesnasuneer
 
Presentation (6).pptx about programming language submitted by shesna
Presentation (6).pptx about programming language submitted by shesnaPresentation (6).pptx about programming language submitted by shesna
Presentation (6).pptx about programming language submitted by shesna
shesnasuneer
 

More from shesnasuneer (13)

shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvvshape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
shape from x.pptxvvvvvvvvvvvvvvvvvvvvvvv
 
Introduction to Java(1) - CPPT+opy.Jpptx
Introduction to Java(1) - CPPT+opy.JpptxIntroduction to Java(1) - CPPT+opy.Jpptx
Introduction to Java(1) - CPPT+opy.Jpptx
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
OBJECT RECOGNITION.pptxttrrttrtrtrrtrtttOBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
OBJECT RECOGNITION.pptxttrrttrtrtrrtrttt
 
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdfCOMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
COMPUTER VISION CHAPTER 4 PARTthis ppt 3.pdf
 
chapter 4 computervision PART1.pcomputerptx
chapter 4 computervision PART1.pcomputerptxchapter 4 computervision PART1.pcomputerptx
chapter 4 computervision PART1.pcomputerptx
 
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISIONchapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
chapter 4 computervision.PPT.pptx ABOUT COMPUTER VISION
 
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISIONchapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
chapter 4 computervision.pdf IT IS ABOUT COMUTER VISION
 
computervision1.pdf it is about computer vision
computervision1.pdf it is about computer visioncomputervision1.pdf it is about computer vision
computervision1.pdf it is about computer vision
 
computervision1.pptx its about computer vision
computervision1.pptx its about computer visioncomputervision1.pptx its about computer vision
computervision1.pptx its about computer vision
 
features of java.pdf about java buzzwords
features of java.pdf about java buzzwordsfeatures of java.pdf about java buzzwords
features of java.pdf about java buzzwords
 
chAPTER1CV.pptx is abouter computer vision in artificial intelligence
chAPTER1CV.pptx is abouter computer vision in artificial intelligencechAPTER1CV.pptx is abouter computer vision in artificial intelligence
chAPTER1CV.pptx is abouter computer vision in artificial intelligence
 
Presentation (6).pptx about programming language submitted by shesna
Presentation (6).pptx about programming language submitted by shesnaPresentation (6).pptx about programming language submitted by shesna
Presentation (6).pptx about programming language submitted by shesna
 

Recently uploaded

NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 

Recently uploaded (20)

NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 

3d vision.pptxvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

  • 1.
  • 2. Use of 3D vision • Shape from X • Shape from X is a generic name for techniques that aim to extract shape from intensity images and other cues such as focus. • Some of these methods estimate local surface orientation (e.g., surface normal) rather than absolute depth. • Shape from motion
  • 3. • 3D vision tasks 1 Marr’s theory 2 Other vision paradigms: Active and purposive vision • Basics of projective geometry 1 Points and hyperplanes in projective space 2 Homography 3 Estimating homography from point correspondences
  • 4. • Scene reconstruction from multiple views 1 Triangulation 2 Projective reconstruction 3 Matching constraints 4 Bundle adjustment 5 Upgrading the projective reconstruction, self-calibration
  • 5. • Shape from X 1 Shape from motion 2 Shape from texture 3 Other shape from X techniques
  • 6. • Full 3D objects 1 3D objects, models, and related issues 2 Line labeling 3 Volumetric representation, direct measurements 4 Volumetric modeling strategies 5 Surface modeling strategies 6 Registering surface patches and their fusion to get a full 3D model
  • 7. • 2D view-based representations of a 3D scene 1 Viewing space 2 Multi-view representations and aspect graphs • 3D reconstruction from an unorganized set of 2D views, and Structure from Motion
  • 8. There are many serious reasons why 3D vision using intensity images as input is regarded as difficult. 1.The imaging system of a camera and the human eye performs perspective projection, which leads to considerable loss of information. 2. The relationship between image intensity and the 3D geometry of the corresponding scene point is very complicated. 3.Mutual occlusion of objects in the scene, and even self- occlusion of one object,further complicates the vision task. 4.Noise in images, and the high time complexity of many algorithms, contributes further to the problem, although this is not specific to 3D vision.
  • 9. • Marr [Marr, 1982] defines 3D vision as ‘From an image (or a series of images) of a scene, derive an accurate three- dimensional geometric description of the scene and quantitatively determine the properties of the object in the scene’.
  • 10. Marr’s theory • Marr proposed that a computer vision system was just an example of an information processing device that could be understood at three levels: 1. Computational theory. The theory describes what the device is supposed to do what information it provides from other information provided as input. It should also describe the logic of the strategy that performs this task. 2. Representation and algorithm. These address precisely how the computation may be carried out in particular, information representations and algorithms to manipulate them. 3. Implementation. The physical realization of the algorithm specifically, programs and hardware.
  • 11.
  • 12. • Having derived some such description, it is then necessary to remove the dependence on the vantage point and to transform the description into an object- centered one.
  • 13. • The requirement, then, is to move from pixels to surface delineation, then to surface characteristic description (orientation), then to a full 3D description. These transformations are effected by moving from the 2D image to a primal sketch, then to a 2.5D sketch, and thence to a full 3D representation.
  • 14. The primal sketch • The primal sketch aims to capture, in as general a way as possible, the significant intensity changes in an image. Hitherto, such changes have been referred to as ‘edges’, • but Marr makes the observation that this word implies a physical meaning that cannot be inferred at this stage
  • 15. The 2.5D sketch • The 2.5D sketch reconstructs the relative distances from the viewer of surfaces detected in the scene, and may be called a depth map.
  • 16. The 3D representation • At this stage the Marr paradigm overlaps with top-down, model-based approaches. It is required to take the evidence derived so far and identify objects within it. This can only be achieved with some knowledge about what ‘objects’ are, and, consequently, som means of describing them. The important point is that this is a transition to an object centered coordinate system, allowing object descriptions to be viewer independent.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. The Marr paradigm advocates a set of relatively independent modules; the low-level modules aim to recover a meaningful description of the input intensity image, the middle-level modules use different cues such as intensity changes, contours, texture, motion to recover shape or location in space. The Marr paradigm is a nice theoretic framework, but unfortunately does not lead to successful vision applications performing, e.g., recognition and navigation tasks. It was shown later that most low-level and middle-level tasks are ill-posed, with no unique solution. One popular way developed in the eighties to make the task well-posed is regularization. A constraint requiring continuity and smoothness of the solution is often added.
  • 28. Other vision paradigms: Active and purposive vision • When consistent geometric information has to be explicitly modeled (as for manipulation of the object), an object-centered co-ordinate system seems to be appropriate. • Two schools are trying to explain the vision mechanism:  The first and older one tries to use explicit metric information in the early stages of the visual task (lines, curvatures, normals, etc.).  Geometry is typically extracted in a bottom-up fashion without any information about the purpose of this representation.  The output is a geometric model. • The second and younger school does not extract metric (geometric) information from visual data until needed for a specific task.
  • 29. • A database or collection of intrinsic images (or views) is the model. • Many traditional computer vision systems and theories capture data with cameras with fixed characteristics while active perception and purposive vision may be appropriate. • Active vision system ... characteristics of the data acquisition are dynamically controlled by the scene interpretation. • Many visual tasks tend to be simpler if the observer is active and controls its visual sensors. • The controlled eye (or camera) movement is an example. • If there is not enough data to interpret the scene the camera can look at it from other viewpoint. • Active vision is an intelligent data acquisition controlled by the measured, partially interpreted scene parameters and their errors from the scene.
  • 30. • The active approach can make most ill-posed vision tasks tractable.
  • 31. • There is no established theory that provides a mathematical (computational) model explaining the understanding aspects of human vision. • Two recent developments towards new vision theory are: • Qualitative vision  that looks for a qualitative description of objects or scenes.  The motivation is not to represent geometry that is not needed for qualitative (non-geometric) tasks or decisions.  Qualitative information is more invariant to various unwanted transformations (e.g. slightly differing viewpoints) or noise than quantitative ones.  Qualitativeness (or invariance) enables interpretation of observed events at several levels of complexity
  • 32. • Purposive paradigm  The key question is to identify the goal of the task, the motivation being to ease the task by making explicit just that piece of information that is needed.  Collision avoidance for autonomous vehicle navigation is an example where precise shape description is not needed.  The approach may be heterogeneous and a qualitative answer may be sufficient in some cases.  The paradigm does not yet have a solid theoretical basis, but the study of biological vision is a rich source of inspiration
  • 33. 55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography Estimating homography from point correspondence
  • 34. Basics of projective geometry Single or multiple view geometry deals with mathematics of relation between • 3D geometric features (points, lines, corners) in the scene • their camera projections • relations among multiple camera projections of a 3D scene Points and hyperplanes in projective space Scene: (𝒅 + 𝟏)-dimensional space excluding the origin, i.e., ℜ𝒅+𝟏 − 𝟎 Why origin is excluded? Origin ≈ pinhole ≈ optical center An equivalence relation “≅” is defined as follows: 𝒙𝟏, … , 𝒙𝒅+𝟏 𝐓 ≅ 𝒙𝟏 ′ , … , 𝒙𝒅+𝟏 ′ 𝐓 𝐢𝐟𝐟 ∃ 𝜶 ≠ 𝟎 𝐬. 𝐭. 𝒙𝟏, … , 𝒙𝒅+𝟏 𝐓 = 𝜶 𝒙𝟏 ′ , … , 𝒙𝒅+𝟏 ′ 𝐓
  • 35. The area developed from photogrammetry, which measures 3D distances from photographs. The mathematical vehicle for multiple view geometry is projective geometry. We require to study perspective projection (called also central projection), which describes image formation by a pinhole camera or a thin lens.
  • 36. Projective space: a 𝓟𝒅is the quotient space of this equivalence relation. It can be imagined as the set of all lines in R^d+1 passing through the origin
  • 37.
  • 38. Perspective projection of parallel lines
  • 39. Homogeneous points Each equivalent class of the relation “≅” generates an open line from the origin. Note that the origin is not included in any of these lines and thus the disjoin property of equivalent classes is satisfied For each line or equivalent class, exactly one point is projected in the acquired image and is the point where the projective hyperplane intersects the line. These points in the projective space are referred to a homogeneous points. What is the property of homogenous points? Homogeneous points are coplanar lying on the projection plane. For simplicity, let us assume that our projection plane is 𝒛 = 𝟏
  • 40. Homogeneous points Note that homogeneous points form the image hyperplane. Thus, to determine the perspective projection of a scene point, we need to determine corresponding homogeneous point 𝒙𝟏, … , 𝒙𝒅+𝟏 𝐓 𝑷 𝒙𝟏 ′ , … , 𝒙𝒅+𝟏 ′ = 𝟏 𝐓, where 𝒙𝒊 = 𝜶𝒙𝒊 ′ | 𝜶: 𝐜𝐨𝐧𝐬𝐭𝐚𝐧𝐭. Note that the points 𝒙𝟏, … , 𝒙𝒅, 𝟎 𝐓 do not have an Euclidean counterpart • Consider the limiting case 𝒙𝟏, … , 𝒙𝒅, 𝜶 𝐓 that is projectively equivalent to 𝒙𝟏/𝜶, … , 𝒙𝒅/𝜶, 𝟏 𝐓 , and assume that 𝜶 𝟎. • This corresponds to a point on the projective hyperplane 𝓟𝒅 going to infinity in the direction of the radius vector 𝒙𝟏, … , 𝒙𝒅, 𝟎 𝐓
  • 41. Properties of projection A line in the scene space through (but not including) the origin is mapped onto a point in the projective plane A plane in the scene space through the origin (but not including) is mapped to a line on the projection plane
  • 42. Homography Homography ≈ Collineation ≈ Projective transformation is a mapping from one projection plane to another projection plane for the same 𝒅 + 𝟏 -dimensional scene and the common origin 𝓟𝒅 𝑯 𝓟𝒅. Also, expressed as 𝐮′ ≅ 𝑯𝐮, where 𝑯 is a 𝒅 + 𝟏 × 𝒅 + 𝟏 matrix. Property: Any three collinear points in 𝓟𝒅 remain collinear in 𝓟𝒅 Prove! Satisfies cross ratio property (see the figure)
  • 43. Matrix formulation for Homography 𝜶 𝒖′ 𝒗′ 𝟏 = 𝒉𝟏𝟏 𝒉𝟏𝟐 𝒉𝟏𝟑 𝒉𝟐𝟏 𝒉𝟐𝟐 𝒉𝟐𝟑 𝒉𝟑𝟏 𝒉𝟑𝟐 𝒉𝟑𝟑 𝒖 𝒗 𝟏 The scale factor 𝜶 ≠ 𝟎 and 𝐝𝐞𝐭 𝑯 ≠0; otherwise everything is mapped onto a single point. Eliminating the scale factor 𝜶, we get 𝒖′ = 𝒉𝟏𝟏𝒖+𝒉𝟏𝟐𝒗+𝒉𝟏𝟑 𝒉𝟑𝟏𝒖+𝒉𝟑𝟐𝒗+𝒉𝟑𝟑 and 𝒗′ = 𝒉𝟐𝟏𝒖+𝒉𝟐𝟐𝒗+𝒉𝟐𝟑 𝒉𝟑𝟏𝒖+𝒉𝟑𝟐𝒗+𝒉𝟑𝟑
  • 45. Sub groups of homographys Any homography can be uniquely decomposed as 𝑯 = 𝑯𝑷𝑯𝑨𝑯𝑺 where 𝑯𝑷 = 𝑰 𝟎 𝐚𝐓 𝒃 , 𝑯𝑨 = 𝑲 𝟎 𝟎𝐓 𝟏 , 𝑯𝑺 = 𝑹 −𝑹𝐭 𝟎𝐓 𝟏
  • 46. Estimating homography from point correspondence Given a set of orders pairs of points 𝒖𝒊, 𝒖𝒊 ′ 𝒊=𝟏 𝒎 To solve the homogeneous system of linear equations 𝜶𝒊𝒖𝒊 ′ = 𝑯𝒖𝒊, 𝒊 = 𝟏, … , 𝒎 for 𝑯 and 𝜶𝒊. Number of equations : 𝒎(𝒅 + 𝟏) Number of unknowns: 𝒎 + 𝒅 + 𝟏 𝟐 − 𝟏 Degenerative configuration, i.e., 𝑯 may not be uniquely solved even if 𝒎 ≥ 𝐝 + 𝟐 and caused when 𝒅 or more points are coplanar Correspondence of more than sufficient points lead to the notion of optimal fitting reducing the effect of noise
  • 47. Maximum likelihood estimation 𝒖𝒊, 𝒗𝒊 𝐓 and 𝒖𝒊 ′ , 𝒗𝒊 ′ 𝐓 | 𝒊 = 𝟏, … , 𝒎 are identified corresponding points in two different projection planes Principle: Find the homography (i.e., the transformation matrix 𝑯) that maximizes the likelihood mapping of the points 𝒖𝒊, 𝒗𝒊 𝐓 on the first plane to 𝒖𝒊 ′ , 𝒗𝒊 ′ 𝐓 on to the second plane Model: Ideal points are in the vicinity of the identified points, i.e., there noise in the process of locating the points 𝒖𝒊, 𝒗𝒊 𝐓 and 𝒖𝒊 ′ , 𝒗𝒊 ′ 𝐓 Method to solve the problem • Determine the ML function using Gaussian model • It contains several multiplicative terms • Take log → multiplications are converted to addition • Remove the minus sign (see the Gaussian expression) • Maximization is converted to a minimization term
  • 48. Final expression for maximum likelihood estimation min 𝒉,𝒖𝒊,𝒗𝒊 𝒊=𝟏 𝒎 𝒖𝒊 − 𝒖𝒊 𝟐 + 𝒗𝒊 − 𝒗𝒊 𝟐 + 𝒉𝟏𝟏𝒖𝒊 + 𝒉𝟏𝟐𝒗𝒊 + 𝒉𝟏𝟑 𝒉𝟑𝟏𝒖𝒊 + 𝒉𝟑𝟐𝒗𝒊 + 𝒉𝟑𝟑 − 𝒖𝒊 ′ 𝟐 + 𝒉𝟐𝟏𝒖𝒊 + 𝒉𝟐𝟐𝒗𝒊 + 𝒉𝟐𝟑 𝒉𝟑𝟏𝒖𝒊 + 𝒉𝟑𝟐𝒗𝒊 + 𝒉𝟑𝟑 − 𝒗𝒊 ′ 𝟐
  • 49. Scene reconstruction from multiple views • Triangulation
  • 50.
  • 52.
  • 53. Matching constraints • Matching constraints are relations satisfied by collections of corresponding image points in n views. They have the property that a multilinear function of homogeneous image coordinates must vanish; the coefficients of these functions form multiview tensors.
  • 54.
  • 55.
  • 56. Bundle adjustment • The non-linear least squares specialized for this task is known from photogrammetry as bundle adjustment.
  • 57. Upgrading the projective reconstruction, self- calibration • There are several kinds of additional knowledge, permitting the projective ambiguity to be refined to an affine, similarity, or Euclidean one. Methods that use additional knowledge to compute a similarity reconstruction instead of mere projective one are also known as self-calibration because this is in fact equivalent to finding intrinsic camera parameters
  • 58. • Self-calibration methods can be divided into two groups: constraints on the cameras and constraints on the scene.
  • 59. Shape from X • Shape from X is a generic name for techniques that aim to extract shape from intensity images and other cues such as focus. • Some of these methods estimate local surface orientation (e.g., surface normal) rather than absolute depth. • Shape may be extracted from motion, optical flow, texture, focus/de-focus,vergence, and contour. • Each of these techniques may be used to derive a 2.5D sketch for Marr’s visiontheory; they are also of practical use on their own.
  • 60. Shape from motion • Motion is a primary property exploited by human observers of the 3D world. • The real world we see is dynamic in many respects, and the relative movement of objects in view, their translation and rotation relative to the observer, the motion of the observer relative to other static and moving objects all provide very strong clues to shape and depth.
  • 61. • 3D information from moving scenes can be done as a two-phase process: 1. Finding correspondences or calculating the nature of the flow is a lower-level phase that operates on pixel arrays. 2. The shape extraction phase follows as a separate, higher-level process. This phase is examined here.
  • 62. Rigidity, and the structure from motion theorem • Ullman’s success in this area was based on the psycho-physical observation that the human visual system seems to assume that objects are rigid. • This rigidity constraint prompted the proof of an elegant structure from motion theorem saying that three orthographic projections of four non-co-planar points have a unique 3D interpretation as belonging to one rigid body. • First note that the body’s motion may be decomposed into translational and rotational movement; the former gives the movement of a fixed point with respect to the observer, and the latter relative rotation of the body (for example, about the chosen fixed point). • Ullman’s result is the best possible in the sense that unique reconstruction of a rigid body cannot be guaranteed with fewer than three projections of four points, or with three projections of fewer than four points. It should also be remembered that the result refers to orthographic projection when in general image projections are perspective, as far as it is recognizable, is easy to identify.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68. Full 3D objects • Volumetric modeling strategies include constructive solid geometry, super_x0002_quadrics and generalized cylinders. • Surface modeling strategies include boundary representations, triangulated surfaces, and quadric patches. • Line labeling is an outmoded but accessible technique for reconstructing objects with planar faces. • Transitions to 3D objects need a co-ordinate system that is object centered. • 3D objects may be measured mechanically by computed tomography, by range finders or by shape from motion techniques.
  • 69. 3D model-based vision • To create a full 3D model from a set of range images, the surfaces must first be registered rotations and translations should be found that match one surface to another. • Model-based vision uses a priori knowledge about an object to ease its recognition. • Techniques exist to locate curved objects from range images.
  • 70. 2D view-based representations of a 3D scene • 2D view-based representations of 3D scenes may be achieved with multi-view representations. • It is possible to select a few stored reference images, and render any view from them. • Interpolation of views is not enough and view extrapolation is needed. This requires knowledge of geometry, and the view- based approach does not differ significantly from 3D geometry reconstruction. • It is possible to perform a 3D reconstruction from an unorganized set of 2D views. This approach has been used widely recently by, e.g., Google StreetView.
  • 71. Reconstructing scene geometry • Large scale scene features such as plane parameters may be recaptured from properties of known objects such as straight lines and approximate size. • Well known geometric results identify vanishing points and ground orientation. • Similar approaches may well work even if large scale clues are unavailable.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106. Shape from optical flow • In a continuous sequence, we are therefore interested in the apparent movement of each pixel (x, y) which is given by the optical flow field (dx/dt, dy/dt). Determining shape from optical flow is mathematically non- trivial, and here an early simplification of the subject is presented as an illustration [Clocksin, 1980]. The simpli- fication is in two parts:
  • 107.
  • 108. • Motion is due to the observer travelling in a straight line through a static landscape.Without loss of generality, suppose the motion is in the direction of the z axis of a viewer-centered co-ordinate system (i.e., the observer is positioned at the origin). • Rather than being projected onto a 2D plane, the image is seen on the surface of a unit sphere, centered at the observer (a ‘spherical retina’). Points in 3D are represented in spherical polar rather than Cartesian co-ordinates—spherical polar co-ordinates (r, θ, ϕ) (see Figure 12.1) are related to (x, y, z) by the equations
  • 109. Shape from texture • The angle at which the surface is seen would cause a (perspective) distortion of the texture primitive (texel), and the relative size of the primitives would vary according to distance from the observer.
  • 110. • Considering a textured surface patterned with identical texels which have been recovered by lower-level processing, note that with respect to a viewer it has three properties at any point projected onto a retinal image: distance from the observer, slant; the angle at which the surface is sloping away from the viewer (the angle between the surface normal and the line of sight); and tilt, the direction in which the slant takes place. Attempts to re-capture some of this information is based on the texture gradient—that is, the direction of maximum rate of change of the perceived size of the texels, and a scalar measurement of this rate.
  • 111. • Texture is usually used as an additional or complementary feature, augmenting another, stronger clue in shape extraction.
  • 112. Other shape from X techniques • Shape from focus/de-focus techniques are based on the fact that lenses have finite depth of field, and only objects at the correct distance are in focus; others are blurred in proportion to their distance. • Two main approaches can be distinguished: • Shape from focus measures depth in one location in an active manner; this technique is used in 3D measuring machines in mechanical engineering. The object to be measured is fixed on a motorized table that moves along x, y, z axes.
  • 113. • Shape from de-focus typically estimates depth using two input images captured at different depths. The relative depth of the whole scene can be reconstructed from image blur. The image is modeled as a convolution of the image with a proper point spread function the function is either known from capturing setup parameters or estimated. • Shape from vergence uses two cameras fixed on a common rod. Using two servo_x0002_mechanisms, the cameras can change the direction of their optical axes (verge) in the plane containing a line segment joining their optical centers. Such devices are called stereo heads;
  • 114. • Shape from contour aims to describe a 3D shape from contours seen from one or more view directions. Objects with smooth bounding surfaces are quite difficult to analyze. • The set of all points on the object surface where surface normal is perpendicular to the observer’s visual ray is called a rim
  • 115. Assuming orthographic projection, the rim points generate a silhouette of an object in the image. Silhouettes can be easily and reliably captured if back-light illumination is used, although there is possible complication in thespecial case in which two distinct rim points project to a single image point.
  • 116. • The inherent difficulty in shape from contour comes from the loss of information in projecting 3D to 2D. Humans are surprisingly successful at perceiving clear 3D shapes from contours, and it seems that tremendous background knowledge is used to assist. Understanding this human ability is one of the major challenges for computer vision.
  • 117. Full 3D objects 3D objects, models, and related issues: The notion of a 3D object allows us to consider a 3D volume as a part of the entire 3D world. This volume has a particular interpretation (semantics, purpose) for the task in hand. we have treated geometric and radiometric techniques that provide intermediate 3D cues, and it was implicitly assumed that such cues help to understand the nature of a 3D object. Shape is another informal concept that humans typically connect with a 3D object.
  • 118. • Computer vision aims at scientific methods for 3D object description, but there are no mathematical tools yet available to express shape in its general sense. • Curvilinear surfaces with no restriction on surface shape are called free-form surfaces. • Roughly speaking, the 3D vision task distinguishes two classes of approach:
  • 119. 1. Reconstruction of the 3D object model or representation from real-world measurements with the aim of estimating a continuous function representing the surface. 2. Recognition of an instance of a 3D object in the scene. It is assumed that object classes are known in advance, and that they are represented by a suitable 3D model. Humans meet and recognize often deformable objects that change their shape.
  • 120. • Computer vision as well as computer graphics use 3D models to encapsulate the shape of an 3D object. • 3D models serve in computer graphics to generate detailed surface descriptions used to render realistic 2D images. • In computer vision, the model is used either for reconstruction (copying, displaying an object from a different viewpoint,modifying an object slightly during animation) or for recognition purposes, where features are used that distinguish objects from different classes.
  • 121. • There are two main classes of models: volumetric and surface. • Volumetric models represent the ‘inside’ of a 3DZ object explicitly, while surface models use only object surfaces, as most vision-based measuring techniques can only see the surface of a non-transparent solid. • 3D models make a transition towards an object-centered co-ordinate system, allowing • object descriptions to be viewer independent. This is the most difficult phase within Marr’s paradigm.
  • 122. • 3D models of objects are common in other areas besides computer vision, notably computer-aided design (CAD) and computer graphics, where image synthesis is required that is, an exact (2D) pictorial representation of some modeled 3D object. • Various representation schemes exist, with different properties. A representation is called complete if two different objects cannot correspond to the same model, so a particular model is unambiguous. • A representation is called unique if an object cannot correspond to two different models.
  • 123. • Most 3D representation methods sacrifice either the completeness or the uniqueness property. • Commercial CAD systems frequently sacrifice uniqueness.
  • 124. Line labeling • blocks world approach. • Line labeling is an outmoded but accessible technique for reconstructing objects with planar faces. Independently, other researchers built on these ideas to develop what is now a very well known line labeling algorithm
  • 125.
  • 126. • Line labeling is able to detect ‘impossible’ .
  • 127. Volumetric representation, direct measurements • An object is placed in some reference co-ordinate system and its volume is subdivided into small volume elements called voxels—it is usual for these to be cubes. • The most straightforward representation of voxel-based volumetric models is the 3D occupancy grid, which is implemented as a 3D Boolean array
  • 128. • The object is fixed to a measuring machine, and an absolute co-ordinate system is attached to it. Points on the object surface are touched by a measuring needle which provides 3D co-ordinates; Another 3D measurement technique,computed tomography, looks inside the object and thus yields more detailed information than the binary occupancy grid.
  • 129. Volumetric modeling strategies • Constructive Solid Geometry • The principal idea of Constructive Solid Geometry (CSG), which has found some success is to construct 3D bodies from a selection of solid primitives. • A CSG model is stored as a tree, with leaf nodes representing the primitive solid and edges enforcing precedence among the set theoretical operations
  • 130. • Super-quadrics • Super-quadrics are geometric bodies that can be understood as a generalization of basic quadric solids, introduced in computer graphics [Barr, 1981]. • Super-ellipsoids are instances of super-quadrics used in computer vision.
  • 131. • where a1, a2, and a3 define the super-quadric size in the x, y, and z directions, respectively. εvert is the squareness parameter in the latitude plane and εhori is the squareness parameter in the longitude plane. • The squareness values used in respective planes are 0 (i.e., square) ≤ ε ≤ 2 (i.e., deltoid), as only those are convex bodies. If squareness parameters are greater than 2, the body changes to a cross-like shape.
  • 132. Generalized cylinders • Generalized cylinders, or generalized cones, are often also called sweep representations. • a cone is defined by a circle whose radius changes linearly with distance traveled, moving along a straight line.
  • 133. • These generalized cones turn out to be very good at representing some classes of solid body. • The advantage of symmetrical volumetric primitives, such as generalized cylinders and super-quadrics, is their ability to capture common symmetries and represent certain shapes with few parameters. • An influential early vision system called ACRONYM used generalized cones as its modeling scheme. • There is a modification of the sweep representation called a skeleton representation, which stores only the spines of the objects.
  • 134. Surface modeling strategies • A solid object can be represented by surfaces bounding it; such a description can vary from simple triangular patches to visually appealing structures such as non-uniform rational B-splines (NURBS) popular in geometric modeling. Computer vision solves two main problems with surfaces: 1. reconstruction creates surface description from sparse depth measurements that are typically corrupted by outliers; 2.segmentation aims to classify surface or surface patches into surface types.
  • 135. • Boundary representations (B-reps) can be viewed conceptually as a triple: • A set of surfaces of the object. • A set of space curves representing intersections between the surfaces. • A graph describing the surface connectivity. B-reps are an appealing and intuitively natural way of representing 3D bodies in that they consist of an explicit list of the bodies’ faces. In the simplest case, ‘faces’ are taken to be planar, so bodies are always polyhedral, and we are dealing the whole time with piecewise planar surfaces.
  • 136. • Triangulation of irregular data points (e.g., a 3D point cloud obtained from a range scanner) is an example of an interpolation method. • The best-known technique is called Delaunay triangulation, which can be defined in two, three, or more space dimensions.
  • 137. Registering surface patches and their fusion to get a full 3D model • A range image represents distance measurements from an observer to an object; it yields a partial 3D description of the surface from one view only. • Several range images are needed to capture the whole surface of an object. • Range image registration finds a rigid geometric transformation between two range images of the same object captured from two different viewpoints.
  • 138. • The method automates the construction of a 3D model of a 3D free-form object from a set of range images as follows. 1. The object is placed on a turntable and a set of range images from different viewpoints is measured by a structured-light (laser-plane) range finder. 2. A triangulated surface is constructed over the range images. 3. Large data sets are reduced by decimation of triangular meshes in each view. 4. Surfaces are registered into a common object-centered co- ordinate system and out_x0002_liers in measurements are removed.
  • 139. • A 4-connected mesh cannot represent all objects; e.g., a sphere cannot be covered by a four-sided polygon. • By splitting each polygon by an edge, a triangulation of the surface, which is able to represent any surface, is easily obtained. • A polygon may be split two ways; it is preferable to choose the shortest edge because this results in triangles with larger inner angles.
  • 140.
  • 141.
  • 142.
  • 143.
  • 144.
  • 145. 2D view-based representations of a 3D scene • Viewing space • The trouble is that there is potentially an infinite number of possible viewpoints that induce an infinite number of object appearances. • To cope with the huge number of viewpoints and appearances it is necessary to sample a viewpoint space and group together similar neighboring views. • A simplified model is a viewing sphere model that is often used in the orthographic projection case
  • 146. Multi-view representations and aspect graphs • Other representation methods attempt to combine all the viewpoint-specific models into a single data structure. One of them is the characteristic view technique in which all possible 2D projections of the convex polyhedral object are grouped into a finite number of topologically equivalent classes. • A similar approach is based on aspect which is defined as the topological structure of singularities in a single view of an object aspect has useful invariance properties.
  • 147. • Most small changes in vantage point will not affect aspect, and such vantage points (that isbmost) are referred to as stable.
  • 148. 3D reconstruction from an unorganized set of 2D views, and Structure from Motion