SlideShare a Scribd company logo
1 of 24
.
VTOP MODULE 2
COMPUTER VISION
MODULE 2
• Depth Estimation And Multi-Camera Views:
Depth Estimation and Multi-Camera Views:
Perspective, Binocular Stereopsis: Camera and
Epipolar Geometry; Homography, Rectification,
DLT, RANSAC, 3-D reconstruction framework;
Auto-calibration.
 Depth Estimation is the task of measuring the distance of each pixel relative to
the camera. Depth is extracted from either monocular (single) or stereo (multiple
views of a scene) images. Traditional methods use multi-view geometry to find
the relationship between the images. Newer methods can directly estimate depth
by minimizing the regression loss, or by learning to generate a novel view from a
sequence.
 It is an important task in computer vision and has various applications such as
3D reconstruction, augmented reality, autonomous navigation, and more.
 There are several techniques for depth estimation, and one commonly used
approach is stereo vision. Stereo vision involves using a pair of cameras, known
as a stereo camera setup, to capture images of a scene from slightly different
viewpoints. The disparity between corresponding pixels in the left and right
images can be used to calculate the depth information.
Depth Estimation
 To estimate depth using stereo vision, the following steps are typically involved:
Camera calibration: Accurate calibration of the stereo camera setup is necessary to
determine the intrinsic and extrinsic parameters of each camera. This calibration
process establishes the relationship between the 3D world coordinates and the
corresponding 2D image points.
Image rectification: Rectification is performed to transform the stereo image pair so
that corresponding epipolar lines become scanlines. This simplifies the matching
process by reducing it to a 1D search problem.
Disparity calculation: Matching algorithms are used to find correspondences
between the left and right images. These algorithms aim to identify the pixel
disparities, i.e., the horizontal shift of a point between the two images. Common
techniques include block matching, semi-global matching, and graph cuts.
Depth Estimation
Depth computation: Once the disparity map is obtained, the depth can be calculated
using triangulation. By knowing the baseline distance (distance between the two
camera centers) and the focal length of the cameras, the depth at each pixel can be
computed using simple geometry.
Apart from stereo vision, there are other methods for depth estimation, including
structured light, time-of-flight, and monocular depth estimation using a single
camera. Monocular depth estimation relies on various cues, such as texture, motion,
perspective, and object size, to infer depth information. Deep learning-based
approaches, especially convolutional neural networks (CNNs), have shown
promising results in monocular depth estimation by learning from large-scale
datasets.
Depth Estimation
Depth Estimation
Multi-camera views refer to the use of multiple cameras positioned at different
locations or angles to capture a scene simultaneously. By combining the views from
multiple cameras, it becomes possible to obtain a more comprehensive
understanding of the scene, including depth information and different perspectives.
Here are some key points about multi-camera views:
Enhanced Coverage: With multiple cameras, it is possible to cover a larger area of
the scene compared to a single camera. Each camera can capture a different portion
or angle of the scene, providing a wider field of view.
Improved Depth Perception: By utilizing multiple cameras, depth information can
be extracted using techniques like stereo vision or structure from motion. By
comparing the views from different cameras, it becomes possible to estimate the
depth of objects in the scene, enabling 3D reconstruction and depth-based
applications.
Redundancy and Robustness: Having multiple camera views provides redundancy
in capturing the scene. If one camera fails or its view is obstructed, other cameras
can still provide information about the scene. This redundancy enhances the
robustness and reliability of the system.
Multi Camera View
Viewpoint Diversity: Each camera in a multi-camera setup can have a different
perspective or viewpoint of the scene. This diversity of viewpoints can be beneficial
for various applications, such as object tracking, activity recognition, or scene
understanding. By combining different perspectives, a more comprehensive
representation of the scene can be obtained.
Multi-Modal Information: Multi-camera views can also capture different
modalities of the scene, such as visible light, infrared, depth sensors, or thermal
imaging. By combining these different modalities, richer and more detailed
information about the scene can be obtained, leading to improved understanding and
analysis.
Applications of multi-camera views include surveillance systems, autonomous
vehicles, virtual reality, augmented reality, robotics, sports analysis, and many more.
The synchronized and coordinated use of multiple cameras enables a deeper
understanding of the scene, enhances accuracy and robustness, and opens up new
possibilities in computer vision and imaging applications.
Multi Camera View
Multi-camera views refer to the use of multiple cameras to capture different
perspectives simultaneously. These multiple camera angles are then often edited
together to create a dynamic and engaging visual experience for the audience. Each
camera provides a unique perspective, allowing viewers to see different angles,
details, and reactions.
Multi-camera setups are commonly used in various media productions, including
television shows, live events, sports broadcasts, and films. Here are some key
perspectives achieved through multi-camera views:
Wide Shots: A wide shot provides an overall view of the scene, capturing the entire
set or location. It establishes the context, shows the spatial relationships between
characters or objects, and sets the stage for more detailed shots.
Medium Shots: Medium shots focus on characters or objects from a medium
distance. They offer a balanced view, showing the subject from the waist up or from
the knees up. Medium shots are often used for dialogue scenes and allow viewers to
see facial expressions and body language.
Perspective
Close-ups: Close-up shots zoom in on a specific subject, such as a person's face or
an object. They highlight details and emotions, creating an intimate connection
between the viewer and the subject. Close-ups are particularly effective for
conveying emotions or emphasizing important story elements.
Over-the-Shoulder Shots: Over-the-shoulder shots are commonly used in dialogue
scenes. They capture the back of one person's shoulder and part of their head, with
the main focus on the person they are facing. This perspective provides a sense of
depth and helps viewers feel like they are part of the conversation.
Reaction Shots: Reaction shots capture the emotional responses or reactions of
characters to a particular event or dialogue. They are usually close-ups of a
character's face, emphasizing their expressions and adding depth to the scene.
Point-of-View Shots: Point-of-view shots provide the audience with the perspective
of a particular character. The camera becomes the character's eyes, showing what
they see and their subjective experience of the situation. These shots can create a
sense of immersion and empathy.
Perspective
By combining and switching between these different camera perspectives, directors
and editors can create engaging visual narratives that enhance the storytelling
experience. Multi-camera views provide flexibility in post-production, allowing for
the selection of the best shots and angles to convey the intended message and evoke
the desired emotions from the audience.
Perspective
Binocular stereopsis is the ability of humans (and some animals) to perceive depth
and three-dimensional information by utilizing the binocular disparity resulting from
having two eyes placed horizontally on the face. Each eye captures a slightly
different view of the world, and the brain combines these two images to create a
single perception with depth perception.
The process of binocular stereopsis involves several steps:
Binocular Disparity: Binocular disparity refers to the differences in the retinal
images between the two eyes. Because the eyes are horizontally separated, they
receive slightly different perspectives of the same scene. These disparities are due to
the parallax effect and provide important depth cues.
 The parallax effect is a phenomenon that occurs due to the displacement or
difference in the apparent position of an object when viewed from different
angles. It is a visual cue that helps perceive depth and distance in a scene.
Binocular Stereopsis
 The parallax effect is closely related to binocular disparity, which is the primary
mechanism behind binocular stereopsis (the ability to perceive depth using two
eyes). When we view objects with binocular vision, each eye has a slightly
different perspective, resulting in a disparity between the images captured by
each eye. The brain processes these disparities to compute depth information and
create a perception of three-dimensional space.
Here's an example to illustrate the parallax effect:
Hold your finger in front of your face and look at it first with your left eye and then
with your right eye, alternating between the two. You will notice that the finger
appears to shift its position relative to the background. This apparent shift is the
parallax effect in action. The amount of shift or displacement is greater when the
object is closer to you and smaller when it is farther away.
Binocular Stereopsis
Correspondence Matching: The brain's visual processing system compares the
images from each eye and matches corresponding points or features between them. It
searches for similar patterns, textures, or edges in both images to establish
correspondences.
Binocular Stereopsis
Disparity Calculation: Once the corresponding points are identified, the brain
measures the horizontal displacement or disparity between them. The magnitude of
the disparity is proportional to the depth difference between the object and the
observer.
Depth Perception: By analyzing the magnitude of the disparity, the brain estimates
the relative depth of objects in the visual scene. Objects that appear closer will have
a larger disparity, while objects farther away will have a smaller disparity.
Binocular Stereopsis
Fusion and 3D Perception: The brain combines the information from both eyes,
integrating the two slightly different perspectives into a single perception. This
fusion of the images creates the perception of depth, allowing us to see the world in
three dimensions.
Binocular stereopsis is an important component of human vision and provides us
with valuable depth cues, allowing us to navigate and interact with the environment
effectively. It enables us to judge distances, perceive the relative positions of objects,
and experience a sense of depth and solidity in our visual perception.
In addition to human vision, binocular stereopsis has applications in fields such as
computer vision and robotics. By using stereo cameras or other depth-sensing
techniques, machines can replicate the principles of binocular stereopsis to perceive
depth and reconstruct three-dimensional representations of the world around them.
Binocular Stereopsis
Camera geometry refers to the mathematical and physical properties that describe the
behavior and characteristics of a camera. It encompasses both intrinsic and extrinsic
parameters that define how the camera captures and projects the 3D world onto a 2D
image.
Intrinsic Parameters: Intrinsic parameters are internal to the camera and define its
internal optical characteristics. These parameters include:
Camera Geometry
 Focal Length: The focal length
determines the camera's field of
view and the degree of
magnification. It represents the
distance between the camera's lens
and the image sensor when the
subject is in focus.
 Principal Point: The principal
point represents the optical center
of the camera. It is the point where
the optical axis intersects the
image plane.
 Lens Distortion: Lens distortion refers to the imperfections in the camera lens
that can cause image distortions. Common types of distortion include radial
distortion (barrel or pincushion distortion) and tangential distortion.
Camera Geometry
Tangential Distortion: Tangential distortion is a different type of distortion that
occurs due to misalignments or irregularities in the lens elements. It causes the
image to appear skewed or stretched asymmetrically, typically in a non-linear
manner. Tangential distortion can result from factors such as slight tilting or
displacement of the lens elements or inconsistencies in lens manufacturing.
Radial Distortion: Radial distortion refers to the distortion that occurs when straight lines
near the edges of an image appear curved or bent. It is caused by imperfections in the lens that
cause light rays to refract differently depending on their distance from the center of the lens.
Radial distortion is typically classified into two subtypes:
Camera Geometry
Barrel Distortion: Barrel distortion
causes straight lines to curve outward,
resembling the shape of a barrel. It occurs
when the outer portions of the image are
magnified more than the center. This
distortion is commonly observed in wide-
angle lenses.
Pincushion Distortion: Pincushion
distortion causes straight lines to curve
inward, resembling the shape of a
pincushion. It occurs when the center of
the image is magnified more than the outer
portions. Pincushion distortion is often
observed in telephoto lenses.
Extrinsic Parameters: Extrinsic parameters describe the position and orientation of
the camera in the 3D world. These parameters include:
Camera Center: The camera center, also known as the optical center or camera
position, represents the location of the camera's optical axis in the 3D world.
Camera Pose: The camera pose describes the position (translation) and orientation
(rotation) of the camera relative to a reference coordinate system.
Projection Model: The projection model defines how the 3D world is projected
onto the 2D image plane. The most common projection model used is the pinhole
camera model, which assumes a perspective projection. It assumes that light rays
pass through a single point (pinhole) in the camera and project onto the image plane.
Camera Calibration: Camera calibration is the process of determining the intrinsic
and extrinsic parameters of a camera. It involves capturing calibration images with
known calibration patterns, such as a chessboard, and using mathematical algorithms
to estimate the camera parameters.
Camera Geometry
Understanding camera geometry and its parameters is crucial for various
applications, including computer vision, 3D reconstruction, camera calibration,
augmented reality, and robotics. By accurately modeling the camera's behavior, it
becomes possible to interpret and manipulate images and accurately estimate the
position and geometry of objects in the 3D world.
Camera Geometry
Epipolar geometry is a fundamental concept in computer vision and stereo imaging
that describes the geometric relationship between two camera views observing the
same scene. It provides constraints on the possible locations of corresponding points
in the two images, enabling depth estimation and 3D reconstruction.
Epipolar geometry include
Epipolar Geometry
Epipole: The epipole is a
point that represents the
projection of one camera
center onto the image plane of
the other camera. It is the
point of intersection between
the line connecting the camera
centers (baseline) and the
image plane. Each camera has
its own epipole in the other
camera's image.𝑒𝑟
Here 𝑒𝑙 and 𝑒𝑟 is the epipole
Epipolar Plane: The epipolar plane is a 3D plane that contains the baseline (the line
connecting the camera centers) and any point in the 3D scene. It represents the
possible locations of corresponding points in the two camera views.
Epipolar Geometry
Epipolar line: The epipolar line is the straight line of intersection of the epipolar
plane with the image plane. It is the image in one camera of a ray through the optical
center and image point in the other camera. All epipolar lines intersect at the epipole.
Epipolar Geometry

More Related Content

What's hot

Ajay ppt region segmentation new copy
Ajay ppt region segmentation new   copyAjay ppt region segmentation new   copy
Ajay ppt region segmentation new copyAjay Kumar Singh
 
The application of image enhancement in color and grayscale images
The application of image enhancement in color and grayscale imagesThe application of image enhancement in color and grayscale images
The application of image enhancement in color and grayscale imagesNisar Ahmed Rana
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Chapter 8 Video
Chapter 8 VideoChapter 8 Video
Chapter 8 Videoshelly3160
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
 
Introduction to image processing-Class Notes
Introduction to image processing-Class NotesIntroduction to image processing-Class Notes
Introduction to image processing-Class NotesDr.YNM
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentationasodariyabhavesh
 
Planos y movimientos de cámaras
Planos y movimientos de cámarasPlanos y movimientos de cámaras
Planos y movimientos de cámarasAlberto Moncada
 
Fields of digital image processing slides
Fields of digital image processing slidesFields of digital image processing slides
Fields of digital image processing slidesSrinath Dhayalamoorthy
 
Three View Self Calibration and 3D Reconstruction
Three View Self Calibration and 3D ReconstructionThree View Self Calibration and 3D Reconstruction
Three View Self Calibration and 3D ReconstructionPeter Abeles
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domainAshish Kumar
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Md. Minhazul Haque
 
Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)asodariyabhavesh
 
Digital image processing
Digital image processingDigital image processing
Digital image processingDEEPASHRI HK
 
Lecture 1 for Digital Image Processing (2nd Edition)
Lecture 1 for Digital Image Processing (2nd Edition)Lecture 1 for Digital Image Processing (2nd Edition)
Lecture 1 for Digital Image Processing (2nd Edition)Moe Moe Myint
 
Lec 07 image enhancement in frequency domain i
Lec 07 image enhancement in frequency domain iLec 07 image enhancement in frequency domain i
Lec 07 image enhancement in frequency domain iAli Hassan
 
Region filling
Region fillingRegion filling
Region fillinghetvi naik
 
Computer graphics notes
Computer graphics notesComputer graphics notes
Computer graphics notessmruti sarangi
 
Image Acquisition
Image AcquisitionImage Acquisition
Image Acquisitionshail288
 

What's hot (20)

Ajay ppt region segmentation new copy
Ajay ppt region segmentation new   copyAjay ppt region segmentation new   copy
Ajay ppt region segmentation new copy
 
The application of image enhancement in color and grayscale images
The application of image enhancement in color and grayscale imagesThe application of image enhancement in color and grayscale images
The application of image enhancement in color and grayscale images
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Chapter 8 Video
Chapter 8 VideoChapter 8 Video
Chapter 8 Video
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Introduction to image processing-Class Notes
Introduction to image processing-Class NotesIntroduction to image processing-Class Notes
Introduction to image processing-Class Notes
 
Chapter10 image segmentation
Chapter10 image segmentationChapter10 image segmentation
Chapter10 image segmentation
 
Planos y movimientos de cámaras
Planos y movimientos de cámarasPlanos y movimientos de cámaras
Planos y movimientos de cámaras
 
Fields of digital image processing slides
Fields of digital image processing slidesFields of digital image processing slides
Fields of digital image processing slides
 
Three View Self Calibration and 3D Reconstruction
Three View Self Calibration and 3D ReconstructionThree View Self Calibration and 3D Reconstruction
Three View Self Calibration and 3D Reconstruction
 
Enhancement in frequency domain
Enhancement in frequency domainEnhancement in frequency domain
Enhancement in frequency domain
 
Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001Multi Object Tracking | Presentation 1 | ID 103001
Multi Object Tracking | Presentation 1 | ID 103001
 
Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)Chapter 3 image enhancement (spatial domain)
Chapter 3 image enhancement (spatial domain)
 
Digital image processing
Digital image processingDigital image processing
Digital image processing
 
Lecture 1 for Digital Image Processing (2nd Edition)
Lecture 1 for Digital Image Processing (2nd Edition)Lecture 1 for Digital Image Processing (2nd Edition)
Lecture 1 for Digital Image Processing (2nd Edition)
 
Lec 07 image enhancement in frequency domain i
Lec 07 image enhancement in frequency domain iLec 07 image enhancement in frequency domain i
Lec 07 image enhancement in frequency domain i
 
Region filling
Region fillingRegion filling
Region filling
 
Sound lesson 1
Sound lesson 1Sound lesson 1
Sound lesson 1
 
Computer graphics notes
Computer graphics notesComputer graphics notes
Computer graphics notes
 
Image Acquisition
Image AcquisitionImage Acquisition
Image Acquisition
 

Similar to MODULE 2 computer vision part 2 depth estimation

Remote Sensing
Remote SensingRemote Sensing
Remote SensingSovanBar
 
Fundamentals of matchmoving
Fundamentals of matchmovingFundamentals of matchmoving
Fundamentals of matchmovingDipjoy Routh
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAijcseit
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAijcseit
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAijcseit
 
Virtual viewpoint three dimensional panorama
Virtual viewpoint three dimensional panoramaVirtual viewpoint three dimensional panorama
Virtual viewpoint three dimensional panoramaijcseit
 
Copy of 3 d report
Copy of 3 d reportCopy of 3 d report
Copy of 3 d reportVirajjha
 
Gesture detection by virtual surface
Gesture detection by virtual surfaceGesture detection by virtual surface
Gesture detection by virtual surfaceAshish Garg
 
Master thesis 2
Master thesis 2Master thesis 2
Master thesis 2Jb Soni
 
IRJET- Object Recognization from Structural Information using Perception
IRJET- Object Recognization from Structural Information using PerceptionIRJET- Object Recognization from Structural Information using Perception
IRJET- Object Recognization from Structural Information using PerceptionIRJET Journal
 
Light Lens Zoom
Light Lens ZoomLight Lens Zoom
Light Lens Zoommweimer
 

Similar to MODULE 2 computer vision part 2 depth estimation (20)

Sony
SonySony
Sony
 
Remote Sensing
Remote SensingRemote Sensing
Remote Sensing
 
Fundamentals of matchmoving
Fundamentals of matchmovingFundamentals of matchmoving
Fundamentals of matchmoving
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
 
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMAVIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
VIRTUAL VIEWPOINT THREE-DIMENSIONAL PANORAMA
 
Virtual viewpoint three dimensional panorama
Virtual viewpoint three dimensional panoramaVirtual viewpoint three dimensional panorama
Virtual viewpoint three dimensional panorama
 
3D Stereoscopic Filmmaking
3D Stereoscopic Filmmaking3D Stereoscopic Filmmaking
3D Stereoscopic Filmmaking
 
Copy of 3 d report
Copy of 3 d reportCopy of 3 d report
Copy of 3 d report
 
Gesture detection by virtual surface
Gesture detection by virtual surfaceGesture detection by virtual surface
Gesture detection by virtual surface
 
Stereoscopic Imaging
Stereoscopic ImagingStereoscopic Imaging
Stereoscopic Imaging
 
Master thesis 2
Master thesis 2Master thesis 2
Master thesis 2
 
ADVANCED SURVEYING.pptx
ADVANCED SURVEYING.pptxADVANCED SURVEYING.pptx
ADVANCED SURVEYING.pptx
 
final_SMART (1)
final_SMART (1)final_SMART (1)
final_SMART (1)
 
Sky
SkySky
Sky
 
Technical Def:Des
Technical Def:DesTechnical Def:Des
Technical Def:Des
 
IRJET- Object Recognization from Structural Information using Perception
IRJET- Object Recognization from Structural Information using PerceptionIRJET- Object Recognization from Structural Information using Perception
IRJET- Object Recognization from Structural Information using Perception
 
basics of microscope - part 1
basics of microscope - part 1basics of microscope - part 1
basics of microscope - part 1
 
Light Lens Zoom
Light Lens ZoomLight Lens Zoom
Light Lens Zoom
 
De24686692
De24686692De24686692
De24686692
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 

Recently uploaded (20)

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 

MODULE 2 computer vision part 2 depth estimation

  • 2. MODULE 2 • Depth Estimation And Multi-Camera Views: Depth Estimation and Multi-Camera Views: Perspective, Binocular Stereopsis: Camera and Epipolar Geometry; Homography, Rectification, DLT, RANSAC, 3-D reconstruction framework; Auto-calibration.
  • 3.  Depth Estimation is the task of measuring the distance of each pixel relative to the camera. Depth is extracted from either monocular (single) or stereo (multiple views of a scene) images. Traditional methods use multi-view geometry to find the relationship between the images. Newer methods can directly estimate depth by minimizing the regression loss, or by learning to generate a novel view from a sequence.  It is an important task in computer vision and has various applications such as 3D reconstruction, augmented reality, autonomous navigation, and more.  There are several techniques for depth estimation, and one commonly used approach is stereo vision. Stereo vision involves using a pair of cameras, known as a stereo camera setup, to capture images of a scene from slightly different viewpoints. The disparity between corresponding pixels in the left and right images can be used to calculate the depth information. Depth Estimation
  • 4.  To estimate depth using stereo vision, the following steps are typically involved: Camera calibration: Accurate calibration of the stereo camera setup is necessary to determine the intrinsic and extrinsic parameters of each camera. This calibration process establishes the relationship between the 3D world coordinates and the corresponding 2D image points. Image rectification: Rectification is performed to transform the stereo image pair so that corresponding epipolar lines become scanlines. This simplifies the matching process by reducing it to a 1D search problem. Disparity calculation: Matching algorithms are used to find correspondences between the left and right images. These algorithms aim to identify the pixel disparities, i.e., the horizontal shift of a point between the two images. Common techniques include block matching, semi-global matching, and graph cuts. Depth Estimation
  • 5. Depth computation: Once the disparity map is obtained, the depth can be calculated using triangulation. By knowing the baseline distance (distance between the two camera centers) and the focal length of the cameras, the depth at each pixel can be computed using simple geometry. Apart from stereo vision, there are other methods for depth estimation, including structured light, time-of-flight, and monocular depth estimation using a single camera. Monocular depth estimation relies on various cues, such as texture, motion, perspective, and object size, to infer depth information. Deep learning-based approaches, especially convolutional neural networks (CNNs), have shown promising results in monocular depth estimation by learning from large-scale datasets. Depth Estimation
  • 7. Multi-camera views refer to the use of multiple cameras positioned at different locations or angles to capture a scene simultaneously. By combining the views from multiple cameras, it becomes possible to obtain a more comprehensive understanding of the scene, including depth information and different perspectives. Here are some key points about multi-camera views: Enhanced Coverage: With multiple cameras, it is possible to cover a larger area of the scene compared to a single camera. Each camera can capture a different portion or angle of the scene, providing a wider field of view. Improved Depth Perception: By utilizing multiple cameras, depth information can be extracted using techniques like stereo vision or structure from motion. By comparing the views from different cameras, it becomes possible to estimate the depth of objects in the scene, enabling 3D reconstruction and depth-based applications. Redundancy and Robustness: Having multiple camera views provides redundancy in capturing the scene. If one camera fails or its view is obstructed, other cameras can still provide information about the scene. This redundancy enhances the robustness and reliability of the system. Multi Camera View
  • 8. Viewpoint Diversity: Each camera in a multi-camera setup can have a different perspective or viewpoint of the scene. This diversity of viewpoints can be beneficial for various applications, such as object tracking, activity recognition, or scene understanding. By combining different perspectives, a more comprehensive representation of the scene can be obtained. Multi-Modal Information: Multi-camera views can also capture different modalities of the scene, such as visible light, infrared, depth sensors, or thermal imaging. By combining these different modalities, richer and more detailed information about the scene can be obtained, leading to improved understanding and analysis. Applications of multi-camera views include surveillance systems, autonomous vehicles, virtual reality, augmented reality, robotics, sports analysis, and many more. The synchronized and coordinated use of multiple cameras enables a deeper understanding of the scene, enhances accuracy and robustness, and opens up new possibilities in computer vision and imaging applications. Multi Camera View
  • 9. Multi-camera views refer to the use of multiple cameras to capture different perspectives simultaneously. These multiple camera angles are then often edited together to create a dynamic and engaging visual experience for the audience. Each camera provides a unique perspective, allowing viewers to see different angles, details, and reactions. Multi-camera setups are commonly used in various media productions, including television shows, live events, sports broadcasts, and films. Here are some key perspectives achieved through multi-camera views: Wide Shots: A wide shot provides an overall view of the scene, capturing the entire set or location. It establishes the context, shows the spatial relationships between characters or objects, and sets the stage for more detailed shots. Medium Shots: Medium shots focus on characters or objects from a medium distance. They offer a balanced view, showing the subject from the waist up or from the knees up. Medium shots are often used for dialogue scenes and allow viewers to see facial expressions and body language. Perspective
  • 10. Close-ups: Close-up shots zoom in on a specific subject, such as a person's face or an object. They highlight details and emotions, creating an intimate connection between the viewer and the subject. Close-ups are particularly effective for conveying emotions or emphasizing important story elements. Over-the-Shoulder Shots: Over-the-shoulder shots are commonly used in dialogue scenes. They capture the back of one person's shoulder and part of their head, with the main focus on the person they are facing. This perspective provides a sense of depth and helps viewers feel like they are part of the conversation. Reaction Shots: Reaction shots capture the emotional responses or reactions of characters to a particular event or dialogue. They are usually close-ups of a character's face, emphasizing their expressions and adding depth to the scene. Point-of-View Shots: Point-of-view shots provide the audience with the perspective of a particular character. The camera becomes the character's eyes, showing what they see and their subjective experience of the situation. These shots can create a sense of immersion and empathy. Perspective
  • 11. By combining and switching between these different camera perspectives, directors and editors can create engaging visual narratives that enhance the storytelling experience. Multi-camera views provide flexibility in post-production, allowing for the selection of the best shots and angles to convey the intended message and evoke the desired emotions from the audience. Perspective
  • 12. Binocular stereopsis is the ability of humans (and some animals) to perceive depth and three-dimensional information by utilizing the binocular disparity resulting from having two eyes placed horizontally on the face. Each eye captures a slightly different view of the world, and the brain combines these two images to create a single perception with depth perception. The process of binocular stereopsis involves several steps: Binocular Disparity: Binocular disparity refers to the differences in the retinal images between the two eyes. Because the eyes are horizontally separated, they receive slightly different perspectives of the same scene. These disparities are due to the parallax effect and provide important depth cues.  The parallax effect is a phenomenon that occurs due to the displacement or difference in the apparent position of an object when viewed from different angles. It is a visual cue that helps perceive depth and distance in a scene. Binocular Stereopsis
  • 13.  The parallax effect is closely related to binocular disparity, which is the primary mechanism behind binocular stereopsis (the ability to perceive depth using two eyes). When we view objects with binocular vision, each eye has a slightly different perspective, resulting in a disparity between the images captured by each eye. The brain processes these disparities to compute depth information and create a perception of three-dimensional space. Here's an example to illustrate the parallax effect: Hold your finger in front of your face and look at it first with your left eye and then with your right eye, alternating between the two. You will notice that the finger appears to shift its position relative to the background. This apparent shift is the parallax effect in action. The amount of shift or displacement is greater when the object is closer to you and smaller when it is farther away. Binocular Stereopsis
  • 14. Correspondence Matching: The brain's visual processing system compares the images from each eye and matches corresponding points or features between them. It searches for similar patterns, textures, or edges in both images to establish correspondences. Binocular Stereopsis
  • 15. Disparity Calculation: Once the corresponding points are identified, the brain measures the horizontal displacement or disparity between them. The magnitude of the disparity is proportional to the depth difference between the object and the observer. Depth Perception: By analyzing the magnitude of the disparity, the brain estimates the relative depth of objects in the visual scene. Objects that appear closer will have a larger disparity, while objects farther away will have a smaller disparity. Binocular Stereopsis
  • 16. Fusion and 3D Perception: The brain combines the information from both eyes, integrating the two slightly different perspectives into a single perception. This fusion of the images creates the perception of depth, allowing us to see the world in three dimensions. Binocular stereopsis is an important component of human vision and provides us with valuable depth cues, allowing us to navigate and interact with the environment effectively. It enables us to judge distances, perceive the relative positions of objects, and experience a sense of depth and solidity in our visual perception. In addition to human vision, binocular stereopsis has applications in fields such as computer vision and robotics. By using stereo cameras or other depth-sensing techniques, machines can replicate the principles of binocular stereopsis to perceive depth and reconstruct three-dimensional representations of the world around them. Binocular Stereopsis
  • 17. Camera geometry refers to the mathematical and physical properties that describe the behavior and characteristics of a camera. It encompasses both intrinsic and extrinsic parameters that define how the camera captures and projects the 3D world onto a 2D image. Intrinsic Parameters: Intrinsic parameters are internal to the camera and define its internal optical characteristics. These parameters include: Camera Geometry  Focal Length: The focal length determines the camera's field of view and the degree of magnification. It represents the distance between the camera's lens and the image sensor when the subject is in focus.  Principal Point: The principal point represents the optical center of the camera. It is the point where the optical axis intersects the image plane.
  • 18.  Lens Distortion: Lens distortion refers to the imperfections in the camera lens that can cause image distortions. Common types of distortion include radial distortion (barrel or pincushion distortion) and tangential distortion. Camera Geometry Tangential Distortion: Tangential distortion is a different type of distortion that occurs due to misalignments or irregularities in the lens elements. It causes the image to appear skewed or stretched asymmetrically, typically in a non-linear manner. Tangential distortion can result from factors such as slight tilting or displacement of the lens elements or inconsistencies in lens manufacturing.
  • 19. Radial Distortion: Radial distortion refers to the distortion that occurs when straight lines near the edges of an image appear curved or bent. It is caused by imperfections in the lens that cause light rays to refract differently depending on their distance from the center of the lens. Radial distortion is typically classified into two subtypes: Camera Geometry Barrel Distortion: Barrel distortion causes straight lines to curve outward, resembling the shape of a barrel. It occurs when the outer portions of the image are magnified more than the center. This distortion is commonly observed in wide- angle lenses. Pincushion Distortion: Pincushion distortion causes straight lines to curve inward, resembling the shape of a pincushion. It occurs when the center of the image is magnified more than the outer portions. Pincushion distortion is often observed in telephoto lenses.
  • 20. Extrinsic Parameters: Extrinsic parameters describe the position and orientation of the camera in the 3D world. These parameters include: Camera Center: The camera center, also known as the optical center or camera position, represents the location of the camera's optical axis in the 3D world. Camera Pose: The camera pose describes the position (translation) and orientation (rotation) of the camera relative to a reference coordinate system. Projection Model: The projection model defines how the 3D world is projected onto the 2D image plane. The most common projection model used is the pinhole camera model, which assumes a perspective projection. It assumes that light rays pass through a single point (pinhole) in the camera and project onto the image plane. Camera Calibration: Camera calibration is the process of determining the intrinsic and extrinsic parameters of a camera. It involves capturing calibration images with known calibration patterns, such as a chessboard, and using mathematical algorithms to estimate the camera parameters. Camera Geometry
  • 21. Understanding camera geometry and its parameters is crucial for various applications, including computer vision, 3D reconstruction, camera calibration, augmented reality, and robotics. By accurately modeling the camera's behavior, it becomes possible to interpret and manipulate images and accurately estimate the position and geometry of objects in the 3D world. Camera Geometry
  • 22. Epipolar geometry is a fundamental concept in computer vision and stereo imaging that describes the geometric relationship between two camera views observing the same scene. It provides constraints on the possible locations of corresponding points in the two images, enabling depth estimation and 3D reconstruction. Epipolar geometry include Epipolar Geometry Epipole: The epipole is a point that represents the projection of one camera center onto the image plane of the other camera. It is the point of intersection between the line connecting the camera centers (baseline) and the image plane. Each camera has its own epipole in the other camera's image.𝑒𝑟 Here 𝑒𝑙 and 𝑒𝑟 is the epipole
  • 23. Epipolar Plane: The epipolar plane is a 3D plane that contains the baseline (the line connecting the camera centers) and any point in the 3D scene. It represents the possible locations of corresponding points in the two camera views. Epipolar Geometry
  • 24. Epipolar line: The epipolar line is the straight line of intersection of the epipolar plane with the image plane. It is the image in one camera of a ray through the optical center and image point in the other camera. All epipolar lines intersect at the epipole. Epipolar Geometry