The document discusses using surface normal vectors for real-time object detection in autonomous driving applications. The goals are to:
1. Develop a stixel-based stereo vision module running at 15-30 fps for detecting objects and estimating their 3D positions.
2. Validate hypothesis regions of interest (ROIs) using surface normal vectors to improve precision by 10%.
3. Analyze object geometry features and classify objects using surface normal vectors.
Three key points about structure from motion:
1. Given multiple images of 3D points, structure from motion aims to estimate the 3D structure and camera motion from 2D point correspondences across images.
2. For affine cameras, factorization methods can be used to decompose the measurement matrix and obtain the motion and structure matrices up to an affine ambiguity.
3. For projective cameras, an iterative procedure alternates between factorization to estimate motion/structure and re-solving for depths to handle the projective ambiguity. At least 7 point correspondences are needed for a two-camera case.
This document provides an agenda for a presentation on geostatistics for mineral deposits. The presentation will cover topics such as sampling, geostatistics part 1 and 2, and estimations. It will include breaks between sessions and conclude with a discussion period. Sampling topics include an overview of sampling theory and nomographs, while geostatistics sessions will cover variograms, kriging, and simulations. Estimation methods like inverse distance, kriging, and recoverable resources will also be discussed.
This document discusses methods for identifying and removing hidden surfaces when rendering 3D scenes to create a realistic 2D image. It describes two approaches: object-space methods that compare whole objects, and image-space methods that decide visibility point-by-point. It focuses on the depth buffer/z-buffer method, which processes surfaces one point at a time, comparing depth values to determine visibility and store the color of visible points. It also discusses using scan line coherence to solve hidden surfaces one scan line at a time from top to bottom.
This document discusses two-view geometry and epipolar constraints. It defines key concepts like epipolar planes, epipoles, epipolar lines, and the baseline. It explains that corresponding points in two images must lie on corresponding epipolar lines. It describes how the essential and fundamental matrices encode the epipolar geometry and constrain correspondences. It introduces algorithms like the eight-point algorithm and normalized eight-point algorithm to estimate the fundamental matrix from point correspondences. It concludes by explaining how camera calibration allows estimating the essential matrix and extrinsic camera parameters from the fundamental matrix.
This document provides an overview of geostatistics and variogram analysis. It discusses how the variogram describes the spatial correlation of a phenomenon through parameters like the nugget effect and range. Experimental variograms are calculated from data and theoretical models like spherical, exponential, and power models are fitted. The variogram can identify different correlation scales through nested models. Components at different scales can be extracted through kriging. As an example, fertility data from France is analyzed to filter its large-scale spatial structure.
The document describes how snakes, or active contours, can be used to model shapes in images. It discusses how snakes work by defining an energy function along a curve and minimizing that energy to find the optimal curve. The energy includes an internal term based on curvature and an external term from image features. Level sets are used to propagate the curves towards the minimum energy configuration using gradient descent. Key steps include modeling the shape as a curve, defining the energy function, deriving the curve to minimize energy via calculus of variations, and propagating the curves using level sets.
The document summarizes key concepts in image formation, including how light interacts with objects and lenses to form images, and how different imaging systems like the human eye and digital cameras work. It discusses factors that affect image quality such as point spread functions and noise. Methods for analyzing the effects of noise propagation and algorithms on image quality are presented, such as error propagation techniques and Monte Carlo simulations.
Advanced Approach for Slopes Measurement by Non - Contact Optical TechniqueIJERA Editor
This document describes an advanced non-contact optical technique for measuring slopes. It introduces a numerical computation method to acquire surface shapes using optical moiré reflection. The method uses coherent illumination and fine pitch gratings to project a reference grating onto a surface and observe interference fringes. The sensitivity and accuracy of this slope measurement method is high. Equations are derived that relate the measured slopes to the optical and geometric parameters of the system.
Three key points about structure from motion:
1. Given multiple images of 3D points, structure from motion aims to estimate the 3D structure and camera motion from 2D point correspondences across images.
2. For affine cameras, factorization methods can be used to decompose the measurement matrix and obtain the motion and structure matrices up to an affine ambiguity.
3. For projective cameras, an iterative procedure alternates between factorization to estimate motion/structure and re-solving for depths to handle the projective ambiguity. At least 7 point correspondences are needed for a two-camera case.
This document provides an agenda for a presentation on geostatistics for mineral deposits. The presentation will cover topics such as sampling, geostatistics part 1 and 2, and estimations. It will include breaks between sessions and conclude with a discussion period. Sampling topics include an overview of sampling theory and nomographs, while geostatistics sessions will cover variograms, kriging, and simulations. Estimation methods like inverse distance, kriging, and recoverable resources will also be discussed.
This document discusses methods for identifying and removing hidden surfaces when rendering 3D scenes to create a realistic 2D image. It describes two approaches: object-space methods that compare whole objects, and image-space methods that decide visibility point-by-point. It focuses on the depth buffer/z-buffer method, which processes surfaces one point at a time, comparing depth values to determine visibility and store the color of visible points. It also discusses using scan line coherence to solve hidden surfaces one scan line at a time from top to bottom.
This document discusses two-view geometry and epipolar constraints. It defines key concepts like epipolar planes, epipoles, epipolar lines, and the baseline. It explains that corresponding points in two images must lie on corresponding epipolar lines. It describes how the essential and fundamental matrices encode the epipolar geometry and constrain correspondences. It introduces algorithms like the eight-point algorithm and normalized eight-point algorithm to estimate the fundamental matrix from point correspondences. It concludes by explaining how camera calibration allows estimating the essential matrix and extrinsic camera parameters from the fundamental matrix.
This document provides an overview of geostatistics and variogram analysis. It discusses how the variogram describes the spatial correlation of a phenomenon through parameters like the nugget effect and range. Experimental variograms are calculated from data and theoretical models like spherical, exponential, and power models are fitted. The variogram can identify different correlation scales through nested models. Components at different scales can be extracted through kriging. As an example, fertility data from France is analyzed to filter its large-scale spatial structure.
The document describes how snakes, or active contours, can be used to model shapes in images. It discusses how snakes work by defining an energy function along a curve and minimizing that energy to find the optimal curve. The energy includes an internal term based on curvature and an external term from image features. Level sets are used to propagate the curves towards the minimum energy configuration using gradient descent. Key steps include modeling the shape as a curve, defining the energy function, deriving the curve to minimize energy via calculus of variations, and propagating the curves using level sets.
The document summarizes key concepts in image formation, including how light interacts with objects and lenses to form images, and how different imaging systems like the human eye and digital cameras work. It discusses factors that affect image quality such as point spread functions and noise. Methods for analyzing the effects of noise propagation and algorithms on image quality are presented, such as error propagation techniques and Monte Carlo simulations.
Advanced Approach for Slopes Measurement by Non - Contact Optical TechniqueIJERA Editor
This document describes an advanced non-contact optical technique for measuring slopes. It introduces a numerical computation method to acquire surface shapes using optical moiré reflection. The method uses coherent illumination and fine pitch gratings to project a reference grating onto a surface and observe interference fringes. The sensitivity and accuracy of this slope measurement method is high. Equations are derived that relate the measured slopes to the optical and geometric parameters of the system.
Harris corner detector provides rotation invariant feature detection by analyzing the eigenvalues of the Hessian matrix computed at each point. Scale invariant detectors like SIFT find maxima of scale-space functions like the Laplacian of Gaussian or Difference of Gaussians to identify keypoints independently across scale. Affine invariant detectors search for intensity extrema along radial lines from seed points and approximate corresponding image regions with ellipses related by the geometric moment invariants. Descriptors aim to provide distinctive yet invariant representations of local image patches centered on detected keypoints to enable reliable matching across variations.
This document discusses feature extraction and edge detection techniques in computer vision. It provides details on:
1) Edge detection methods including first and second derivative operators, Sobel edge detector, Laplacian of Gaussian (LoG), and Canny edge detector.
2) Edge descriptors such as edge normal, direction, position, and strength.
3) Types of edges like step, ramp, line, and roof edges.
4) Corner detection using an eigenvalue analysis of the gradient matrix within a neighborhood.
Use of Specularities and Motion in the Extraction of Surface ShapeDamian T. Gordon
This document discusses using specular reflections or "highlights" and motion to determine surface shape. It describes structured highlight inspection which uses a spherical array of point light sources and images of highlights to calculate surface orientation at each point. A structured highlight inspection system extracts highlights from images and uses lookup tables from calibration to reconstruct the 3D surface shape. Stereo highlight techniques can improve on approximations by using two camera views to uniquely determine illumination angles.
This document discusses techniques for fitting parametric models to sets of image features. It begins by introducing the concept of fitting, where a simple parametric model (like a line or circle) is used to represent multiple detected features. It describes how fitting involves choosing the best model, assigning features to model instances, and determining the number of instances. The document then discusses specific issues that arise, such as noise, outliers, missing data, and model selection. It presents techniques for line fitting, including least squares, total least squares, and robust methods like RANSAC. It also discusses fitting general curves and conics using least squares approaches.
Setting the lower order bit plane to zero would have the effect of reducing the number of distinct gray levels by half. This would cause the histogram to become more peaked, with more pixels concentrated in fewer bins.
This document discusses moments of inertia, which are a measure of an object's resistance to rotational acceleration about an axis. It defines the moment of inertia of an area and introduces key concepts like the parallel axis theorem, radius of gyration, and calculating moments of inertia through integration or for composite areas. Examples are provided to demonstrate calculating moments of inertia for various shapes, including rectangles, triangles, L-shapes, and composite profiles, about different axes. The document also covers determining moments of inertia at the centroidal axes versus other axes using the parallel axis theorem.
The document presents a new approach called Multi-Scale Oriented Patches (MOPS) for multi-image matching. MOPS uses Harris corners detected at multiple image scales and orientations. Descriptor vectors are generated from image patches around each interest point. The approach was tested successfully for panoramic image stitching and introduces innovations in multi-image matching by using interest points at multiple scales and orientations.
6161103 10.4 moments of inertia for an area by integrationetcenterrbru
This document discusses calculating moments of inertia for planar areas using integration. It describes:
1) Choosing a differential element for integration that has size in only one direction to simplify the calculation.
2) The procedure involves specifying a rectangular differential element and orienting it parallel or perpendicular to the axis of rotation.
3) Moments of inertia are calculated through single or double integration, depending on whether the element has thickness in one or two directions.
The document analyzes geometric distortions in imagery from the HJ-1A/B satellites and compares methods for geometric correction. It finds:
1) HJ-1A/B CCD imagery has both global and local geometric distortions even after initial correction.
2) Polynomial, thin plate splines, and finite element models were tested for correction using control points. Polynomial modeling performed worst while finite element modeling produced the best results with enough evenly distributed points.
3) Finite element modeling is recommended for precise geometric correction of HJ-1A/B imagery as it is a local method that provides stability and accuracy, especially with over 1,000 control points.
This document discusses methods for dynamically calculating daylight glare over the course of a year. It presents three methods: 1) A timestep-by-timestep RADIANCE simulation that serves as a reference method but is very computationally intensive. 2) A simplified daylight glare probability (DGPs) method based only on vertical eye illuminance, similar to average luminance methods. 3) An enhanced simplified DGP method that also considers simplified images to account for peak glare sources in addition to vertical eye illuminance. The enhanced method is validated against full-year RADIANCE simulations using different shading systems. A histogram analysis and glare rating classification is proposed to evaluate dynamic glare results over a year.
6161103 10.5 moments of inertia for composite areasetcenterrbru
1) Moments of inertia for composite areas can be determined by dividing the area into its composite parts, finding the moment of inertia of each part about its centroidal axis and the reference axis using the parallel axis theorem, and taking the algebraic sum.
2) The procedure was demonstrated by calculating the moment of inertia of a composite area made of a rectangle and circle, and another made of three rectangles.
3) For the second example, the cross-sectional area was divided into three rectangles, the moment of inertia of each was found about the x and y axes using the parallel axis theorem, and summed to find the total moment of inertia.
ANISOTROPIC SURFACES DETECTION USING INTENSITY MAPS ACQUIRED BY AN AIRBORNE L...grssieee
The document discusses methods for estimating the spatial anisotropy of surfaces using near-infrared LiDAR intensity maps over coastal environments. It presents two estimators - one based on 1D correlations of columns and lines in sliding windows, and another based on 2D correlations of windows and their transposes. The estimators are evaluated on synthetic data with varying anisotropy, relative anisotropy, and signal-to-noise ratio. The estimators are then applied to LiDAR intensity maps from coastal areas to characterize anisotropic surfaces independently of intensity variations. Future work involves combining these methods with multi-resolution wavelet approaches and comparing LiDAR intensity to DEM and dual-polarization SAR data.
This document summarizes research analyzing the accuracy of 3D models reconstructed from spherical video images. The study acquired 134 spherical images of an indoor environment using a Garmin VIRB 360 camera. Images were extracted from video and reference points were collected. Aerial triangulation, dense image matching, and registration with TLS point clouds were performed. Results showed reconstruction accuracy was higher when reference points were distributed across images rather than clustered in the middle. However, the model had significant noise due to glass surfaces and stitching challenges. While geometric detail and accuracy were low, the 3D model could still enable some applications. Factors like calibration, stitching, resolution, and illumination variability affected the results.
This document describes work on developing spectrum-based regularization approaches for linear inverse problems. The author proposes using a learned distribution of singular values to build regularization models that are better suited for recovering signals correlated with medium frequencies, not just low frequencies as in traditional models. Algorithms are presented for learning the singular value profile from training data and for solving the resulting regularization models. Experimental results demonstrate that the proposed spectrum-learning regularization and SLR-TV hybrid models can provide improved reconstruction accuracy over total variation and Tikhonov regularization.
Determination of System Geometrical Parameters and Consistency between Scans ...David Scaduto
Digital breast tomosynthesis (DBT) requires precise knowledge of acquisition geometry for accurate image reconstruction. Further, image subtraction techniques employed in dual-energy contrast-enhanced tomosynthesis require that scans be performed under nearly identical geometrical conditions. A geometrical calibration algorithm is developed to investigate system geometry and geometrical consistency of image acquisition between consecutive digital breast tomosynthesis scans, according to requirements for dual-energy contrast-enhanced tomosynthesis. Investigation of geometrical accuracy and consistency on a prototype DBT unit reveals accurate angular measurement, but potentially clinically significant differences in acquisition angles between scans. Further, a slight gantry wobble is observed, suggesting the need for incorporation of gantry wobble into image reconstruction, or improvements to system hardware.
Photogrammetry - Space Resection by Collinearity EquationsAhmed Nassar
Space resection is commonly used to determine the exterior orientation parameters (which refers to position and orientation related to an exterior coordinate system) associated with one or more photos based on measurements of ground control points (GCPs). space resection is a nonlinear problem, existing methods involve linearization of the collinearity condition and the use of an iterative process to determine the final solution using the least-squares method. The process also requires initial approximate values of the unknown parameters, some of which must be estimated by another least-squares solution.
Detailed Description on Cross Entropy Loss Function범준 김
The document discusses cross entropy loss function which is commonly used in classification problems. It derives the theoretical basis for cross entropy by formulating it as minimizing the cross entropy between the predicted probabilities and true labels. For binary classification problems, cross entropy is shown to be equivalent to maximizing the likelihood of the training data which can be written as minimizing the binary cross entropy. This concept is extended to multiclass classification problems by defining the prediction as a probability distribution over classes and label as a one-hot encoding.
This document discusses digital image processing of satellite images. It describes how satellite images are represented digitally as pixels with brightness values. It outlines three main categories of image processing: image rectification and restoration to correct distortions; enhancement to improve visual interpretation; and information extraction to automate feature identification. Specific techniques discussed include image rectification, contrast enhancement, spatial filtering, edge enhancement, and band ratioing. The overall aim is to analyze satellite images both visually and quantitatively.
The frame camera is used by users of digital single-lens reflex cameras (DSLRs) as a shorthand for an image sensor format which is the same size as 35mm format (36 mm × 24 mm) film.
Panoramic imagery is created either by digitally stitching together multiple images from the same position (left/right, up/down) or by rotating a camera with conventional optics, and an area or line sensor.
- The document describes using the MicMac photogrammetry software to create orthophotographs, point clouds, and digital surface models (DSMs) from Pleiades satellite images.
- It reviews two papers on using MicMac and discusses their methods, which include image orientation, tie point calculation, sparse point cloud generation, georeferencing, and dense image correlation to produce outputs.
- The results section shows sample outputs including tie points, sparse point clouds, DEMs, shaded relief images, and orthophotographs produced for case studies in the two papers.
Optic Flow Estimation by Deep Learning outlines several key concepts in optical flow estimation including:
- Optical flow is the apparent motion of brightness patterns in images. Estimating optical flow involves making assumptions like brightness constancy and spatial coherence.
- Classical algorithms like Lucas-Kanade and Horn-Schunck use techniques like regularization, coarse-to-fine processing, and descriptor matching to address challenges like the aperture problem, large displacements, and occlusions.
- Recent deep learning approaches like FlowNet, DeepFlow, and EpicFlow use convolutional neural networks to directly learn optical flow, achieving state-of-the-art performance on benchmarks. These approaches combine descriptor matching, variational optimization,
This presentation is an analysis of the paper,"SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"
Harris corner detector provides rotation invariant feature detection by analyzing the eigenvalues of the Hessian matrix computed at each point. Scale invariant detectors like SIFT find maxima of scale-space functions like the Laplacian of Gaussian or Difference of Gaussians to identify keypoints independently across scale. Affine invariant detectors search for intensity extrema along radial lines from seed points and approximate corresponding image regions with ellipses related by the geometric moment invariants. Descriptors aim to provide distinctive yet invariant representations of local image patches centered on detected keypoints to enable reliable matching across variations.
This document discusses feature extraction and edge detection techniques in computer vision. It provides details on:
1) Edge detection methods including first and second derivative operators, Sobel edge detector, Laplacian of Gaussian (LoG), and Canny edge detector.
2) Edge descriptors such as edge normal, direction, position, and strength.
3) Types of edges like step, ramp, line, and roof edges.
4) Corner detection using an eigenvalue analysis of the gradient matrix within a neighborhood.
Use of Specularities and Motion in the Extraction of Surface ShapeDamian T. Gordon
This document discusses using specular reflections or "highlights" and motion to determine surface shape. It describes structured highlight inspection which uses a spherical array of point light sources and images of highlights to calculate surface orientation at each point. A structured highlight inspection system extracts highlights from images and uses lookup tables from calibration to reconstruct the 3D surface shape. Stereo highlight techniques can improve on approximations by using two camera views to uniquely determine illumination angles.
This document discusses techniques for fitting parametric models to sets of image features. It begins by introducing the concept of fitting, where a simple parametric model (like a line or circle) is used to represent multiple detected features. It describes how fitting involves choosing the best model, assigning features to model instances, and determining the number of instances. The document then discusses specific issues that arise, such as noise, outliers, missing data, and model selection. It presents techniques for line fitting, including least squares, total least squares, and robust methods like RANSAC. It also discusses fitting general curves and conics using least squares approaches.
Setting the lower order bit plane to zero would have the effect of reducing the number of distinct gray levels by half. This would cause the histogram to become more peaked, with more pixels concentrated in fewer bins.
This document discusses moments of inertia, which are a measure of an object's resistance to rotational acceleration about an axis. It defines the moment of inertia of an area and introduces key concepts like the parallel axis theorem, radius of gyration, and calculating moments of inertia through integration or for composite areas. Examples are provided to demonstrate calculating moments of inertia for various shapes, including rectangles, triangles, L-shapes, and composite profiles, about different axes. The document also covers determining moments of inertia at the centroidal axes versus other axes using the parallel axis theorem.
The document presents a new approach called Multi-Scale Oriented Patches (MOPS) for multi-image matching. MOPS uses Harris corners detected at multiple image scales and orientations. Descriptor vectors are generated from image patches around each interest point. The approach was tested successfully for panoramic image stitching and introduces innovations in multi-image matching by using interest points at multiple scales and orientations.
6161103 10.4 moments of inertia for an area by integrationetcenterrbru
This document discusses calculating moments of inertia for planar areas using integration. It describes:
1) Choosing a differential element for integration that has size in only one direction to simplify the calculation.
2) The procedure involves specifying a rectangular differential element and orienting it parallel or perpendicular to the axis of rotation.
3) Moments of inertia are calculated through single or double integration, depending on whether the element has thickness in one or two directions.
The document analyzes geometric distortions in imagery from the HJ-1A/B satellites and compares methods for geometric correction. It finds:
1) HJ-1A/B CCD imagery has both global and local geometric distortions even after initial correction.
2) Polynomial, thin plate splines, and finite element models were tested for correction using control points. Polynomial modeling performed worst while finite element modeling produced the best results with enough evenly distributed points.
3) Finite element modeling is recommended for precise geometric correction of HJ-1A/B imagery as it is a local method that provides stability and accuracy, especially with over 1,000 control points.
This document discusses methods for dynamically calculating daylight glare over the course of a year. It presents three methods: 1) A timestep-by-timestep RADIANCE simulation that serves as a reference method but is very computationally intensive. 2) A simplified daylight glare probability (DGPs) method based only on vertical eye illuminance, similar to average luminance methods. 3) An enhanced simplified DGP method that also considers simplified images to account for peak glare sources in addition to vertical eye illuminance. The enhanced method is validated against full-year RADIANCE simulations using different shading systems. A histogram analysis and glare rating classification is proposed to evaluate dynamic glare results over a year.
6161103 10.5 moments of inertia for composite areasetcenterrbru
1) Moments of inertia for composite areas can be determined by dividing the area into its composite parts, finding the moment of inertia of each part about its centroidal axis and the reference axis using the parallel axis theorem, and taking the algebraic sum.
2) The procedure was demonstrated by calculating the moment of inertia of a composite area made of a rectangle and circle, and another made of three rectangles.
3) For the second example, the cross-sectional area was divided into three rectangles, the moment of inertia of each was found about the x and y axes using the parallel axis theorem, and summed to find the total moment of inertia.
ANISOTROPIC SURFACES DETECTION USING INTENSITY MAPS ACQUIRED BY AN AIRBORNE L...grssieee
The document discusses methods for estimating the spatial anisotropy of surfaces using near-infrared LiDAR intensity maps over coastal environments. It presents two estimators - one based on 1D correlations of columns and lines in sliding windows, and another based on 2D correlations of windows and their transposes. The estimators are evaluated on synthetic data with varying anisotropy, relative anisotropy, and signal-to-noise ratio. The estimators are then applied to LiDAR intensity maps from coastal areas to characterize anisotropic surfaces independently of intensity variations. Future work involves combining these methods with multi-resolution wavelet approaches and comparing LiDAR intensity to DEM and dual-polarization SAR data.
This document summarizes research analyzing the accuracy of 3D models reconstructed from spherical video images. The study acquired 134 spherical images of an indoor environment using a Garmin VIRB 360 camera. Images were extracted from video and reference points were collected. Aerial triangulation, dense image matching, and registration with TLS point clouds were performed. Results showed reconstruction accuracy was higher when reference points were distributed across images rather than clustered in the middle. However, the model had significant noise due to glass surfaces and stitching challenges. While geometric detail and accuracy were low, the 3D model could still enable some applications. Factors like calibration, stitching, resolution, and illumination variability affected the results.
This document describes work on developing spectrum-based regularization approaches for linear inverse problems. The author proposes using a learned distribution of singular values to build regularization models that are better suited for recovering signals correlated with medium frequencies, not just low frequencies as in traditional models. Algorithms are presented for learning the singular value profile from training data and for solving the resulting regularization models. Experimental results demonstrate that the proposed spectrum-learning regularization and SLR-TV hybrid models can provide improved reconstruction accuracy over total variation and Tikhonov regularization.
Determination of System Geometrical Parameters and Consistency between Scans ...David Scaduto
Digital breast tomosynthesis (DBT) requires precise knowledge of acquisition geometry for accurate image reconstruction. Further, image subtraction techniques employed in dual-energy contrast-enhanced tomosynthesis require that scans be performed under nearly identical geometrical conditions. A geometrical calibration algorithm is developed to investigate system geometry and geometrical consistency of image acquisition between consecutive digital breast tomosynthesis scans, according to requirements for dual-energy contrast-enhanced tomosynthesis. Investigation of geometrical accuracy and consistency on a prototype DBT unit reveals accurate angular measurement, but potentially clinically significant differences in acquisition angles between scans. Further, a slight gantry wobble is observed, suggesting the need for incorporation of gantry wobble into image reconstruction, or improvements to system hardware.
Photogrammetry - Space Resection by Collinearity EquationsAhmed Nassar
Space resection is commonly used to determine the exterior orientation parameters (which refers to position and orientation related to an exterior coordinate system) associated with one or more photos based on measurements of ground control points (GCPs). space resection is a nonlinear problem, existing methods involve linearization of the collinearity condition and the use of an iterative process to determine the final solution using the least-squares method. The process also requires initial approximate values of the unknown parameters, some of which must be estimated by another least-squares solution.
Detailed Description on Cross Entropy Loss Function범준 김
The document discusses cross entropy loss function which is commonly used in classification problems. It derives the theoretical basis for cross entropy by formulating it as minimizing the cross entropy between the predicted probabilities and true labels. For binary classification problems, cross entropy is shown to be equivalent to maximizing the likelihood of the training data which can be written as minimizing the binary cross entropy. This concept is extended to multiclass classification problems by defining the prediction as a probability distribution over classes and label as a one-hot encoding.
This document discusses digital image processing of satellite images. It describes how satellite images are represented digitally as pixels with brightness values. It outlines three main categories of image processing: image rectification and restoration to correct distortions; enhancement to improve visual interpretation; and information extraction to automate feature identification. Specific techniques discussed include image rectification, contrast enhancement, spatial filtering, edge enhancement, and band ratioing. The overall aim is to analyze satellite images both visually and quantitatively.
The frame camera is used by users of digital single-lens reflex cameras (DSLRs) as a shorthand for an image sensor format which is the same size as 35mm format (36 mm × 24 mm) film.
Panoramic imagery is created either by digitally stitching together multiple images from the same position (left/right, up/down) or by rotating a camera with conventional optics, and an area or line sensor.
- The document describes using the MicMac photogrammetry software to create orthophotographs, point clouds, and digital surface models (DSMs) from Pleiades satellite images.
- It reviews two papers on using MicMac and discusses their methods, which include image orientation, tie point calculation, sparse point cloud generation, georeferencing, and dense image correlation to produce outputs.
- The results section shows sample outputs including tie points, sparse point clouds, DEMs, shaded relief images, and orthophotographs produced for case studies in the two papers.
Optic Flow Estimation by Deep Learning outlines several key concepts in optical flow estimation including:
- Optical flow is the apparent motion of brightness patterns in images. Estimating optical flow involves making assumptions like brightness constancy and spatial coherence.
- Classical algorithms like Lucas-Kanade and Horn-Schunck use techniques like regularization, coarse-to-fine processing, and descriptor matching to address challenges like the aperture problem, large displacements, and occlusions.
- Recent deep learning approaches like FlowNet, DeepFlow, and EpicFlow use convolutional neural networks to directly learn optical flow, achieving state-of-the-art performance on benchmarks. These approaches combine descriptor matching, variational optimization,
This presentation is an analysis of the paper,"SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUEScscpconf
In the first study [1], a combination of K-means, watershed segmentation method, and Difference In Strength (DIS) map were used to perform image segmentation and edge detection
tasks. We obtained an initial segmentation based on K-means clustering technique. Starting from this, we used two techniques; the first is watershed technique with new merging
procedures based on mean intensity value to segment the image regions and to detect their boundaries. The second is edge strength technique to obtain accurate edge maps of our images without using watershed method. In this technique: We solved the problem of undesirable over segmentation results produced by the watershed algorithm, when used directly with raw data images. Also, the edge maps we obtained have no broken lines on entire image. In the 2nd study level set methods are used for the implementation of curve/interface evolution under various forces. In the third study the main idea is to detect regions (objects) boundaries, to isolate and extract individual components from a medical image. This is done using an active contours to detect regions in a given image, based on techniques of curve evolution, Mumford–Shah functional for segmentation and level sets. Once we classified our images into different intensity regions based on Markov Random Field. Then we detect regions whose boundaries are not necessarily defined by gradient by minimize an energy of Mumford–Shah functional forsegmentation, where in the level set formulation, the problem becomes a mean-curvature which will stop on the desired boundary. The stopping term does not depend on the gradient of the image as in the classical active contour. The initial curve of level set can be anywhere in the image, and interior contours are automatically detected. The final image segmentation is one
closed boundary per actual region in the image.
The document discusses various techniques for image segmentation including discontinuity-based approaches, similarity-based approaches, thresholding methods, region-based segmentation using region growing and region splitting/merging. Key techniques covered include edge detection using gradient operators, the Hough transform for edge linking, optimal thresholding, and split-and-merge segmentation using quadtrees.
This document discusses algorithms for visible surface determination (VSD) to determine which surfaces are visible during 3D rendering. It describes two main approaches: image precision, which operates at the display resolution, and object precision, which operates at the object level. It also discusses techniques like the depth buffer and depth sorting algorithms. The depth buffer method uses two buffers - a depth buffer and frame buffer - to track pixel depth and color values. It processes objects and surfaces, testing pixels and updating the buffers. Depth sorting paints surfaces in order of decreasing depth to resolve visibility.
The document discusses image segmentation techniques including thresholding. Thresholding divides an image into foreground and background regions based on pixel intensity values. Global thresholding uses a single threshold value for the entire image, while adaptive or local thresholding uses variable thresholds that change across the image. Multilevel thresholding can extract objects within a specific intensity range using multiple threshold values. The Hough transform is also presented as a way to connect disjointed edge points and detect shapes like lines in an image.
Real-time large scale dense RGB-D SLAM with volumetric fusion extends KinectFusion to larger scales. It represents the volumetric reconstruction as a rolling buffer that translates as the camera moves. It estimates camera pose through combined geometric and photometric constraints. It closes loops by non-rigidly deforming the map with constraints from loop closures and jointly optimizes the camera poses and map. Evaluation shows it produces large, globally consistent, real-time dense reconstructions.
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
This document discusses several papers related to using omnidirectional/fisheye camera views for autonomous driving applications. The papers propose methods for tasks like image classification, object detection, scene understanding from 360 degree camera data. Specific approaches discussed include graph-based classification of omnidirectional images, learning spherical convolutions for 360 degree imagery, spherical CNNs, and networks for scene understanding and 3D object detection using around view monitoring camera systems.
안녕하세요 딥러닝 논문읽기 모임 입니다! 오늘 소개할 논문은 3D관련 업무를 진행 하시는/ 희망하시는 분들의 필수 논문인 VoxelNET 입니다.
발표자료:https://www.slideshare.net/taeseonryu/mcsemultimodal-contrastive-learning-of-sentence-embeddings
안녕하세요! 딥러닝 논문읽기 모임입니다.
오늘은 자율 주행, 가정용 로봇, 증강/가상 현실과 같은 다양한 응용 분야에서 중요한 문제인 3D 포인트 클라우드에서의 객체 탐지에 대한 획기적인 진전을 소개하고자 합니다. 이를 위해 'VoxelNet'이라는 새로운 3D 탐지 네트워크에 대해 알아보겠습니다.
1. 기존 방법의 한계
기존의 많은 노력은 수동으로 만들어진 특징 표현, 예를 들어 새의 눈 시점 투영 등에 집중해 왔습니다. 하지만 이러한 방법들은 LiDAR 포인트 클라우드와 영역 제안 네트워크(RPN) 사이의 연결을 효과적으로 수행하기 어렵습니다.
2. VoxelNet의 혁신적 접근법
VoxelNet은 3D 포인트 클라우드를 위한 수동 특징 공학의 필요성을 없애고, 특징 추출과 바운딩 박스 예측을 단일 단계, end-to-end 학습 가능한 깊은 네트워크로 통합합니다. VoxelNet은 포인트 클라우드를 균일하게 배치된 3D 복셀로 나누고, 새롭게 도입된 복셀 특징 인코딩(VFE) 레이어를 통해 각 복셀 내의 포인트 그룹을 통합된 특징 표현으로 변환합니다.
3. 효과적인 기하학적 표현 학습
이 방식을 통해 포인트 클라우드는 서술적인 체적 표현으로 인코딩되며, 이는 RPN에 연결되어 탐지를 생성합니다. VoxelNet은 다양한 기하학적 구조를 가진 객체의 효과적인 구별 가능한 표현을 학습합니다.
4. 성능 평가
KITTI 자동차 탐지 벤치마크에서의 실험 결과, VoxelNet은 기존의 LiDAR 기반 3D 탐지 방법들을 큰 차이로 능가했습니다. 또한, LiDAR만을 기반으로 한 보행자와 자전거 탐지에서도 희망적인 결과를 보였습니다.
VoxelNet의 도입은 3D 포인트 클라우드에서의 객체 탐지를 혁신적으로 개선하고 있으며, 이 분야에서의 미래 발전에 중요한 영향을 미칠 것으로 기대됩니다.
오늘 논문 리뷰를 위해 이미지처리 허정원님이 자세한 리뷰를 도와주셨습니다 많은 관심 미리 감사드립니다!
https://youtu.be/yCgsCyoJoMg
본 논문은 single depth map으로부터의 정확한 3D hand pose estimation을 목표로 한다. 3D hand pose estimation은 HCI, AR등의 기술을 구현함에 있어서 매우 중요한 기술이다. 이를 위해 많은 연구자들이 정확도를 높이기 위해 여러 방법을 제시하였지만, 여전히 손가락들의 비슷한 생김새, 가려짐, 다양한 손가락의 움직임으로 인한 복잡성 때문에 정확도를 올리는데 한계가 있었다. 본 논문은 기존 방법들의 한계를 극복하기 위해 기존 방법들이 사용하는 입력 형태와 출력 형태를 바꾸었다. 2d depth image를 입력으로 받아 hand joint의 3D coordinate를 직접 regress하는 대부분의 기존 방법들과는 달리, 제안하는 모델은 3D voxelized depth map을 입력으로 받아 3D heatmap을 출력한다. 이를 위해 encoder-decoder 형식의 3D CNN을 사용하였고, 달라진 입력과 출력 형태로 인해 제안하는 모델은 널리 사용되는 3개의 3d hand pose estimation dataset, 1개의 3d human pose estimation dataset에서 가장 높은 성능을 내었다. 또한 ICCV 2017에서 주최된 HANDS 2017 challenge에서 우승 하였다.
This paper presents a technique to create panoramic video in real-time by stitching together video frames from multiple webcams. The system has two stages: an initialization stage and a real-time stage. In the initialization stage, features are detected and matched between webcam frames to compute a perspective matrix describing their geometric relationship. In the real-time stage, frames are registered and blended using the perspective matrix to display the combined wide field of view in real-time. The technique was demonstrated using two ordinary webcams and allows for inexpensive panoramic video without specialized hardware.
Data Processing Using THEOS Satellite Imagery for Disaster Monitoring (Case S...NopphawanTamkuan
This content shows the specification of THEOS/Thaichote (Thai satellite), information of flood in Vietnam, comparison of pre-disaster image (Landsat-8) and post-disaster image (THEOS) by different methods such as color composite, thresholding, and segmentation for flooded areas classification.
An Efficient Algorithm for the Segmentation of Astronomical ImagesIOSR Journals
This document proposes an efficient algorithm for segmenting celestial objects from astronomical images. The algorithm uses multiple preprocessing steps including removing bright point sources, stationary wavelet transform, total variation denoising, and adaptive histogram equalization. Level set segmentation is then used as the key technique for segmentation. Preprocessing helps overcome issues like noise, weak object edges, and low contrast. Level set segmentation can segment objects while retaining their texture and shape information for subsequent classification. The algorithm is tested on various celestial objects and shown to effectively segment them.
Digital images can be manipulated mathematically by treating pixel brightness values as numbers. This document discusses various digital image processing techniques including:
1. Image rectification to correct geometric distortions and calibrate radiometric data. This involves techniques like geometric corrections to adjust for sensor distortions and radiometric corrections to standardize brightness values.
2. Image enhancement techniques like contrast stretching to increase image contrast and improve feature detection. Methods include linear stretches, histogram equalization, and logarithmic transforms.
3. Local operations called spatial filtering that modify pixel values based on neighboring pixels. This can emphasize or de-emphasize certain image features or spatial frequencies to enhance details or reduce noise.
The document discusses refraction modeling and experiments to better understand refraction effects. Key points:
- Experiments using reciprocal simultaneous observations found refraction coefficients (k values) close to zero rather than the typical 0.13 value, indicating refraction is more complex.
- Adjusting a monitoring network found the optimal k value was -2.12 rather than the assumed 0.13, and a stochastic model accounting for zenith angle noise improved results.
- An experiment measuring a triangle closer to a wall found increasing angular errors correlated with decreasing distance from the wall, supporting an asymmetrical refraction model.
- A proposed generalized 3D refraction model accounts for gradient direction, station orientation, and as
Scan conversion algorithms convert graphical primitives defined in terms of coordinates into pixels on a raster display. The midpoint line algorithm uses integer calculations to scan convert lines of varying slopes. Area primitives like rectangles are filled by iterating through pixels within the boundary. Anti-aliasing aims to reduce jagged edges by weighting pixel intensities based on overlap with graphical elements.
The document summarizes object and face detection techniques including:
- Lowe's SIFT descriptor for specific object recognition using histograms of edge orientations.
- Viola and Jones' face detector which uses boosted classifiers with Haar-like features and an attentional cascade for fast rejection of non-faces.
- Earlier face detection work including eigenfaces, neural networks, and distribution-based methods.
Data-Driven Motion Estimation With Spatial AdaptationCSCJournals
The pel-recursive computation of 2-D optical flow raises a wealth of issues, such as the treatment of outliers, motion discontinuities and occlusion. Our proposed approach deals with these issues within a common framework. It relies on the use of a data-driven technique called Generalised Cross Validation to estimate the best regularisation scheme for a given pixel. In our model, the regularisation parameter is a general matrix whose entries can account for different sources of error. The motion vector estimation takes into consideration local image properties following a spatially adaptive approach where each moving pixel is supposed to have its own regularisation matrix. Preliminary experiments indicate that this approach provides robust estimates of the optical flow.
This document presents a summary of a research paper on shape from focus. Shape from focus is a technique that uses differences in focus levels across a series of images to obtain depth information and reconstruct the 3D shape of an object. The paper develops a sum-modified Laplacian (SML) operator to provide local measures of image focus quality. The SML operator is applied to images captured at different focus levels to determine focus measures. A depth estimation algorithm then interpolates the focus measures to obtain accurate depth estimates for each point. Results show the SML operator provides robust focus measures and the overall shape from focus approach can effectively reconstruct shapes, making it suitable for challenging visual inspection problems.
Computer Graphics - Lecture 03 - Virtual Cameras and the Transformation Pipeline💻 Anton Gerdelan
Slides from when I was teaching CS4052 Computer Graphics at Trinity College Dublin in Ireland.
These slides aren't used any more so they may as well be available to the public!
There are some mistakes in the slides, I'll try to comment below these.
Similar to Stixel based real time object detection for ADAS using surface normal (20)
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Stixel based real time object detection for ADAS using surface normal
1. Stixel-based Real Time Object detection for
ADAS using Surface Normal Vectors
CVLab. at Inha Univ.
Tae-Kang Woo
2016.12.
Keywords : Surface vector, Detection validation, Disparity confidence(Middle level representation confidence),
Stixel, Real-time ADAS, 3D reconstruction, Extrinsic parameter estimation, Object detection
2. Contents
I. Problem Definition
1. Problem
2. Goal
3. Related work
II. System Design
III. Surface Normal Vector
1. SNV map using integral image
2. Local SNV computation
IV. Super-SNV
1. S-SNV computation
2. Parametric issue
3. Adaptive mean shift
V. Evaluation
1. Test scenario & Database
2. Evaluation method
3. Experimental result & discussion
VI. Conclusion
2
Chap.1, 2
Chap.4
Chap.3
Chap.3
Chap.4
3. Introduction
❖ Flow chart of ADAS Stereo vision
3
disparity V-disparity Ground remove
In disparity map
Remove the sky that is over 2.5m based on ground
Ground line Remove below ground plane(v-disparity line)
Height constraint
Left image
Stixel Stixel
segmentation
5. Purpose of Stereo vision
5
❖ Goal
▪ Stixel-based stereo vision module for real time ADAS
• 15fps on TX1 or 30fps on PC
▪ Stable hypothesis ROI for recognition module
• Precision rate 10% improvement by removing error ROI
▪ Object geometry feature analysis & classification using SNV
• Propose 3 classes of forward situation based on Surface Normal Vector
• Hypothesis ROI validation using surface vector object classification
▪ Extrinsic parameter output between camera and ROI(ground, object)
• The representative vector of the surface vectors in the ROI is selected
6. Previous approach – object detection
6
❖ A disparity map refinement to enhance weakly-textured urban
environment data (2013)
▪ Research to overcome disparity error is most active.
▪ Define refinement term using segmentation based on edge
▪ Performance improves but takes more than 1,700ms
Result
7. Previous approach – object detection
7
❖ Disparity confidence map (2010)
▪ Confidence map based on matching cost to enable disparity validation
▪ There is a disadvantage that the reliability of the object unit is not provided
▪ Reliability is not provided for interpolated disparity estimates
Image Disparity
Confidence
8. Previous approach – object detection
8
❖ U-V Disparity Map Analysis (2010, 2015)
▪ Super pixel method based on 2D projection
▪ It is assumed that the disparity exists in the object, and the object is detected by fitting the line
of each axis after projection.
▪ Disparity Multiple errors occur when estimating to improve performance.
Test image
Result
disparity
V-disparity
U-disparity
9. Problem – object detection
❖ Hypothesis Error
9
Error Out-Noc Out-All
2 pixels 24.83 % 28.39 %
3 pixels 17.14 % 20.78 %
4 pixels 13.18 % 16.70 %
5 pixels 10.79 % 14.14 %
Deep Embed alg.
There are inevitable errors on reflective regions in spite of state-of-the-art method.
Z. Chen, X. Sun, Y. Yu, L. Wang and C. Huang: A Deep Visual Correspondence Embedding Model for Stereo Matching Costs. ICCV 2015.
10. System design
10
❖ INNOVATION : Develop validation method with physical meaning
Left image
Right image
Distance
3D position
Multiple ROI
Bounding box
NERV Object Detection
Normal-based Efficient Re-Validation
Stixel
Estimation
Stixel
Segmentation
(objectness)stixels
Stereo
Matching
Disparity
map
Hypothesis
ValidationSurface Normal
map
BB with 3D
position
Surface
normal
computation
Depth feature for
RGB-D processing
Extrinsic parameter
Camera pose
11. Hypothesis ROI validation
❖ Surface normal vector
11
A
B
C
𝐴
𝐵
𝐶
𝑁 = 𝐴𝐵 × 𝐴𝐶
A
B
C𝑁 = 𝑢 × Ԧ𝑣
𝑁 = {𝑦 𝑢 𝑧 𝑣 − 𝑧 𝑢 𝑦𝑣, 𝑥 𝑢 𝑧 𝑣 − 𝑧 𝑢 𝑥 𝑣, 𝑥 𝑢 𝑦𝑣 − 𝑦 𝑢 𝑥 𝑣}
𝑢
Ԧ𝑣
Real meter Surface Normal vector
12. Hypothesis ROI validation
❖ Surface normal at Hypothesis Error
12
A
B
C
Surface Normal vector of Error area
Disparity error
13. Hypothesis ROI validation
❖ Assumption
▪ Normal vectors can represent a difference of object attributes.
▪ Each normal vector has information : {position, direction, scale}
▪ Surface normal can be divided into 3 part ( i.e. object, ground, error )
❖ Goal
▪ Find the differences among each part of surface normal
13
Object
Surface normal
Position : {x, y, z}
Direction : {i, j, k}
Scale : s
Ground
Surface normal
Position : {x, y, z}
Direction : {i, j, k}
Scale : s
Error
Surface normal
Position : {x, y, z}
Direction : {i, j, k}
Scale : s
Class of Surface Normal vector
14. Hypothesis ROI validation
❖ Feature of normal in error region
14
High Density than others position’s
Their normal don’t have
horizontal component
15. SNV Map Computation
❖ How to compute normal vector? – Naïve SNV
15
3D point cloud from disparity
Time : 29 ms
Surface normal
• Generally, it has been considered that calculating a surface vector in an image is difficult to operate in real time.
16. SNV Map Computation
❖ How to compute normal vectors efficiently? – Integral image
16
𝒮(𝐼𝑂, 𝑚, 𝑛, 𝑟) =
1
4𝑟2
· ( 𝐼𝑂(𝑚 + 𝑟, 𝑛 + 𝑟) − 𝐼𝑂(𝑚 − 𝑟, 𝑛 + 𝑟) − 𝐼𝑂(𝑚 + 𝑟, 𝑛 − 𝑟) + 𝐼𝑂(𝑚 − 𝑟, 𝑛 − 𝑟) )
𝑢 𝑥 =
𝒫𝑥 𝑚 + 𝑟, 𝑛 − 𝒫𝑥 𝑚 − 𝑟, 𝑛
2
𝑢 𝑦 =
𝒫𝑦 𝑚 + 𝑟, 𝑛 − 𝒫𝑦 𝑚 − 𝑟, 𝑛
2
𝑢 𝑧 =
𝒮(𝐼 𝒫𝑧
, 𝑚 + 1, 𝑛, 𝑟 − 1) − 𝒮(𝐼 𝒫𝑧
, 𝑚 − 1, 𝑛, 𝑟 − 1)
2
𝑢 𝑥 =
𝒫𝑥 𝑚, 𝑛 + 𝑟 − 𝒫𝑥 𝑚, 𝑛 − 𝑟
2
𝑢 𝑦 =
𝒫𝑦 𝑚, 𝑛 + 𝑟 − 𝒫𝑦 𝑚, 𝑛 − 𝑟
2
𝑢 𝑧 =
𝒮(𝐼 𝒫𝑧
, 𝑚, 𝑛 + 1, 𝑟 − 1) − 𝒮(𝐼 𝒫𝑧
, 𝑚, 𝑛 − 1, 𝑟 − 1)
2
• where 𝒫𝑥, 𝒫𝑦, and 𝒫𝑧 are two-dimensional maps storing the x-, y-, and z-coordinates of the organized point cloud.
𝐼 𝒫𝑧
is the integral image of the z-components of the point cloud.
𝑟 means, ℛ 𝑚, 𝑛 = min(ℬ 𝑚, 𝑛 ,
𝒯 𝑚,𝑛
2
). smoothing function depending on depth and depth change
𝑁 = 𝑢 × Ԧ𝑣
Holzer, S., Rusu, R. B., Dixon, M., Gedikli, S., & Navab, N. (2012, October). Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using
integral images. In Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on (pp. 2684-2689). IEEE.
Processing time : 28 ms
At 2.26 GHz Intel(R) Core(TM)2 Quad CPU and 4 GB of RAM
VGA image 307200 pixels
Processing time : 12 ms
At 2.7 GHz Intel Core i7 CPU and 16 GB of RAM
VGA image 307200 pixels
17. SNV Map Computation
❖ How to compute normal vectors in real time? – Local SNV
17
𝑁 = 𝑢 × Ԧ𝑣
PROCESS
1. Compute SNV in stixel ROI
2. Convert coordinate to angle
3. Find the mode angle using
histogram
4. Remove outlier in adaptive
mean shift
5. Select the convergence value to
main extrinsic parameter
18. Ground information in surface vector
❖ How to compute normal vector in real time? – Local SNV
18
Original image Full search
2 pixel ground 5 pixel ground
Time: 28ms
Time: 5ms
19. Ground information in surface vector
❖ How to compute normal vector efficiently? – Local SNV + Super-SNV
19
X
Y
Z
-0.0~0.0
-1.0~-0.9
-0.1~-0.0
The angle of the surface of the object can be calculated. Automatic calculation of external parameters is possible.
Pitch angle : -1.89°
20. Direction of ground and object
20
Pitch angle : -1.89°❖ Stixel area based SNV
Direction of ground vector Direction of object vector
y
x
z
y
z
x
21. Super Surface Normal vector
❖ SSNV Selection method
21
• Although the resolution of the Cartesian coordinate reference is 0.1, the range of resolution varies by
cos−1
𝑥. Therefore, the calculation of the histogram itself will have a significant effect on the reliability.
• Therefore, the histogram is calculated by changing the x-axis in units of 𝜃.
𝜃
y
-2.5 0 2.5 5-5
0
1
Interval : 0.1°
(x, y, z) Cartesian coordinate pitch, yaw, roll angle
𝒑𝒊𝒕𝒄𝒉 ° = 𝟗𝟎° − 𝐜𝐨𝐬−𝟏
𝒛
𝒚 𝟐 + 𝒛 𝟐
22. Issue of vector interval
❖ How to set the interval of the vectors?
▪ The optimal solution from the trade-off between execution time and accuracy
▪ Super surface normal vector confidence
22
Interval : 5 pixel
Setting standard?
23. Issue of vector interval
❖ How to set the interval of the vectors? – Processing time
23
10.71
6.53
5.36
4.48
4.12
4.01
6.83
1.7
0.97
0.71
0.52
0.41
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1
2
3
4
5
6
Time(ms)
Interval(pixel)
1 2 3 4 5 6
SNV cal time 10.71 6.53 5.36 4.48 4.12 4.01
Super-SNV time 6.83 1.7 0.97 0.71 0.52 0.41
Suface normal processing time
SNV cal time Super-SNV time
• Compute the surface vector for the entire image while maintaining real time.
• Since the surface vector calculation module is a sub-module of the entire module, it can not be costly.
• It is difficult to use less than three pixels because it is available only within a vector interval of 6.6ms which
is 10% of real time.
• Will it be possible to guarantee reliability in intervals of 3 pixels or more?
Real time
boundary
6.6ms
24. Issue of vector interval
❖ How to set the interval of the vectors? – Accuracy
▪ Histogram mode
24
• The range of ± α is defined as Inlier based on the bin having the maximum value of the histogram. The
range should be 95% of the total number. Find the mean and standard deviation of the Inlier vectors.
• The mean is -1.94°, The variation is (64.06?).
mode: -1.3° Inlier: 95% range
based on mode
Inlier mean:
-1.94°
25. Issue of vector interval
❖ How to set the interval of the vectors? – Accuracy
▪ Histogram Distribution
25
• Due to the physical phenomenon of the camera, it is not possible to extract the opposite parallax and vector
on the curved surface.
Therefore, a skewness occurs in the distribution and the entire distribution is biased in one direction
• The vector of the ideal ground rather than the camera observation is expected to follow the normal
distribution, but the specimen is distorted due to camera observation.
• In this case, the representative value of the distribution is known as the average < mean value < mode, and
the median value is known to be located near the average at the point where the interval between the mean
and the mode is divided into three equal parts.
𝒉 𝒄
Optical axis
𝜽
V-FOV
Ground
Camera
26. Issue of vector interval
❖ How to set the interval of the vectors? – Accuracy
▪ Advanced mean-shift
26
1. Use the initial value of the mode to find the mean value of the inliers within the surrounding 𝑟.
In this case, 𝑟 is determined as a range including 50% of the total number. Ex) 본 예제에서 약 ±3.5°
2. Perform STEP 1. again based on the average value.
3. Repeat STEP 1. and 2. until the average converges to 0.01° or less.
Init value: -1.3° Inlier: 50% range
based on center
r
First step:-1.13°
27. Issue of vector interval
❖ How to set the interval of the vectors? – Accuracy
▪ Advanced mean-shift
27
• Confidence measure: entropy
𝐻 𝑋 = 𝐸 𝐼 𝑋 =
1
𝐾
𝑃 𝑋 = 𝑘 ln(
1
𝑃 𝑋 = 𝑘
) = −
1
𝐾
𝑃 𝑋 = 𝑘 ln 𝑃 𝑋 = 𝑘
Init value: -1.3° Inlier: 50% range
based on center
r
First step:-1.13°
𝐻 𝑚𝑒𝑎𝑛 ≥ 𝐻 𝑚𝑜𝑑𝑒 > 𝐻 𝑚𝑒𝑎𝑛𝑠ℎ𝑖𝑓𝑡
28. Experiment Result
❖ Surface Normal Result
28
Point cloud
Pitch histogram Ground dir
Normal vector
Disparity
Left image
32. Experiment result
❖ Result on KITTI Dataset
32
Stixel only Stixel with SNV
Number
of object
9873 9873
True positive 9579 9562
False positive 1805 396
False negative 294 311
Precision 0.841 0.960
Recall 0.970 0.968
𝐹1 measure 0.901 0.964
True Positive
False Negative
False Positive
Removed
SNV
True Positive
True Positive
False Positive
False Positive
STIXEL
Aver time: 24ms
33. Discussion about Experiment result
❖ Discussion on KITTI Dataset
33
Stixel only Stixel with SNV
Number
of object
9873 9873
True positive 9579 9562
False positive 1805 396
False negative 294 311
Precision 0.841 0.960
Recall 0.970 0.968
𝐹1 measure 0.901 0.964
True Positive
False Negative
False Positive
Removed
SNV
True Positive
True Positive
False Positive
False Positive
STIXEL
It should not be
removed, but it
was removed
because there is
a lot of ground
vector
It should be removed there is a lot of ground vector
It should be removed, but it was not removed
because there is a lot of object vector
36. Conclusion
36
❖Hypothesis ROI validation
• It use ‘Surface normal’ to find the difference direction between other object.
• Surface normal can be computed by Global or Local method.
• This method depends on only two inputs that are ‘Disparity map’ and ‘Bounding box’.
• So It can be utilized to any 3D recognition system for their result validation.
• This method appears to solve the disparity error on reflective region.
• Also, In global method, Surface normal map can be used to recognition module.
❖Future work
• I will develop a 3D ROI for ADAS based on collision risk analysis.
pixel
distance 5m