This document is a report on real-time 3D segmentation authored by three students - Gunjan Kumar Singh, Saurabh Bhardwaj, and Divya Sanghi. It was prepared for Practice School-I at CEERI under the guidance of Dr. Jagdish Raheja. The report describes an algorithm for segmenting cluttered 3D scenes in real-time by first segmenting depth images into surface patches and then combining surface patches into object hypotheses using adjacency, co-planarity, and curvature matching while handling occlusion. Code implementation details and results are also provided.
Perceptual Weights Based On Local Energy For Image Quality AssessmentCSCJournals
This paper proposes an image quality metric that can effectively measure the quality of an image that correlates well with human judgment on the appearance of the image. The present work adds a new dimension to the structural approach based full-reference image quality assessment for gray scale images. The proposed method assigns more weight to the distortions present in the visual regions of interest of the reference (original) image than to the distortions present in the other regions of the image, referred to as perceptual weights. The perceptual features and their weights are computed based on the local energy modeling of the original image. The proposed model is validated using the image database provided by LIVE (Laboratory for Image & Video Engineering, The University of Texas at Austin) based on the evaluation metrics as suggested in the video quality experts group (VQEG) Phase I FR-TV test.
Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...IJERA Editor
In area of video compression, Motion Estimation is one of the most important modules and play an important role
to design and implementation of any the video encoder. It consumes more than 85% of video encoding time due to
searching of a candidate block in the search window of the reference frame. Various block matching methods have
been developed to minimize the search time. In this context, Adaptive Rood Pattern Search is one of the less
expensive block matching methods, which is widely acceptable for better Motion Estimation in video data
processing. In this paper we have proposed to optimize the macro block size used in adaptive rood pattern search
method for improvement in motion estimation.
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONSijcseit
Object segmentation plays an important role in human visual perception, medical image processing and content based image retrieval. It provides information for recognition and interpretation. This paper uses mathematical morphology for image segmentation. Object segmentation is difficult because one usually does not know a priori what type of object exists in an image, how many different shapes are there and what regions the image has. To carryout discrimination and segmentation several innovative segmentation methods, based on morphology are proposed. The present study proposes segmentation method based on multiscale morphological reconstructions. Various sizes of structuring elements have been used to segment simple and complex shapes. It enhances local boundaries that may lead to improve segmentation accuracy.The method is tested on various datasets and results shows that it can be used for both interactive and automatic segmentation.
COMPUTER VISION PERFORMANCE AND IMAGE QUALITY METRICS: A RECIPROCAL RELATION csandit
Computer vision algorithms are essential components of many systems in operation today. Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image distortions on object segmentation, tracking and detection, and analyze the predictability of this impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual or augmented reality scenarios. These shortcomings can be overcome by integrating computer vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
Iris recognition systems have attracted much attention for their uniqueness, stability and reliability. However, performance of this system depends on quality of iris image. Therefore there is a need to select good quality images before features can be extracted. In this paper, iris
quality is done by assessing the effect of standard deviation, contrast, area ratio, occlusion,blur, dilation and sharpness on iris images. A fusion method based on principal component analysis (PCA) is proposed to determine the quality score. CASIA, IID and UBIRIS databases are used to test the proposed algorithm. SVM was used to evaluate the performance of the
proposed quality algorithm. . The experimental results demonstrated that the proposed algorithm yields an efficiency of over 84 % and 90 % Correct Rate and Area under the Curve respectively. The use of character component to assess quality has been found to be sufficient for quality detection.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Comparative Study and Analysis of Image Inpainting TechniquesIOSR Journals
Abstract: Image inpainting is a technique to fill missing region or reconstruct damage area from an image.It
removes an undesirable object from an image in visually plausible way.For filling the part of image, it use
information from the neighboring area. In this dissertation work, we present a Examplar based method for
filling in the missing information in an image, which takes structure synthesis and texture sysnthesis together.
In exemplar based approach it used local information from an image to patch propagation.We have also
implement Nonlocal Mean approach for exemplar based image inpainting.In Nonlocal mean approach it find
multiple samples of best exemplar patches for patch propagation and weight their contribution according to
their similarity to the neighborhood under evaluation. We have further extended this algorithm by considering
collaborative filtering method to synthesize and propagate with multiple samples of best exemplar patches. We
have to preformed experiment on many images and found that our algorithm successfully inpaint the target
region.We have tested the accuracy of our algorithm by finding parameter like PSNR and compared PSNR
value for all three different approaches.
Keywords: Texture Synthesis, Structure Synthesis, Patch Propagation ,imageinpainting ,nonlocal approach,
collabrative filtering.
Perceptual Weights Based On Local Energy For Image Quality AssessmentCSCJournals
This paper proposes an image quality metric that can effectively measure the quality of an image that correlates well with human judgment on the appearance of the image. The present work adds a new dimension to the structural approach based full-reference image quality assessment for gray scale images. The proposed method assigns more weight to the distortions present in the visual regions of interest of the reference (original) image than to the distortions present in the other regions of the image, referred to as perceptual weights. The perceptual features and their weights are computed based on the local energy modeling of the original image. The proposed model is validated using the image database provided by LIVE (Laboratory for Image & Video Engineering, The University of Texas at Austin) based on the evaluation metrics as suggested in the video quality experts group (VQEG) Phase I FR-TV test.
Optimization of Macro Block Size for Adaptive Rood Pattern Search Block Match...IJERA Editor
In area of video compression, Motion Estimation is one of the most important modules and play an important role
to design and implementation of any the video encoder. It consumes more than 85% of video encoding time due to
searching of a candidate block in the search window of the reference frame. Various block matching methods have
been developed to minimize the search time. In this context, Adaptive Rood Pattern Search is one of the less
expensive block matching methods, which is widely acceptable for better Motion Estimation in video data
processing. In this paper we have proposed to optimize the macro block size used in adaptive rood pattern search
method for improvement in motion estimation.
OBJECT SEGMENTATION USING MULTISCALE MORPHOLOGICAL OPERATIONSijcseit
Object segmentation plays an important role in human visual perception, medical image processing and content based image retrieval. It provides information for recognition and interpretation. This paper uses mathematical morphology for image segmentation. Object segmentation is difficult because one usually does not know a priori what type of object exists in an image, how many different shapes are there and what regions the image has. To carryout discrimination and segmentation several innovative segmentation methods, based on morphology are proposed. The present study proposes segmentation method based on multiscale morphological reconstructions. Various sizes of structuring elements have been used to segment simple and complex shapes. It enhances local boundaries that may lead to improve segmentation accuracy.The method is tested on various datasets and results shows that it can be used for both interactive and automatic segmentation.
COMPUTER VISION PERFORMANCE AND IMAGE QUALITY METRICS: A RECIPROCAL RELATION csandit
Computer vision algorithms are essential components of many systems in operation today. Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image distortions on object segmentation, tracking and detection, and analyze the predictability of this impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual or augmented reality scenarios. These shortcomings can be overcome by integrating computer vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
Iris recognition systems have attracted much attention for their uniqueness, stability and reliability. However, performance of this system depends on quality of iris image. Therefore there is a need to select good quality images before features can be extracted. In this paper, iris
quality is done by assessing the effect of standard deviation, contrast, area ratio, occlusion,blur, dilation and sharpness on iris images. A fusion method based on principal component analysis (PCA) is proposed to determine the quality score. CASIA, IID and UBIRIS databases are used to test the proposed algorithm. SVM was used to evaluate the performance of the
proposed quality algorithm. . The experimental results demonstrated that the proposed algorithm yields an efficiency of over 84 % and 90 % Correct Rate and Area under the Curve respectively. The use of character component to assess quality has been found to be sufficient for quality detection.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Comparative Study and Analysis of Image Inpainting TechniquesIOSR Journals
Abstract: Image inpainting is a technique to fill missing region or reconstruct damage area from an image.It
removes an undesirable object from an image in visually plausible way.For filling the part of image, it use
information from the neighboring area. In this dissertation work, we present a Examplar based method for
filling in the missing information in an image, which takes structure synthesis and texture sysnthesis together.
In exemplar based approach it used local information from an image to patch propagation.We have also
implement Nonlocal Mean approach for exemplar based image inpainting.In Nonlocal mean approach it find
multiple samples of best exemplar patches for patch propagation and weight their contribution according to
their similarity to the neighborhood under evaluation. We have further extended this algorithm by considering
collaborative filtering method to synthesize and propagate with multiple samples of best exemplar patches. We
have to preformed experiment on many images and found that our algorithm successfully inpaint the target
region.We have tested the accuracy of our algorithm by finding parameter like PSNR and compared PSNR
value for all three different approaches.
Keywords: Texture Synthesis, Structure Synthesis, Patch Propagation ,imageinpainting ,nonlocal approach,
collabrative filtering.
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEijcsa
Mosaicing is blending together of several arbitrarily shaped images to form one large balanced image such
that boundaries between the original images are not seen. Image mosaicing creates a large field of view
using of scene and the result image can be used for texture mapping of a 3D environment too. Blended
image has become a wide necessity in images captured from real time sensor devices, bio-medical
equipment, satellite images from space, aerospace, security systems, brain mapping, genetics etc. Idea
behind this work is to automate the Image Mosaicing System so that blending may be fast, easy and
efficient even if large number of images are considered. This work also provides an analysis of blending
over images containing different kinds of distortion and noise which further enhances the quality of the
system and make the system more reliable and robust.
Soft computing is likely to play aprogressively important role in many applications including image enhancement. The paradigm for soft computing is the human mind. The soft computing critique has been particularly strong with fuzzy logic. The fuzzy logic is facts representationas a
rule for management of uncertainty. Inthis paperthe Multi-Dimensional optimized problem is addressed by discussing the optimal thresholding usingfuzzyentropyfor Image enhancement. This technique is compared with bi-level and multi-level thresholding and obtained optimal
thresholding values for different levels of speckle noisy and low contrasted images. The fuzzy entropy method has produced better results compared to bi-level and multi-level thresholding techniques.
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion RatioCSCJournals
We intend to make a 3D model using a stereo pair of images by using a novel method of local matching in pixel domain for calculating horizontal disparities. We also find the occlusion ratio using the stereo pair followed by the use of The Edge Detection and Image SegmentatiON (EDISON) system, on one the images, which provides a complete toolbox for discontinuity preserving filtering, segmentation and edge detection. Instead of assigning a disparity value to each pixel, a disparity plane is assigned to each segment. We then warp the segment disparities to the original image to get our final 3D viewing Model.
A Fuzzy Set Approach for Edge DetectionCSCJournals
Image segmentation is one of the most studied problems in image analysis, computer vision, pattern recognition etc. Edge detection is a discontinuity based approach used for image segmentation. In this paper, an edge detection using fuzzy set is proposed, where an image is considered as a fuzzy set and pixels are taken as elements of fuzzy set. The fuzzy approach converts the color image to a partially segmented image; finally an edge detector is convolved over the partially segmented image to obtain an edged image. The approach is implemented using MATLAB 7.11. (R2010b). For qualitative and quantitative comparison, BSD (Berkeley Segmentation Database) images are used for experimentation. Performance parameters used are PSNR (dB) and Performance ratio (PR) of true to false edges. It has been shown that the proposed approach performs better than Canny’s edge detection algorithm under almost all scenarios. The proposed approach reduces false edge detection and double edges.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...CSCJournals
In this paper, the proposed approach consists of mainly three important steps: preprocessing, gridding and segmentation of micro array images. Initially, the microarray image is preprocessed using filtering and morphological operators and it is given for gridding to fit a grid on the images using hill-climbing algorithm. Subsequently, the segmentation is carried out using the fuzzy c-means clustering. Initially the enhanced fuzzy c-means clustering algorithm (EFCMC) is implemented to effectively clustering the image whether the image may be affected by the noises or not. Then, the EFCM method was employed the real microarray images and noisy microarray images in order to investigate the efficiency of the segmentation. Finally, the segmentation efficiency of the proposed approach was compared with the various algorithms in terms of quality index and the obtained results ensures that the performance efficiency of the proposed algorithm was improved in term of quality index rather than other algorithms.
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...CSCJournals
In this paper a new approach for invariant recognition of broken rectangular biscuits is proposed using fuzzy membership-distance products, called fuzzy moment descriptors. The existing methods for recognition of flawed rectangular biscuits are mostly based on Hough transform. However these methods are prone to error due to noise and/or variation in illumination. Fuzzy moment descriptors are less sensitive to noise thus making it an effective approach invariant to the above stray external disturbances. Further, the normalization and sorting of the moment vectors make it a size and rotation invariant recognition process .In earlier studies fuzzy moment descriptors has successfully been applied in image matching problem. In this paper the algorithm is applied in recognition of flawed and non-flawed rectangular biscuits. In general the proposed algorithm has potential applications in industrial quality control.
Image enhancement is one of the challenging issues in image processing. The objective of Image enhancement is to process an image so that result is more suitable than original image for specific application. Digital image enhancement techniques provide a lot of choices for improving the visual quality of images. Appropriate choice of such techniques is very important. This paper will provide an overview and analysis of different techniques commonly used for image enhancement. Image enhancement plays a fundamental role in vision applications. Recently much work is completed in the field of images enhancement. Many techniques have previously been proposed up to now for enhancing the digital images. In this paper, a survey on various image enhancement techniques has been done.
Statistical Feature based Blind Classifier for JPEG Image Splice Detectionrahulmonikasharma
Digital imaging, image forgery and its forensics have become an established field of research now days. Digital imaging is used to enhance and restore images to make them more meaningful while image forgery is done to produce fake facts by tampering images. Digital forensics is then required to examine the questioned images and classify them as authentic or tampered. This paper aims to design and implement a blind classifier to classify original and spliced Joint Photographic Experts Group (JPEG) images. Classifier is based on statistical features obtained by exploiting image compression artifacts which are extracted as Blocking Artifact Characteristics Matrix. The experimental results have shown that the proposed classifier outperforms the existing one. It gives improved performance in terms of accuracy and area under curve while classifying images. It supports .bmp and .tiff file formats and is fairly robust to noise.
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidRajyalakshmi Reddy
This paper presented a simple and efficient algorithm
for multi-focus image fusion, which used a
multiresolution signal decomposition scheme called
Laplacian pyramid method. The principle of Laplacian
pyramid transform is introduced, and based on it the
fusion strategy is described in detail. By analyzing the
experimental results, it showed that this method has
good performance, and the quality of the fused image is
better than the results of other methods
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEijcsa
Mosaicing is blending together of several arbitrarily shaped images to form one large balanced image such
that boundaries between the original images are not seen. Image mosaicing creates a large field of view
using of scene and the result image can be used for texture mapping of a 3D environment too. Blended
image has become a wide necessity in images captured from real time sensor devices, bio-medical
equipment, satellite images from space, aerospace, security systems, brain mapping, genetics etc. Idea
behind this work is to automate the Image Mosaicing System so that blending may be fast, easy and
efficient even if large number of images are considered. This work also provides an analysis of blending
over images containing different kinds of distortion and noise which further enhances the quality of the
system and make the system more reliable and robust.
Soft computing is likely to play aprogressively important role in many applications including image enhancement. The paradigm for soft computing is the human mind. The soft computing critique has been particularly strong with fuzzy logic. The fuzzy logic is facts representationas a
rule for management of uncertainty. Inthis paperthe Multi-Dimensional optimized problem is addressed by discussing the optimal thresholding usingfuzzyentropyfor Image enhancement. This technique is compared with bi-level and multi-level thresholding and obtained optimal
thresholding values for different levels of speckle noisy and low contrasted images. The fuzzy entropy method has produced better results compared to bi-level and multi-level thresholding techniques.
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion RatioCSCJournals
We intend to make a 3D model using a stereo pair of images by using a novel method of local matching in pixel domain for calculating horizontal disparities. We also find the occlusion ratio using the stereo pair followed by the use of The Edge Detection and Image SegmentatiON (EDISON) system, on one the images, which provides a complete toolbox for discontinuity preserving filtering, segmentation and edge detection. Instead of assigning a disparity value to each pixel, a disparity plane is assigned to each segment. We then warp the segment disparities to the original image to get our final 3D viewing Model.
A Fuzzy Set Approach for Edge DetectionCSCJournals
Image segmentation is one of the most studied problems in image analysis, computer vision, pattern recognition etc. Edge detection is a discontinuity based approach used for image segmentation. In this paper, an edge detection using fuzzy set is proposed, where an image is considered as a fuzzy set and pixels are taken as elements of fuzzy set. The fuzzy approach converts the color image to a partially segmented image; finally an edge detector is convolved over the partially segmented image to obtain an edged image. The approach is implemented using MATLAB 7.11. (R2010b). For qualitative and quantitative comparison, BSD (Berkeley Segmentation Database) images are used for experimentation. Performance parameters used are PSNR (dB) and Performance ratio (PR) of true to false edges. It has been shown that the proposed approach performs better than Canny’s edge detection algorithm under almost all scenarios. The proposed approach reduces false edge detection and double edges.
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONijistjournal
A different image fusion algorithm based on self organizing feature map is proposed in this paper, aiming to produce quality images. Image Fusion is to integrate complementary and redundant information from multiple images of the same scene to create a single composite image that contains all the important features of the original images. The resulting fused image will thus be more suitable for human and machine perception or for further image processing tasks. The existing fusion techniques based on either direct operation on pixels or segments fail to produce fused images of the required quality and are mostly application based. The existing segmentation algorithms become complicated and time consuming when multiple images are to be fused. A new method of segmenting and fusion of gray scale images adopting Self organizing Feature Maps(SOM) is proposed in this paper. The Self Organizing Feature Maps is adopted to produce multiple slices of the source and reference images based on various combination of gray scale and can dynamically fused depending on the application. The proposed technique is adopted and analyzed for fusion of multiple images. The technique is robust in the sense that there will be no loss in information due to the property of Self Organizing Feature Maps; noise removal in the source images done during processing stage and fusion of multiple images is dynamically done to get the desired results. Experimental results demonstrate that, for the quality multifocus image fusion, the proposed method performs better than some popular image fusion methods in both subjective and objective qualities.
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...CSCJournals
In this paper, the proposed approach consists of mainly three important steps: preprocessing, gridding and segmentation of micro array images. Initially, the microarray image is preprocessed using filtering and morphological operators and it is given for gridding to fit a grid on the images using hill-climbing algorithm. Subsequently, the segmentation is carried out using the fuzzy c-means clustering. Initially the enhanced fuzzy c-means clustering algorithm (EFCMC) is implemented to effectively clustering the image whether the image may be affected by the noises or not. Then, the EFCM method was employed the real microarray images and noisy microarray images in order to investigate the efficiency of the segmentation. Finally, the segmentation efficiency of the proposed approach was compared with the various algorithms in terms of quality index and the obtained results ensures that the performance efficiency of the proposed algorithm was improved in term of quality index rather than other algorithms.
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...CSCJournals
In this paper a new approach for invariant recognition of broken rectangular biscuits is proposed using fuzzy membership-distance products, called fuzzy moment descriptors. The existing methods for recognition of flawed rectangular biscuits are mostly based on Hough transform. However these methods are prone to error due to noise and/or variation in illumination. Fuzzy moment descriptors are less sensitive to noise thus making it an effective approach invariant to the above stray external disturbances. Further, the normalization and sorting of the moment vectors make it a size and rotation invariant recognition process .In earlier studies fuzzy moment descriptors has successfully been applied in image matching problem. In this paper the algorithm is applied in recognition of flawed and non-flawed rectangular biscuits. In general the proposed algorithm has potential applications in industrial quality control.
Image enhancement is one of the challenging issues in image processing. The objective of Image enhancement is to process an image so that result is more suitable than original image for specific application. Digital image enhancement techniques provide a lot of choices for improving the visual quality of images. Appropriate choice of such techniques is very important. This paper will provide an overview and analysis of different techniques commonly used for image enhancement. Image enhancement plays a fundamental role in vision applications. Recently much work is completed in the field of images enhancement. Many techniques have previously been proposed up to now for enhancing the digital images. In this paper, a survey on various image enhancement techniques has been done.
Statistical Feature based Blind Classifier for JPEG Image Splice Detectionrahulmonikasharma
Digital imaging, image forgery and its forensics have become an established field of research now days. Digital imaging is used to enhance and restore images to make them more meaningful while image forgery is done to produce fake facts by tampering images. Digital forensics is then required to examine the questioned images and classify them as authentic or tampered. This paper aims to design and implement a blind classifier to classify original and spliced Joint Photographic Experts Group (JPEG) images. Classifier is based on statistical features obtained by exploiting image compression artifacts which are extracted as Blocking Artifact Characteristics Matrix. The experimental results have shown that the proposed classifier outperforms the existing one. It gives improved performance in terms of accuracy and area under curve while classifying images. It supports .bmp and .tiff file formats and is fairly robust to noise.
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidRajyalakshmi Reddy
This paper presented a simple and efficient algorithm
for multi-focus image fusion, which used a
multiresolution signal decomposition scheme called
Laplacian pyramid method. The principle of Laplacian
pyramid transform is introduced, and based on it the
fusion strategy is described in detail. By analyzing the
experimental results, it showed that this method has
good performance, and the quality of the fused image is
better than the results of other methods
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
An interactive image segmentation using multiple user inputªseSAT Journals
Abstract In this paper, we consider the Interactive image segmentation with multiple user inputs. The proposed system is the use of multiple intuitive user inputs to better reflect the user’s intention. The use of multiple types of intuitive inputs provides the user’s intention under different scenario. The proposed method is developed as a combined segmentation and editing tool. It incorporates a simple user interface and a fast and reliable segmentation based on 1D segment matching. The user is required to click just a few "control points" on the desired object border, and let the algorithm complete the rest. The user can then edit the result by adding, removing and moving control points, where each interaction follows by an automatic, real-time segmentation by the algorithm. Interactive image segmentation involves a proposed algorithm, Constrained Random walks algorithm. The Constrained Random Walks algorithm facilitates the use of three types of user inputs. 1. Foreground and Background seed input 2. Soft Constraint input 3. Hard Constraint input. The effectiveness of the proposed method is validated by experimental results. The proposed algorithm is algorithmically simple, efficient and less time consuming. Keywords: Interactive image segmentation, Interactive image segmentation, digital image editing, multiple user inputs, random walks algorithm.
An algorithm to quantify the swelling by reconstructing 3D model of the face with stereo images is presented. We
analyzed the primary problems in computational stereo, which include correspondence and depth calculation. Work has been carried out to determine suitable methods for depth estimation and standardizing volume estimations. Finally we designed software for reconstructing 3D images from 2D stereo images, which was built on Matlab and Visual C++. Utilizing
techniques from multi-view geometry, a 3D model of the face was constructed and refined. An explicit analysis of the stereo
disparity calculation methods and filter elimination disparity estimation for increasing reliability of the disparity map was
used. Minimizing variability in position by using more precise positioning techniques and resources will increase the accuracy of this technique and is a focus for future work
Implementation of Object Tracking for Real Time VideoIDES Editor
Real-time tracking of object boundaries is an
important task in many vision applications. Here we propose
an approach to implement the level set method. This approach
does not need to solve any partial differential equations (PDFs),
thus reducing the computation dramatically compared with
optimized narrow band techniques proposed before. With our
approach, real-time level-set based video tracking can be
achieved.
The efficiency and quality of a feature descriptor are critical to the user experience of many computer vision applications. However, the existing descriptors are either too computationally expensive to achieve real-time performance, or not sufficiently distinctive to identify correct matches from a large database with various transformations. In this paper, we propose a highly efficient and distinctive binary descriptor, called local difference binary (LDB). LDB directly computes a binary string for an image patch using simple intensity and gradient difference tests on pair wise grid cells within the patch. A multiple-gridding strategy and a salient bit-selection method are applied to capture the distinct patterns of the patch at different spatial granularities. Experimental results demonstrate that compared to the existing state-of-the-art binary descriptors, primarily designed for speed, LDB has similar construction efficiency, while achieving a greater accuracy and faster speed for mobile object recognition and tracking tasks.
Super-resolution (SR) is the process of obtaining a high resolution (HR) image or
a sequence of HR images from a set of low resolution (LR) observations. The block
matching algorithms used for motion estimation to obtain motion vectors between the
frames in Super-resolution. The implementation and comparison of two different types of
block matching algorithms viz. Exhaustive Search (ES) and Spiral Search (SS) are
discussed. Advantages of each algorithm are given in terms of motion estimation
computational complexity and Peak Signal to Noise Ratio (PSNR). The Spiral Search
algorithm achieves PSNR close to that of Exhaustive Search at less computation time than
that of Exhaustive Search. The algorithms that are evaluated in this paper are widely used
in video super-resolution and also have been used in implementing various video standards
like H.263, MPEG4, H.264.
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...sipij
Efficient and efficient multiple object segmentation is an important task in computer vision and object recognition. In this work; we address a method to effectively discover a user’s concept when multiple objects of interest are involved in content based image retrieval. The proposed method incorporate a framework for multiple object retrieval using semi-supervised method of similar region merging and flood fill which models the spatial and appearance relations among image pixels. To improve the effectiveness of similarity based region merging we propose a new similarity based object retrieval. The users only need to roughly indicate the after which steps desired objects contour is obtained during the automatic merging of similar regions. A novel similarity based region merging mechanism is proposed to guide the merging process with the help of mean shift technique and objects detection using region labeling and flood fill. A region R is merged with its adjacent regions Q if Q has highest similarity with Q (using Bhattacharyya descriptor) among all Q’s adjacent regions. The proposed method automatically merges the regions that are initially segmented through mean shift technique, and then effectively extracts the object contour by merging all similar regions. Extensive experiments are performed on 12 object classes (224 images total) show promising results.
Integration of poses to enhance the shape of the object tracking from a singl...eSAT Journals
Abstract In computer vision, tracking human pose has received a growing attention in recent years. The existing methods used multi-view videos and camera calibrations to enhance the shape of the object in 3D view. In this paper, tracking and partial reconstruction of the shape of the object from a single view video is identified. The goal of the proposed integrated method is to detect the movement of a person more accurately in 2D view. The integrated method is a combination of Silhouette based pose estimation and Scene flow based pose estimation. The silhouette based pose estimation is used to enhance the shape of the object for 3D reconstruction and scene flow based pose estimation is used to capture the size as well as the stability of the object. By integrating these two poses, the accurate shape of the object has been calculated from a single view video. Keywords: Pose Estimation, optical Flow, Silhouette, Object Reconstruction, 3D Objects
1. A REPORT
ON
Realtime 3D segmentation
By
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
AT
CSIR-Central Electronics Engineering Research Institute
Pilani-333031
A Practice School-I station of
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI
23rd
May - 17th
July, 2014
2. A REPORT
ON
Realtime 3D segmentation
By
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
Prepared in the partial fulfilment of
Practice School-I
(BITS F221)
Under the guidance of
DR. JAGDISH RAHEJA
PRINCIPAL SCIENTIST, DIGITAL SYSTEMS GROUP
AT
CSIR-Central Electronics Engineering Research Institute (CEERI)
Pilani-333031
A Practice School-I station of
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI
23rd
May - 17th
July, 2014
3. BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE
PILANI (RAJASTHAN)
Practice School Division
Station: CENTRAL ELECTRONICS ENGINEERING RESEARCH INSTITUTE (CEERI) Centre: Pilani
Duration: From: 23rd
May, 2014 To: 17th
July, 2014
Date of Submission: 15th
July, 2014
Title of the Project: REALTIME 3D SEGMENTATION
Name of the Student ID No. Discipline
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
Name of the Expert: Dr. Jagdish Raheja Designation: Principal Scientist
Name of the PS Faculty: Mr. Parikshit Kishor Singh
Key words: Object segmentation, occlusion check, Graph, Adjacency Matrix
Project Area: 3D Image Processing
Abstract: A real-time algorithm that segments unstructured and highly cluttered scenes is discussed in
this paper. The algorithm robustly separates objects of unknown shape in congested scenes of stacked
and occluded objects. The model-free approach finds smooth surface patches, using a depth image
from a Kinect camera, which are subsequently combined to form highly probable object hypotheses.
Co-planarity and curvature matching is used to recombine surfaces separated by occlusion. The real-
time capabilities are proven and the quality of the algorithm is evaluated on a benchmark database.
Advantages compared to existing approaches as well as weaknesses are discussed.
Date
15th
July 2014
6. 1
Acknowledgement
Firstly, we are very grateful to Practice School Division (PSD), BITS-Pilani for
providing us an opportunity to pursue our Practice School-1(PS-1) under guidance of
eminent scientists at Central Electronics Engineering Research Institute (CEERI), Pilani.
We would like to express our sincere thanks to Dr. Chandrashekhar, Director,
CEERI, Pilani for giving us the opportunity to carry out a project in this esteemed
organization. We would also like to thank Dr. J.L. Raheja our Project Guide for
suggesting us the project and providing us valuable guidance and support throughout
our work. We would like to extend our gratitude to Ms. Zeba and all others who were
directly or indirectly related to this project.
We are grateful to Mr. Vinod Verma for helping us in our daily attendance and
support throughout our tenure in CEERI. We would also like to thank our PS-1
instructor, Mr. Parikshit Kishor Singh, for being a constant source of guidance and
motivation for us.
7. 2
Introduction
In computer vision, image segmentation is the process of partitioning a digital image into
multiple segments (sets of pixels, also known as super-pixels). The goal
of segmentation is to simplify and/or change the representation of an image into
something that is more meaningful and easier to analyze.
In the present work the model-free and real-time capable segmentation approach
presented in the previous work of the authors is extended to a general probabilistic
framework, which considers multimodal cues in a uniform manner. The algorithm
combines two segmentation methods: the identification of smooth object surfaces and the
composition of these surfaces into sensible object hypotheses.
In this work, region growing is replaced by connected component analysis and motion
sensitive temporal smoothing is implemented to avoid the motion blur effect.
While the high level segmentation extracted support planes and decomposed the
remaining blobs using binary space partitioning; the second contribution introduced the
idea of composing cutfree neighboring surfaces. In the current work, graph-cut is being
applied on a probabilistically weighted similarity graph considering adjacency, curvature
and co-planarity of found surface patches to enable the method to handle occluded and
open curved objects.
Additionally, the algorithms are further optimized for real-time challenges. The main
advantage of the method, in contrast to existing ones, is the capability to unknown,
stacked, nearby, and occluded objects in a model-free manner. Naturally, this approach
has its limitations compared to model-based approaches, especially if very complex
objects heaps are to be considered. However, it provides a meaningful initial object
hypothesis in arbitrary situations, which can be refined by active exploration or used as
input to model-based adaptive methods.
The probabilistic nature of method allows to focus these methods to selectively
disambiguate uncertain object hypotheses. The algorithm operates in real-time facilitating
interactive usage in human-robot-cooperation tasks.
8. 3
Pre-Segmentation
The objective of the first processing step is to segment the depth image into regions of
(smoothly curved) surfaces, continuously enclosed by sharp object edges. We deal with
depth images as they possess low noise levels. Additionally, the raw depth image is
transformed into a 3D point cloud, which is represented w.r.t. a robot-defined coordinate
frame.
a)Median Filter : Median filter is the first step in implementing this work. It is used to
remove noise from image (if any). In this method we construct a mask of N × N
dimension, where N is an odd number. Generally a 3 × 3 mask is preferred. The mask
consists of all the 8 neighbors of the concerned pixel including the pixel itself. All the
pixel values in the mask are then sorted and the concerned pixel is replaced with the
median value of the mask. Median filter is preferred over Box filter because it replaces
the pixel value with the value of one of its neighbors while box filter may produce a
value which is nowhere in the entire image.
b) Determination of Surface Normals and Temporal Smoothing: As a basis for
computing “surface normal edges”, surface normals for every image point are determined
from the plane spanned by three points in the 3 ×3 neighborhood of the considered
central image point using cross product.
The determination of surface normals is directly performed on the raw depth image,
instead of the 3D point cloud. That is, the 2D image coordinates are augmented by the
depth value to yield valid three-dimensional vectors. This procedure yields much more
distinct changes of the normal direction at the boundary of objects, because the
smoothing effect due to 3D projection is avoided.
In order to reduce sensor noise and to obtain smooth and stable surface normal
estimations, a three stage smoothing procedure is applied.
First a 3 × 3 median filter has been applied earlier to the raw depth image. Secondly, a
motion sensitive temporal smoothing is used, averaging depth values of all individual
image pixels within the last n = 6 frames, if the difference of the depth values is smaller
than d = 10. The normal are calculated by taking the co-ordinates of concerned pixels and
any of its two neighbor pixels. Now, normal can be calculated by taking cross product of
the two vectors.Finally, the calculated normals are smoothed applying a convolution
using a 5 × 5 Gaussian kernel.
9. 4
c) Detection of Surface Normal Edges: The next step is the fast detection of surface
normal edges, which is based on the computation of the scalar product of adjacent surface
normals. To obtain clear, uninterrupted edges, suitable for subsequent application of a
region growing algorithm, we look for edges in all eight directions defined by the
neighboring pixels of a point, i.e. north (N), east (E), south (S), west (W), as well as NE,
SE, SW, NW.
The final result of the edge filter is obtained from averaging the results of all eight scalar
products. While large values, close to one, correspond to flat surfaces, smaller values
indicate increasingly sharp object edges.
Finally, binarizing the obtained edge image by employing a threshold value θmax = 0.85
(31.8o
), we can easily separate edges from smoothly curved surfaces.
Object edges are clearly visible as bold lines as shown in figure below. On the left the
actual depth map is shown while right picture shows the detected edges.While smooth
and large surfaces form homogeneous white regions. Some false edges may still be
detected due to noise.
However those regions are small and disjointed and thus can be easily filtered out in
subsequent processing steps.
d) Segmentation into Surface Patches: Finally, a fast connected component analysis
algorithm is applied. The fast surface patch segmentation based on normal edges already
provides a detailed segmentation of the scene into surface patches, and is then employed
in the subsequent object segmentation step. The algorithm consists of two iterations of
processing which are explained with the help of flow charts below.
10. 5
1st Iteration
Image
Scan image pixel by pixel
Check top and left neighbors
If both neighbors are
boundary or background
points
Pixel is not an edge or
background
Assign a
new label
If one neighbor is boundary
or background
Assign the
label of other
neighbor
If neither neighbor is
boundary or background
Assign the label
of neighbor with
smaller value
Assign the
larger label as
child of smaller
label
11. 6
2nd
Iteration
Both the iterations are also explained below with
the help of a sample image. Suppose we have an
image as shown to the right. The black pixels
represent the boundary. First of all we visit the first
pixel after making sure that it is not a boundary
pixel, we assign first label to it. Then moving to
next pixel, We check its left neighbor (as it doesn’t
has any top neighbor). Since the neighbor is not a
boundary pixel so we give it the same label as the
label. After that we encounter a boundary and after
boundary we need to generate a new label. So new
label is generated and it is assigned to the pixel as
shown in image below(left). Proceeding similarly
me assign labels to all the pixels (non boundary) in
the 1st
row.
Scan image pixel by pixel
Pixel is
labelled
Get label’s parent
Assign parent’s label in place of
child’s.
12. 7
Similarly we mark pixels in the 2nd
row as well comparing the current pixel with its top
and left neighbor.
In the third row we encounter a pixel(just below label 2) whose top neighbor has label 2
while the left neighbor has pixel value 1. So the lower value is assigned to it ( upper right
figure). We proceed similarly and at the end of 1st
iteration we get an image like the one
shown below.
13. 8
2nd
iteration is all about merging the different labels which construct same surface like in
the image shown above, labels 1 and 2 represents the same plane so they must be
represented by a single label. In this iteration we merge such labels with the minimum of
the two values. We first point out a pixel where the top or right pixel has the different
label than the current pixel. Then we replace all the pixels of that region with the
minimum value of the two labels. Lower left image shows the image after merging pixels
with label 1 and 2 while the lower right image shows the resulting image after completion
of merging.
14. 9
High-Level Object Segmentation
In the second processing block, the aim is segmentation on an object level, which means
that the previously found surface patches need to be combined to form proper object
regions. A weighted graph is created, modeling the probability of two surface patches
belonging to the same object region.
Subsequently this graph structure is analyzed to find the most probable segmentation into
object regions using a graph cut algorithm. Co-planarity and curvature cues are also
employed to successfully combine objects patches which are separated due to occlusion.
a) Adjacency Matrix and Assignment of Edge Points:
An initial adjacency matrix representing the basic connectivity of surface patches is
determined as follows: For every edge point pr all neighboring surface points pi within a
radius r in image space are considered, which have a Euclidean distance ||pr –pi|| smaller
than a threshold dmax. All possible surface pairs obtained from this list are marked as
adjacent.
For example, in the given figure, surfaces 2
and 10 are neighbors in image space, but not
in 3D space and therefore aren’t considered
adjacent. Faces 7,9,11 fulfill the conditions
and become connected in the graph.
b) Cutfree Neighbors: To further improve the adjacency matrix, a plausible heuristic
check is applied. The central idea of this heuristic approach is that visible surface patches
are part of an object’s outer hull, such that points belonging to this object should either lie
on the one or the other side of the associated plane. If we conversely find enough points
on both sides of the plane, we assume two (or more) separated objects and split the blob
into two blobs for further processing.
If one surface cuts the other, such that a considerable amount of points are lying on both
sides of the former surface. For illustration, in the surfaces 4 and 12 in above figure,
while all points of face 4 are on top of the supporting face 12, the plane fitted into surface
4 cuts surface 12. Hence this surface combination is disregarded. On the other hand
surfaces 7,9,11 are all pairwise cut-free.
15. 10
c) Improving the Adjacency Matrix: In case of occlusion, a single face of an object is
separated into two parts, which will not have a link in the adjacency matrix. To overcome
this limitation, further links are added to the matrix based on additional cues, namely co-
planarity of flat faces and similar curvature of curved surfaces.
d) Co-Planarity: To check for co-planarity of two flat surfaces we proceed in two steps:
If both surfaces have similar mean normals (up to a small noise margin). Because
the normal of the spanned plane may crucially depend on the actual selection of
points, this criterion is checked for a set of 50 randomly selected triples of points.
If any of the calculated normal deviates too much, the two surfaces are not
considered coplanar.
Whether the faces are aligned, i.e. indeed span a common plane. In this case, any
plane spanned by three points from both surfaces should have a similar normal as
the two original mean normals.
Otherwise, the above described occlusion check is carried out, along several lines
connecting two randomly selected points from both surfaces. If this check is passed as
well, a corresponding link in the connectivity matrix is added for the given pair of
surfaces.
This Figure shows the
resulting graphs before and
after co-planarity extension.
While the first graph results
in four final objects, the
second graph correctly
results in three objects.
16. 11
e) Curvature Matching: In order to handle curved surface patches in a similar fashion,
their curvatures are compared. To this end, a curvature histogram is computed for every
curved surface, representing the distribution of surface normals within the surface. The
2D histogram of 11×11 bins describes the relative frequency of observing surface
normals with given x and y components. The associated frequency of z components is
determined by the fact, that normals are normalized to magnitude one.
The distance of histograms is estimated by the mutual overlap of their distributions:
D(A,B) = ∑
Exploiting normalisation of histograms:
min(a,b)=1/2*((a+b)-│a-b│)
We compute the Similarity index:
S(a,b)=1-(1/2*D(a,b))=∑ijmin(aij,bij)
It is a score between 0 and 1. Surface pairs with a score larger than h = 0.5 are
considered for recombination.
We differentiate between open curved objects and occluded curved objects. To
recombine the inner and outer surfaces of an open object (like a cup or bowl), two
conditions must be fulfilled: (1) both surfaces are neighboring in image space and (2) the
surfaces are concave and convex respectively. The first condition considers the fact,
that the calculation of the initial adjacency matrix is restricted to neighbored surfaces in
Euclidean space.
To assess the convexity / concavity of a surface, we again consider the curvature
histogram, namely the two extremal bins hmin and hmax along the major axis of the
histogram blob. Back-projecting these bins into the image space, we yield point sets Pmin
and Pmax, whose normals are mapped onto the corresponding bins.
The mean image coordinates pmin and pmax of these point sets. Accordingly, convexity
is assessed by considering the scalar product
(pmax − pmin) ・ (hmax − hmin)
between the directional vectors
formed by the extremal points in
image vs. histogram coordinates. If
this value is positive, i.e. both vectors
pointing into a similar direction, the
surface is convex, otherwise it is
concave. This is also shown in figure
17. 12
to the right.
The picture below shows the actual image, depth map and histograms of the objects in the
image. The left two histograms belongs to the lying cylinder (one for each occluded part)
third one belongs to the vertical bottle and the last one to the ball.
f) Probabilistic Object Composition (Graph Cut): The result of the previous steps is an
adjacency matrix representing a graph with edges for all possible surface combinations
arising from cut free neighborhood, co-planarity and curvature matching. This graph is
turned into a weighted graph, such that edge weights represent the strength of
connectivity between two connected nodes.
To determine the connectivity weights, initially a common weight is assigned wij = 1/n to
all edges (i, j) originating from node i. Here, n denotes the number of nodes adjacent to
node i. This results in a directed graph, where all outgoing edges of a node have the same
weight and thus the same probability for composition with this node. To create an
undirected graph, we average the weights of incoming and outgoing edges:
Wsym =
2
1
(Win + Wout )
The higher the connectivity of two nodes, the higher their
connecting weight. Exploiting the weighed graph, we set a
threshold θc = 0.5, and then apply graph cut algorithm.
Starting with individual nodes, the algorithm calculates all
connected sub graphs in ascending size and their
corresponding cuts. A cut is the set of all edges outgoing
from the sub graph and the associated costs is the sum of the
18. 13
corresponding edge weights. If the costs are smaller than θc, a cut is found and the sub
graph is extracted as a single object. If the sub-graph exceeds n/2 in size, the algorithm
aborts, because all potential cuts were considered. The figure on the previous page shows
a cut of edge with probability of .29. this cut was made as its cost was less than 0.5
(consistent with our threshold value).
This threshold balances between under- and over-segmentation. A very small value, close
to zero, generates a single segment for every initially connected sub-graph, while a very
high value generates an individual segment for every surface node. This creates sub-
graphs which represent different objects.
g) Remaining Edge Points: In the final processing step, all remaining edge points have
to be processed to obtain the final segmentation result.
Firstly, the remaining points are segmented using a region growing algorithm working in
the image plane and using the Euclidean distance as the criterion of uniformity. These
segments are then processed according to the following rules:
If a segment has no neighboring faces (caused by missing depth information), it
becomes a separate object.
If a segment has one neighboring face and comprises very few points only, they are
assigned to this neighbor.
If a segment is completely enclosed by a single neighboring face, it becomes a new
object. If it is not completely enclosed, all points are assigned to the single
neighboring region.
If a segment has more than one neighbor and all neighbors are part of a common
object, it will be assigned to this object.
If a segment has more than one neighbor corresponding to different objects, all
points are assigned to the best matching neighboring plane using RANSAC.
19. 14
Code Explanation
Structures:
vctr: To hold the normalized surface normal calculated at each pixel. It contains
the x, y and z components in float data type.
coordinate: To structure to hold co-ordinates of different pixels.
list: It has fields of label, node_no and a pointer next. label holds the label value
assigned. node_no is used to count the number of nodes so it acts as a counter. next
is just a pointer to the next field.
arr: it has two fields value and no. while value contains an integer label, no. acts as
a counter.
Functions:
vec :It takes as input the pointer of an array of 3 coordinates, then calculates the
difference between them (to calculate vector along surface). It stores the difference
of the x, y and z components in an array and calls the CrossProduct function. It
returns the surface normal of the type vctr.
CrossProduct : It takes as input 2 arrays which contain the x,y and z
components of 2 vectors and calculates the cross product to return the surface
normal.
edge : It takes as input pointers to 2 vctr and calculates the dot product between
them. This is done to check the angle between two surface normal.
create : It creates a list of all the labels that are assigned. I has 2 pointers head and
temp. Head points to the first element of list while temp points to the lastly added
node. It takes as input double pointers to head and temp pointer, pointer to an
integer label and an integer n.
When the list is empty, a new node is created. Values of label and n pointers are
assigned to label and node_no respectively. Head and temp, both are pointed at the
same label (this created label).
20. 15
When the list is non-empty, the same thing happens except that now head points at
the first node and temp points at the last node. Temp is created so that for adding a
new element, the list has not to be traversed every time. It returns a pointer to
temp.
replace : It takes as input 2 integer pointers and the list created using create
function. It compares the value of the 2 values the pointers are pointing at and
replaces the bigger value by the smaller value in the list’s label field.
dist : It calculate the distance between 2 coordinate points.
del : This function takes both head and temp pointer and deletes the list created for
storing labels, node by node. As the list was created dynamically, it must be
deleted at the end of the program by the programmer. Compiler is not responsible
for this task anymore.
add: add function takes a pointer to an array, an integer value (size of that array)
and pointer to an integer(label). It adds the label to the array if it is not already
present in the array to keep a count of the number of different labels in the image.
mod : This function takes two integer values and one integer pointer. It calculates
the absolute difference between the two integer values and stores it at the location
pointed by the integer variable.
thinning : explained in a subsequent section.
thinningIteration : explained in a subsequent section.
Insertionsort : For sorting which is used to find the median of the values in the
mask of median filer.
21. 16
Thinning Algorithm
The method for extracting the skeleton of a picture consists of removing all the contour
points of the picture except those points that belong to the skeleton. In order to preserve
the connectivity of the skeleton, each iteration is divided into two sub-iterations.
In the first sub-iteration, the contour point P1 is deleted from the digital pattern if it
satisfies the following conditions:
(a) 2 ≤ B(P1) ≤ 6
(b) A(P1)= 1
(C) P2*P4*P6 = 0
(d) P4*P6*P8 = 0
where A(P1) is the number of 01 patterns in the ordered set P2, P3, P4, - • - P8, P9 that
are the eight neighbors of P1 (Figure 1), and B(P1) is the number of nonzero neighbors of
P1, that is, B(P1) = P2 + P3 + P4 + • • • + P8 + P9. If any condition is not satisfied, e.g.,
A(P1) = 2 P1 is not deleted from the picture.
In the second sub-iteration, only conditions (c) and (d) are changed as follows:
(c') P2*P4*P8 = 0
(d') P2*P6*P8 = 0
and the rest remain the same.
By conditions (c) and (d) of the first sub-iteration, only the south-east boundary points
and the north-west corner points which do not belong to an ideal skeleton are removed.
By condition (a), the endpoints of a skeleton line are preserved. Also, condition (b)
prevents the deletion of those points that lie between the endpoints of a skeleton
line, as shown in Figure 5. The iterations continue until no more points can be removed.
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
This is like a boundary of some binary image. In the first gray kernel, B(P1)=1 hence it
won’t be deleted. In the second kernel A(P1) =2 so it also won’t be deleted. In this way
the single pixel boundary is preserved.
22. 17
Main function
In the main function we first load the source image and apply median filter on it. An
array of integers, window is created. It is given the 9 values of the 3×3 kernel centered at
each pixel. This array is sorted using insertion sort and then value at the center of the
kernel is replaced by the fifth value in sorted order.
Now we loop through the image in a for loop. At each pixel, we store in an array of
coordinate , the coordinates of the point and right and down neighbors. This array is then
passed to the function vec which then returns the normal. This normal is stored in an
array of vctr. So after looping through the entire image, in a 2d array of vctr we have the
unit normal at each point.
Now we take the dot product of a normal from all 8 neighbors one by one using the dot
function and sore the values in an array of floats. Then the average of the 8 dot products
is taken. If the average value is smaller than the threshold ( dot product is smaller means
that angle is larger) then we mark the point as boundary point(black) and rest points as
white or the other way round(just a matter of convention ) .
Hence after looping through the entire image we have the edges in the image. But as the
edges are quite thick, a thinning algorithm would also be applied to reduce the thickness
of edges to one pixel only. The thinning algorithm is explained in a later section. After
completion of all this we get a binary image.
Our next step is to identify the set of pixels in a connected region as a single pixel. For
this we define an array of pointers which can point to integer values having exactly the
same size as that of the original image. The algorithm for assigning labels to the non-
boundary pixels has already been discussed.
To label each pixel with a value, we follow the Connected Component labelling code
explained in the first part. It proceeds as follows:
If the pixel is a boundary or background point, continue the loop without doing
anything and assign in patch matrix the pointer an integer whose value is 255.
Else, at x=y=0, assign a new label and create a node that stores this label value.
Also pointer to this value is stored in the patch matrix.
In the first row or column, only one of the left or top neighbors is available
respectively. In that case if the available neighbor is a boundary point increment
23. 18
the value of label, create a new node and assign it. Also store the pointer in patch
matrix.
If either the top or left neighbor of the concerned pixel is a boundary point, assign
the label of the other neighbor and store pointer to that in the patch matrix.
If both the neighbors are boundary, increment the value of label and assign it to the
pixel.
If both neighbors are non-boundary points, then assign the smaller label of the two
and store that the bigger value is a child of the smaller value.
So in main function as soon as we assign a new label, we call the function create which
adds a node to the current linked list if already a list exists and creates a first node if no
list exists at the time of assignment of label. Then we make the corresponding pointers
point to the label stored in that list. As further and further labels are assigned, the list
grows in size and when the last pixel is processed, we get a complete list having the value
of all the assigned labels.
For merging the labels such that regions which are not completely disjoint have the same
label, we follow this approach (in the patch matrix):
If the pixel is first row or column, or has both top and left neighbors as boundary
or is itself a boundary, continue the loop. It is so because in the first row or column
the pixel will have either a new value or a value same as it’s available neighbor. In
either case, it won’t be involved directly in change of values or call to replace. It
would only be involved as a neighbor.
Else, if the top or left neighbor is a boundary, and the value at the pixel is not equal
to the value of other neighbor use the replace function and replace the pointer of
the larger value by the smaller value. In this way these components would be
merged.
Else if neither neighbor is boundary, then by the rule of assignment of labels the
label value has to be equal to the value of either the top or left neighbor. If the
value is equal to the left neighbor, then call replace function on the top neighbor
and the current pixel. In this way these regions will be joined.
24. 19
So by now we have identified different surfaces in a scene. Now since the objects are
made up of surfaces so we need to combine the surfaces that belong to the same object.
For this we visit each boundary pixel in the patch matrix and construct a circle of unit
radius (consider all the neighbors at a distance if one unit to the concerned pixel) and then
we follow the following approach.
If the neighboring pixels contains more than two boundary pixels then we continue
through the loop and skip that pixel as that pixel is at the boundary of more than
two surfaces and all these surfaces will be merged in subsequent steps.
If the neighboring pixels have less than or equal to two neighbors then for each
non-boundary pixel we call the mod function which returns the absolute difference
between the depth values of the concerned neighbor and the concerned boundary
pixel. The difference value for each non-boundary pixel is added and averaged.
If the average if less than a threshold (20) in our case, the minimum value of label
among all the neighbor is assigned to all the non-boundary pixels.
If the average is more than the specified threshold the program considers them as
surface belonging to different objects and hence leave them as they are.
After this step the surfaces having the same patch form a single object.
Now we come to the part of 3d projection. Here a right and a left stereo image is taken.
Both the images are split into their RGB channels using split function. Then another array
is created in which the red component from left image and blue and green component
from right image is taken and merged. This creates an anaglyph which when viewed from
red-cyan glasses will create the perception of depth.
25. 20
Results
This section presents the final results that were obtained from the applied algorithm. First
the input image is shown with five objects in it and then the segmented images for each
object are shown.
26. 21
But these results represent just the 2D output. So to convert it into 3D output we use the
technique of anaglyph.
Anaglyph
since we did not have the RGB image of the scene shown above so below we have shown
the results of our function tested on a different set of images. To the left, left image is
shown while the right image is shown to the right. Anaglyph is shown a the bottom.
27. 22
Conclusion
• In this paper, extension of model-free segmentation algorithm for cluttered scenes
which is not restricted by a given set of object models or world knowledge.
• A fast algorithm to determine object edges using edge detection on surface normals
was combined with a novel graph-based method to combine surface patches to
form highly probable object hypotheses.
• Coplanarity checks and curvature matching were added to handle occluded and
open curved objects.
• The algorithm can deal with stacked, nearby, and occluded objects, which is
achieved by finding object edges in depth images and the novel idea to identify
adjacent and cut-free surface patches, as well as coplanar surfaces, separated by
occlusion, which can be combined to form object regions.
• The algorithm was evaluated w.r.t. real-time capabilities and segmentation quality.
28. 23
References
• Realtime 3D Segmentation for Human-Robot Interaction ,Andre U¨ ckermann,
Robert Haschke and Helge Ritter
• Real-Time 3D Segmentation of Cluttered Scenes for Robot Grasping, A. U¨
ckermann, R. Haschke, H. Ritter,
• A Fast Parallel Algorithm for Thinning Digital Patterns, T. Y. Zhaung and C. Y.
Suen
• http://www.aishack.in/2010/03/labelling-connected-components-example
• Digital Image Processing 1st
edition, S.Sridhar
• Digital Image Processing 2nd
edition, R.C. Gonzalez, R.E.Woods
• Object Oriented Programming with c++ 6th
edition, E Balagurusamy
• Datastructures with c 2nd
edition, Yashwant Kanetkar
• Learning OpenCV , G.Bradski, A. Kaehler