SlideShare a Scribd company logo
A REPORT
ON
Realtime 3D segmentation
By
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
AT
CSIR-Central Electronics Engineering Research Institute
Pilani-333031
A Practice School-I station of
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI
23rd
May - 17th
July, 2014
A REPORT
ON
Realtime 3D segmentation
By
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
Prepared in the partial fulfilment of
Practice School-I
(BITS F221)
Under the guidance of
DR. JAGDISH RAHEJA
PRINCIPAL SCIENTIST, DIGITAL SYSTEMS GROUP
AT
CSIR-Central Electronics Engineering Research Institute (CEERI)
Pilani-333031
A Practice School-I station of
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI
23rd
May - 17th
July, 2014
BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE
PILANI (RAJASTHAN)
Practice School Division
Station: CENTRAL ELECTRONICS ENGINEERING RESEARCH INSTITUTE (CEERI) Centre: Pilani
Duration: From: 23rd
May, 2014 To: 17th
July, 2014
Date of Submission: 15th
July, 2014
Title of the Project: REALTIME 3D SEGMENTATION
Name of the Student ID No. Discipline
Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science
Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science
Name of the Expert: Dr. Jagdish Raheja Designation: Principal Scientist
Name of the PS Faculty: Mr. Parikshit Kishor Singh
Key words: Object segmentation, occlusion check, Graph, Adjacency Matrix
Project Area: 3D Image Processing
Abstract: A real-time algorithm that segments unstructured and highly cluttered scenes is discussed in
this paper. The algorithm robustly separates objects of unknown shape in congested scenes of stacked
and occluded objects. The model-free approach finds smooth surface patches, using a depth image
from a Kinect camera, which are subsequently combined to form highly probable object hypotheses.
Co-planarity and curvature matching is used to recombine surfaces separated by occlusion. The real-
time capabilities are proven and the quality of the algorithm is evaluated on a benchmark database.
Advantages compared to existing approaches as well as weaknesses are discussed.
Date
15th
July 2014
Table of Contents
Topic Page No.
Acknowledgement..................... .................................................. 1
Introduction............................... .................................................. 2
Pre Segmentation....................... .................................................. 3
 Median filter.................... .................................................. 3
 Determining surface normal and temporal smoothing ...... 3
 Detection of surface normal edges .................................... 4
 Segmentation into surface patches..................................... 4
 Connected component analysis algorithm......................... 5
High Segmentation.................... .................................................. 9
 Adjacency matrix and assignment of edge points ............. 9
 Cutfree neighbors............ .................................................. 9
 Improving adjacency matrix.............................................. 10
 Co-Planarity.................... .................................................. 10
 Curvature matching......... .................................................. 11
 Probabilistic Object Composition (Graph Cut) ................. 12
 Remaining edge points.... .................................................. 13
Code Explanation ..................... .................................................. 14
 Structures ........................ .................................................. 14
 Functions......................... .................................................. 14
 Thinning Algorithm ........ .................................................. 16
 Main function.................. .................................................. 17
Results ....................................... .................................................. 20
Conclusion................................. .................................................. 22
References ................................. .................................................. 23
1
Acknowledgement
Firstly, we are very grateful to Practice School Division (PSD), BITS-Pilani for
providing us an opportunity to pursue our Practice School-1(PS-1) under guidance of
eminent scientists at Central Electronics Engineering Research Institute (CEERI), Pilani.
We would like to express our sincere thanks to Dr. Chandrashekhar, Director,
CEERI, Pilani for giving us the opportunity to carry out a project in this esteemed
organization. We would also like to thank Dr. J.L. Raheja our Project Guide for
suggesting us the project and providing us valuable guidance and support throughout
our work. We would like to extend our gratitude to Ms. Zeba and all others who were
directly or indirectly related to this project.
We are grateful to Mr. Vinod Verma for helping us in our daily attendance and
support throughout our tenure in CEERI. We would also like to thank our PS-1
instructor, Mr. Parikshit Kishor Singh, for being a constant source of guidance and
motivation for us.
2
Introduction
In computer vision, image segmentation is the process of partitioning a digital image into
multiple segments (sets of pixels, also known as super-pixels). The goal
of segmentation is to simplify and/or change the representation of an image into
something that is more meaningful and easier to analyze.
In the present work the model-free and real-time capable segmentation approach
presented in the previous work of the authors is extended to a general probabilistic
framework, which considers multimodal cues in a uniform manner. The algorithm
combines two segmentation methods: the identification of smooth object surfaces and the
composition of these surfaces into sensible object hypotheses.
In this work, region growing is replaced by connected component analysis and motion
sensitive temporal smoothing is implemented to avoid the motion blur effect.
While the high level segmentation extracted support planes and decomposed the
remaining blobs using binary space partitioning; the second contribution introduced the
idea of composing cutfree neighboring surfaces. In the current work, graph-cut is being
applied on a probabilistically weighted similarity graph considering adjacency, curvature
and co-planarity of found surface patches to enable the method to handle occluded and
open curved objects.
Additionally, the algorithms are further optimized for real-time challenges. The main
advantage of the method, in contrast to existing ones, is the capability to unknown,
stacked, nearby, and occluded objects in a model-free manner. Naturally, this approach
has its limitations compared to model-based approaches, especially if very complex
objects heaps are to be considered. However, it provides a meaningful initial object
hypothesis in arbitrary situations, which can be refined by active exploration or used as
input to model-based adaptive methods.
The probabilistic nature of method allows to focus these methods to selectively
disambiguate uncertain object hypotheses. The algorithm operates in real-time facilitating
interactive usage in human-robot-cooperation tasks.
3
Pre-Segmentation
The objective of the first processing step is to segment the depth image into regions of
(smoothly curved) surfaces, continuously enclosed by sharp object edges. We deal with
depth images as they possess low noise levels. Additionally, the raw depth image is
transformed into a 3D point cloud, which is represented w.r.t. a robot-defined coordinate
frame.
a)Median Filter : Median filter is the first step in implementing this work. It is used to
remove noise from image (if any). In this method we construct a mask of N × N
dimension, where N is an odd number. Generally a 3 × 3 mask is preferred. The mask
consists of all the 8 neighbors of the concerned pixel including the pixel itself. All the
pixel values in the mask are then sorted and the concerned pixel is replaced with the
median value of the mask. Median filter is preferred over Box filter because it replaces
the pixel value with the value of one of its neighbors while box filter may produce a
value which is nowhere in the entire image.
b) Determination of Surface Normals and Temporal Smoothing: As a basis for
computing “surface normal edges”, surface normals for every image point are determined
from the plane spanned by three points in the 3 ×3 neighborhood of the considered
central image point using cross product.
The determination of surface normals is directly performed on the raw depth image,
instead of the 3D point cloud. That is, the 2D image coordinates are augmented by the
depth value to yield valid three-dimensional vectors. This procedure yields much more
distinct changes of the normal direction at the boundary of objects, because the
smoothing effect due to 3D projection is avoided.
In order to reduce sensor noise and to obtain smooth and stable surface normal
estimations, a three stage smoothing procedure is applied.
First a 3 × 3 median filter has been applied earlier to the raw depth image. Secondly, a
motion sensitive temporal smoothing is used, averaging depth values of all individual
image pixels within the last n = 6 frames, if the difference of the depth values is smaller
than d = 10. The normal are calculated by taking the co-ordinates of concerned pixels and
any of its two neighbor pixels. Now, normal can be calculated by taking cross product of
the two vectors.Finally, the calculated normals are smoothed applying a convolution
using a 5 × 5 Gaussian kernel.
4
c) Detection of Surface Normal Edges: The next step is the fast detection of surface
normal edges, which is based on the computation of the scalar product of adjacent surface
normals. To obtain clear, uninterrupted edges, suitable for subsequent application of a
region growing algorithm, we look for edges in all eight directions defined by the
neighboring pixels of a point, i.e. north (N), east (E), south (S), west (W), as well as NE,
SE, SW, NW.
The final result of the edge filter is obtained from averaging the results of all eight scalar
products. While large values, close to one, correspond to flat surfaces, smaller values
indicate increasingly sharp object edges.
Finally, binarizing the obtained edge image by employing a threshold value θmax = 0.85
(31.8o
), we can easily separate edges from smoothly curved surfaces.
Object edges are clearly visible as bold lines as shown in figure below. On the left the
actual depth map is shown while right picture shows the detected edges.While smooth
and large surfaces form homogeneous white regions. Some false edges may still be
detected due to noise.
However those regions are small and disjointed and thus can be easily filtered out in
subsequent processing steps.
d) Segmentation into Surface Patches: Finally, a fast connected component analysis
algorithm is applied. The fast surface patch segmentation based on normal edges already
provides a detailed segmentation of the scene into surface patches, and is then employed
in the subsequent object segmentation step. The algorithm consists of two iterations of
processing which are explained with the help of flow charts below.
5
1st Iteration
Image
Scan image pixel by pixel
Check top and left neighbors
If both neighbors are
boundary or background
points
Pixel is not an edge or
background
Assign a
new label
If one neighbor is boundary
or background
Assign the
label of other
neighbor
If neither neighbor is
boundary or background
Assign the label
of neighbor with
smaller value
Assign the
larger label as
child of smaller
label
6
2nd
Iteration
Both the iterations are also explained below with
the help of a sample image. Suppose we have an
image as shown to the right. The black pixels
represent the boundary. First of all we visit the first
pixel after making sure that it is not a boundary
pixel, we assign first label to it. Then moving to
next pixel, We check its left neighbor (as it doesn’t
has any top neighbor). Since the neighbor is not a
boundary pixel so we give it the same label as the
label. After that we encounter a boundary and after
boundary we need to generate a new label. So new
label is generated and it is assigned to the pixel as
shown in image below(left). Proceeding similarly
me assign labels to all the pixels (non boundary) in
the 1st
row.
Scan image pixel by pixel
Pixel is
labelled
Get label’s parent
Assign parent’s label in place of
child’s.
7
Similarly we mark pixels in the 2nd
row as well comparing the current pixel with its top
and left neighbor.
In the third row we encounter a pixel(just below label 2) whose top neighbor has label 2
while the left neighbor has pixel value 1. So the lower value is assigned to it ( upper right
figure). We proceed similarly and at the end of 1st
iteration we get an image like the one
shown below.
8
2nd
iteration is all about merging the different labels which construct same surface like in
the image shown above, labels 1 and 2 represents the same plane so they must be
represented by a single label. In this iteration we merge such labels with the minimum of
the two values. We first point out a pixel where the top or right pixel has the different
label than the current pixel. Then we replace all the pixels of that region with the
minimum value of the two labels. Lower left image shows the image after merging pixels
with label 1 and 2 while the lower right image shows the resulting image after completion
of merging.
9
High-Level Object Segmentation
In the second processing block, the aim is segmentation on an object level, which means
that the previously found surface patches need to be combined to form proper object
regions. A weighted graph is created, modeling the probability of two surface patches
belonging to the same object region.
Subsequently this graph structure is analyzed to find the most probable segmentation into
object regions using a graph cut algorithm. Co-planarity and curvature cues are also
employed to successfully combine objects patches which are separated due to occlusion.
a) Adjacency Matrix and Assignment of Edge Points:
An initial adjacency matrix representing the basic connectivity of surface patches is
determined as follows: For every edge point pr all neighboring surface points pi within a
radius r in image space are considered, which have a Euclidean distance ||pr –pi|| smaller
than a threshold dmax. All possible surface pairs obtained from this list are marked as
adjacent.
For example, in the given figure, surfaces 2
and 10 are neighbors in image space, but not
in 3D space and therefore aren’t considered
adjacent. Faces 7,9,11 fulfill the conditions
and become connected in the graph.
b) Cutfree Neighbors: To further improve the adjacency matrix, a plausible heuristic
check is applied. The central idea of this heuristic approach is that visible surface patches
are part of an object’s outer hull, such that points belonging to this object should either lie
on the one or the other side of the associated plane. If we conversely find enough points
on both sides of the plane, we assume two (or more) separated objects and split the blob
into two blobs for further processing.
If one surface cuts the other, such that a considerable amount of points are lying on both
sides of the former surface. For illustration, in the surfaces 4 and 12 in above figure,
while all points of face 4 are on top of the supporting face 12, the plane fitted into surface
4 cuts surface 12. Hence this surface combination is disregarded. On the other hand
surfaces 7,9,11 are all pairwise cut-free.
10
c) Improving the Adjacency Matrix: In case of occlusion, a single face of an object is
separated into two parts, which will not have a link in the adjacency matrix. To overcome
this limitation, further links are added to the matrix based on additional cues, namely co-
planarity of flat faces and similar curvature of curved surfaces.
d) Co-Planarity: To check for co-planarity of two flat surfaces we proceed in two steps:
 If both surfaces have similar mean normals (up to a small noise margin). Because
the normal of the spanned plane may crucially depend on the actual selection of
points, this criterion is checked for a set of 50 randomly selected triples of points.
If any of the calculated normal deviates too much, the two surfaces are not
considered coplanar.
 Whether the faces are aligned, i.e. indeed span a common plane. In this case, any
plane spanned by three points from both surfaces should have a similar normal as
the two original mean normals.
Otherwise, the above described occlusion check is carried out, along several lines
connecting two randomly selected points from both surfaces. If this check is passed as
well, a corresponding link in the connectivity matrix is added for the given pair of
surfaces.
This Figure shows the
resulting graphs before and
after co-planarity extension.
While the first graph results
in four final objects, the
second graph correctly
results in three objects.
11
e) Curvature Matching: In order to handle curved surface patches in a similar fashion,
their curvatures are compared. To this end, a curvature histogram is computed for every
curved surface, representing the distribution of surface normals within the surface. The
2D histogram of 11×11 bins describes the relative frequency of observing surface
normals with given x and y components. The associated frequency of z components is
determined by the fact, that normals are normalized to magnitude one.
The distance of histograms is estimated by the mutual overlap of their distributions:
D(A,B) = ∑
Exploiting normalisation of histograms:
min(a,b)=1/2*((a+b)-│a-b│)
We compute the Similarity index:
S(a,b)=1-(1/2*D(a,b))=∑ijmin(aij,bij)
It is a score between 0 and 1. Surface pairs with a score larger than h = 0.5 are
considered for recombination.
We differentiate between open curved objects and occluded curved objects. To
recombine the inner and outer surfaces of an open object (like a cup or bowl), two
conditions must be fulfilled: (1) both surfaces are neighboring in image space and (2) the
surfaces are concave and convex respectively. The first condition considers the fact,
that the calculation of the initial adjacency matrix is restricted to neighbored surfaces in
Euclidean space.
To assess the convexity / concavity of a surface, we again consider the curvature
histogram, namely the two extremal bins hmin and hmax along the major axis of the
histogram blob. Back-projecting these bins into the image space, we yield point sets Pmin
and Pmax, whose normals are mapped onto the corresponding bins.
The mean image coordinates pmin and pmax of these point sets. Accordingly, convexity
is assessed by considering the scalar product
(pmax − pmin) ・ (hmax − hmin)
between the directional vectors
formed by the extremal points in
image vs. histogram coordinates. If
this value is positive, i.e. both vectors
pointing into a similar direction, the
surface is convex, otherwise it is
concave. This is also shown in figure
12
to the right.
The picture below shows the actual image, depth map and histograms of the objects in the
image. The left two histograms belongs to the lying cylinder (one for each occluded part)
third one belongs to the vertical bottle and the last one to the ball.
f) Probabilistic Object Composition (Graph Cut): The result of the previous steps is an
adjacency matrix representing a graph with edges for all possible surface combinations
arising from cut free neighborhood, co-planarity and curvature matching. This graph is
turned into a weighted graph, such that edge weights represent the strength of
connectivity between two connected nodes.
To determine the connectivity weights, initially a common weight is assigned wij = 1/n to
all edges (i, j) originating from node i. Here, n denotes the number of nodes adjacent to
node i. This results in a directed graph, where all outgoing edges of a node have the same
weight and thus the same probability for composition with this node. To create an
undirected graph, we average the weights of incoming and outgoing edges:
Wsym =
2
1
(Win + Wout )
The higher the connectivity of two nodes, the higher their
connecting weight. Exploiting the weighed graph, we set a
threshold θc = 0.5, and then apply graph cut algorithm.
Starting with individual nodes, the algorithm calculates all
connected sub graphs in ascending size and their
corresponding cuts. A cut is the set of all edges outgoing
from the sub graph and the associated costs is the sum of the
13
corresponding edge weights. If the costs are smaller than θc, a cut is found and the sub
graph is extracted as a single object. If the sub-graph exceeds n/2 in size, the algorithm
aborts, because all potential cuts were considered. The figure on the previous page shows
a cut of edge with probability of .29. this cut was made as its cost was less than 0.5
(consistent with our threshold value).
This threshold balances between under- and over-segmentation. A very small value, close
to zero, generates a single segment for every initially connected sub-graph, while a very
high value generates an individual segment for every surface node. This creates sub-
graphs which represent different objects.
g) Remaining Edge Points: In the final processing step, all remaining edge points have
to be processed to obtain the final segmentation result.
Firstly, the remaining points are segmented using a region growing algorithm working in
the image plane and using the Euclidean distance as the criterion of uniformity. These
segments are then processed according to the following rules:
 If a segment has no neighboring faces (caused by missing depth information), it
becomes a separate object.
 If a segment has one neighboring face and comprises very few points only, they are
assigned to this neighbor.
 If a segment is completely enclosed by a single neighboring face, it becomes a new
object. If it is not completely enclosed, all points are assigned to the single
 neighboring region.
 If a segment has more than one neighbor and all neighbors are part of a common
object, it will be assigned to this object.
 If a segment has more than one neighbor corresponding to different objects, all
points are assigned to the best matching neighboring plane using RANSAC.
14
Code Explanation
Structures:
 vctr: To hold the normalized surface normal calculated at each pixel. It contains
the x, y and z components in float data type.
 coordinate: To structure to hold co-ordinates of different pixels.
 list: It has fields of label, node_no and a pointer next. label holds the label value
assigned. node_no is used to count the number of nodes so it acts as a counter. next
is just a pointer to the next field.
 arr: it has two fields value and no. while value contains an integer label, no. acts as
a counter.
Functions:
 vec :It takes as input the pointer of an array of 3 coordinates, then calculates the
difference between them (to calculate vector along surface). It stores the difference
of the x, y and z components in an array and calls the CrossProduct function. It
returns the surface normal of the type vctr.
 CrossProduct : It takes as input 2 arrays which contain the x,y and z
components of 2 vectors and calculates the cross product to return the surface
normal.
 edge : It takes as input pointers to 2 vctr and calculates the dot product between
them. This is done to check the angle between two surface normal.
 create : It creates a list of all the labels that are assigned. I has 2 pointers head and
temp. Head points to the first element of list while temp points to the lastly added
node. It takes as input double pointers to head and temp pointer, pointer to an
integer label and an integer n.
When the list is empty, a new node is created. Values of label and n pointers are
assigned to label and node_no respectively. Head and temp, both are pointed at the
same label (this created label).
15
When the list is non-empty, the same thing happens except that now head points at
the first node and temp points at the last node. Temp is created so that for adding a
new element, the list has not to be traversed every time. It returns a pointer to
temp.
 replace : It takes as input 2 integer pointers and the list created using create
function. It compares the value of the 2 values the pointers are pointing at and
replaces the bigger value by the smaller value in the list’s label field.
 dist : It calculate the distance between 2 coordinate points.
 del : This function takes both head and temp pointer and deletes the list created for
storing labels, node by node. As the list was created dynamically, it must be
deleted at the end of the program by the programmer. Compiler is not responsible
for this task anymore.
 add: add function takes a pointer to an array, an integer value (size of that array)
and pointer to an integer(label). It adds the label to the array if it is not already
present in the array to keep a count of the number of different labels in the image.
 mod : This function takes two integer values and one integer pointer. It calculates
the absolute difference between the two integer values and stores it at the location
pointed by the integer variable.
 thinning : explained in a subsequent section.
 thinningIteration : explained in a subsequent section.
 Insertionsort : For sorting which is used to find the median of the values in the
mask of median filer.
16
Thinning Algorithm
The method for extracting the skeleton of a picture consists of removing all the contour
points of the picture except those points that belong to the skeleton. In order to preserve
the connectivity of the skeleton, each iteration is divided into two sub-iterations.
In the first sub-iteration, the contour point P1 is deleted from the digital pattern if it
satisfies the following conditions:
(a) 2 ≤ B(P1) ≤ 6
(b) A(P1)= 1
(C) P2*P4*P6 = 0
(d) P4*P6*P8 = 0
where A(P1) is the number of 01 patterns in the ordered set P2, P3, P4, - • - P8, P9 that
are the eight neighbors of P1 (Figure 1), and B(P1) is the number of nonzero neighbors of
P1, that is, B(P1) = P2 + P3 + P4 + • • • + P8 + P9. If any condition is not satisfied, e.g.,
A(P1) = 2 P1 is not deleted from the picture.
In the second sub-iteration, only conditions (c) and (d) are changed as follows:
(c') P2*P4*P8 = 0
(d') P2*P6*P8 = 0
and the rest remain the same.
By conditions (c) and (d) of the first sub-iteration, only the south-east boundary points
and the north-west corner points which do not belong to an ideal skeleton are removed.
By condition (a), the endpoints of a skeleton line are preserved. Also, condition (b)
prevents the deletion of those points that lie between the endpoints of a skeleton
line, as shown in Figure 5. The iterations continue until no more points can be removed.
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0
This is like a boundary of some binary image. In the first gray kernel, B(P1)=1 hence it
won’t be deleted. In the second kernel A(P1) =2 so it also won’t be deleted. In this way
the single pixel boundary is preserved.
17
Main function
In the main function we first load the source image and apply median filter on it. An
array of integers, window is created. It is given the 9 values of the 3×3 kernel centered at
each pixel. This array is sorted using insertion sort and then value at the center of the
kernel is replaced by the fifth value in sorted order.
Now we loop through the image in a for loop. At each pixel, we store in an array of
coordinate , the coordinates of the point and right and down neighbors. This array is then
passed to the function vec which then returns the normal. This normal is stored in an
array of vctr. So after looping through the entire image, in a 2d array of vctr we have the
unit normal at each point.
Now we take the dot product of a normal from all 8 neighbors one by one using the dot
function and sore the values in an array of floats. Then the average of the 8 dot products
is taken. If the average value is smaller than the threshold ( dot product is smaller means
that angle is larger) then we mark the point as boundary point(black) and rest points as
white or the other way round(just a matter of convention ) .
Hence after looping through the entire image we have the edges in the image. But as the
edges are quite thick, a thinning algorithm would also be applied to reduce the thickness
of edges to one pixel only. The thinning algorithm is explained in a later section. After
completion of all this we get a binary image.
Our next step is to identify the set of pixels in a connected region as a single pixel. For
this we define an array of pointers which can point to integer values having exactly the
same size as that of the original image. The algorithm for assigning labels to the non-
boundary pixels has already been discussed.
To label each pixel with a value, we follow the Connected Component labelling code
explained in the first part. It proceeds as follows:
 If the pixel is a boundary or background point, continue the loop without doing
anything and assign in patch matrix the pointer an integer whose value is 255.
 Else, at x=y=0, assign a new label and create a node that stores this label value.
Also pointer to this value is stored in the patch matrix.
 In the first row or column, only one of the left or top neighbors is available
respectively. In that case if the available neighbor is a boundary point increment
18
the value of label, create a new node and assign it. Also store the pointer in patch
matrix.
 If either the top or left neighbor of the concerned pixel is a boundary point, assign
the label of the other neighbor and store pointer to that in the patch matrix.
 If both the neighbors are boundary, increment the value of label and assign it to the
pixel.
 If both neighbors are non-boundary points, then assign the smaller label of the two
and store that the bigger value is a child of the smaller value.
So in main function as soon as we assign a new label, we call the function create which
adds a node to the current linked list if already a list exists and creates a first node if no
list exists at the time of assignment of label. Then we make the corresponding pointers
point to the label stored in that list. As further and further labels are assigned, the list
grows in size and when the last pixel is processed, we get a complete list having the value
of all the assigned labels.
For merging the labels such that regions which are not completely disjoint have the same
label, we follow this approach (in the patch matrix):
 If the pixel is first row or column, or has both top and left neighbors as boundary
or is itself a boundary, continue the loop. It is so because in the first row or column
the pixel will have either a new value or a value same as it’s available neighbor. In
either case, it won’t be involved directly in change of values or call to replace. It
would only be involved as a neighbor.
 Else, if the top or left neighbor is a boundary, and the value at the pixel is not equal
to the value of other neighbor use the replace function and replace the pointer of
the larger value by the smaller value. In this way these components would be
merged.
 Else if neither neighbor is boundary, then by the rule of assignment of labels the
label value has to be equal to the value of either the top or left neighbor. If the
value is equal to the left neighbor, then call replace function on the top neighbor
and the current pixel. In this way these regions will be joined.
19
So by now we have identified different surfaces in a scene. Now since the objects are
made up of surfaces so we need to combine the surfaces that belong to the same object.
For this we visit each boundary pixel in the patch matrix and construct a circle of unit
radius (consider all the neighbors at a distance if one unit to the concerned pixel) and then
we follow the following approach.
 If the neighboring pixels contains more than two boundary pixels then we continue
through the loop and skip that pixel as that pixel is at the boundary of more than
two surfaces and all these surfaces will be merged in subsequent steps.
 If the neighboring pixels have less than or equal to two neighbors then for each
non-boundary pixel we call the mod function which returns the absolute difference
between the depth values of the concerned neighbor and the concerned boundary
pixel. The difference value for each non-boundary pixel is added and averaged.
 If the average if less than a threshold (20) in our case, the minimum value of label
among all the neighbor is assigned to all the non-boundary pixels.
 If the average is more than the specified threshold the program considers them as
surface belonging to different objects and hence leave them as they are.
After this step the surfaces having the same patch form a single object.
Now we come to the part of 3d projection. Here a right and a left stereo image is taken.
Both the images are split into their RGB channels using split function. Then another array
is created in which the red component from left image and blue and green component
from right image is taken and merged. This creates an anaglyph which when viewed from
red-cyan glasses will create the perception of depth.
20
Results
This section presents the final results that were obtained from the applied algorithm. First
the input image is shown with five objects in it and then the segmented images for each
object are shown.
21
But these results represent just the 2D output. So to convert it into 3D output we use the
technique of anaglyph.
Anaglyph
since we did not have the RGB image of the scene shown above so below we have shown
the results of our function tested on a different set of images. To the left, left image is
shown while the right image is shown to the right. Anaglyph is shown a the bottom.
22
Conclusion
• In this paper, extension of model-free segmentation algorithm for cluttered scenes
which is not restricted by a given set of object models or world knowledge.
• A fast algorithm to determine object edges using edge detection on surface normals
was combined with a novel graph-based method to combine surface patches to
form highly probable object hypotheses.
• Coplanarity checks and curvature matching were added to handle occluded and
open curved objects.
• The algorithm can deal with stacked, nearby, and occluded objects, which is
achieved by finding object edges in depth images and the novel idea to identify
adjacent and cut-free surface patches, as well as coplanar surfaces, separated by
occlusion, which can be combined to form object regions.
• The algorithm was evaluated w.r.t. real-time capabilities and segmentation quality.
23
References
• Realtime 3D Segmentation for Human-Robot Interaction ,Andre U¨ ckermann,
Robert Haschke and Helge Ritter
• Real-Time 3D Segmentation of Cluttered Scenes for Robot Grasping, A. U¨
ckermann, R. Haschke, H. Ritter,
• A Fast Parallel Algorithm for Thinning Digital Patterns, T. Y. Zhaung and C. Y.
Suen
• http://www.aishack.in/2010/03/labelling-connected-components-example
• Digital Image Processing 1st
edition, S.Sridhar
• Digital Image Processing 2nd
edition, R.C. Gonzalez, R.E.Woods
• Object Oriented Programming with c++ 6th
edition, E Balagurusamy
• Datastructures with c 2nd
edition, Yashwant Kanetkar
• Learning OpenCV , G.Bradski, A. Kaehler

More Related Content

What's hot

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEAUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
ijcsa
 
Fuzzy entropy based optimal
Fuzzy entropy based optimalFuzzy entropy based optimal
Fuzzy entropy based optimal
ijsc
 
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion RatioDeveloping 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
CSCJournals
 
A Fuzzy Set Approach for Edge Detection
A Fuzzy Set Approach for Edge DetectionA Fuzzy Set Approach for Edge Detection
A Fuzzy Set Approach for Edge Detection
CSCJournals
 
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
ijistjournal
 
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
CSCJournals
 
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET Journal
 
Image segmentation using normalized graph cut
Image segmentation using normalized graph cutImage segmentation using normalized graph cut
Image segmentation using normalized graph cut
Mahesh Dananjaya
 
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
CSCJournals
 
A review on image enhancement techniques
A review on image enhancement techniquesA review on image enhancement techniques
A review on image enhancement techniques
IJEACS
 
J017426467
J017426467J017426467
J017426467
IOSR Journals
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Shakas Technologies
 
Statistical Feature based Blind Classifier for JPEG Image Splice Detection
Statistical Feature based Blind Classifier for JPEG Image Splice DetectionStatistical Feature based Blind Classifier for JPEG Image Splice Detection
Statistical Feature based Blind Classifier for JPEG Image Splice Detection
rahulmonikasharma
 
Analysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidAnalysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
Analysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
Rajyalakshmi Reddy
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
Manje Gowda
 
An application of morphological
An application of morphologicalAn application of morphological
An application of morphologicalNaresh Chilamakuri
 

What's hot (18)

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
40120130406009
4012013040600940120130406009
40120130406009
 
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISEAUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
AUTOMATED IMAGE MOSAICING SYSTEM WITH ANALYSIS OVER VARIOUS IMAGE NOISE
 
Fuzzy entropy based optimal
Fuzzy entropy based optimalFuzzy entropy based optimal
Fuzzy entropy based optimal
 
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion RatioDeveloping 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
Developing 3D Viewing Model from 2D Stereo Pair with its Occlusion Ratio
 
A Fuzzy Set Approach for Edge Detection
A Fuzzy Set Approach for Edge DetectionA Fuzzy Set Approach for Edge Detection
A Fuzzy Set Approach for Edge Detection
 
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSIONADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
ADOPTING AND IMPLEMENTATION OF SELF ORGANIZING FEATURE MAP FOR IMAGE FUSION
 
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
An Analysis and Comparison of Quality Index Using Clustering Techniques for S...
 
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...IRJET-  	  Comparison and Simulation based Analysis of an Optimized Block Mat...
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...
 
Image segmentation using normalized graph cut
Image segmentation using normalized graph cutImage segmentation using normalized graph cut
Image segmentation using normalized graph cut
 
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
Invariant Recognition of Rectangular Biscuits with Fuzzy Moment Descriptors, ...
 
A review on image enhancement techniques
A review on image enhancement techniquesA review on image enhancement techniques
A review on image enhancement techniques
 
J017426467
J017426467J017426467
J017426467
 
Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...Quality assessment of stereoscopic 3 d image compression by binocular integra...
Quality assessment of stereoscopic 3 d image compression by binocular integra...
 
Statistical Feature based Blind Classifier for JPEG Image Splice Detection
Statistical Feature based Blind Classifier for JPEG Image Splice DetectionStatistical Feature based Blind Classifier for JPEG Image Splice Detection
Statistical Feature based Blind Classifier for JPEG Image Splice Detection
 
Analysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
Analysis of Multi-focus Image Fusion Method Based on Laplacian PyramidAnalysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
Analysis of Multi-focus Image Fusion Method Based on Laplacian Pyramid
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
 
An application of morphological
An application of morphologicalAn application of morphological
An application of morphological
 

Viewers also liked

intervalo (matematica)
intervalo (matematica)intervalo (matematica)
intervalo (matematica)
Victorrolando
 
PPT- Decision + Interpersonal
PPT- Decision + InterpersonalPPT- Decision + Interpersonal
PPT- Decision + InterpersonalSudip Das
 
A Little Photo Essay about Intensification and infill
A Little Photo Essay about Intensification and infillA Little Photo Essay about Intensification and infill
A Little Photo Essay about Intensification and infill
Ian Cooper
 
Advantages of Floor Heating
Advantages of Floor HeatingAdvantages of Floor Heating
Advantages of Floor Heating
ThermoSoft
 
Diffusion of research-based instructional strategies_The case of SCALE-UP
Diffusion of research-based instructional strategies_The case of SCALE-UPDiffusion of research-based instructional strategies_The case of SCALE-UP
Diffusion of research-based instructional strategies_The case of SCALE-UPXaver Neumeyer
 
online education ppt
online education pptonline education ppt
online education pptSudip Das
 
中文三年级 期末作文- final
中文三年级 期末作文- final中文三年级 期末作文- final
中文三年级 期末作文- finalSime Luketa
 

Viewers also liked (8)

intervalo (matematica)
intervalo (matematica)intervalo (matematica)
intervalo (matematica)
 
PPT- Decision + Interpersonal
PPT- Decision + InterpersonalPPT- Decision + Interpersonal
PPT- Decision + Interpersonal
 
A Little Photo Essay about Intensification and infill
A Little Photo Essay about Intensification and infillA Little Photo Essay about Intensification and infill
A Little Photo Essay about Intensification and infill
 
Advantages of Floor Heating
Advantages of Floor HeatingAdvantages of Floor Heating
Advantages of Floor Heating
 
Diffusion of research-based instructional strategies_The case of SCALE-UP
Diffusion of research-based instructional strategies_The case of SCALE-UPDiffusion of research-based instructional strategies_The case of SCALE-UP
Diffusion of research-based instructional strategies_The case of SCALE-UP
 
online education ppt
online education pptonline education ppt
online education ppt
 
中文三年级 期末作文- final
中文三年级 期末作文- final中文三年级 期末作文- final
中文三年级 期末作文- final
 
Journal paper_1
Journal paper_1Journal paper_1
Journal paper_1
 

Similar to PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H

IRJET - Deep Learning Approach to Inpainting and Outpainting System
IRJET -  	  Deep Learning Approach to Inpainting and Outpainting SystemIRJET -  	  Deep Learning Approach to Inpainting and Outpainting System
IRJET - Deep Learning Approach to Inpainting and Outpainting System
IRJET Journal
 
An interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’sAn interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’s
eSAT Publishing House
 
An interactive image segmentation using multiple user inputªs
An interactive image segmentation using multiple user inputªsAn interactive image segmentation using multiple user inputªs
An interactive image segmentation using multiple user inputªs
eSAT Journals
 
Vision based non-invasive tool for facial swelling assessment
Vision based non-invasive tool for facial swelling assessment Vision based non-invasive tool for facial swelling assessment
Vision based non-invasive tool for facial swelling assessment
University of Moratuwa
 
A Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion CaptureA Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion Capture
IRJET Journal
 
Implementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time VideoImplementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time Video
IDES Editor
 
Report
ReportReport
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEY
Journal For Research
 
557 480-486
557 480-486557 480-486
557 480-486
idescitation
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Koteswar Rao Jerripothula
 
Final_draft_Practice_School_II_report
Final_draft_Practice_School_II_reportFinal_draft_Practice_School_II_report
Final_draft_Practice_School_II_reportRishikesh Bagwe
 
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
sipij
 
IRJET- Image Feature Extraction using Hough Transformation Principle
IRJET- Image Feature Extraction using Hough Transformation PrincipleIRJET- Image Feature Extraction using Hough Transformation Principle
IRJET- Image Feature Extraction using Hough Transformation Principle
IRJET Journal
 
Ijnsa050207
Ijnsa050207Ijnsa050207
Ijnsa050207
IJNSA Journal
 
Integration of poses to enhance the shape of the object tracking from a singl...
Integration of poses to enhance the shape of the object tracking from a singl...Integration of poses to enhance the shape of the object tracking from a singl...
Integration of poses to enhance the shape of the object tracking from a singl...
eSAT Journals
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 

Similar to PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H (20)

IRJET - Deep Learning Approach to Inpainting and Outpainting System
IRJET -  	  Deep Learning Approach to Inpainting and Outpainting SystemIRJET -  	  Deep Learning Approach to Inpainting and Outpainting System
IRJET - Deep Learning Approach to Inpainting and Outpainting System
 
An interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’sAn interactive image segmentation using multiple user input’s
An interactive image segmentation using multiple user input’s
 
An interactive image segmentation using multiple user inputªs
An interactive image segmentation using multiple user inputªsAn interactive image segmentation using multiple user inputªs
An interactive image segmentation using multiple user inputªs
 
2001714
20017142001714
2001714
 
3 video segmentation
3 video segmentation3 video segmentation
3 video segmentation
 
Morpho
MorphoMorpho
Morpho
 
Vision based non-invasive tool for facial swelling assessment
Vision based non-invasive tool for facial swelling assessment Vision based non-invasive tool for facial swelling assessment
Vision based non-invasive tool for facial swelling assessment
 
A Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion CaptureA Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion Capture
 
Implementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time VideoImplementation of Object Tracking for Real Time Video
Implementation of Object Tracking for Real Time Video
 
Report
ReportReport
Report
 
OBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEYOBJECT DETECTION AND RECOGNITION: A SURVEY
OBJECT DETECTION AND RECOGNITION: A SURVEY
 
557 480-486
557 480-486557 480-486
557 480-486
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
 
Final_draft_Practice_School_II_report
Final_draft_Practice_School_II_reportFinal_draft_Practice_School_II_report
Final_draft_Practice_School_II_report
 
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
Semi-Supervised Method of Multiple Object Segmentation with a Region Labeling...
 
IRJET- Image Feature Extraction using Hough Transformation Principle
IRJET- Image Feature Extraction using Hough Transformation PrincipleIRJET- Image Feature Extraction using Hough Transformation Principle
IRJET- Image Feature Extraction using Hough Transformation Principle
 
Ijnsa050207
Ijnsa050207Ijnsa050207
Ijnsa050207
 
Integration of poses to enhance the shape of the object tracking from a singl...
Integration of poses to enhance the shape of the object tracking from a singl...Integration of poses to enhance the shape of the object tracking from a singl...
Integration of poses to enhance the shape of the object tracking from a singl...
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 

PS1_2014_2012B5A7521P_2012B5A7848P_2012B4A7958H

  • 1. A REPORT ON Realtime 3D segmentation By Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science AT CSIR-Central Electronics Engineering Research Institute Pilani-333031 A Practice School-I station of BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI 23rd May - 17th July, 2014
  • 2. A REPORT ON Realtime 3D segmentation By Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science Prepared in the partial fulfilment of Practice School-I (BITS F221) Under the guidance of DR. JAGDISH RAHEJA PRINCIPAL SCIENTIST, DIGITAL SYSTEMS GROUP AT CSIR-Central Electronics Engineering Research Institute (CEERI) Pilani-333031 A Practice School-I station of BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, PILANI 23rd May - 17th July, 2014
  • 3. BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE PILANI (RAJASTHAN) Practice School Division Station: CENTRAL ELECTRONICS ENGINEERING RESEARCH INSTITUTE (CEERI) Centre: Pilani Duration: From: 23rd May, 2014 To: 17th July, 2014 Date of Submission: 15th July, 2014 Title of the Project: REALTIME 3D SEGMENTATION Name of the Student ID No. Discipline Gunjan Kumar Singh 2012B5A7521P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Saurabh Bhardwaj 2012B5A7848P Msc. (Hons.) In Physics and B.E. (Hons.) in Computer Science Divya Sanghi 2012B4A7958H Msc. (Hons.) In Mathematics and B.E. (Hons.) in Computer Science Name of the Expert: Dr. Jagdish Raheja Designation: Principal Scientist Name of the PS Faculty: Mr. Parikshit Kishor Singh Key words: Object segmentation, occlusion check, Graph, Adjacency Matrix Project Area: 3D Image Processing Abstract: A real-time algorithm that segments unstructured and highly cluttered scenes is discussed in this paper. The algorithm robustly separates objects of unknown shape in congested scenes of stacked and occluded objects. The model-free approach finds smooth surface patches, using a depth image from a Kinect camera, which are subsequently combined to form highly probable object hypotheses. Co-planarity and curvature matching is used to recombine surfaces separated by occlusion. The real- time capabilities are proven and the quality of the algorithm is evaluated on a benchmark database. Advantages compared to existing approaches as well as weaknesses are discussed. Date 15th July 2014
  • 4. Table of Contents Topic Page No. Acknowledgement..................... .................................................. 1 Introduction............................... .................................................. 2 Pre Segmentation....................... .................................................. 3  Median filter.................... .................................................. 3  Determining surface normal and temporal smoothing ...... 3  Detection of surface normal edges .................................... 4  Segmentation into surface patches..................................... 4  Connected component analysis algorithm......................... 5 High Segmentation.................... .................................................. 9  Adjacency matrix and assignment of edge points ............. 9  Cutfree neighbors............ .................................................. 9  Improving adjacency matrix.............................................. 10  Co-Planarity.................... .................................................. 10  Curvature matching......... .................................................. 11  Probabilistic Object Composition (Graph Cut) ................. 12  Remaining edge points.... .................................................. 13 Code Explanation ..................... .................................................. 14  Structures ........................ .................................................. 14  Functions......................... .................................................. 14  Thinning Algorithm ........ .................................................. 16  Main function.................. .................................................. 17 Results ....................................... .................................................. 20 Conclusion................................. .................................................. 22 References ................................. .................................................. 23
  • 5.
  • 6. 1 Acknowledgement Firstly, we are very grateful to Practice School Division (PSD), BITS-Pilani for providing us an opportunity to pursue our Practice School-1(PS-1) under guidance of eminent scientists at Central Electronics Engineering Research Institute (CEERI), Pilani. We would like to express our sincere thanks to Dr. Chandrashekhar, Director, CEERI, Pilani for giving us the opportunity to carry out a project in this esteemed organization. We would also like to thank Dr. J.L. Raheja our Project Guide for suggesting us the project and providing us valuable guidance and support throughout our work. We would like to extend our gratitude to Ms. Zeba and all others who were directly or indirectly related to this project. We are grateful to Mr. Vinod Verma for helping us in our daily attendance and support throughout our tenure in CEERI. We would also like to thank our PS-1 instructor, Mr. Parikshit Kishor Singh, for being a constant source of guidance and motivation for us.
  • 7. 2 Introduction In computer vision, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as super-pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. In the present work the model-free and real-time capable segmentation approach presented in the previous work of the authors is extended to a general probabilistic framework, which considers multimodal cues in a uniform manner. The algorithm combines two segmentation methods: the identification of smooth object surfaces and the composition of these surfaces into sensible object hypotheses. In this work, region growing is replaced by connected component analysis and motion sensitive temporal smoothing is implemented to avoid the motion blur effect. While the high level segmentation extracted support planes and decomposed the remaining blobs using binary space partitioning; the second contribution introduced the idea of composing cutfree neighboring surfaces. In the current work, graph-cut is being applied on a probabilistically weighted similarity graph considering adjacency, curvature and co-planarity of found surface patches to enable the method to handle occluded and open curved objects. Additionally, the algorithms are further optimized for real-time challenges. The main advantage of the method, in contrast to existing ones, is the capability to unknown, stacked, nearby, and occluded objects in a model-free manner. Naturally, this approach has its limitations compared to model-based approaches, especially if very complex objects heaps are to be considered. However, it provides a meaningful initial object hypothesis in arbitrary situations, which can be refined by active exploration or used as input to model-based adaptive methods. The probabilistic nature of method allows to focus these methods to selectively disambiguate uncertain object hypotheses. The algorithm operates in real-time facilitating interactive usage in human-robot-cooperation tasks.
  • 8. 3 Pre-Segmentation The objective of the first processing step is to segment the depth image into regions of (smoothly curved) surfaces, continuously enclosed by sharp object edges. We deal with depth images as they possess low noise levels. Additionally, the raw depth image is transformed into a 3D point cloud, which is represented w.r.t. a robot-defined coordinate frame. a)Median Filter : Median filter is the first step in implementing this work. It is used to remove noise from image (if any). In this method we construct a mask of N × N dimension, where N is an odd number. Generally a 3 × 3 mask is preferred. The mask consists of all the 8 neighbors of the concerned pixel including the pixel itself. All the pixel values in the mask are then sorted and the concerned pixel is replaced with the median value of the mask. Median filter is preferred over Box filter because it replaces the pixel value with the value of one of its neighbors while box filter may produce a value which is nowhere in the entire image. b) Determination of Surface Normals and Temporal Smoothing: As a basis for computing “surface normal edges”, surface normals for every image point are determined from the plane spanned by three points in the 3 ×3 neighborhood of the considered central image point using cross product. The determination of surface normals is directly performed on the raw depth image, instead of the 3D point cloud. That is, the 2D image coordinates are augmented by the depth value to yield valid three-dimensional vectors. This procedure yields much more distinct changes of the normal direction at the boundary of objects, because the smoothing effect due to 3D projection is avoided. In order to reduce sensor noise and to obtain smooth and stable surface normal estimations, a three stage smoothing procedure is applied. First a 3 × 3 median filter has been applied earlier to the raw depth image. Secondly, a motion sensitive temporal smoothing is used, averaging depth values of all individual image pixels within the last n = 6 frames, if the difference of the depth values is smaller than d = 10. The normal are calculated by taking the co-ordinates of concerned pixels and any of its two neighbor pixels. Now, normal can be calculated by taking cross product of the two vectors.Finally, the calculated normals are smoothed applying a convolution using a 5 × 5 Gaussian kernel.
  • 9. 4 c) Detection of Surface Normal Edges: The next step is the fast detection of surface normal edges, which is based on the computation of the scalar product of adjacent surface normals. To obtain clear, uninterrupted edges, suitable for subsequent application of a region growing algorithm, we look for edges in all eight directions defined by the neighboring pixels of a point, i.e. north (N), east (E), south (S), west (W), as well as NE, SE, SW, NW. The final result of the edge filter is obtained from averaging the results of all eight scalar products. While large values, close to one, correspond to flat surfaces, smaller values indicate increasingly sharp object edges. Finally, binarizing the obtained edge image by employing a threshold value θmax = 0.85 (31.8o ), we can easily separate edges from smoothly curved surfaces. Object edges are clearly visible as bold lines as shown in figure below. On the left the actual depth map is shown while right picture shows the detected edges.While smooth and large surfaces form homogeneous white regions. Some false edges may still be detected due to noise. However those regions are small and disjointed and thus can be easily filtered out in subsequent processing steps. d) Segmentation into Surface Patches: Finally, a fast connected component analysis algorithm is applied. The fast surface patch segmentation based on normal edges already provides a detailed segmentation of the scene into surface patches, and is then employed in the subsequent object segmentation step. The algorithm consists of two iterations of processing which are explained with the help of flow charts below.
  • 10. 5 1st Iteration Image Scan image pixel by pixel Check top and left neighbors If both neighbors are boundary or background points Pixel is not an edge or background Assign a new label If one neighbor is boundary or background Assign the label of other neighbor If neither neighbor is boundary or background Assign the label of neighbor with smaller value Assign the larger label as child of smaller label
  • 11. 6 2nd Iteration Both the iterations are also explained below with the help of a sample image. Suppose we have an image as shown to the right. The black pixels represent the boundary. First of all we visit the first pixel after making sure that it is not a boundary pixel, we assign first label to it. Then moving to next pixel, We check its left neighbor (as it doesn’t has any top neighbor). Since the neighbor is not a boundary pixel so we give it the same label as the label. After that we encounter a boundary and after boundary we need to generate a new label. So new label is generated and it is assigned to the pixel as shown in image below(left). Proceeding similarly me assign labels to all the pixels (non boundary) in the 1st row. Scan image pixel by pixel Pixel is labelled Get label’s parent Assign parent’s label in place of child’s.
  • 12. 7 Similarly we mark pixels in the 2nd row as well comparing the current pixel with its top and left neighbor. In the third row we encounter a pixel(just below label 2) whose top neighbor has label 2 while the left neighbor has pixel value 1. So the lower value is assigned to it ( upper right figure). We proceed similarly and at the end of 1st iteration we get an image like the one shown below.
  • 13. 8 2nd iteration is all about merging the different labels which construct same surface like in the image shown above, labels 1 and 2 represents the same plane so they must be represented by a single label. In this iteration we merge such labels with the minimum of the two values. We first point out a pixel where the top or right pixel has the different label than the current pixel. Then we replace all the pixels of that region with the minimum value of the two labels. Lower left image shows the image after merging pixels with label 1 and 2 while the lower right image shows the resulting image after completion of merging.
  • 14. 9 High-Level Object Segmentation In the second processing block, the aim is segmentation on an object level, which means that the previously found surface patches need to be combined to form proper object regions. A weighted graph is created, modeling the probability of two surface patches belonging to the same object region. Subsequently this graph structure is analyzed to find the most probable segmentation into object regions using a graph cut algorithm. Co-planarity and curvature cues are also employed to successfully combine objects patches which are separated due to occlusion. a) Adjacency Matrix and Assignment of Edge Points: An initial adjacency matrix representing the basic connectivity of surface patches is determined as follows: For every edge point pr all neighboring surface points pi within a radius r in image space are considered, which have a Euclidean distance ||pr –pi|| smaller than a threshold dmax. All possible surface pairs obtained from this list are marked as adjacent. For example, in the given figure, surfaces 2 and 10 are neighbors in image space, but not in 3D space and therefore aren’t considered adjacent. Faces 7,9,11 fulfill the conditions and become connected in the graph. b) Cutfree Neighbors: To further improve the adjacency matrix, a plausible heuristic check is applied. The central idea of this heuristic approach is that visible surface patches are part of an object’s outer hull, such that points belonging to this object should either lie on the one or the other side of the associated plane. If we conversely find enough points on both sides of the plane, we assume two (or more) separated objects and split the blob into two blobs for further processing. If one surface cuts the other, such that a considerable amount of points are lying on both sides of the former surface. For illustration, in the surfaces 4 and 12 in above figure, while all points of face 4 are on top of the supporting face 12, the plane fitted into surface 4 cuts surface 12. Hence this surface combination is disregarded. On the other hand surfaces 7,9,11 are all pairwise cut-free.
  • 15. 10 c) Improving the Adjacency Matrix: In case of occlusion, a single face of an object is separated into two parts, which will not have a link in the adjacency matrix. To overcome this limitation, further links are added to the matrix based on additional cues, namely co- planarity of flat faces and similar curvature of curved surfaces. d) Co-Planarity: To check for co-planarity of two flat surfaces we proceed in two steps:  If both surfaces have similar mean normals (up to a small noise margin). Because the normal of the spanned plane may crucially depend on the actual selection of points, this criterion is checked for a set of 50 randomly selected triples of points. If any of the calculated normal deviates too much, the two surfaces are not considered coplanar.  Whether the faces are aligned, i.e. indeed span a common plane. In this case, any plane spanned by three points from both surfaces should have a similar normal as the two original mean normals. Otherwise, the above described occlusion check is carried out, along several lines connecting two randomly selected points from both surfaces. If this check is passed as well, a corresponding link in the connectivity matrix is added for the given pair of surfaces. This Figure shows the resulting graphs before and after co-planarity extension. While the first graph results in four final objects, the second graph correctly results in three objects.
  • 16. 11 e) Curvature Matching: In order to handle curved surface patches in a similar fashion, their curvatures are compared. To this end, a curvature histogram is computed for every curved surface, representing the distribution of surface normals within the surface. The 2D histogram of 11×11 bins describes the relative frequency of observing surface normals with given x and y components. The associated frequency of z components is determined by the fact, that normals are normalized to magnitude one. The distance of histograms is estimated by the mutual overlap of their distributions: D(A,B) = ∑ Exploiting normalisation of histograms: min(a,b)=1/2*((a+b)-│a-b│) We compute the Similarity index: S(a,b)=1-(1/2*D(a,b))=∑ijmin(aij,bij) It is a score between 0 and 1. Surface pairs with a score larger than h = 0.5 are considered for recombination. We differentiate between open curved objects and occluded curved objects. To recombine the inner and outer surfaces of an open object (like a cup or bowl), two conditions must be fulfilled: (1) both surfaces are neighboring in image space and (2) the surfaces are concave and convex respectively. The first condition considers the fact, that the calculation of the initial adjacency matrix is restricted to neighbored surfaces in Euclidean space. To assess the convexity / concavity of a surface, we again consider the curvature histogram, namely the two extremal bins hmin and hmax along the major axis of the histogram blob. Back-projecting these bins into the image space, we yield point sets Pmin and Pmax, whose normals are mapped onto the corresponding bins. The mean image coordinates pmin and pmax of these point sets. Accordingly, convexity is assessed by considering the scalar product (pmax − pmin) ・ (hmax − hmin) between the directional vectors formed by the extremal points in image vs. histogram coordinates. If this value is positive, i.e. both vectors pointing into a similar direction, the surface is convex, otherwise it is concave. This is also shown in figure
  • 17. 12 to the right. The picture below shows the actual image, depth map and histograms of the objects in the image. The left two histograms belongs to the lying cylinder (one for each occluded part) third one belongs to the vertical bottle and the last one to the ball. f) Probabilistic Object Composition (Graph Cut): The result of the previous steps is an adjacency matrix representing a graph with edges for all possible surface combinations arising from cut free neighborhood, co-planarity and curvature matching. This graph is turned into a weighted graph, such that edge weights represent the strength of connectivity between two connected nodes. To determine the connectivity weights, initially a common weight is assigned wij = 1/n to all edges (i, j) originating from node i. Here, n denotes the number of nodes adjacent to node i. This results in a directed graph, where all outgoing edges of a node have the same weight and thus the same probability for composition with this node. To create an undirected graph, we average the weights of incoming and outgoing edges: Wsym = 2 1 (Win + Wout ) The higher the connectivity of two nodes, the higher their connecting weight. Exploiting the weighed graph, we set a threshold θc = 0.5, and then apply graph cut algorithm. Starting with individual nodes, the algorithm calculates all connected sub graphs in ascending size and their corresponding cuts. A cut is the set of all edges outgoing from the sub graph and the associated costs is the sum of the
  • 18. 13 corresponding edge weights. If the costs are smaller than θc, a cut is found and the sub graph is extracted as a single object. If the sub-graph exceeds n/2 in size, the algorithm aborts, because all potential cuts were considered. The figure on the previous page shows a cut of edge with probability of .29. this cut was made as its cost was less than 0.5 (consistent with our threshold value). This threshold balances between under- and over-segmentation. A very small value, close to zero, generates a single segment for every initially connected sub-graph, while a very high value generates an individual segment for every surface node. This creates sub- graphs which represent different objects. g) Remaining Edge Points: In the final processing step, all remaining edge points have to be processed to obtain the final segmentation result. Firstly, the remaining points are segmented using a region growing algorithm working in the image plane and using the Euclidean distance as the criterion of uniformity. These segments are then processed according to the following rules:  If a segment has no neighboring faces (caused by missing depth information), it becomes a separate object.  If a segment has one neighboring face and comprises very few points only, they are assigned to this neighbor.  If a segment is completely enclosed by a single neighboring face, it becomes a new object. If it is not completely enclosed, all points are assigned to the single  neighboring region.  If a segment has more than one neighbor and all neighbors are part of a common object, it will be assigned to this object.  If a segment has more than one neighbor corresponding to different objects, all points are assigned to the best matching neighboring plane using RANSAC.
  • 19. 14 Code Explanation Structures:  vctr: To hold the normalized surface normal calculated at each pixel. It contains the x, y and z components in float data type.  coordinate: To structure to hold co-ordinates of different pixels.  list: It has fields of label, node_no and a pointer next. label holds the label value assigned. node_no is used to count the number of nodes so it acts as a counter. next is just a pointer to the next field.  arr: it has two fields value and no. while value contains an integer label, no. acts as a counter. Functions:  vec :It takes as input the pointer of an array of 3 coordinates, then calculates the difference between them (to calculate vector along surface). It stores the difference of the x, y and z components in an array and calls the CrossProduct function. It returns the surface normal of the type vctr.  CrossProduct : It takes as input 2 arrays which contain the x,y and z components of 2 vectors and calculates the cross product to return the surface normal.  edge : It takes as input pointers to 2 vctr and calculates the dot product between them. This is done to check the angle between two surface normal.  create : It creates a list of all the labels that are assigned. I has 2 pointers head and temp. Head points to the first element of list while temp points to the lastly added node. It takes as input double pointers to head and temp pointer, pointer to an integer label and an integer n. When the list is empty, a new node is created. Values of label and n pointers are assigned to label and node_no respectively. Head and temp, both are pointed at the same label (this created label).
  • 20. 15 When the list is non-empty, the same thing happens except that now head points at the first node and temp points at the last node. Temp is created so that for adding a new element, the list has not to be traversed every time. It returns a pointer to temp.  replace : It takes as input 2 integer pointers and the list created using create function. It compares the value of the 2 values the pointers are pointing at and replaces the bigger value by the smaller value in the list’s label field.  dist : It calculate the distance between 2 coordinate points.  del : This function takes both head and temp pointer and deletes the list created for storing labels, node by node. As the list was created dynamically, it must be deleted at the end of the program by the programmer. Compiler is not responsible for this task anymore.  add: add function takes a pointer to an array, an integer value (size of that array) and pointer to an integer(label). It adds the label to the array if it is not already present in the array to keep a count of the number of different labels in the image.  mod : This function takes two integer values and one integer pointer. It calculates the absolute difference between the two integer values and stores it at the location pointed by the integer variable.  thinning : explained in a subsequent section.  thinningIteration : explained in a subsequent section.  Insertionsort : For sorting which is used to find the median of the values in the mask of median filer.
  • 21. 16 Thinning Algorithm The method for extracting the skeleton of a picture consists of removing all the contour points of the picture except those points that belong to the skeleton. In order to preserve the connectivity of the skeleton, each iteration is divided into two sub-iterations. In the first sub-iteration, the contour point P1 is deleted from the digital pattern if it satisfies the following conditions: (a) 2 ≤ B(P1) ≤ 6 (b) A(P1)= 1 (C) P2*P4*P6 = 0 (d) P4*P6*P8 = 0 where A(P1) is the number of 01 patterns in the ordered set P2, P3, P4, - • - P8, P9 that are the eight neighbors of P1 (Figure 1), and B(P1) is the number of nonzero neighbors of P1, that is, B(P1) = P2 + P3 + P4 + • • • + P8 + P9. If any condition is not satisfied, e.g., A(P1) = 2 P1 is not deleted from the picture. In the second sub-iteration, only conditions (c) and (d) are changed as follows: (c') P2*P4*P8 = 0 (d') P2*P6*P8 = 0 and the rest remain the same. By conditions (c) and (d) of the first sub-iteration, only the south-east boundary points and the north-west corner points which do not belong to an ideal skeleton are removed. By condition (a), the endpoints of a skeleton line are preserved. Also, condition (b) prevents the deletion of those points that lie between the endpoints of a skeleton line, as shown in Figure 5. The iterations continue until no more points can be removed. 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 This is like a boundary of some binary image. In the first gray kernel, B(P1)=1 hence it won’t be deleted. In the second kernel A(P1) =2 so it also won’t be deleted. In this way the single pixel boundary is preserved.
  • 22. 17 Main function In the main function we first load the source image and apply median filter on it. An array of integers, window is created. It is given the 9 values of the 3×3 kernel centered at each pixel. This array is sorted using insertion sort and then value at the center of the kernel is replaced by the fifth value in sorted order. Now we loop through the image in a for loop. At each pixel, we store in an array of coordinate , the coordinates of the point and right and down neighbors. This array is then passed to the function vec which then returns the normal. This normal is stored in an array of vctr. So after looping through the entire image, in a 2d array of vctr we have the unit normal at each point. Now we take the dot product of a normal from all 8 neighbors one by one using the dot function and sore the values in an array of floats. Then the average of the 8 dot products is taken. If the average value is smaller than the threshold ( dot product is smaller means that angle is larger) then we mark the point as boundary point(black) and rest points as white or the other way round(just a matter of convention ) . Hence after looping through the entire image we have the edges in the image. But as the edges are quite thick, a thinning algorithm would also be applied to reduce the thickness of edges to one pixel only. The thinning algorithm is explained in a later section. After completion of all this we get a binary image. Our next step is to identify the set of pixels in a connected region as a single pixel. For this we define an array of pointers which can point to integer values having exactly the same size as that of the original image. The algorithm for assigning labels to the non- boundary pixels has already been discussed. To label each pixel with a value, we follow the Connected Component labelling code explained in the first part. It proceeds as follows:  If the pixel is a boundary or background point, continue the loop without doing anything and assign in patch matrix the pointer an integer whose value is 255.  Else, at x=y=0, assign a new label and create a node that stores this label value. Also pointer to this value is stored in the patch matrix.  In the first row or column, only one of the left or top neighbors is available respectively. In that case if the available neighbor is a boundary point increment
  • 23. 18 the value of label, create a new node and assign it. Also store the pointer in patch matrix.  If either the top or left neighbor of the concerned pixel is a boundary point, assign the label of the other neighbor and store pointer to that in the patch matrix.  If both the neighbors are boundary, increment the value of label and assign it to the pixel.  If both neighbors are non-boundary points, then assign the smaller label of the two and store that the bigger value is a child of the smaller value. So in main function as soon as we assign a new label, we call the function create which adds a node to the current linked list if already a list exists and creates a first node if no list exists at the time of assignment of label. Then we make the corresponding pointers point to the label stored in that list. As further and further labels are assigned, the list grows in size and when the last pixel is processed, we get a complete list having the value of all the assigned labels. For merging the labels such that regions which are not completely disjoint have the same label, we follow this approach (in the patch matrix):  If the pixel is first row or column, or has both top and left neighbors as boundary or is itself a boundary, continue the loop. It is so because in the first row or column the pixel will have either a new value or a value same as it’s available neighbor. In either case, it won’t be involved directly in change of values or call to replace. It would only be involved as a neighbor.  Else, if the top or left neighbor is a boundary, and the value at the pixel is not equal to the value of other neighbor use the replace function and replace the pointer of the larger value by the smaller value. In this way these components would be merged.  Else if neither neighbor is boundary, then by the rule of assignment of labels the label value has to be equal to the value of either the top or left neighbor. If the value is equal to the left neighbor, then call replace function on the top neighbor and the current pixel. In this way these regions will be joined.
  • 24. 19 So by now we have identified different surfaces in a scene. Now since the objects are made up of surfaces so we need to combine the surfaces that belong to the same object. For this we visit each boundary pixel in the patch matrix and construct a circle of unit radius (consider all the neighbors at a distance if one unit to the concerned pixel) and then we follow the following approach.  If the neighboring pixels contains more than two boundary pixels then we continue through the loop and skip that pixel as that pixel is at the boundary of more than two surfaces and all these surfaces will be merged in subsequent steps.  If the neighboring pixels have less than or equal to two neighbors then for each non-boundary pixel we call the mod function which returns the absolute difference between the depth values of the concerned neighbor and the concerned boundary pixel. The difference value for each non-boundary pixel is added and averaged.  If the average if less than a threshold (20) in our case, the minimum value of label among all the neighbor is assigned to all the non-boundary pixels.  If the average is more than the specified threshold the program considers them as surface belonging to different objects and hence leave them as they are. After this step the surfaces having the same patch form a single object. Now we come to the part of 3d projection. Here a right and a left stereo image is taken. Both the images are split into their RGB channels using split function. Then another array is created in which the red component from left image and blue and green component from right image is taken and merged. This creates an anaglyph which when viewed from red-cyan glasses will create the perception of depth.
  • 25. 20 Results This section presents the final results that were obtained from the applied algorithm. First the input image is shown with five objects in it and then the segmented images for each object are shown.
  • 26. 21 But these results represent just the 2D output. So to convert it into 3D output we use the technique of anaglyph. Anaglyph since we did not have the RGB image of the scene shown above so below we have shown the results of our function tested on a different set of images. To the left, left image is shown while the right image is shown to the right. Anaglyph is shown a the bottom.
  • 27. 22 Conclusion • In this paper, extension of model-free segmentation algorithm for cluttered scenes which is not restricted by a given set of object models or world knowledge. • A fast algorithm to determine object edges using edge detection on surface normals was combined with a novel graph-based method to combine surface patches to form highly probable object hypotheses. • Coplanarity checks and curvature matching were added to handle occluded and open curved objects. • The algorithm can deal with stacked, nearby, and occluded objects, which is achieved by finding object edges in depth images and the novel idea to identify adjacent and cut-free surface patches, as well as coplanar surfaces, separated by occlusion, which can be combined to form object regions. • The algorithm was evaluated w.r.t. real-time capabilities and segmentation quality.
  • 28. 23 References • Realtime 3D Segmentation for Human-Robot Interaction ,Andre U¨ ckermann, Robert Haschke and Helge Ritter • Real-Time 3D Segmentation of Cluttered Scenes for Robot Grasping, A. U¨ ckermann, R. Haschke, H. Ritter, • A Fast Parallel Algorithm for Thinning Digital Patterns, T. Y. Zhaung and C. Y. Suen • http://www.aishack.in/2010/03/labelling-connected-components-example • Digital Image Processing 1st edition, S.Sridhar • Digital Image Processing 2nd edition, R.C. Gonzalez, R.E.Woods • Object Oriented Programming with c++ 6th edition, E Balagurusamy • Datastructures with c 2nd edition, Yashwant Kanetkar • Learning OpenCV , G.Bradski, A. Kaehler