fpres

Analyzing and Refining An
Application of the Sum of
Differences
Zachary Job

S.A.D. What?
• Computer vision
– Linear Algebra out the wazu!
• Simultaneous Localization and Mapping
• Real-time Point Cloud Processing and Streaming
• Polygon and Voxel Reconstruction
• Movement and Depth Processing
• SPIN object recognition
• Etcetera…

Plenty of data for all!
• A plethora of information is at everyone’s
fingertips. More pixels than we know what to
do with. Unless you’re Google training their
cat seeking neural net.
• Some algorithms are not too easy to
implement from scratch or on the fly…
– There are two difficulties in CS. Cache Invalidation,
naming conventions, and off by one errors

Why S.A.D?
• It’s elementary and needs no computer vision
background
• I help my high-school robotics club learn programming
• Not impossible to implement as portable code. I’m a C
kind of guy. I’ll have to write a portable version that
errors more than my non SAD implementations.
• It’s fast enough
• This is one of the more error prone algorithms due to
well documented errors
• DON’T SEE THAT EVERYDAY! I hope everyone can take
something away from this

Ground Truth Disparity
Base Image

It’s Simple. It’s useful.
• Multiple disparities can be used to determine
3D points in space. Be it distance,
reconstruction, or more.
I used triangle “zipping” reconstruction
so it isn’t smooth, but faaaaast

I could go on…
Questions before I continue?

Objective
• Attempt to classify the image characteristics. This
could be useful in a lower level language. A
machine without laser optics could process
information rapidly via two cameras. This makes
the code less dense, hardware more inexpensive,
and works within reasonable parameters.
• ALSO! I’ve been wondering if I really need to use
SPIN to perform recognition on point models
which takes an eon and a half…

The Challenges
• Balance efficiency and performance
– If the training set is to be expanded this is critical
• Ensure the algorithm cleanses/normalizes/adjusts the
data throughout execution.
– Images are data. The pixels should be treated so when
trying to process and relate them.
• Select a suitable classification algorithm in R
– This is a class on R, something has to be in R
• Have the data formatted enough before R gets its
hands on it
– There is no need to re-loop in a higher level scripting
language

Gray the Images!
• RGB is definitely makes simple comparison
difficult. Intensities could mangle things
without overheard to correct this. Graying
leaves a rather elegant way to clean and
simplify inputs.

Initial Ideas
Zero-mean normal cross-coorelation

The decision
• Go ahead with NCC and utilize training data to
perform KKNN to boot. This representation
best suites debugging. There are already an
immense amount of steps involved to just get
the data.

Rectification
• Although I eventually attributed some bugs to poor
results. The NCC also appeared to be of little help for
SAD yielding dirty disparities and KKNN results of 0-
15%. For this simple version it wouldn’t do any good
nor did it cleanse the data as I required.
• It can be noticed the intensity varies greatly. I had an
epiphany! A classmate had shared with me the idea of
treating data as a binary problem when dealing with
classification. A wonderfully simple idea. This would
work. It would be fast. It would keep things simple.

Neighbor Based Ranking
• Generate a map. For each pixel, if a neighbor
of a window surrounding the pixel is greater
add one to the pixel’s map location.
• Use the ranks during SAD instead of the pixel
intensities.

Naïve Results
This looks good!

Training methodology
• As in the initial attempt I would be using KKNN.
• I would manually select regions and write the
points in for processing via the main C program.
• The results were loaded to R as the deviations
between four selections similar to quartiles. Parts
of the selections were compared against
themselves producing further measures of
deviation. Finally widths of pixel ranges were
used to derive slightly more about the sections.
– This is super naïve but for the sake of proving this is
viable.

The Second Results
• With a very, very, very small number of points
around 50 (selecting is a tedious process)
KKNN was averaging 22-60% accuracy.
• Success?
– For such a small sample this is promising, however
not extremely conclusive.
– Time became a constraint. I underestimated the
work involved and had to act fast.

What I Gathered
• There holds some validity to the idea of
feature comparison through KKNN and
disparity maps
• Improvements can be made…
– Increase the data set
– Select very defined cases to represent each
feature
– Re-evaluate features and improve them
– Time!

First Rectification Attempt
• Use of KLT tracking to automatically populate
features
– Bugs in the code swatted this down given all I
could get working was edge detection
KLT tracking is a means to perform edge detection however can be advanced to assist
in detecting additional features.

Second Attempt
• Perform tedious by hand selection
– I feared this may not produce a notable amount of
points for the level of differentiation in the
pictures

Success!
• This yielded promising samples over 10000 iterations each
• Results...
– [1] "K 13 -> 91.666667"
– Results...
– [1] "K 13 -> 91.666667"
– Results...
– [1] "K 13 -> 66.666667"
– Results...
– [1] "K 13 -> 75.000000"
– Results...
– [1] "K 13 -> 83.333333"
– Results...
– [1] "K 13 -> 83.333333"
• This became further evidence of the viability of this method

fpres

Recommended

Recommended

More Related Content

Similar to fpres

Similar to fpres (20)

More from Zachary Job

More from Zachary Job (7)

fpres