SlideShare a Scribd company logo
Analyzing and Refining An
Application of the Sum of
Differences
Zachary Job
A short background…
S.A.D. What?
• Computer vision
– Linear Algebra out the wazu!
• Simultaneous Localization and Mapping
• Real-time Point Cloud Processing and Streaming
• Polygon and Voxel Reconstruction
• Movement and Depth Processing
• SPIN object recognition
• Etcetera…
Plenty of data for all!
• A plethora of information is at everyone’s
fingertips. More pixels than we know what to
do with. Unless you’re Google training their
cat seeking neural net.
• Some algorithms are not too easy to
implement from scratch or on the fly…
– There are two difficulties in CS. Cache Invalidation,
naming conventions, and off by one errors
Why S.A.D?
• It’s elementary and needs no computer vision
background
• I help my high-school robotics club learn programming
• Not impossible to implement as portable code. I’m a C
kind of guy. I’ll have to write a portable version that
errors more than my non SAD implementations.
• It’s fast enough
• This is one of the more error prone algorithms due to
well documented errors
• DON’T SEE THAT EVERYDAY! I hope everyone can take
something away from this
A quick explanation…
Ground Truth Disparity
Base Image
It’s Simple. It’s useful.
• Multiple disparities can be used to determine
3D points in space. Be it distance,
reconstruction, or more.
I used triangle “zipping” reconstruction
so it isn’t smooth, but faaaaast
Regions with texture
Similar textures
Repeating Patterns
Occlusion
I could go on…
Questions before I continue?
Objective
• Attempt to classify the image characteristics. This
could be useful in a lower level language. A
machine without laser optics could process
information rapidly via two cameras. This makes
the code less dense, hardware more inexpensive,
and works within reasonable parameters.
• ALSO! I’ve been wondering if I really need to use
SPIN to perform recognition on point models
which takes an eon and a half…
The Challenges
• Balance efficiency and performance
– If the training set is to be expanded this is critical
• Ensure the algorithm cleanses/normalizes/adjusts the
data throughout execution.
– Images are data. The pixels should be treated so when
trying to process and relate them.
• Select a suitable classification algorithm in R
– This is a class on R, something has to be in R
• Have the data formatted enough before R gets its
hands on it
– There is no need to re-loop in a higher level scripting
language
Gray the Images!
• RGB is definitely makes simple comparison
difficult. Intensities could mangle things
without overheard to correct this. Graying
leaves a rather elegant way to clean and
simplify inputs.
Initial Ideas
Zero-mean normal cross-coorelation
However…
The decision
• Go ahead with NCC and utilize training data to
perform KKNN to boot. This representation
best suites debugging. There are already an
immense amount of steps involved to just get
the data.
The test images
Rectification
• Although I eventually attributed some bugs to poor
results. The NCC also appeared to be of little help for
SAD yielding dirty disparities and KKNN results of 0-
15%. For this simple version it wouldn’t do any good
nor did it cleanse the data as I required.
• It can be noticed the intensity varies greatly. I had an
epiphany! A classmate had shared with me the idea of
treating data as a binary problem when dealing with
classification. A wonderfully simple idea. This would
work. It would be fast. It would keep things simple.
Neighbor Based Ranking
• Generate a map. For each pixel, if a neighbor
of a window surrounding the pixel is greater
add one to the pixel’s map location.
• Use the ranks during SAD instead of the pixel
intensities.
Naïve Results
This looks good!
Training methodology
• As in the initial attempt I would be using KKNN.
• I would manually select regions and write the
points in for processing via the main C program.
• The results were loaded to R as the deviations
between four selections similar to quartiles. Parts
of the selections were compared against
themselves producing further measures of
deviation. Finally widths of pixel ranges were
used to derive slightly more about the sections.
– This is super naïve but for the sake of proving this is
viable.
The Second Results
• With a very, very, very small number of points
around 50 (selecting is a tedious process)
KKNN was averaging 22-60% accuracy.
• Success?
– For such a small sample this is promising, however
not extremely conclusive.
– Time became a constraint. I underestimated the
work involved and had to act fast.
What I Gathered
• There holds some validity to the idea of
feature comparison through KKNN and
disparity maps
• Improvements can be made…
– Increase the data set
– Select very defined cases to represent each
feature
– Re-evaluate features and improve them
– Time!
First Rectification Attempt
• Use of KLT tracking to automatically populate
features
– Bugs in the code swatted this down given all I
could get working was edge detection
KLT tracking is a means to perform edge detection however can be advanced to assist
in detecting additional features.
Second Attempt
• Perform tedious by hand selection
– I feared this may not produce a notable amount of
points for the level of differentiation in the
pictures
Success!
• This yielded promising samples over 10000 iterations each
• Results...
– [1] "K 13 -> 91.666667"
– Results...
– [1] "K 13 -> 91.666667"
– Results...
– [1] "K 13 -> 66.666667"
– Results...
– [1] "K 13 -> 75.000000"
– Results...
– [1] "K 13 -> 83.333333"
– Results...
– [1] "K 13 -> 83.333333"
• This became further evidence of the viability of this method
Summary
Questions?

More Related Content

Similar to fpres

"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
Edge AI and Vision Alliance
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
Nimrita Koul
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
Tulipp. Eu
 
Introduction to deep learning workshop
Introduction to deep learning workshopIntroduction to deep learning workshop
Introduction to deep learning workshop
Shamane Siriwardhana
 
Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenches
Vinay Shukla
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human label
Kai-Wen Zhao
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniquesAbhineet Bhamra
 
Image analytics - A Primer
Image analytics - A PrimerImage analytics - A Primer
Image analytics - A Primer
Gopi Krishna Nuti
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
JoanJeremiah
 
Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportation
Wanjin Yu
 
Machine Learning Startup
Machine Learning StartupMachine Learning Startup
Machine Learning Startup
Ben Lackey
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
gdgsurrey
 
The objectdetection using the Artificialintelligence.pptx
The objectdetection using the Artificialintelligence.pptxThe objectdetection using the Artificialintelligence.pptx
The objectdetection using the Artificialintelligence.pptx
RahulRaut98
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
ABHISHEK MAURYA
 
Patella railsconf 2012
Patella railsconf 2012Patella railsconf 2012
Patella railsconf 2012
Jeff Dwyer
 
30thSep2014
30thSep201430thSep2014
30thSep2014Mia liu
 
Emotion recognition and drowsiness detection using python.ppt
Emotion recognition and drowsiness detection using python.pptEmotion recognition and drowsiness detection using python.ppt
Emotion recognition and drowsiness detection using python.ppt
Gopi Naidu
 
UNit4.pdf
UNit4.pdfUNit4.pdf
UNit4.pdf
SugumarSarDurai
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
SaadMemon23
 

Similar to fpres (20)

"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f..."Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
"Solving Vision Tasks Using Deep Learning: An Introduction," a Presentation f...
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Introduction to deep learning workshop
Introduction to deep learning workshopIntroduction to deep learning workshop
Introduction to deep learning workshop
 
Apache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenchesApache con big data 2015 - Data Science from the trenches
Apache con big data 2015 - Data Science from the trenches
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human label
 
Face Detection techniques
Face Detection techniquesFace Detection techniques
Face Detection techniques
 
Image analytics - A Primer
Image analytics - A PrimerImage analytics - A Primer
Image analytics - A Primer
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Computer vision for transportation
Computer vision for transportationComputer vision for transportation
Computer vision for transportation
 
Machine Learning Startup
Machine Learning StartupMachine Learning Startup
Machine Learning Startup
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
 
The objectdetection using the Artificialintelligence.pptx
The objectdetection using the Artificialintelligence.pptxThe objectdetection using the Artificialintelligence.pptx
The objectdetection using the Artificialintelligence.pptx
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Patella railsconf 2012
Patella railsconf 2012Patella railsconf 2012
Patella railsconf 2012
 
30thSep2014
30thSep201430thSep2014
30thSep2014
 
Emotion recognition and drowsiness detection using python.ppt
Emotion recognition and drowsiness detection using python.pptEmotion recognition and drowsiness detection using python.ppt
Emotion recognition and drowsiness detection using python.ppt
 
UNit4.pdf
UNit4.pdfUNit4.pdf
UNit4.pdf
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
 

More from Zachary Job

677_Project_Report_V2
677_Project_Report_V2677_Project_Report_V2
677_Project_Report_V2Zachary Job
 
Sample_RWEngineering
Sample_RWEngineeringSample_RWEngineering
Sample_RWEngineeringZachary Job
 
Sample_HEngineering
Sample_HEngineeringSample_HEngineering
Sample_HEngineeringZachary Job
 
Sample_HArchitecture
Sample_HArchitectureSample_HArchitecture
Sample_HArchitectureZachary Job
 
MultipleObjectDepthMapRec
MultipleObjectDepthMapRecMultipleObjectDepthMapRec
MultipleObjectDepthMapRecZachary Job
 

More from Zachary Job (7)

677_Project_Report_V2
677_Project_Report_V2677_Project_Report_V2
677_Project_Report_V2
 
Sample_RWEngineering
Sample_RWEngineeringSample_RWEngineering
Sample_RWEngineering
 
Sample_HEngineering
Sample_HEngineeringSample_HEngineering
Sample_HEngineering
 
Sample_HArchitecture
Sample_HArchitectureSample_HArchitecture
Sample_HArchitecture
 
Pong
PongPong
Pong
 
MultipleObjectDepthMapRec
MultipleObjectDepthMapRecMultipleObjectDepthMapRec
MultipleObjectDepthMapRec
 
stigbot_beta
stigbot_betastigbot_beta
stigbot_beta
 

fpres

  • 1. Analyzing and Refining An Application of the Sum of Differences Zachary Job
  • 3. S.A.D. What? • Computer vision – Linear Algebra out the wazu! • Simultaneous Localization and Mapping • Real-time Point Cloud Processing and Streaming • Polygon and Voxel Reconstruction • Movement and Depth Processing • SPIN object recognition • Etcetera…
  • 4. Plenty of data for all! • A plethora of information is at everyone’s fingertips. More pixels than we know what to do with. Unless you’re Google training their cat seeking neural net. • Some algorithms are not too easy to implement from scratch or on the fly… – There are two difficulties in CS. Cache Invalidation, naming conventions, and off by one errors
  • 5. Why S.A.D? • It’s elementary and needs no computer vision background • I help my high-school robotics club learn programming • Not impossible to implement as portable code. I’m a C kind of guy. I’ll have to write a portable version that errors more than my non SAD implementations. • It’s fast enough • This is one of the more error prone algorithms due to well documented errors • DON’T SEE THAT EVERYDAY! I hope everyone can take something away from this
  • 8. It’s Simple. It’s useful. • Multiple disparities can be used to determine 3D points in space. Be it distance, reconstruction, or more. I used triangle “zipping” reconstruction so it isn’t smooth, but faaaaast
  • 9.
  • 14. I could go on… Questions before I continue?
  • 15. Objective • Attempt to classify the image characteristics. This could be useful in a lower level language. A machine without laser optics could process information rapidly via two cameras. This makes the code less dense, hardware more inexpensive, and works within reasonable parameters. • ALSO! I’ve been wondering if I really need to use SPIN to perform recognition on point models which takes an eon and a half…
  • 16. The Challenges • Balance efficiency and performance – If the training set is to be expanded this is critical • Ensure the algorithm cleanses/normalizes/adjusts the data throughout execution. – Images are data. The pixels should be treated so when trying to process and relate them. • Select a suitable classification algorithm in R – This is a class on R, something has to be in R • Have the data formatted enough before R gets its hands on it – There is no need to re-loop in a higher level scripting language
  • 17. Gray the Images! • RGB is definitely makes simple comparison difficult. Intensities could mangle things without overheard to correct this. Graying leaves a rather elegant way to clean and simplify inputs.
  • 18. Initial Ideas Zero-mean normal cross-coorelation
  • 20. The decision • Go ahead with NCC and utilize training data to perform KKNN to boot. This representation best suites debugging. There are already an immense amount of steps involved to just get the data.
  • 22. Rectification • Although I eventually attributed some bugs to poor results. The NCC also appeared to be of little help for SAD yielding dirty disparities and KKNN results of 0- 15%. For this simple version it wouldn’t do any good nor did it cleanse the data as I required. • It can be noticed the intensity varies greatly. I had an epiphany! A classmate had shared with me the idea of treating data as a binary problem when dealing with classification. A wonderfully simple idea. This would work. It would be fast. It would keep things simple.
  • 23. Neighbor Based Ranking • Generate a map. For each pixel, if a neighbor of a window surrounding the pixel is greater add one to the pixel’s map location. • Use the ranks during SAD instead of the pixel intensities.
  • 25. Training methodology • As in the initial attempt I would be using KKNN. • I would manually select regions and write the points in for processing via the main C program. • The results were loaded to R as the deviations between four selections similar to quartiles. Parts of the selections were compared against themselves producing further measures of deviation. Finally widths of pixel ranges were used to derive slightly more about the sections. – This is super naïve but for the sake of proving this is viable.
  • 26. The Second Results • With a very, very, very small number of points around 50 (selecting is a tedious process) KKNN was averaging 22-60% accuracy. • Success? – For such a small sample this is promising, however not extremely conclusive. – Time became a constraint. I underestimated the work involved and had to act fast.
  • 27. What I Gathered • There holds some validity to the idea of feature comparison through KKNN and disparity maps • Improvements can be made… – Increase the data set – Select very defined cases to represent each feature – Re-evaluate features and improve them – Time!
  • 28. First Rectification Attempt • Use of KLT tracking to automatically populate features – Bugs in the code swatted this down given all I could get working was edge detection KLT tracking is a means to perform edge detection however can be advanced to assist in detecting additional features.
  • 29. Second Attempt • Perform tedious by hand selection – I feared this may not produce a notable amount of points for the level of differentiation in the pictures
  • 30. Success! • This yielded promising samples over 10000 iterations each • Results... – [1] "K 13 -> 91.666667" – Results... – [1] "K 13 -> 91.666667" – Results... – [1] "K 13 -> 66.666667" – Results... – [1] "K 13 -> 75.000000" – Results... – [1] "K 13 -> 83.333333" – Results... – [1] "K 13 -> 83.333333" • This became further evidence of the viability of this method