Ever wondered how computers recognize and track human faces? Netlighter Saksham Gautam held a talk at Netlight Grand Edge Data Munich and dissected this seemingly complex problem of object recognition and tracking into digestible pieces. You will get a good understanding of what happens behind the scenes when, for instance, you walk through automatic passport control at the airport, or how Google's self driven car 'sees' the road signs. We will see how these generic algorithms and techniques can be applied to other machine learning and pattern recognition problems as well.
2. 08.06.2016 Saksham Gautam
LET’S START WITH A QUICK SHOW OF HANDS
HOW MANY OF YOU…
▸ have heard about machine learning?
▸ have used machine learning in your projects?
▸ have implemented any ML algorithm from scratch?
▸ have done Andrew Ng’s (or other) courses on ML?
▸ understand that deep learning uses neural network?
▸ still remember what the kernel trick is?
3. 08.06.2016 Saksham Gautam
SHOW OF HANDS ON YOUR FAMILIARITY WITH COMPUTER VISION
HOW ABOUT…
▸ know how an image can be represented as a matrix?
▸ have used openCV or MATLAB?
▸ understand how convolution can be used to detect edges?
▸ know the role of scale space in computer vision?
▸ remember how eigenvectors can be used for face
recognition?
4. 08.06.2016 Saksham Gautam
WHAT DO WE WANT TO ACHIEVE?
FACE DETECTION & RECOGNITION
http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detection.html#gsc.tab=0
FACE
MONA LISA
NOT A FACE!
6. 08.06.2016 Saksham Gautam
BASIC STEPS FOR FACE RECOGNITION
BUT HOW EXACTLY?
1. Capture image
2. Filter out noise
3. Find face in the image
4. Create a similarity metric and a model (Training)
5. Match any given face to one from the database
6. Return the closest match with the probability
7. 08.06.2016 Saksham Gautam
FACE RECOGNITION CAN BE BROKEN DOWN INTO SIMPLE STEPS
BUILDING BLOCKS
RAW IMAGE
PROCESSED
IMAGE
FEATURES
MACHINE LEARNING
ALGORITHM
Training
Validation
MODEL
RAW IMAGE
DECISION
How can I capture image?
Remove any noise?
What’s the information
in the image?
Can we match patterns?
14. 08.06.2016 Saksham Gautam
CONVOLUTION CAN BE USED FOR COMPUTING IMAGE GRADIENT
IMAGE GRADIENT
0 0 0
100 100 100
0 50 100
100 50 0
-1 0 1
*
0
0
100
-100
=
15. 08.06.2016 Saksham Gautam
EDGES AND CORNERS ARE FEATURES IN AN IMAGE
SOBEL FILTER FOR DETECTING EDGES
HARRIS CORNER DETECTOR
-1 0 1
-2 0 2
-1 0 1
Gx =
-1 -2 -1
0 0 0
1 2 1
Gy =
$python sobel-filter.py
$python harris-corner.py
16. 08.06.2016 Saksham Gautam
MORE ROBUST FEATURES CAN BE USED FOR OBJECT RECOGNITION
SIFT, SURF, HOG
▸ More advanced features can be used for scale invariance
▸ Some are robust even under varying lighting conditions
▸ These serve as the starting point for the ML part
17. 08.06.2016 Saksham Gautam
CASCADES OF FILTERS ON AN IMAGE CAN BE USED FOR DETECTING FACES
DETECTING FACES
http://siret.ms.mff.cuni.cz/facereco/method/
$python viola-jones.py
18. 08.06.2016 Saksham Gautam
FEATURES FRO THE FACE CAN BE FED TO AN ML ALGORITHM
BUILDING BLOCKS
RAW IMAGE
PROCESSED
IMAGE
FEATURES
MACHINE LEARNING
ALGORITHM
Training
Validation
MODEL
RAW IMAGE
DECISION
19. PERFORMANCE (P) OF A METHOD
FOR A TASK (T) INCREASES WITH
EXPERIENCE (E)
Tom Mitchell
BTW, WHO LEARNS? THE MACHINE, REALLY?
20. 08.06.2016 Saksham Gautam
PROBABILITY AND STATISTICS CAN HELP ANSWER MANY QUESTIONS
T-SHIRT SIZE FOR THE SUMMIT MAYBE I SHOULD
HAVE PICKED ‘M’
INSTEAD OF ’S'
22. 08.06.2016 Saksham Gautam
MAXIMUM LIKELIHOOD ESTIMATE HELPS ON THE FACE OF UNCERTAINTY
CLASSIFICATION PROBLEM?
Length #2
Length #1
S
M
Length #1
Length #2
S
31. 08.06.2016 Saksham Gautam
GO WITH THE HYPE, BUT WITH CARE
DEEP LEARNING ~ MASSIVE NEURAL NETWORK
▸ Learning algorithm is the same, i.e. back propagation
▸ Has the same problem with overfitting
▸ Can be used for feature extraction and selection
▸ Mathematical foundations for neural network still not
“perfect”
▸ Pointer: https://www.tensorflow.org from Google
32. 08.06.2016 Saksham Gautam
MACHINE LEARNING PIPELINE
SUMMARY
RAW IMAGE
PROCESSED
IMAGE
FEATURES
MACHINE LEARNING
ALGORITHM
Training
Validation
MODEL
RAW IMAGE
DECISION
33. REFERENCES
• OpenCV Documentation. http://docs.opencv.org/3.1.0/#gsc.tab=0
• Andrew Ng. Machine Learning Courser on Coursera. http://www.coursera.org/learn/machine-learning
• Christopher Bishop. Machines that Learn. https://www.youtube.com/watch?v=icaA7gVxqSs
• Video Lecture on Face Detection and Tracking. https://www.youtube.com/watch?v=WfdYYNamHZ8
• Adam Harvey explains Viola-Jones Face Detection. http://www.makematics.com/research/viola-jones/
• Christopher, M. Bishop. "Pattern recognition and machine learning." Company New York 16.4 (2006):
049901.
• Bradski, Gary, and Adrian Kaehler. Learning OpenCV: Computer vision with the OpenCV library. "
O'Reilly Media, Inc .", 2008
• Solem, Jan Erik. Programming Computer Vision with Python: Tools and algorithms for analyzing
images. " O'Reilly Me dia, Inc.", 2012.
• Hartley, Richard, and Andrew Zisserman. Multiple view geometry in computer vision. Cambridge
university press, 2003.
37. 08.06.2016 Saksham Gautam
EVERY SIGNAL CAN BE DECOMPOSED TO SINES AND COSINES
FOURIER TRANSFORM
▸ Frequency can be thought of as information in the image
▸ Fourier Transform can be used to decompose a signal into
these components
▸ Signal can be multiplied with filter in frequency domain
▸ Multiplication in frequency domain is convolution in time
domain