Inside the ANN: A visual and intuitive journey to understand how artificial neural networks store knowledge and how they make decisions (no code, no math included)

Inside the ANN (Artificial Neural Network) A visual and intuitive journey* to understand how it stores knowledge and how it takes decisions*no code, no math included[following BCN Python Group’s request after my presentation on Machine Learning last September 25th2014]
PresentedbyXavier Arrufat
BCN Python Meetup–November2014
Barcelona, November20th, 2014

Questions from last Meetup(Python BCN -september25th2014)
1.How does an ANN work? Examples?
2.You must be an engineer… a mathematicianwould never say ANNs are easy [to understand]!

Real case
Mail sorting by ZIP (postal) codeNY –1912U.S. Postal ServiceMt.Pleasantsortingoffice –1951TheBritishPostal Museum1960s? Photo Credit: Patrick S. McCabe, U.S. Postal ServiceRoyal Mail on Christmas–Glasgow –2010s? (TheTelegraph–picture: PA) 1990s? U.S. Postal Service

Real case
Mail sorting by ZIP (postal) codeRoyal Mail on Christmas –Glasgow –2010s? (TheTelegraph–picture: PA)

Agenda
10 minHow humanslearn to read
30 minHow ANNslearn to read (digits)
0 minIs there muchdifference?

How humans learn to read
This is letter ‘a’…
A a
4 years old

And these are letter ‘a’ too…

Reading test

Reading test completed: ‘Paranoia’
Easy? Are these really ‘a’s?

d b
qp
Challenging letters…

d b
qp
Remark: vertical axed symmetry leads to confusion much more often

Human writing at age 4
Notice: only vertical axed symmetry gets (spontaneously) generated

Vision circuitry
Observe symmetric ‘wiring’ (only in one plane, I infer)
http://webvision.med.utah.edu/book/part-ix- psychophysics-of-vision/the-primary-visual-cortex/
http://www1.appstate.edu/~kms/classes/ psy3203/EyePhysio/VisualPathways.jpg

MNIST dataset
http://yann.lecun.com/exdb/mnist/
Training set of 60,000 examples (6,000 per digit)
Test set of 10,000 examples (1,000 per digit)
Each character is a 28x28 pixel box =>
784 numbers per character within range [0:white, background, 255:black, foreground]
(N.B.: when using ANNs, normalize values to range [0,1] or [-1,1] before continuing)

Machine Learning
Problem definition –handwritten digits identification
Input0123456789
Whoseclassdoesitbelong?
(classificationproblem)

Machine Learning
Expected solution: probabilistic
Input0123456789
Whoseclassdoesitbelong?
(classificationproblem)
Probability
0.01
0.10
0.07
0.06
0.31
0.04
0.03
0.15
0.02
0.21

Machine Learning
What’s the solving black box like?
Input
28 x 28 pixel box
784numbers0123456789Machine Learning-Black Box -
Probability
0.01
0.10
0.07
0.06
0.31
0.04
0.03
0.15
0.02
0.21

ANN –Artificial Neural Network
Standard schema (scaring, huh?) Forward computationLowerlayerweightmatrixUpperlayerweightmatrix
numbers
Θ1 (Theta1)
15 x 784 numbers
Θ2 (Theta2)
10 x 15+1numbers
Credit on directed graph (text overlaid is mine instead): Michael Nielsenon http://neuralnetworksanddeeplearning.com/chap1.html

Cellulose acetate
Replaced by PET nowadays
(Polyethylene terephthalate a.k.a. ‘polyester’)
Image from Unimed(http://unimed.eu.com/products/radiology- supplies/x-ray-film-cassette-with-screen-4682.html)

Semitransparent patternsImage1Image2

Semitransparent patternsImage1Image2
0.00
0.50
0.00
0.50
1.00
0.50
0.00
0.50
0.00
0.00
0.00
0.00
0.50
1.00
0.50
0.50
0.50
0.50‘Transparency’ values(foodforthoughtto complete theanalogy: ifnegativevalues, thefilm emitslight insteadof absorbingit?)

Superposing patternsImage1Image2SuperposeResult

Superposing mathImage1Image2SuperposeResult
0.00
0.50
0.00
0.50
1.00
0.50
0.00
0.50
0.00
0.00
0.00
0.00
0.25
1.00
0.25
0.00
0.25
0.00
0.00
0.00
0.00
0.50
1.00
0.50
0.50
0.50
0.50multiplicationPixelbypixel

‘Coincidence’ levelImage1Image2Result

Internal ‘sensors’
(or filters, or bases, or internal features, or neurons … or lower layer)
Sensor commitee1: 20 membersImage fromSheng-huaZhong, Yan Liu, Yang Liu. Bilinear Deep Learning for Image Classification. To appear in ACM International Conference of Multimedia (SIG MM'11), 2011

(or filters, or bases, or internal features, or neurons … or lower layer)
Sensor commitee1: 20 membersImages fromSheng-huaZhong, Yan Liu, Yang Liu. Bilinear Deep Learning for Image Classification. To appear in ACM International Conference of Multimedia (SIG MM'11), 2011

(or filters, or bases, or internal features, or neurons …)
Sensor commitee2: 900 members
Sensor commitee3: 64 members
Image from Pikalike
http://www.picalike.com/
Image from Tom Lahore http://evolvingstuff.blogspot.com.es/2012/12/mnist-features.html

Internal sensors
(reduced selection for the sake of simplicity) ‘Neurons’ extracted from Pikalike’simage on previous page
http://www.picalike.com/

Internal sensors: matching input
Levelof response as per patternmatching
Input
high
low
mid
mid
low
mid
low
low

Internal sensors: voting0123456789
‘Sensor commitee’
(response as per patternmatching)
Individual ‘votes’
high
low
mid
mid
low
mid
low
low
0.70
0.80
0.12
0.11
0.85
0.03
0.32
0.65
0.15
0.08
Input
Note: all numbersin thisslidearemadeup. Theydo notcorrespondto actual results.

Internal sensors: voting0123456789
‘Votingcommitee’
(as per patternrecognition)
Cumulated‘votes’
high
low
mid
mid
low
mid
low
low
2.42
20.40
15.73
12.52
63.61
7.93
6.06
2.94
4.11
41.56
Input

Normalizing votes (probability) 0123456789
high
low
mid
mid
low
mid
low
low
2.42
20.40
15.73
12.52
63.61
7.93
6.06
2.94
4.11
41.56
Input
0.01
0.10
0.07
0.06
0.31
0.04
0.03
0.15
0.02
0.21
Probability
(votes / total)

Decision making0123456789
high
low
mid
mid
low
mid
low
low
2.42
20.40
15.73
12.52
63.61
7.93
6.06
2.94
4.11
41.56
Input
0.01
0.10
0.07
0.06
0.31
0.04
0.03
0.15
0.02
0.21
Probability
(votes / total)

All steps in one page
0.0
0.9
*
=
0.00.90.9
*
=0.8
Σ
1) Multiplyinput * filter, pixelper pixel
2) Addup resultingvalues. Youget a simple real numberR
4) Cast weighted votes on your filter’s favorite classes
7.35
Input
Filter
3) (optional–non-lineal activationfunction) SquashR withintherange(0,1) [ or (-1,1) ]
5) Normalize votes: compute class probabilities
6) Make a decisionbased on probabilites
0) Get a convenient set of filtersandvotingrules(a.k.a. ‘ANN training’)

ANN –Artificial Neural Network
Standard schema (now a bit more intuitive?)
Θ1 (Theta1)
15 x 784 numbers
Θ2 (Theta2)
10 x 15+1numbersLowerlayerweightmatrixUpperlayerweightmatrix RatingDecisionmaking Forward computation PatterncomparisonInternalmatchinglevelVoting
numbers
Credit on arrowed graph (text overlaid is mine instead): Michael Nielsenon http://neuralnetworksanddeeplearning.com/chap1.html

Inside the ANN: A visual and intuitive journey to understand how artificial neural networks store knowledge and how they make decisions (no code, no math included)

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Inside the ANN: A visual and intuitive journey to understand how artificial neural networks store knowledge and how they make decisions (no code, no math included)

Similar to Inside the ANN: A visual and intuitive journey to understand how artificial neural networks store knowledge and how they make decisions (no code, no math included) (20)

Recently uploaded

Recently uploaded (20)

Inside the ANN: A visual and intuitive journey to understand how artificial neural networks store knowledge and how they make decisions (no code, no math included)