Artificial Neural Network basics

Qingkai Kong

2016-12-02

h"p://seismo.berkeley.edu/qingkaikong/	
  
Workshop time

Learningcurve

Gentle introduction

Step by step ANN

Real world example

h"ps://github.com/qingkaikong/20161202_ANN_basics	
  
Workshop time

Learningcurve

Gentle introduction

•  What’s ML

•  ANN history

•  ANN overview

Step by step ANN

Real world example

h"ps://github.com/qingkaikong/20161202_ANN_basics	
  
What is machine learning?

h"ps://github.com/qingkaikong/20161202_ANN_basics	
  
Self-driving car

Voice recognition 

…

h"ps://github.com/qingkaikong/20161202_ANN_basics	
  
Not always
working
1940s

Birth

1970s

Winter

1980s

Rebirth

2006

Deep
ANN in simple view
ANN jargons
What’re the weights
.	
  
.	
  
.	
  
w1	
  
w2	
  
wn	
  
F(eye×w1+nose×w2+…+mouth×wn)	
  
Sheldon	
  
Cooper?	
  
Intuitive 
Artificial Neural Network

Output	
  
Input	
  
.	
  
.	
  
.	
  
w1	
  
w2	
  
wn	
  
F(eye×w1+nose×w2+…+mouth×wn)	
  
Sheldon	
  
Cooper?	
  
Intuitive 
Artificial Neural Network

Output	
  
Input	
  
feedback	
  
error	
  
.	
  
.	
  
.	
  
w1	
  
w2	
  
wn	
  
F(eye×w1+nose×w2+…+mouth×wn)	
  
Sheldon	
  
Cooper?	
  
Intuitive 
Artificial Neural Network

Output	
  
Input	
  
feedback	
  
error	
  
.	
  
.	
  
.	
  
w1	
  
w2	
  
wn	
  
F(eye×w1+nose×w2+…+mouth×wn)	
  
Intuitive 
Artificial Neural Network

Output	
  
Input	
  
Workshop time

Learningcurve

Gentle introduction

•  What’s ML

•  ANN history

•  ANN overview

Step by step ANN

•  Perceptron

•  Backpropagation

Real world example
hps://arxiv.org/abs/1508.06576v1	
  
hps://deepart.io/	
  
Application: Learn arts
hp://junkhost.com/2016/03/man-­‐combines-­‐random-­‐peoples-­‐
photos-­‐using-­‐neural-­‐networks-­‐and-­‐the-­‐results-­‐are-­‐amazing/	
  
X
Σ yf
Input
Output
feature2
feature3
ω0
ω2
ω3
X – input data
y – output target
ωi – weights
Σ – summation
f – activation function
Blue circle – bias
feature1
ω1
1
Σ = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn	
  
	
  
f = f(ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn)	
  
z = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn	
  
1
1+e−z
More activation function

f (z) =
1
1+e−z
df (z)
dx
= f (z)(1− f (z))
X
Σ yf
Input
Output
feature2
feature3
ω0
ω2
ω3
X – input data
y – output target
ωi – weights
Σ – summation
f – activation function
Blue circle – bias
feature1
ω1
1
Σ = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn	
  
f = f(ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn)
y This is our estimation
Perceptron
X
Σ yf
Input
Output
feature2
feature3
ω0
ω2
ω3
X – input data
y – output target
ωi – weights
Σ – summation
f – activation function
Blue circle – bias
feature1
ω1
1
Error	
  =	
  Target	
  -­‐	
  Es-ma-on	
  
Perceptron
Error	
  =	
  Target	
  -­‐	
  Es-ma-on	
  
How the ANN learns
Error	
  =	
  Target	
  -­‐	
  Es-ma-on	
  
How the ANN learns

Learning:	
  
Update	
  weights	
  to	
  reduce	
  error	
  next	
  -me!	
  	
  
Weights	
  Delta	
  =	
  Error	
  ×	
  slope	
  ×	
  input	
  
Weights update rules

How	
  much	
  we	
  will	
  update	
  	
  
the	
  weights	
  for	
  next	
  Xme	
  
Error	
  =	
  Target	
  -­‐	
  Es-ma-on	
  
Look at errors closer

0	
  	
   1	
  
Target	
  
Error	
  =	
  Target	
  -­‐	
  Es-ma-on	
  
Three	
  cases:	
  
•  Error	
  	
  0:	
  Target	
  is	
  0,	
  es-ma-on	
  is	
  not	
  0	
  
•  Error	
  	
  0:	
  Target	
  is	
  1,	
  es-ma-on	
  is	
  not	
  1	
  
•  Error	
  =	
  0:	
  Es-ma-on	
  correct	
  
Look at errors closer

0	
  	
   1	
  
Target	
  
Cases	
  1:	
  
•  Error	
  	
  0:	
  Target	
  is	
  0,	
  es-ma-on	
  is	
  not	
  0	
  
Look at errors closer
(assume inputs are positive)
Cases	
  1:	
  
•  Target	
  is	
  0	
  
•  Es-ma-on	
  is	
  0.3	
  
Look at errors closer
(assume inputs are positive)

Error	
  =	
  0	
  –	
  0.3	
  =-­‐0.3	
  
Cases	
  1:	
  
•  Target	
  is	
  0	
  
•  Es-ma-on	
  is	
  0.3	
  
Look at errors closer
(assume inputs are positive)

1
1+e−z
z	
  zupdate	
  
Error	
  =	
  0	
  –	
  0.3	
  =-­‐0.3	
  
Cases	
  1:	
  
•  Target	
  is	
  0	
  
•  Es-ma-on	
  is	
  0.3	
  
Look at errors closer
(assume inputs are positive)

Error	
  =	
  0	
  –	
  0.3	
  =-­‐0.3	
  
z = ω0x0+ω1x1+ω2x2+ω3x3	
  
We need reduce weights!	
  
Cases	
  1:	
  
•  Target	
  is	
  0	
  
•  Es-ma-on	
  is	
  0.3	
  
Look at errors closer
(assume inputs are positive)

Error	
  =	
  0	
  –	
  0.3	
  =-­‐0.3	
  
z = ω0x0+ω1x1+ω2x2+ω3x3	
  
We need reduce weights!
If we add error to the weights, we will reduce it! 	
  
Cases	
  1:	
  
•  Target	
  is	
  0	
  
•  Es-ma-on	
  is	
  0.3	
  
Look at errors closer
(assume inputs are positive)

Error	
  =	
  0	
  –	
  0.3	
  =-­‐0.3	
  
z = ω0x0+ω1x1+ω2x2+ω3x3	
  
We need reduce weights!
But what if the inputs are negative 	
  
Weights	
  Delta	
  =	
  Error	
  ×	
  input	
  
Weights update rules
z	
  
z	
  
flat	
  slope	
  
steep	
  slope	
  
Weights	
  Delta	
  =	
  Error	
  ×	
  slope	
  ×	
  input	
  
Weights update rules
Learn from example
Learn from Example

Sample	
   Feature	
  1	
   Feature	
  2	
   Feature	
  3	
   Target	
  
Sample	
  1	
   0	
   0	
   1	
   0	
  
Sample	
  2	
   1	
   1	
   1	
   1	
  
Sample	
  3	
   1	
   0	
   1	
   1	
  
Sample	
  4	
   0	
   1	
   1	
   0	
  
Sample	
  5	
   1	
   0	
   0	
   1	
  
How to deal with errors

X
Σ yf
Input
Output
1
1
-0.166
-1.000
-0.395
1 0.441
1
Σ = (-0.166)×1 + 0.441×1 + (-1)×1 + (-0.395)×1
= -1.12
f(-1.12) = 1/(1+e-1.12) = 0.246
Error = 1 – 0.246 = 0.754
Sample	
   Feature	
  1	
   Feature	
  2	
   Feature	
  3	
   Target	
  
Sample	
  1	
   1	
   1	
   1	
   1	
  
1
1+e−z
z	
   zupdate	
  
•  Target:	
  1	
  
•  Es-ma-on:	
  0.246	
  
	
  
Error	
  0.754	
  
1
1+e−z
z	
   zupdate	
  
•  Target:	
  1	
  
•  Es-ma-on:	
  0.246	
  
	
  
Error	
  0.754	
  
We	
  want	
  to	
  increase	
  the	
  weights	
  next	
  -me	
  to	
  have	
  larger	
  z	
  
Weights	
  delta	
  =	
  0.754 ×	
  slope	
  ×	
  input	
  
z	
  
df (z)
dx
= f (z)(1− f (z))
z = -1.12 f(z) = 0.246
slope = 0.246 × (1 - 0.246) = 0.185
Change	
  item	
  =	
  0.754 ×	
  0.185	
  ×	
  	
  	
  
1	
  
1	
  
1	
  
1	
  
0.139	
  
0.139	
  
0.139	
  
0.139	
  
=	
  
-­‐0.166	
  
	
  0.441	
  
-­‐1.000	
  
-­‐0.395	
  
0.139	
  
0.139	
  
0.139	
  
0.139	
  
Updated	
  
Weights	
  
+	
   =	
  
-­‐0.027	
  
	
  0.580	
  
-­‐0.861	
  
-­‐0.256	
  
=	
  
Changes of the error

Original	
  
Weights	
  
updates	
  
-­‐0.166	
  
	
  0.441	
  
-­‐1.000	
  
-­‐0.395	
  
0.139	
  
0.139	
  
0.139	
  
0.139	
  
Updated	
  
Weights	
  
z = (-0.027)×1 + 0.580×1 + (-0.861)×1 + (-0.256) ×1 = -0.564
f(z) = 0.637
Error = 1 – 0.637 = 0.363	
  
+	
   =	
  
-­‐0.027	
  
	
  0.580	
  
-­‐0.861	
  
-­‐0.256	
  
=	
  
Changes of the error

Error of next iteration
Changes of the error

0.754	
  
0.363	
  
Iterate many times
Go to notebook 01	
  
Application: DeepDrumpf

hps://twier.com/DeepDrumpf	
  
Perceptron limitations
Winter of ANN
Multi-Layer Perceptron
X
Σ yf
Input
Output
1
feature2
feature3
X – input data
y – output target
Σ – summation
f – activation function
blue circle - bias
feature1
Hidden1
Σ | f
Hidden3
Σ | f
Hidden4
Σ | f
Hidden2
Σ | f
1
X
Σ yf
Input
Output
1
feature2
feature3
X – input data
y – output target
Σ – summation
f – activation function
blue circle - bias
feature1
Hidden1
Σ | f
Hidden3
Σ | f
1
X
Σ yf
Input
Output
1
feature2
feature3
X – input data
y – output target
Σ – summation
f – activation function
blue circle - bias
feature1
Hidden1
Σ | f
Hidden3
Σ | f
Hidden4
Σ | f
Hidden2
Σ | f
1
X
Σ yf
Input
Output
1
feature2
feature3
feature1
Hidden1
Σ | f
Hidden3
Σ | f
Hidden4
Σ | f
Hidden2
Σ | f
1
Go to notebook 02	
  
Workshop time

Learningcurve

Gentle introduction

•  What’s ML

•  ANN history

•  ANN overview

Step by step ANN

•  Perceptron

•  Backpropagation

Real world example

•  Sklearn example
hp://richzhang.github.io/colorizaXon/	
  hp://whaogive.com/videoColourizaXon/	
  
Application: Colourization
Go to notebook 03	
  
If you want to learn more …

hp://dlab.berkeley.edu/training	
  
Some useful resources

•  hp://iamtrask.github.io/2015/07/12/basic-­‐python-­‐network/	
  
•  hps://seat.massey.ac.nz/personal/s.r.marsland/MLBook.html	
  
•  hp://sebasXanraschka.com/ArXcles/2015_singlelayer_neurons.html	
  
•  hp://www.emergentmind.com/neural-­‐network	
  
•  hp://neuralnetworksanddeeplearning.com/	
  
•  hps://www.coursera.org/learn/neural-­‐networks	
  
You	
  can	
  also	
  find	
  most	
  of	
  today’s	
  workshop	
  material	
  on	
  my	
  blog:	
  
hp://qingkaikong.blogspot.com/2016/10/machine-­‐learning-­‐1-­‐what-­‐is-­‐machine.html	
  
I	
  thank	
  all	
  the	
  authors	
  of	
  the	
  above	
  links,	
  as	
  well	
  as	
  a	
  lot	
  of	
  the	
  images	
  I	
  got	
  from	
  
internet.	
  	
  
Introduction to Artificial Neural Network

Introduction to Artificial Neural Network

Editor's Notes

  • #14 It's a mathematical construct to fit a model to historical data, and use this model to forecast the future.