Intro to Neural Networks

Intro
to
Neural
Networks

Dean
Wya2e

Boulder
Data
Science

@drwya2e

June
9,
2016

Neural
Networks

•  AI
summer
is
here!

•  In
the
last
year
NNs

have

–  ConFnued
SOA

advancements
in

image
and
speech

recogniFon

–  Beaten
a
human
player

in
Go

–  Provided
some

quanFﬁcaFon
of
“art”

•  100,000,000,000
neurons

•  10,000
dendriFc
inputs
per

neuron

•  1
electrical
output

How
does
your
brain
work?

One
simple
abstracFon

Dendri'c

input

Synap'c

weights

Soma
Axonal
output

Digression
into
regression

•  Linear
regression

•  LogisFc
regression

How
to
learn
the
weights?

•  If
we
know
what
output
should
look
like,
can

compute
error
and
update
weights
to
minimize
it

–  OpFmizaFon
problem,
typically
use
gradient
descent

_
Correct
output

Output

Error

Gradient
descent

•  Given
a
cost
funcFon

– MSE

– Cross-‐entropy

– etc.

•  Can
take
step
in
opposite
direcFon
of
cost

gradient
by
compuFng
derivaFve
w.r.t.

weights

•  Scale
by
learning
rate
(Fny
step)

A
brief
history
of
neural
networks:

The
Perceptron

x1
x2
y

0
0
0

0
1
0

1
0
0

1
1
1

~1960:
“The
perceptron”

Universal
funcFon
approximator

AND

A
brief
history
of
neural
networks:

The
Perceptron

~1960:
“The
perceptron”

Universal
funcFon
approximator

x1
x2
y

0
0
0

0
1
1

1
0
1

1
1
0

…but
only
if
funcFon
is
linearly
separable

XOR

?

A
brief
history
of
neural
networks:

The
Perceptron

•  Neural
network
research
halts

(AI
winter)

•  Meanwhile…

–  Support
Vector
Machine
(SVM)

invented,
solves
non-‐linear

problems

•  Shif
toward
separaFon
of
feature

representaFon
and
classiﬁcaFon

–  Handcraf
the
best
features,
train

the
SVM
(or
current
state-‐of-‐the-‐
art)
to
do
the
classiﬁcaFon

•  Eventually,
mulF-‐layer
perceptron

generalizaFon
realized,
solves
non-‐linear

problems

–  Nobody
cares…

A
brief
history
of
neural
networks:

Next
~30
years

h"ps://www.youtube.com/watch?v=3liCbRZPrZA

Handcrafed
arFsanal
features

•  Discovering
good
features
is
hard!

–  Requires
a
lot
of
domain
knowledge

–  State
of
the
art
in
computer
vision
was
the
culminaFon
of
years
of

collaboraFon
between
computer
vision
scienFsts,
neuroscienFsts,
etc.

•  Neural
networks
automaFcally
learn
features
(weights)
from
examples

based
on
the
task

–  Each
neuron
is
a
“feature
detector”
that
acFvates
proporFonately
to
how

well
its
input
matches
its
weights

–  Deep
learning:
Shif
back
from
hand-‐crafed
features
to
features
learned

from
task

General
learning
methods
for
robust
feature

representaFon
and
classiﬁcaFon

Hidden
1
Hidden
2
Hidden
3

•  Handful
of
researchers
sFll
toiling
away
on
neural
networks
with
li2le-‐to-‐no

recogniFon

–  2012:
one
grad
student
studying
how
to
implement
neural
networks
on
GPUs
submits

ﬁrst
“deep
learning”
architecture
to
image
recogniFon
challenge,
wins
by
a
landslide

–  2013:
Almost
every
submission
the
is
a
deep
neural
network
executed
on
GPU

(conFnuing
trend)

A
brief
history
of
neural
networks:

Deep
learning
bandwagon

First
deep
neural
network

•  8
layers

•  650,000
“neurons”
(units)

•  60,000,000
learned
parameters

•  630,000,000
connecFons

•  Uses
same
basic
algorithm
as
mulF-‐layer
perceptron
to
learn
weights

•  Finally
caught
on
because

–  Can
do
it
“fast”
(~1
week
in
2012)
thanks
to
GPU-‐based
computaFon

–  Actually
works
and
with
less
overﬁkng
due
to
tricks
and
massive
amounts
of
data

AlexNet

AlexNet

96
11x11
pixel
ﬁlter
weights
learned
from
ImageNet

AlexNet

Handcrafed
Textons

Unseen
image
classiﬁcaFons

Neural
Networks
in
2016

•  Variety
of
libraries
that
specify

inputs
as
tensor
minibatch
and

automaFcally
compute
gradients

–  Tensorﬂow

–  Theano
(Keras/Lasagne)

–  Torch

•  Libraries
also
available
for

common
Neural
Network
layer

types

–  ConvoluFonal,
acFvaFon,
pooling,

dropout,
RNN,
etc.

•  Almost
too
easy

–  Mind
the
danger
zone!

Data
science
due
diligence

“Neural
Networks
sound
awesome
and
will
solve
all
our

problems!”

•  Significant
investment
in
resources.
GPU
(TPU?)
cluster,
ramp-‐up

on
niche/rapidly-‐evolving
tools

•  Long
feedback
loop
for
architecture
improvement.
Typically
launch

many
jobs
and
terminate
bad
models
(see
above)

•  Need
a
lot
of
high-‐dimensional
data
with
variability
(millions
of

unique
observaFons
and/or
heavy
data
augmentaFon).
Delicate

balance
of
increased
predicFve
power/overfikng

•  Hard
to
debug
when
not
working.
Millions
of
reasons
(literally)
a

model
can
be
wrong,
few
ways
it
can
be
right.
“Black
magic”

•  Deep
nonlinear
models
suffer
from
interpretability
issues.
Blackbox

model
(although
acFve
research
here)

Thanks

Manuel
Ruder,
Alexey
Dosovitskiy,
Thomas
Brox
(2016).
ArFsFc
style
transfer
for
videos.

h2p://arxiv.org/abs/1604.08610

h2ps://www.youtube.com/watch?v=Khuj4ASldmU

“This
is
cool,
but
I
don’t
(want
to)
code”

h2p://playground.tensorﬂow.org

“I
am
comfortable
with
the
SciPy
stack

and
want
to
understand
more”

A
Neural
Network
in
11
lines
of
Python

h2p://iamtrask.github.io/2015/07/12/basic-‐python-‐network/

“I
am
comfortable
with
ML
libraries
and

want
to
build
a
model”

MNIST

•  Keras

h2ps://github.com/fchollet/keras/blob/master/examples/
mnist_cnn.py

•  Tensorflow

h2ps://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/
index.html

Varia'onal
Autoencoders
(also
using
MNIST)

•  Keras

h2p://blog.keras.io/building-‐autoencoders-‐in-‐keras.html

•  Tensorflow

h2ps://jmetzen.github.io/2015-‐11-‐27/vae.html

Intro to Neural Networks

More Related Content

What's hot

Similar to Intro to Neural Networks

Recently uploaded

In this document

Intro to Neural Networks