Deep Learning Primer - a brief introduction

Deep
Learning
Primer

Anantharaman
Narayana
Iyer

7th
June
2014

What
is
Deep
Learning?

Deep
learning
is
a
Machine
Learning

technique
disBnguished
by
2
deﬁning

characterisBcs:

1.  Deep
Architecture

•  MulBple
layers
of
learning.

•  Methodologies
to
train
these
layers
that

gets
close
to
global
opBmum,
alleviaBng

the
eﬀect
of
local
minima
arising
due
to

non-‐convex
objecBve
funcBon

2. 
Feature
Learning
(aka
RepresentaBon

Learning)

•  TradiBonal
machine
learning
system

design,
such
as
LogisBc
Regression,

involve
manual
feature
design.
In

contrast,
a
deep
learning
system

automaBcally
learns
the
features
given

the
input.

AutomaBc
Feature
ExtracBon

Machine
Learning
System

Input

Output

Features

Why
is
there
a
phenomenal
interest?

•  Considered
the
next
big
thing
in

Machine
Learning
by
several

experts

•  Breakthrough
results
reported
in:

–  Speech
RecogniBon

•  MicrosoY
Audio
Video
Indexing
Service

(MAVIS),
reduced
word
error
rates
by
about

30%
on
4
major
benchmarks

–  Object
RecogniBon

•  MNIST
digits
recogniBon:
error
rate
0.27%

•  Successful
image
recogniBon
by
Google

–  Natural
Language
Processing

•  SENNA
system
that
reported
state
of
the
art

results
in
tasks
like
POS
tagging,
Chunking,

Named
EnBty
RecogniBon
etc

•  Substan3al
investments
on
this

technology
recently
by
top

technology
companies

Building
a
deep
learning
system

•  Many
ways
to
build
a
deep
learning
system,

with
the
deﬁning
characterisBcs
being:

–  MulBple
layers
where
each
layer
performs
a

nonlinear
transformaBon
of
the
output

generated
by
its
preceding
layer.

–  AutomaBc
feature
learning
where
the

features
are
progressively
more
abstract

–  Hierarchical
in
nature.

•  Broad
approaches/categorizaBons

–  Unsupervised
or
generaBve
models

–  Supervised
discriminaBve
models

–  Hybrid
(use
an
unsupervised
model
as
an

aid
to
perform
superior
discriminaBon)

•  Common
building
blocks
for
unsupervised

and
hybrid
approaches

–  Restricted
Boltzmann
Machines
(RBM)

–  Autoencoders

ApplicaBon
Example

Problem:
Suppose
we
need
to
build
a
deep
learning
system
to

detect
if
a
given
digital
image
contains
a
human
face
or
not.

Inputs
are
the
image
pixels
and
the
output
is
a
binary.

•  We
can
think
of
the
human
face
to
be
composed
of
a
few
key
facial

consBtuents
such
as
ears,
eyes,
nose
etc.
These
further
can
be
thought
of

contours
with
well
deﬁned
edges,
which
in
turn
are
consBtuted
by
speciﬁc

paderns
of
pixels.

•  We
think
of
this
as
generaBng
edges
from
input
pixels,
from
edges

generate
the
facial
aspects
and
from
those
detect
a
human
face.

•  The
role
of
a
hidden
layer
in
this
system
is
to
perform
a
nonlinear

transform
of
its
inputs
(lower
level
of
abstracBon)
and
produce
a
more

abstract
output
(as
e.g.
generaBng
a
nose
object
from
the
given
contours).

•  Thus
we
progressively
move
up
in
abstracBon
starBng
from
raw
pixels
and

ending
up
with
a
face
object.

High
level
implementaBon
steps

•  Suppose
we
implement
the
given
applicaBon
as
a
deep
neural
network
as:

–  Pixel
values
consBtute
the
input
layer

–  A
single
output
unit
consBtuBng
the
output
layer

–  We
will
have
2
hidden
layers

•  We
will
use
a
stacked
autoencoder
as
the
basic
building
block.

–  An
autoencoder
(AE)
neural
network
learns
to
produce
an
output
that
is
same
as
input
using

unsupervised
learning.
Thus,
given
pixel
values
x
as
input,
the
goal
of
AE
is
to
produce
an
output

image
to
be
same
as
input.

–  As
we
have
2
hidden
layers
we
will
require
2
AE’s
–
say
AE1,
AE2.
We
will
create
a
bodleneck
by

having
a
smaller
number
of
hidden
units
compared
to
number
of
input
units.

•  Layerwise
pretraining

–  Train
the
AE1
with
the
available
images
(that
may
or
may
not
have
an
human
image)

unsupervised.
Now,
the
output
of
hidden
units
of
AE1
consBtute
the
“learnt”
features
at
an

abstracBon
higher
than
the
input
pixels.
(e.g.
Edges
from
pixels)

–  Cascade
the
output
of
the
hidden
layer
of
the
AE
in
the
previous
step
with
AE2
and
train
AE2
to

learn
more
abstract
features
(e.g.
facial
components
from
edges)

•  Add
a
logisBc
regression
layer
as
the
output
layer
and
stack
the
2
AE’s
and
the
output

layer
to
consBtute
a
Neural
Network

•  Fine
tune
this
network
using
backpropagaBon
with
a
smaller
number
of
labeled
images

Deep Learning Primer - a brief introduction

More Related Content

What's hot

Viewers also liked

Similar to Deep Learning Primer - a brief introduction

More from ananth

Recently uploaded

Deep Learning Primer - a brief introduction