EssentialsOfMachineLearning.pdf

Machine Learning and its Applications
Presented by
Ankita Tiwari

Definition by Tom Mitchell(1998):
Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E
A well-defined learning task is given by <P, T, E>.
What is Machine Learning?
“Learning is a process by which a system improves
its performance from experience.”
- Herbert Simon

When Do We Use Machine Learning
ML is used when:
• Human expertise does not exist (navigation on mars )
• Human can't explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Model are based on huge amounts of data (genomics)
Learning isn't always useful:
• There is no need to "learn" to calculate payroll

1. What exactly is deep learning ?
2. Why is it generally better than other methods on
image, speech and certain other types of data?
The short answers
• ‘Deep Learning’ means using a neural network with
several layers of nodes between input and output,
• The combination of layers between input & output
perform feature identification and processing in
stages, just as our brains seem to.

Types of Learning
Unsupervised Learning
Unsupervised learning is the training of a
machine using information that is neither
classified nor labeled and allowing the
algorithm to act on that information
without guidance. Here the task of the
machine is to group unsorted information
according to similarities, patterns, and
differences without any prior training of
data. Unlike supervised learning, no
teacher is provided that means no
training will be given to the machine.
Therefore the machine is restricted to
find the hidden structure in unlabeled
data by itself.
Supervised Learning
It has the presence of a supervisor as a
teacher. Basically supervised learning is
when we teach or train the machine using
data that is well labeled. Which means
some data is already tagged with the
correct answer. After that, the machine is
provided with a new set of
examples(data) so that the supervised
learning algorithm analyses the training
data(set of training examples) and
produces a correct outcome from labeled
data.

Unsupervised Learning
Unsupervised learning is classified into two categories of algorithms:
• Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such
as grouping customers by purchasing behavior.
• Association: An association rule learning problem is where you want to discover rules that describe
large portions of your data, such as people that buy X also tend to buy Y.
Types of Unsupervised Learning:-
Clustering
• Exclusive (partitioning)
• Agglomerative
• Overlapping
• Probabilistic
Clustering Types:-
• Hierarchical clustering
• K-means clustering
• Principal Component Analysis
• Singular Value Decomposition
• Independent Component Analysis

Supervised Learning
Supervised learning is classified into two categories of algorithms:
• Classification: A classification problem is when the output variable is a category, such
as “Red” or “blue” or “disease” and “no disease”.
• Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
Supervised learning deals with or learns with “labeled” data. This implies that some
data is already tagged with the correct answer.
Types:-
• Regression
• Logistic Regression
• Classification
• Naive Bayes Classifiers
• K-NN (k nearest neighbors)
• Decision Trees
• Support Vector Machine

Artificial Neural Network (ANN)
• It is a type of supervised learning.
• An ANN is an information processing paradigm that
is inspired by the brain.
• It learns by examples. It is configured for a specific
application, such as pattern recognition or data
classification, through a learning process.
• Learning largely involves adjustments to the
synaptic connections that exist between the
neurons.

Analogy Between Biological and ANN
Biological Neuron Artificial Neuron

W1
W2
W3
f(x)
1.4
-2.5
-0.06
Working of Perceptron
Sigmoid function

2.7
-8.6
0.002
f(x)
1.4
-2.5
-0.06
x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 21.34
Working of Perceptron

A dataset
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Training and weights update

Training the neural network
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Initialise with random weights

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Present a training pattern
1.4
2.7
1.9

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Feed it through to get output
1.4
2.7 0.8
1.9

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Compare with target output
1.4
2.7 0.8
0
1.9 error 0.8

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Adjust weights based on error
1.4
2.7 0.8
0
1.9 error 0.8

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Present a training pattern
6.4
2.8
1.7

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Feed it through to get output
6.4
2.8 0.9
1.7

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Compare with target output
6.4
2.8 0.9
1
1.7 error -0.1

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Adjust weights based on error
6.4
2.8 0.9
1
1.7 error -0.1

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
And so on...
6.4
2.8 0.9
1
1.7 error -0.1

Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
And so on ….
6.4
2.8 0.9
1
1.7 error -0.1
• Repeat the process number of times, maybe millions of
times each time taking a random training instance, and
making slight weight adjustments
• Algorithms for weight adjustment are designed to make
changes that will reduce the error

The decision boundary perspective
Initial random weights

Present a training instance / adjust the weights

Eventually ….

Summary
• In general, weight learning algorithms of ANNs are like GIGO.
• They work by making thousands and thousands of tiny adjustments,
each making the network do better at the most recent pattern, but
perhaps a little worse on many others
• But, by luck, eventually this tends to be good enough to learn
effective classifiers for many real life applications

Feature detectors
Hand-written data set

what is this
unit doing?
Hand-written data set

Hidden layer makes
connection of neuron
strong or weak, based
on inputs
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Processing of hidden layer

…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
it will send strong signal for a horizontal
line in the top row, ignoring everywhere else

…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight

…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Strong signal for a dark area in the top left
corner

So: multiple layers make sense

Your brain works that way

Multi-layer neural network architectures can learn the true underlying features and
‘feature logic’, and therefore generalise very well.

But, until very recently, our weight-learning
algorithms simply did not work on multi-layer
architectures

There are new methods are proposed to
train multi-layer NN

Train this layer first
The new way to train multi-layer NN

then this layer

then this layer
then this layer

then this layer
then this layer
then this layer

then this layer
then this layer
then this layer
finally this layer

EACH of the (non-output) layers is
trained to be an auto-encoder
Basically, it is forced to learn good
features that describe what comes from
the previous layer

Conclusion
• There are many types of deep learning models,
• Different kinds of autoencoder, variations on
architectures and training algorithms, etc.
• It is a very fast-growing area.

Deep Learning in the Headlines

Scene Labeling via Deep Learning

Machine Learning in
Automatic Speech Recognition
A Typical Speech Recognition System
ML used to predict of phone states from the sound spectrogram
Deep learning has state-of-the-art results
Zeiler et al. “On rectified linear units for speech
recognition” ICASSP 2013

Impact of Deep Learning in Speech Technology

Continuous Observation
Continuous transfer
of data for
observation and
monitoring
Delay,
Power,
Area

Device that analysis
and compute at same
level like human
brain
Requirement of
Artificial
Intelligent
device
ANN CNN
SNN
ANN / CNN
1) Large computation
2) Power hungry
3) Large area
Alternative
SNN:
1) High parallel
computation
2) Power efficient
3) Event-driven
4) More brain inspired
Most remarkable neural
network

Proposed solutions
21-02-2023
• Reduce Data Movement
• Maximize Data Reuse within PE
• Maximize Data Reuse with Buffer
• Multicast Network Design
• Exploit Data Statistics
• Data Compression
• Data gating/ zero skipping
• Operations exhibit High Parallelism

EssentialsOfMachineLearning.pdf

Recommended

Recommended

More Related Content

Similar to EssentialsOfMachineLearning.pdf

Similar to EssentialsOfMachineLearning.pdf (20)

More from Ankita Tiwari

More from Ankita Tiwari (20)

Recently uploaded

Recently uploaded (20)

EssentialsOfMachineLearning.pdf