Machine Learning part 2 - Introduction to Data Science

Introduction to Data Science
Frank Kienle
Machine Learning

Artiﬁcial intelligence is …
the term "artiﬁcial intelligence" is applied when a machine mimics "cognitive"
functions that humans associate with other human minds, such as "learning" and
"problem solving"
Machine Learning is …
an algorithm that can learn from data without relying on rules-based programming.
Statistical Modeling is …
formalization of relationships between variables in the form of mathematical
equations.
Machine Learning vs. Statistical Modeling
01/08/2017 Frank Kienle, p. 35

A computer program is said to learn form experience (E) with some class of tasks
(T) and a performance measure (P) if its performance at tasks in T as measured by P
improves with E
Learning = Improving with experience at some task
•  Improve over task T
•  With respect to performance measure P
•  Base on experience E
Example Spam Filtering: Spam is all email the user does not want to receive and
has not asked to receive
•  T: Identify Spam Emails
•  P: % of spam emails that where ﬁltered - % of ham/(non-spam) emails that where incorrectly
ﬁltered out
•  E: a database of emails that were labelled by users
Machine Learning
01/08/2017 p. 36

optical character recognition:
•  categorize images of handwritten characters by the letters represented
face detection:
•  ﬁnd faces in images (or indicate if a face is present)
customer segmentation:
•  predict, for instance, which customers will respond to a particular promotion
fraud detection:
•  identify credit card transactions (for instance) which may be fraud- ulent in nature
demand prediction:
•  predict demand for individual products
Examples of Machine Learning

Batch processing:
Most of the machine learning algorithms assume that we are mining a database.
That is, all our data is available when and if we want it.
Stream processing for e.g. machinery sensors:
data arrives in a stream or streams, and if it is not processed immediately or stored,
then it is lost forever.
Both can be embedding in fault tolerant architectures:
See for example Lambda (http://lambda-architecture.net) architecture or the Kappa
architecture (kappa-architecture.com) for further discussion (discussed in a
separate lecture)
Batch Processing vs Stream Processing

Machine Learning Overview
01/08/2017 39
Machine
Learning
Supervised
Regression Classiﬁcation
Unsuperwised
Clustering
Dimension
Reduction
what is the difference between supervised and un-supervised learning?
what is the difference between regression problem and classiﬁcation problem?

Unsupervised
•  Clustering & Dimensionality
Reduction
•  SVD
•  PCA
•  K-means
•  Association Analysis
•  Apriori
•  FP-Growth
•  Hidden Markov Model
Supervised
•  Regression
•  Linear
•  Polynomial
•  Decision Trees
•  Random Forests
•  Classiﬁcation
•  KNN
•  Trees
•  Logistic Regression
•  Naïve Bayes
•  SVM
Machine Learning Algorithms
(small excerpt)
ContinuousCategorical

It is all about the assumption of the underlying model
Machine Learning

input: x
output: y
What is the best relation (function)
between x and y, which can be
used for mapping new examples of
x to infer a output y
Input to output example

Model
hypothesis
input: x
output: y
By making an initial hypothesis
on the model structure h(x) we
can infer the model parameters
w
The process to infer the model parameters is denoted as learning in the following

Model
hypothesis
input: x
output: y
By applying the model on a new
input variable we obtain a new
estimate:
The process to infer the model parameters is denoted as learning in the following
Applying the learned model to new input data will lead to an inferred result. This
process is denoted as prediction. The term inference and prediction are used as
synonyms in the following.
ˆy
ˆy

Model
hypothesis
Supervised learning is the machine learning task of inferring a function from
labeled training data. The training data consist of a set of training examples
How can we derive the ,best’
model parameters
Choose model parameters so that
all used training samples x will
result in a nearby result h(x) to y
y supervises the learning process,
Supervised learning

Model
hypothesis
The mean square error (MSE) is the average of the squares of the errors or
deviations.
Supervised learning: cost function and MSE
Cost function
MSE =
1
n
nX
i=1
( ˆyi yi)2
ﬁnding the parameters w which
minimizes this cost function will
result in the estimator with the
smallest possible MSE

Typical regression scenario with more input variables
x0 Rpm
x1
Gas
x2
Valve
x3
Temp
x4
Watt
y
1 500 5.8 5 200 3
1 900 4.5 9 400 5
1 2500 13 15 400 5
1 3000 95 90 400 100
X =
2
6
6
4
1 500 5.8 200
1 900 4.5 400
1 2500 13 400
1 3000 90 400
3
7
7
5 y =
2
6
6
4
3
5
5
100
3
7
7
5

Typical classiﬁcation scenario with more input variables
x0 Rpm
x1
Gas
x2
Valve
x3
Temp
x4
Watt
x5
Status
y
1 500 5.8 5 200 3 0
1 900 4.5 9 400 5 0
1 2500 13 15 400 5 0
1 3000 95 90 400 100 1
X =
2
6
6
4
1 500 5.8 200
1 900 4.5 400
1 2500 13 400
1 3000 90 400
3
7
7
5

m: training samples (rows)
n: features (columns)
X: design matrix, feature matrix
y: target vector (or sometimes denoted with t)
Supervised Learning: terminology
ky h(X, w)k
2
=
0
B
B
B
B
B
B
@
y1
y2
...
...
ym
1
C
C
C
C
C
C
A
0
B
B
B
B
B
B
@
x1,1 x1,2 · · · x1,n
x2,1 x2,2 0 x2,n
...
...
...
...
...
...
...
...
xm,1 xm,2 · · · xm,n
1
C
C
C
C
C
C
A
0
B
B
B
@
w1
w2
...
wn
1
C
C
C
A
2
=
mX
i=1
ym
nX
j=1
xj,iwi
2
.

A computer program is said to learn form experience (E) with some class of tasks
(T) and a performance measure (P) if its performance at tasks in T as measured by P
improves with E
Learning = Improving with experience at some task
•  Improve over task T à model the target
•  With respect to performance measure P à deﬁne the cost function
•  Base on experience E à by using historic data
Machine Learning

Machine Learning (technical steps)
Training Phase
Prediction
data
Pre-
processing
Prepare for
cleaned/correct
information and
provide correct
data format
Learning
Develop new or
decide for
appropriate
mathematical
model
Validation
Control quality
and correctness
of model
(trained)
model
(trained)
model
new data Prediction

52
Source: scikit-learn
01/08/2017

53
Source: scikit-learn
01/08/2017

Machine Learning part 2 - Introduction to Data Science

More Related Content

What's hot

Viewers also liked

Similar to Machine Learning part 2 - Introduction to Data Science

More from Frank Kienle

Recently uploaded

Machine Learning part 2 - Introduction to Data Science