Introduction to Machine Learning
Book
Course Assessment
Course total grade 150
Grading:
- Attendance 10
- Lab + Assignments 20
- Midterm Exam 30
- Final Exam 90
What is Machine Learning?
“Learning is any process by which a system improves
performance from experience.”
- Herbert Simon
Definition by Tom Mitchell (1998):
Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E.
A well-defined learning task is given by <P, T, E>.
3
Traditional Programming
Machine Learning
Computer
Data
Program
Output
Computer
Data
Output
Program
4
7
Some examples of tasks that are best solved
by using a learning algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
State of the Art Applications of
Machine Learning
11
Example (Spam Filter)
Traditional way of Programming
1. First you would look at what spam typically looks like. You might notice that
some words or phrases (such as “4U,” “credit card,” “free,” and “amazing”)
tend to come up a lot in the subject.
2. You would write a detection algorithm for each of the patterns that you
noticed, and your program would flag emails as spam if a number of these
patterns are detected.
3. You would test your program, and repeat steps 1 and 2 until it is good enough.
Example (Spam Filter)
Machine way of Programming
In contrast, a spam filter based on Machine Learning techniques automatically learns
which words and phrases are good predictors of spam by detecting unusually frequent
patterns of words in the spam examples compared to the ham examples.
The program is much shorter, easier to maintain, and most likely more accurate.
Example (Spam Filter)
Machine way of Programming (Automatically
Adapting to change (Classification))
If spammers notice that all their emails containing “4U” are blocked, they
might start writing “For U” instead. A spam filter using traditional programming
techniques would need to be updated to flag “For U” emails. If spammers keep
working around your spam filter, you will need to keep writing new rules forever.
In contrast, a spam filter based on Machine Learning techniques automatically notices
that “For U” has become unusually frequent in spam flagged by users, and it starts
flagging them without your intervention.
Example (Spam Filter)
A well-defined learning task is given by <P, T, E>.
T: Task is Flag spam for new emails
E: Experience is the training Data.
P: Performance for example would be the ration of correctly
classified emails.
Autonomous Cars
• Nevada made it legal for
autonomous cars to drive on
roads in June 2011
• As of 2013, four states (Nevada,
Florida, California, and
Michigan) have legalized
autonomous cars
Penn’s Autonomous Car 
(Ben Franklin Racing Team) 12
Autonomous Car Sensors
13
Machine Learning in
Automatic Speech Recognition
A Typical Speech Recognition System
ML used to predict of phone states from the sound spectrogram
Deep learning has state-of-the-art results
# Hidden Layers 1 2 4 8 10 12
Word Error Rate % 16.0 12.8 11.4 10.9 11.0 11.1
Baseline GMM performance = 15.4%
[Zeiler et al. “On rectified linear units for speech
recognition” ICASSP 2013]
Types of Learning
Types of Learning
• Supervised (inductive) learning
–Given: training data + desired outputs (labels)
• Unsupervised learning
–Given: training data (without desired outputs)
• Semi-supervised learning
–Given: training data + a few desired outputs
• Reinforcement learning
–Rewards from sequence of actions
Supervised Learning
The supervised learning is the ML model where the training
data you feed to the algorithm includes the desired
solutions, called labels is divided into:
 Regression.
 Classification
Supervised Learning: Regression
• Given (x1, y1), (x2, y2), ..., (xn, yn)
• Learn a function f (x) to predict y given x
– y is real-valued == regression
9
8
7
6
5
4
3
2
1
0
1970 1980 1990 2000 2010 2020
September
Arctic
Sea
Ice
Extent
(1,000,000
sq
km)
Supervised Learning
Classification example:
The spam filter is a good example of this: it is trained with many example emails along
with their class (spam or ham), and it must learn how to classify new emails.
A labeled training set for supervised learning (e.g., spam classification)
Supervised Learning: Classification
• Given (x1, y1), (x2, y2), ..., (xn, yn)
• Learn a function f (x) to predict y given x
– y is categorical == classification
Breast Cancer (Malignant / Benign)
1(Malignant)
0(Benign)
Tumor Size
Predict Benign Predict Malignant
Tumor Size
Supervised Learning
Tumor Size
Age
- Clump Thickness
- Uniformity of Cell Size
- Uniformity of Cell Shape
…
• x can be multi-dimensional
– Each dimension corresponds to an attribute
Unsupervised Learning
In unsupervised learning, as you might guess, the
training data is unlabeled.
The Model tries to learn without a teacher. So, the
Model find structure and pattern in the data on its own.
Given input
Output hidden
An unlabeled training set for unsupervised
learning
Unsupervised Learning
• Given x1, x2, ..., xn (without labels)
• Output hidden structure behind the x’s
– E.g., clustering
Unsupervised Learning
• Independent component analysis – separate a combined
signal into its original sources
Reinforcement Learning
• Given a sequence of states and actions with (delayed)
rewards, output a policy
– Policy is a mapping from states  actions that tells you what to
do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
– Balance a pole on your hand
Reinforcement Learning
Reinforcement Learning
Key components of Reinforcement learning:
Agent: The learner or decision- maker.
Environment: The space in which the agent operates.
State: A representation of the current situation or condition in which the
agent is.
Actions: What the agent can do or the choices it can make.
Reward: Feedback from the environment in response to the agent’s
actions.
Policy: A strategy used by the agent to determine its actions based on
the current state.
Reinforcement Learning
The Agent-Environment Interface
Agent and environment interact at discrete time steps
Agent observes state at step t: st S
: t  0, 1, 2, K
produces action at step t : at  A(st )
gets resulting reward :
and resulting next state :
rt1 
st1
. . . st at
rt +1 st +1
at +1
rt +2 st +2
at +2
rt +3 st +3
. . .
at +3
Framing a Learning Problem
Machine learning model maps from
features to prediction
Examples
•Classification
•Is this a dog or a cat?
•Is this email spam or not?
•Regression
•What will the stock price be tomorrow?
•What will be the high temperature tomorrow?
𝑓 𝑥
Features
→ 𝑦
Prediction
Learning has three stages
Training: optimize model parameters
Validation: intermediate evaluations to design/select model
Test: final performance evaluation
Training: the model is fit to data to
minimize a loss or maximize an objective
function
𝜃∗ = argmin 𝐿𝑜𝑠𝑠(𝑓 𝑋; 𝜃 , 𝑌)
𝜃
Model parameters
that minimize loss
Features of all training
examples
“Ground Truth” predictions of
all training examples
37.5 41.2 51.0 48.3 50.5
47.0 46.5 48.9 50.5 47.6
… …
67.0 64.7 63.0 61.4 60.2
𝑋 𝑌
1 row per
example
1 column per
feature
Example
Learn to predict next day’s
temperature given
preceding days’
temperatures
𝑖
𝑖
𝐿𝑜𝑠𝑠 𝑓 𝑋;𝜃 , 𝑌 = ∑ 𝑓 𝑋 ;𝜃 − 𝑦𝑖
2
Model: linear
𝑓(𝑋𝑖; 𝜃) = 𝐴𝑋𝑖 + 𝑏
Optimization via ordinary least
squares regression
Loss: sum squared error
Model design and “hyper parameter”
tuning is performed using a validation set
Select model
Set training parameters
Feature selection
Learning rate, regularization parameters, …
Sometimes, there are clear “train”, “val”, and “test” sets. Other
times, you need to split “train” into a train set for learning
parameters and a val set for checking model performance
linear regression
Testing: The effectiveness of the model
is evaluated on a held out test set
“Held out”: not used in training; ideally not viewed by developers, e.g. in
a private test server
Common performance measures
N 𝑖 𝑖 𝑖
1
Classificationerror: ∑ 𝑓 𝑋 ≠ 𝑦 (for classification model, 𝑦 is target/true label)
N 𝑖
Cross-entropy: − 1
∑ log 𝑓 𝑦 = 𝑦 |𝑋
𝑖 𝑖
𝑖
(for probabilistic model)
RMSE:
N
1
∑𝑖 𝑓 𝑋 − 𝑦
𝑖 𝑖
2 (regression measure)
𝑦𝑖 − 𝒚
̅
𝑅2: 1 − ∑𝑖 𝑓 𝑋𝑖 − 𝑦𝑖
2 / ∑𝑖
expectation/mean/avg)
2 (unitless regression measure; 𝒚
̅ is
In machine learning research, usually data is collected once and then
randomly sampled into train and test partitions
Train and test samples are “i.i.d.”, independent and identically distributed
In many real-world applications, the input to the model in deployment comes from a
different distribution than training
Various Function Representations
• Numerical functions
– Linear regression
– Neural networks
– Support vector machines
• Symbolic functions
– Decision trees
• Instance-based functions
– Nearest-neighbor
• Probabilistic Graphical Models
– Naïve Bayes
– Bayesian networks
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Entropy
• etc.
ML in Practice
• Understand domain, prior knowledge, and goals
• Data integration, selection, cleaning, pre-processing, etc.
• Learn models
• Interpret results
• Consolidate and deploy discovered knowledge
Loop
Complete example to understand
steps of Machine Learning
Suppose you want to know if money makes people happy or not
Step1 :download
the Better Life Index data from the OECD’s website
OECD Data Explorer - Archive • Better Life Index
as well as stats about GDP( Gross Domestic Product) per capita ‫الناتج‬
‫للفرد‬ ‫االجمالى‬ ‫المحلى‬
)
from the IMF’s website.
World Economic Outlook Databases (imf.org)
Step 2: join the tables and sort by GDP per capita.
Table shows an excerpt of what you get.
Complete example to understand
steps of Machine Learning
It is observed that the graph tends to be linear means as the GDP per capita
increase the Life Satisfaction increase also, there are some noisy Data. So we will
chose Linear model.
This step is called model selection: you selected a linear model of life satisfaction
with just one attribute, GDP per capita.
Complete example to understand
steps of Machine Learning
linear model equation.
Second step determine model parameters which are chose best values to make
line covers all scattered points. (This step is the training step)
Complete example to understand
steps of Machine Learning
How can you know which values of the parameter values θ0 and θ1. will make
your model perform best ?
 You need to specify a performance measure (utility function (or fitness
function) that measures how good your model is.
 Or you can define a cost function that measures how bad it is (error).
For linear regression problems, people typically use a cost function that
measures the distance between the linear model’s predictions and the training
examples; the objective is to minimize this distance.
Complete example to understand
steps of Machine Learning
it finds the parameters that make the linear model fit best to your data with least
MSE for example.
This is called training the model. In our case the algorithm finds that the optimal
parameter values are θ0 = 4.85 and θ1 = 4.91 × 10–5.
Complete example to understand
steps of Machine Learning
After training the model we can use now.
For example to know how happy Cypriots are, and the OECD data does not have
the answer.
Fortunately, you can use your model to make a good prediction: you look up
Cyprus’s GDP per capita, find $22,587, and then apply your model and find that
life satisfaction.
Is likely to be somewhere around 4.85 + 22,587 × 4.91 × 10-5 = 5.96.
Complete example to understand
steps of Machine Learning
Python code that loads the data, prepares it, creates a scatterplot for visualization, and
then trains a linear model and makes a prediction.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model
# Load the data
oecd_bli = pd.read_csv("oecd_bli_2015.csv", thousands=',')
gdp_per_capita = pd.read_csv("gdp_per_capita.csv",thousands=',',delimiter='t',
encoding='latin1', na_values="n/a")
# Prepare the data
country_stats = prepare_country_stats(oecd_bli, gdp_per_capita)
X = np.c_[country_stats["GDP per capita"]]
y = np.c_[country_stats["Life satisfaction"]]
# Visualize the data
country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction')
plt.show()
# Select a linear model
model = sklearn.linear_model.LinearRegression()
# Train the model
model.fit(X, y)
# Make a prediction for Cyprus
X_new = [[22587]] # Cyprus' GDP per capita
print(model.predict(X_new)) # outputs [[ 5.96242338]]
Complete example to understand
steps of Machine Learning
Replacing the Linear Regression model with k-Nearest Neighbors
regression in the previous code is as simple as replacing these two
lines:
import sklearn.linear_model
model = sklearn.linear_model.LinearRegression()
with these two:
import sklearn.neighbors
model = sklearn.neighbors.KNeighborsRegressor(n_neighbors=3)
Complete example to understand
steps of Machine Learning
If all went well, your model will make good predictions.
If not, you may need to use:
 more attributes (employment rate, health, air pollution, etc.),
 get more or better quality training data,
 or perhaps select a more powerful model (e.g., a Polynomial Regression model).
In summary:
 You studied the data.
 You selected a model.
 You trained it on the training data (i.e., the learning algorithm searched for the
model parameter values that minimize a cost function).
 Finally, you applied the model to make predictions on new cases (this is called
inference), hoping that this model will generalize well.
Another Complete example to understand
steps of Machine Learning (Heart attack)
Problem:
A heart attack may happens, when the flow of blood to a part of the
heart muscle gets blocked.
Solution:
We want to use ML model to predict if this person may happens to
him heart attack or not.
So , we are going to follow the following steps (workflow):
Step 1: Data collection and features.
Step 2 : Data preprocessing.
Step 3: Feature extraction.
Step 4: Model prediction.
Step 5: Model evaluation.
Step 6: Deployment and monitoring.
Another Complete example to
understand steps of Machine
Learning (Heart attack)
Step one: Data collection and Features
Which may be
 Age
 Gender
 Blood pressure
 Cholesterol levels
 and others
Another Complete example to
understand steps of Machine Learning
(Heart attack)
Step two: Data Preprocessing
Which includes
Cleaning and normalizing the collected data.
May include dealing with missing values.
May include dealing with categorical data to be encoded
appropriately to a numerical value the PC can deal with.
Another Complete example to
understand steps of Machine Learning
(Heart attack)
Step three: Feature selection
Which includes
Identifying the most predictive features, which will
help in predicting the heart attack.
Which means exclude the irrelevant features.
Another Complete example to understand
steps of Machine Learning (Heart attack)
Step four: Model training
Which includes
The data must be divided into training and test.
For, example 80% training and 20% testing.
The processed data will be used to train the ML model.
From common ML models are:
 Logistic regression
 Decision Tree
 Random forest
 Gradient Boosting models
Another Complete example to understand
steps of Machine Learning (Heart attack)
Step five: Model Evaluation
After training the model, we need to evaluate its performance, so
we used different evaluation metrics for classification:
 Accuracy
 Precision
 Recall
 AUC-ROC curve
Note: other metrics will be used for Regression.
Lessons Learned about Learning
• Learning can be viewed as using direct or indirect experience to
approximate a chosen target function.
• Function approximation can be viewed as a search through a space
of hypotheses (representations of functions) for one that best fits a
set of training data.
• Different learning methods assume different hypothesis spaces
(representation languages) and/or employ different search techniques.

01_introduction_Machine learning course.pdf

  • 1.
  • 2.
  • 3.
    Course Assessment Course totalgrade 150 Grading: - Attendance 10 - Lab + Assignments 20 - Midterm Exam 30 - Final Exam 90
  • 6.
    What is MachineLearning? “Learning is any process by which a system improves performance from experience.” - Herbert Simon Definition by Tom Mitchell (1998): Machine Learning is the study of algorithms that • improve their performance P • at some task T • with experience E. A well-defined learning task is given by <P, T, E>. 3
  • 7.
  • 8.
    7 Some examples oftasks that are best solved by using a learning algorithm • Recognizing patterns: – Facial identities or facial expressions – Handwritten or spoken words – Medical images • Generating patterns: – Generating images or motion sequences • Recognizing anomalies: – Unusual credit card transactions – Unusual patterns of sensor readings in a nuclear power plant • Prediction: – Future stock prices or currency exchange rates
  • 9.
    State of theArt Applications of Machine Learning 11
  • 10.
    Example (Spam Filter) Traditionalway of Programming 1. First you would look at what spam typically looks like. You might notice that some words or phrases (such as “4U,” “credit card,” “free,” and “amazing”) tend to come up a lot in the subject. 2. You would write a detection algorithm for each of the patterns that you noticed, and your program would flag emails as spam if a number of these patterns are detected. 3. You would test your program, and repeat steps 1 and 2 until it is good enough.
  • 11.
    Example (Spam Filter) Machineway of Programming In contrast, a spam filter based on Machine Learning techniques automatically learns which words and phrases are good predictors of spam by detecting unusually frequent patterns of words in the spam examples compared to the ham examples. The program is much shorter, easier to maintain, and most likely more accurate.
  • 12.
    Example (Spam Filter) Machineway of Programming (Automatically Adapting to change (Classification)) If spammers notice that all their emails containing “4U” are blocked, they might start writing “For U” instead. A spam filter using traditional programming techniques would need to be updated to flag “For U” emails. If spammers keep working around your spam filter, you will need to keep writing new rules forever. In contrast, a spam filter based on Machine Learning techniques automatically notices that “For U” has become unusually frequent in spam flagged by users, and it starts flagging them without your intervention.
  • 13.
    Example (Spam Filter) Awell-defined learning task is given by <P, T, E>. T: Task is Flag spam for new emails E: Experience is the training Data. P: Performance for example would be the ration of correctly classified emails.
  • 14.
    Autonomous Cars • Nevadamade it legal for autonomous cars to drive on roads in June 2011 • As of 2013, four states (Nevada, Florida, California, and Michigan) have legalized autonomous cars Penn’s Autonomous Car  (Ben Franklin Racing Team) 12
  • 15.
  • 16.
    Machine Learning in AutomaticSpeech Recognition A Typical Speech Recognition System ML used to predict of phone states from the sound spectrogram Deep learning has state-of-the-art results # Hidden Layers 1 2 4 8 10 12 Word Error Rate % 16.0 12.8 11.4 10.9 11.0 11.1 Baseline GMM performance = 15.4% [Zeiler et al. “On rectified linear units for speech recognition” ICASSP 2013]
  • 17.
  • 18.
    Types of Learning •Supervised (inductive) learning –Given: training data + desired outputs (labels) • Unsupervised learning –Given: training data (without desired outputs) • Semi-supervised learning –Given: training data + a few desired outputs • Reinforcement learning –Rewards from sequence of actions
  • 19.
    Supervised Learning The supervisedlearning is the ML model where the training data you feed to the algorithm includes the desired solutions, called labels is divided into:  Regression.  Classification
  • 20.
    Supervised Learning: Regression •Given (x1, y1), (x2, y2), ..., (xn, yn) • Learn a function f (x) to predict y given x – y is real-valued == regression 9 8 7 6 5 4 3 2 1 0 1970 1980 1990 2000 2010 2020 September Arctic Sea Ice Extent (1,000,000 sq km)
  • 21.
    Supervised Learning Classification example: Thespam filter is a good example of this: it is trained with many example emails along with their class (spam or ham), and it must learn how to classify new emails. A labeled training set for supervised learning (e.g., spam classification)
  • 22.
    Supervised Learning: Classification •Given (x1, y1), (x2, y2), ..., (xn, yn) • Learn a function f (x) to predict y given x – y is categorical == classification Breast Cancer (Malignant / Benign) 1(Malignant) 0(Benign) Tumor Size Predict Benign Predict Malignant Tumor Size
  • 23.
    Supervised Learning Tumor Size Age -Clump Thickness - Uniformity of Cell Size - Uniformity of Cell Shape … • x can be multi-dimensional – Each dimension corresponds to an attribute
  • 24.
    Unsupervised Learning In unsupervisedlearning, as you might guess, the training data is unlabeled. The Model tries to learn without a teacher. So, the Model find structure and pattern in the data on its own. Given input Output hidden An unlabeled training set for unsupervised learning
  • 25.
    Unsupervised Learning • Givenx1, x2, ..., xn (without labels) • Output hidden structure behind the x’s – E.g., clustering
  • 26.
    Unsupervised Learning • Independentcomponent analysis – separate a combined signal into its original sources
  • 27.
    Reinforcement Learning • Givena sequence of states and actions with (delayed) rewards, output a policy – Policy is a mapping from states  actions that tells you what to do in a given state • Examples: – Credit assignment problem – Game playing – Robot in a maze – Balance a pole on your hand Reinforcement Learning
  • 28.
    Reinforcement Learning Key componentsof Reinforcement learning: Agent: The learner or decision- maker. Environment: The space in which the agent operates. State: A representation of the current situation or condition in which the agent is. Actions: What the agent can do or the choices it can make. Reward: Feedback from the environment in response to the agent’s actions. Policy: A strategy used by the agent to determine its actions based on the current state.
  • 29.
  • 30.
    The Agent-Environment Interface Agentand environment interact at discrete time steps Agent observes state at step t: st S : t  0, 1, 2, K produces action at step t : at  A(st ) gets resulting reward : and resulting next state : rt1  st1 . . . st at rt +1 st +1 at +1 rt +2 st +2 at +2 rt +3 st +3 . . . at +3
  • 31.
  • 32.
    Machine learning modelmaps from features to prediction Examples •Classification •Is this a dog or a cat? •Is this email spam or not? •Regression •What will the stock price be tomorrow? •What will be the high temperature tomorrow? 𝑓 𝑥 Features → 𝑦 Prediction
  • 33.
    Learning has threestages Training: optimize model parameters Validation: intermediate evaluations to design/select model Test: final performance evaluation
  • 34.
    Training: the modelis fit to data to minimize a loss or maximize an objective function 𝜃∗ = argmin 𝐿𝑜𝑠𝑠(𝑓 𝑋; 𝜃 , 𝑌) 𝜃 Model parameters that minimize loss Features of all training examples “Ground Truth” predictions of all training examples 37.5 41.2 51.0 48.3 50.5 47.0 46.5 48.9 50.5 47.6 … … 67.0 64.7 63.0 61.4 60.2 𝑋 𝑌 1 row per example 1 column per feature Example Learn to predict next day’s temperature given preceding days’ temperatures 𝑖 𝑖 𝐿𝑜𝑠𝑠 𝑓 𝑋;𝜃 , 𝑌 = ∑ 𝑓 𝑋 ;𝜃 − 𝑦𝑖 2 Model: linear 𝑓(𝑋𝑖; 𝜃) = 𝐴𝑋𝑖 + 𝑏 Optimization via ordinary least squares regression Loss: sum squared error
  • 35.
    Model design and“hyper parameter” tuning is performed using a validation set Select model Set training parameters Feature selection Learning rate, regularization parameters, … Sometimes, there are clear “train”, “val”, and “test” sets. Other times, you need to split “train” into a train set for learning parameters and a val set for checking model performance linear regression
  • 36.
    Testing: The effectivenessof the model is evaluated on a held out test set “Held out”: not used in training; ideally not viewed by developers, e.g. in a private test server Common performance measures N 𝑖 𝑖 𝑖 1 Classificationerror: ∑ 𝑓 𝑋 ≠ 𝑦 (for classification model, 𝑦 is target/true label) N 𝑖 Cross-entropy: − 1 ∑ log 𝑓 𝑦 = 𝑦 |𝑋 𝑖 𝑖 𝑖 (for probabilistic model) RMSE: N 1 ∑𝑖 𝑓 𝑋 − 𝑦 𝑖 𝑖 2 (regression measure) 𝑦𝑖 − 𝒚 ̅ 𝑅2: 1 − ∑𝑖 𝑓 𝑋𝑖 − 𝑦𝑖 2 / ∑𝑖 expectation/mean/avg) 2 (unitless regression measure; 𝒚 ̅ is In machine learning research, usually data is collected once and then randomly sampled into train and test partitions Train and test samples are “i.i.d.”, independent and identically distributed In many real-world applications, the input to the model in deployment comes from a different distribution than training
  • 37.
    Various Function Representations •Numerical functions – Linear regression – Neural networks – Support vector machines • Symbolic functions – Decision trees • Instance-based functions – Nearest-neighbor • Probabilistic Graphical Models – Naïve Bayes – Bayesian networks
  • 38.
    Evaluation • Accuracy • Precisionand recall • Squared error • Entropy • etc.
  • 39.
    ML in Practice •Understand domain, prior knowledge, and goals • Data integration, selection, cleaning, pre-processing, etc. • Learn models • Interpret results • Consolidate and deploy discovered knowledge Loop
  • 40.
    Complete example tounderstand steps of Machine Learning Suppose you want to know if money makes people happy or not Step1 :download the Better Life Index data from the OECD’s website OECD Data Explorer - Archive • Better Life Index as well as stats about GDP( Gross Domestic Product) per capita ‫الناتج‬ ‫للفرد‬ ‫االجمالى‬ ‫المحلى‬ ) from the IMF’s website. World Economic Outlook Databases (imf.org) Step 2: join the tables and sort by GDP per capita. Table shows an excerpt of what you get.
  • 41.
    Complete example tounderstand steps of Machine Learning It is observed that the graph tends to be linear means as the GDP per capita increase the Life Satisfaction increase also, there are some noisy Data. So we will chose Linear model. This step is called model selection: you selected a linear model of life satisfaction with just one attribute, GDP per capita.
  • 42.
    Complete example tounderstand steps of Machine Learning linear model equation. Second step determine model parameters which are chose best values to make line covers all scattered points. (This step is the training step)
  • 43.
    Complete example tounderstand steps of Machine Learning How can you know which values of the parameter values θ0 and θ1. will make your model perform best ?  You need to specify a performance measure (utility function (or fitness function) that measures how good your model is.  Or you can define a cost function that measures how bad it is (error). For linear regression problems, people typically use a cost function that measures the distance between the linear model’s predictions and the training examples; the objective is to minimize this distance.
  • 44.
    Complete example tounderstand steps of Machine Learning it finds the parameters that make the linear model fit best to your data with least MSE for example. This is called training the model. In our case the algorithm finds that the optimal parameter values are θ0 = 4.85 and θ1 = 4.91 × 10–5.
  • 45.
    Complete example tounderstand steps of Machine Learning After training the model we can use now. For example to know how happy Cypriots are, and the OECD data does not have the answer. Fortunately, you can use your model to make a good prediction: you look up Cyprus’s GDP per capita, find $22,587, and then apply your model and find that life satisfaction. Is likely to be somewhere around 4.85 + 22,587 × 4.91 × 10-5 = 5.96.
  • 46.
    Complete example tounderstand steps of Machine Learning Python code that loads the data, prepares it, creates a scatterplot for visualization, and then trains a linear model and makes a prediction. import matplotlib.pyplot as plt import numpy as np import pandas as pd import sklearn.linear_model # Load the data oecd_bli = pd.read_csv("oecd_bli_2015.csv", thousands=',') gdp_per_capita = pd.read_csv("gdp_per_capita.csv",thousands=',',delimiter='t', encoding='latin1', na_values="n/a") # Prepare the data country_stats = prepare_country_stats(oecd_bli, gdp_per_capita) X = np.c_[country_stats["GDP per capita"]] y = np.c_[country_stats["Life satisfaction"]] # Visualize the data country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction') plt.show() # Select a linear model model = sklearn.linear_model.LinearRegression() # Train the model model.fit(X, y) # Make a prediction for Cyprus X_new = [[22587]] # Cyprus' GDP per capita print(model.predict(X_new)) # outputs [[ 5.96242338]]
  • 47.
    Complete example tounderstand steps of Machine Learning Replacing the Linear Regression model with k-Nearest Neighbors regression in the previous code is as simple as replacing these two lines: import sklearn.linear_model model = sklearn.linear_model.LinearRegression() with these two: import sklearn.neighbors model = sklearn.neighbors.KNeighborsRegressor(n_neighbors=3)
  • 48.
    Complete example tounderstand steps of Machine Learning If all went well, your model will make good predictions. If not, you may need to use:  more attributes (employment rate, health, air pollution, etc.),  get more or better quality training data,  or perhaps select a more powerful model (e.g., a Polynomial Regression model). In summary:  You studied the data.  You selected a model.  You trained it on the training data (i.e., the learning algorithm searched for the model parameter values that minimize a cost function).  Finally, you applied the model to make predictions on new cases (this is called inference), hoping that this model will generalize well.
  • 49.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Problem: A heart attack may happens, when the flow of blood to a part of the heart muscle gets blocked. Solution: We want to use ML model to predict if this person may happens to him heart attack or not. So , we are going to follow the following steps (workflow): Step 1: Data collection and features. Step 2 : Data preprocessing. Step 3: Feature extraction. Step 4: Model prediction. Step 5: Model evaluation. Step 6: Deployment and monitoring.
  • 50.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Step one: Data collection and Features Which may be  Age  Gender  Blood pressure  Cholesterol levels  and others
  • 51.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Step two: Data Preprocessing Which includes Cleaning and normalizing the collected data. May include dealing with missing values. May include dealing with categorical data to be encoded appropriately to a numerical value the PC can deal with.
  • 52.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Step three: Feature selection Which includes Identifying the most predictive features, which will help in predicting the heart attack. Which means exclude the irrelevant features.
  • 53.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Step four: Model training Which includes The data must be divided into training and test. For, example 80% training and 20% testing. The processed data will be used to train the ML model. From common ML models are:  Logistic regression  Decision Tree  Random forest  Gradient Boosting models
  • 54.
    Another Complete exampleto understand steps of Machine Learning (Heart attack) Step five: Model Evaluation After training the model, we need to evaluate its performance, so we used different evaluation metrics for classification:  Accuracy  Precision  Recall  AUC-ROC curve Note: other metrics will be used for Regression.
  • 55.
    Lessons Learned aboutLearning • Learning can be viewed as using direct or indirect experience to approximate a chosen target function. • Function approximation can be viewed as a search through a space of hypotheses (representations of functions) for one that best fits a set of training data. • Different learning methods assume different hypothesis spaces (representation languages) and/or employ different search techniques.