SlideShare a Scribd company logo
Machine
Learning?
What
is
Soumya Mukherjee
Md Shamimuddin
https://www.linkedin.com/in/soumyarmukherjee/
https://www.linkedin.com/in/mdshamimuddin/
Agenda
 Overview of AI and ML
 Terminology awareness
 Applications in real world
 Use cases within Nokia
 Types of Learning
 Regression
 Classification
 Clustering
 Linear Regression Single Variable with python
• Arthur Samuel (1959)
Machine Learning: Field of study that gives computers the
ability to learn without being explicitly programmed.
• Tom Mitchell (1998)
A computer program is said to learn from experience E with
respect to some task T and some performance measure P, if its
performance on T, as measured by P, improves with experience E.
Machine Learning Definition
Artificial Intelligence Vs Machine Learning Vs Deep Learning
Terminology Awareness
Implies huge data
volumes that cannot be
processed effectively with
traditional applications.
Big Data processing
begins with raw data that
is not aggregated and it is
often impossible to store
such data in the memory
of a single computer
Is about using Statistics
as well as other
programming methods to
find patterns hidden in
the data so that you can
explain some
phenomenon. Machine
Learning uses Data
Mining techniques and
other learning algorithms
to build models of what is
happening behind
some data.
Big Data Data Mining
Is an artificial
intelligence technique
that is broadly used
in Data Mining. ML uses
a training dataset to build
a model that can predict
values of target variables.
Data Mining uses the
predictive force of
Machine Learning by
applying various ML
algorithms on Big data.
Machine Learning
WHAT IS ARTIFICIAL INTELLIGENCE
• Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent
machines that work and react like humans. Some of the activities computers with artificial intelligence
are designed for include:
Knowledge
Gain
Reasoning
Problem
Solving
Learning
Artificial Intelligence
Machine Learning
Supervised Unsupervised Reinforcement
Types of Learning
Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Target/outcome
variable to be
predicted from set of
predictors is known
at training phase.
E.g. Regression,
Decision Tree,
Random Forest, KNN
Target/outcome
variable to be
predicted from set of
predictors is
unknown at training
phase.
E.g. Clustering (K-
means, Apriori)
Machine is trained to
take specific decision
Exposed to an
environment where it
trains itself
continually using trial
and error.
E.g. Markov Decision
process
Applications in real world
• Google search engine
• Self driving cars
• Facebook auto tagging
• Netflix movie recommendation
• Amazon product recommendation
• Healthcare diagnosis
• Speech recognition
• StackOverflow QA tagging
• Chatbot
Data as input
(Text files,
spreadsheet,
SQL database)
Feature Engineering
(Removing unwanted data,
Handle missing values,
Normalization or
Standardization)
Algorithm
Output/
Model
Pipeline solving ML Problem
Pipeline in solving ML Problem
Data Exploration/Feature Engineering
1. Variable Identification
• Predictor(s) n Target
• Type n Category of variable
2. Univariate Analysis
• Central tendency
• Measure of Dispersion
• Visualization Method
• Frequency table(categorical)
3. Bivariate Analysis
• Relation between 2 variables
• Correlation
• Chi-square test
• Z-test
4. Missing Value
Treatment
• Deletion
• Imputation
• Prediction Model
• KNN Imputation
5. Outlier Handling
Detection
• Very Important to handle outlier
• Visualization technique like box-
plot, scatter plot, Histogram
• Any value beyond -1.5IQR to
1.5IQR is an outlier
Treatment
• Remove
• Scale or Normalize
• Transform
• Impute
SUPERVISED LEARNING
• Supervised learning is used whenever we want to predict a certain outcome from
a given input, and we have examples of input/output pairs.
• We build a machine learning model from these input/output pairs, which
comprise our training set.
• Our goal is to make accurate predictions for new, never-before-seen data.
• Supervised learning often requires human effort to build the training set, but
afterward automates and often speeds up an otherwise laborious or infeasible
task.
TYPES OF SUPERVISED MODEL
• Regression :
• regression is the process of predicting a continuous value
• Classification
• predict a class label, which is a choice from a predefined list of possibilities.
CLASSIFICATION
• Binary Classification : Distinguishing between exactly two classes
• Multiclass classification : Classification between more than two classes.
Types of regression
1. Simple Linear Regression
Single predictor + single target
y = m*x + c
2. Multiple Linear Regression
Multiple predictors + single target
y = m1*x1 + m2*x2 + c
3. Polynomial Regression
One or many predictors + single target
Y = mn * x^n + … + m2*x^2 + m1*x1 + c
4. Stepwise Regression
Useful in case of multiple predictors
Add or Remove predictors as needed
Forward selection
Backward elimination
5. Lasso Regression
6. Ridge Regression
7. ElasticNet Regression
Simple Linear Regression
• Single predictor and single target
• Y = b0 + b1*X
• Minimum sum squared error
• Standard packages are already available
• Formula
• Programming example
Classification
 Type of supervised learning
 Output or target is a categorical outcome
Example
 Mail spam or no spam
 Weather rainy, sunny, humid
 Stock price up or down
Predictor(s) Algorithm
Categorical
Target
Types of Classification
1. K-nearest Neighbor Classifier
2. Logistic Regression
3. Naïve Bayes 6. Support Vector Machine
Classifier
5. Random Forest Classifier
4. Decision Tree Classifier
Clustering (Unsupervised learning)
Cluster 1
Cluster 2
Cluster 3
Unsupervised learning
• Unsupervised learning is the training of machine using
information that is neither classified nor labelled
For instance, Given an image having both dogs and cats which have not seen ever.
Machine tries to find pattern
based on shape of head,
ears, body structure etc.
Reinforcement Learning
• Reinforcement learning (RL) is an area of machine learning concerned with
how software agents ought to take actions in an environment so as to maximize some
notion of cumulative reward. (source : Wikipedia)
Eg : you go near fire , its warm : positive reinforcement
you touch fire, it burns your hand : negative reinforcement  learn not to touch
fire
• Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q-
learning etc
ML in Python:
• Numpy
• Pandas
• Scikit-learn
• Matplotlib
• Seaborn
Non-
Programming:
• Weka
• Orange
• RapidMiner
• Qlik Sense
• xls
Deep Learning:
• Tensorflow
• Keras
• PyTorch
• Theano
Tools And Packages
LINEAR REGRESSION
SINGLE VARIABLE
LINEAR REGRESSION
• Linear regression, or ordinary least squares (OLS), is the simplest and most classic
linear method for regression. Linear regression finds the parameters m and b that
minimize the mean squared error between predictions and the true regression
targets, y, on the training set.
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
HOME PRICES
area price
2600 550000
3000 565000
3200 610000
3600 680000
4000 725000
Given these home prices, find
out the price of homes whose
area is
3300 square feet
5000 square feet
SCATTER PLOT
BEST FIT LINE.
PREDICT HOME PRICES FOR A GIVEN AREA
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
SLOPE INTERSECTION EQUATION OF A STRAIGHT
LINE
PROGRAM IN PYTHON
EVALUATING MODEL PERFORMANCE
• The performance of a regression model can be understood by knowing the error
rate of the predictions made by the model. You can also measure the performance
by knowing how well your regression line fit the dataset.
• Let’s try to understand how to measure the performance of regression models.
• A good regression model is one where the difference between the actual or
observed values and predicted values for the selected model is small and unbiased
for train, validation and test data sets.
EVALUATING MODEL PERFORMANCE
• To measure the performance of your regression model, some statistical metrics are used. They
are-
• Mean Absolute Error(MAE)
• Root Mean Square Error(RMSE)
• Coefficient of determination or R2
• Adjusted R2
MEAN ABSOLUTE ERROR(MAE)
• This is the simplest of all the metrics. It is measured by taking the average of the absolute
difference between actual values and the predictions.
MEAN ABSOLUTE ERROR (MAE)
ROOT MEAN SQUARE ERROR(RMSE)
• The Root Mean Square Error is measured
by taking the square root of the average
of the squared difference between the
prediction and the actual value.
• It represents the sample standard
deviation of the differences between
predicted values and observed
values(also called residuals). It is
calculated using the following formula:
ROOT MEAN SQUARE ERROR(RMSE)
COEFFICIENT OF DETERMINATION OR R^2
• It measures how well the actual
outcomes are replicated by the
regression line.
• It helps you to understand how well the
independent variable adjusted with the
variance in your model.
• That means how good is your model
for a dataset.
• The mathematical representation for
R^2 is
Here, SSR = Sum Square of
Residuals(the squared difference
between the predicted and the
average value)
SST = Sum Square of Total(the
squared difference between the
actual and average value)
COEFFICIENT OF DETERMINATION OR R^2 (CONT.)
• Here the green line represents the regression line
and the red line represents the average line. The
differences in data points from these lines are
taken in the equation.
• Usually, the value of R^2 lies between 0 to 1(it
can be negative if the regression line somehow
has a worse fit than the average!). The closer its
value to one, the better your model is. This is
because either your regression line has well fitted
the dataset or the data points are distributed with
low variance. Which lessens the value of the Sum
of Residuals. Hence, the equation gets closer to
one.
THANK YOU

More Related Content

What's hot

Classification and regression trees (cart)
Classification and regression trees (cart)Classification and regression trees (cart)
Classification and regression trees (cart)
Learnbay Datascience
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
AyanaRukasar
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
Upekha Vandebona
 
K means clustering
K means clusteringK means clustering
K means clustering
Kuppusamy P
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
Knoldus Inc.
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Simplilearn
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
VARUN KUMAR
 
KNN
KNN KNN
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
WriteMyThesis
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
Student
 
Feature selection
Feature selectionFeature selection
Feature selection
dkpawar
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
Knoldus Inc.
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
NishantKumar1179
 

What's hot (20)

Classification and regression trees (cart)
Classification and regression trees (cart)Classification and regression trees (cart)
Classification and regression trees (cart)
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
K means clustering
K means clusteringK means clustering
K means clustering
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
KNN
KNN KNN
KNN
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 

Similar to Machine learning and linear regression programming

LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
Machine Learning Valencia
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Experfy
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
pradeep kumar
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
Aman Vasisht
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
ArchanaT32
 
An Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social SciencesAn Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social Sciencesfsmart01
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
SanjanaSaxena17
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
VenkateswaraBabuRavi
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
AmAn Singh
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D. EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D.
Statistics and Bioinformatics (EiB-UB)
 
fINAL ML PPT.pptx
fINAL ML PPT.pptxfINAL ML PPT.pptx
fINAL ML PPT.pptx
19445KNithinbabu
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
Spotle.ai
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
Fares Al-Qunaieer
 
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
PriyadharshiniG41
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
Mark Peng
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
BeyaNasr1
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
Alon Bochman, CFA
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 

Similar to Machine learning and linear regression programming (20)

LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
 
Ml ppt at
Ml ppt atMl ppt at
Ml ppt at
 
Application of Machine Learning in Agriculture
Application of Machine  Learning in AgricultureApplication of Machine  Learning in Agriculture
Application of Machine Learning in Agriculture
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
 
An Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social SciencesAn Introduction to Simulation in the Social Sciences
An Introduction to Simulation in the Social Sciences
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Machine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by stepMachine Learning Notes for beginners ,Step by step
Machine Learning Notes for beginners ,Step by step
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D. EiB Seminar from Esteban Vegas, Ph.D.
EiB Seminar from Esteban Vegas, Ph.D.
 
fINAL ML PPT.pptx
fINAL ML PPT.pptxfINAL ML PPT.pptx
fINAL ML PPT.pptx
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
random forest.pptx
random forest.pptxrandom forest.pptx
random forest.pptx
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 

Recently uploaded

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 

Recently uploaded (20)

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

Machine learning and linear regression programming

  • 2. Agenda  Overview of AI and ML  Terminology awareness  Applications in real world  Use cases within Nokia  Types of Learning  Regression  Classification  Clustering  Linear Regression Single Variable with python
  • 3. • Arthur Samuel (1959) Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed. • Tom Mitchell (1998) A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Machine Learning Definition
  • 4. Artificial Intelligence Vs Machine Learning Vs Deep Learning
  • 6. Implies huge data volumes that cannot be processed effectively with traditional applications. Big Data processing begins with raw data that is not aggregated and it is often impossible to store such data in the memory of a single computer Is about using Statistics as well as other programming methods to find patterns hidden in the data so that you can explain some phenomenon. Machine Learning uses Data Mining techniques and other learning algorithms to build models of what is happening behind some data. Big Data Data Mining Is an artificial intelligence technique that is broadly used in Data Mining. ML uses a training dataset to build a model that can predict values of target variables. Data Mining uses the predictive force of Machine Learning by applying various ML algorithms on Big data. Machine Learning
  • 7. WHAT IS ARTIFICIAL INTELLIGENCE • Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. Some of the activities computers with artificial intelligence are designed for include: Knowledge Gain Reasoning Problem Solving Learning
  • 9. Types of Learning Supervised Learning Unsupervised Learning Reinforcement Learning Target/outcome variable to be predicted from set of predictors is known at training phase. E.g. Regression, Decision Tree, Random Forest, KNN Target/outcome variable to be predicted from set of predictors is unknown at training phase. E.g. Clustering (K- means, Apriori) Machine is trained to take specific decision Exposed to an environment where it trains itself continually using trial and error. E.g. Markov Decision process
  • 10. Applications in real world • Google search engine • Self driving cars • Facebook auto tagging • Netflix movie recommendation • Amazon product recommendation • Healthcare diagnosis • Speech recognition • StackOverflow QA tagging • Chatbot
  • 11. Data as input (Text files, spreadsheet, SQL database) Feature Engineering (Removing unwanted data, Handle missing values, Normalization or Standardization) Algorithm Output/ Model Pipeline solving ML Problem
  • 12. Pipeline in solving ML Problem
  • 13. Data Exploration/Feature Engineering 1. Variable Identification • Predictor(s) n Target • Type n Category of variable 2. Univariate Analysis • Central tendency • Measure of Dispersion • Visualization Method • Frequency table(categorical) 3. Bivariate Analysis • Relation between 2 variables • Correlation • Chi-square test • Z-test 4. Missing Value Treatment • Deletion • Imputation • Prediction Model • KNN Imputation 5. Outlier Handling Detection • Very Important to handle outlier • Visualization technique like box- plot, scatter plot, Histogram • Any value beyond -1.5IQR to 1.5IQR is an outlier Treatment • Remove • Scale or Normalize • Transform • Impute
  • 14. SUPERVISED LEARNING • Supervised learning is used whenever we want to predict a certain outcome from a given input, and we have examples of input/output pairs. • We build a machine learning model from these input/output pairs, which comprise our training set. • Our goal is to make accurate predictions for new, never-before-seen data. • Supervised learning often requires human effort to build the training set, but afterward automates and often speeds up an otherwise laborious or infeasible task.
  • 15. TYPES OF SUPERVISED MODEL • Regression : • regression is the process of predicting a continuous value • Classification • predict a class label, which is a choice from a predefined list of possibilities.
  • 16. CLASSIFICATION • Binary Classification : Distinguishing between exactly two classes • Multiclass classification : Classification between more than two classes.
  • 17. Types of regression 1. Simple Linear Regression Single predictor + single target y = m*x + c 2. Multiple Linear Regression Multiple predictors + single target y = m1*x1 + m2*x2 + c 3. Polynomial Regression One or many predictors + single target Y = mn * x^n + … + m2*x^2 + m1*x1 + c 4. Stepwise Regression Useful in case of multiple predictors Add or Remove predictors as needed Forward selection Backward elimination 5. Lasso Regression 6. Ridge Regression 7. ElasticNet Regression
  • 18. Simple Linear Regression • Single predictor and single target • Y = b0 + b1*X • Minimum sum squared error • Standard packages are already available • Formula • Programming example
  • 19. Classification  Type of supervised learning  Output or target is a categorical outcome Example  Mail spam or no spam  Weather rainy, sunny, humid  Stock price up or down Predictor(s) Algorithm Categorical Target
  • 20. Types of Classification 1. K-nearest Neighbor Classifier 2. Logistic Regression 3. Naïve Bayes 6. Support Vector Machine Classifier 5. Random Forest Classifier 4. Decision Tree Classifier
  • 22. Unsupervised learning • Unsupervised learning is the training of machine using information that is neither classified nor labelled For instance, Given an image having both dogs and cats which have not seen ever. Machine tries to find pattern based on shape of head, ears, body structure etc.
  • 23. Reinforcement Learning • Reinforcement learning (RL) is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. (source : Wikipedia) Eg : you go near fire , its warm : positive reinforcement you touch fire, it burns your hand : negative reinforcement  learn not to touch fire • Algorithms for RL include – MonteCarlo methods, Markov Decision Processes, Q- learning etc
  • 24. ML in Python: • Numpy • Pandas • Scikit-learn • Matplotlib • Seaborn Non- Programming: • Weka • Orange • RapidMiner • Qlik Sense • xls Deep Learning: • Tensorflow • Keras • PyTorch • Theano Tools And Packages
  • 26. LINEAR REGRESSION • Linear regression, or ordinary least squares (OLS), is the simplest and most classic linear method for regression. Linear regression finds the parameters m and b that minimize the mean squared error between predictions and the true regression targets, y, on the training set.
  • 27. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000
  • 28. HOME PRICES area price 2600 550000 3000 565000 3200 610000 3600 680000 4000 725000 Given these home prices, find out the price of homes whose area is 3300 square feet 5000 square feet
  • 31. PREDICT HOME PRICES FOR A GIVEN AREA
  • 32. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 33. PREDICT HOME PRICES FOR A GIVEN AREA (CONT.)
  • 34. SLOPE INTERSECTION EQUATION OF A STRAIGHT LINE
  • 36. EVALUATING MODEL PERFORMANCE • The performance of a regression model can be understood by knowing the error rate of the predictions made by the model. You can also measure the performance by knowing how well your regression line fit the dataset. • Let’s try to understand how to measure the performance of regression models. • A good regression model is one where the difference between the actual or observed values and predicted values for the selected model is small and unbiased for train, validation and test data sets.
  • 37. EVALUATING MODEL PERFORMANCE • To measure the performance of your regression model, some statistical metrics are used. They are- • Mean Absolute Error(MAE) • Root Mean Square Error(RMSE) • Coefficient of determination or R2 • Adjusted R2
  • 38. MEAN ABSOLUTE ERROR(MAE) • This is the simplest of all the metrics. It is measured by taking the average of the absolute difference between actual values and the predictions.
  • 40. ROOT MEAN SQUARE ERROR(RMSE) • The Root Mean Square Error is measured by taking the square root of the average of the squared difference between the prediction and the actual value. • It represents the sample standard deviation of the differences between predicted values and observed values(also called residuals). It is calculated using the following formula:
  • 41. ROOT MEAN SQUARE ERROR(RMSE)
  • 42. COEFFICIENT OF DETERMINATION OR R^2 • It measures how well the actual outcomes are replicated by the regression line. • It helps you to understand how well the independent variable adjusted with the variance in your model. • That means how good is your model for a dataset. • The mathematical representation for R^2 is Here, SSR = Sum Square of Residuals(the squared difference between the predicted and the average value) SST = Sum Square of Total(the squared difference between the actual and average value)
  • 43. COEFFICIENT OF DETERMINATION OR R^2 (CONT.) • Here the green line represents the regression line and the red line represents the average line. The differences in data points from these lines are taken in the equation. • Usually, the value of R^2 lies between 0 to 1(it can be negative if the regression line somehow has a worse fit than the average!). The closer its value to one, the better your model is. This is because either your regression line has well fitted the dataset or the data points are distributed with low variance. Which lessens the value of the Sum of Residuals. Hence, the equation gets closer to one.

Editor's Notes

  1. list of possibilities. classification approach can be thought of as a means of categorizing or "classifying" some unknown items into a discrete set of "classes."
  2. plt.scatter(df['area'],df['price'] , marker = '*', color = 'red')
  3. plt.xlabel('area') plt.ylabel('price') plt.scatter(df['area'],df['price'], marker = '*', color = 'red') plt.plot(df['area'], model.predict(df[['area']]))