SlideShare a Scribd company logo
1 of 32
Download to read offline
- Harsh Makadia
Basics of Machine Learning
Why machine learning matters
Artificial intelligence will shape our future more powerfully than any other innovation this century.
AlphaGo defeated one of the best human players at Go — an
extraordinary achievement in a game dominated by humans for
two decades after machines first conquered chess.
OpenAI reached yet another incredible milestone by defeating
the world’s top professionals in 1v1 matches of the online
multiplayer game Dota 2.
Google Translate app & Robots for
Hotel Service
Supervised Learning
The two tasks of supervised learning:
1. Regression
2. Classification
Try answering below questions
1. How much money will I make by spending more dollars on digital
advertising?
2. Will this loan applicant pay back the loan or not?
3. What’s going to happen to the stock market tomorrow?
Supervised learning = Data set containing training examples
with associated correct labels
For example, learning to classify handwritten digits.
- Train your data set to such extent that it becomes easy to find out the
result of unknown inputs.
How supervised learning works ?
let’s examine the problem of predicting annual income of person based on
the number of years of higher education he/she has completed.
Inshort,
Build a model that approximates the relationship f between the number of
years of higher education X and corresponding annual income Y
Solutions
Engineering Solution
income = ($5,000 * years_of_education)
+ baseline_income
Learning Solution
Degree type,
Years of work experience,
School tiers, etc.
For example: “If they completed a
Bachelor’s degree or higher, give the
income estimate a 1.5x multiplier.”
Goal of supervised learning
- Learns the relationship between income & education from scratch, by
running labeled training data through a learning algorithm.
- This learned function can be used to estimate the income of people whose
income Y is unknown, as long as we have years of education X as inputs.
- Inshort, we can apply our model to the unlabeled test data to estimate
GOALS?
Predict Y as accurately as possible when given new examples
where X is known and Y is unknown.
Two task of Supervised Learning
1. Regression - predicting continuous
values
- Predicts a continuous target variable Y.
- Estimate a value, such as housing prices or human lifespan, based on
input data X.
- Continuous means there aren’t gaps (discontinuities) in the value that Y
can take on. Eg. A person’s weight and height are continuous values.
- Discrete Variables means it can only take on a finite number of values
Eg. Number of kids somebody has is a discrete variable.
- Predicting income is a classic regression problem. Your input data X
includes all relevant information about individuals in the data set that can
be used to predict income, such as years of education, years of work
experience, job title, or zip code.
- These attributes are called Features, which can be numerical(e.g.
years of work experience) or categorical (e.g. job title or field of study).
You’ll want as many training observations as possible relating these features
to the target output Y, so that your model can learn the relationship f
between X and Y.
Linear regression(ordinary least
squares)
We have our data set X, and corresponding target values Y. The goal of ordinary
least squares (OLS) regression is to learn a linear model that we can use to predict a
new y given a previously unseen x with as little error as possible. We want to guess
how much income someone earns based on how many years of education they
received.
Linear regression is a parametric method, which means it makes an assumption about
the form of the function relating X and Y (we’ll cover examples of non-parametric
methods later). Our model will be a function that predicts ŷ given a specific x:
β0 is the y-intercept and β1 is the slope of our line, i.e. how much income increases (or
decreases) with one additional year of education.
Our goal is to learn the model parameters (in this case, β0 and β1) that minimize error in the
model’s predictions.
To find the best parameters:
1. Define a cost function, or loss function, that measures how inaccurate our model’s
predictions are.
2. Find the parameters that minimize loss, i.e. make our model as accurate as possible.
Gradient Descent - Learn the
parameters
- TensorFlow(Open source library for machine intelligence) uses this in the background.
- Find the minimum of our model’s loss function by iteratively getting a better and better
approximation of it.
Overfitting
- Common problem in Machine Learning.
- Overfitting happens when a model overlearns from the training data to the
point that it starts picking up idiosyncrasies that aren’t representative of
patterns in the real world.
2. Classification
More question to answer -
- Is this email spam or not?
- Is that borrower going to repay their loan?
- Will those users click on the ad or not?
- Who is that person in your Facebook picture?
Two Types :
1. logistic regression
2. Support vector machines (SVMs).
- Classification predicts a discrete target label Y.
- Classification is the problem of assigning new observations to the class to
which they most likely belong, based on a classification model built from
labeled training data.
Logistic regression: 0 or 1?
- Deals with binary type of classification
- e.g. when assigning handwritten digits a label between 0 and 9, or using facial
recognition to detect which friends are in a Facebook picture.
SVM (Support vector Machines)
- The algorithm is geometrically motivated in nature, rather than being
driven by probabilistic thinking.
- SVMs use a separating line (or, in more than two dimensions, a
multi-dimensional hyperplane) to split the space into a red zone and a blue
zone.
2D Space that are
either Blue or Red
Separating them by line
Way 1
Separating them by line
Way 2
- The distance to the nearest point on either side of the line is called the margin, and SVM
tries to maximize the margin.
- Think about it like a safety space: the bigger that space, the less likely that noisy points
get misclassified.
Classification vs Regression
- Groups the output into class
- If it is discrete/categorical variable then it is
classification problem
- Predicts the output value using training
data
- If it is real number/continuous then it is
regression problem
Unsupervised learning
Clustering
- Items inside a cluster should be very similar to each other, but very different
from those outside.
- Does grouping of data without being told ahead of time
- Used for knowledge discovery rather than prediction
- clustering is useful whenever there is diverse and varied data
Famous clustering Method k-means
Thank You!

More Related Content

What's hot

Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
butest
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 

What's hot (19)

Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Machine learning
Machine learningMachine learning
Machine learning
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
 
Eick/Alpaydin Introduction
Eick/Alpaydin IntroductionEick/Alpaydin Introduction
Eick/Alpaydin Introduction
 
An Introduction to Machine Learning
An Introduction to Machine LearningAn Introduction to Machine Learning
An Introduction to Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1Machine Learning SPPU Unit 1
Machine Learning SPPU Unit 1
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
 
Machine Learning Interview Questions
Machine Learning Interview QuestionsMachine Learning Interview Questions
Machine Learning Interview Questions
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 

Similar to Basics of Machine Learning

AML_030607.ppt
AML_030607.pptAML_030607.ppt
AML_030607.ppt
butest
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
PranavPatil822557
 

Similar to Basics of Machine Learning (20)

Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
PREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptxPREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptx
 
Machine learning Method and techniques
Machine learning Method and techniquesMachine learning Method and techniques
Machine learning Method and techniques
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Linear_Regression
Linear_RegressionLinear_Regression
Linear_Regression
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
 
IRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and ChallengesIRJET- Machine Learning: Survey, Types and Challenges
IRJET- Machine Learning: Survey, Types and Challenges
 
Chapter01 introductory handbook
Chapter01 introductory handbookChapter01 introductory handbook
Chapter01 introductory handbook
 
Presentation machine learning
Presentation machine learningPresentation machine learning
Presentation machine learning
 
AML_030607.ppt
AML_030607.pptAML_030607.ppt
AML_030607.ppt
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Creating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural netsCreating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural nets
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
 
AI and Video Marketing.docx
AI and Video Marketing.docxAI and Video Marketing.docx
AI and Video Marketing.docx
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
 
Housing price prediction
Housing price predictionHousing price prediction
Housing price prediction
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 

Recently uploaded (20)

Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 

Basics of Machine Learning

  • 1. - Harsh Makadia Basics of Machine Learning
  • 2. Why machine learning matters Artificial intelligence will shape our future more powerfully than any other innovation this century. AlphaGo defeated one of the best human players at Go — an extraordinary achievement in a game dominated by humans for two decades after machines first conquered chess. OpenAI reached yet another incredible milestone by defeating the world’s top professionals in 1v1 matches of the online multiplayer game Dota 2.
  • 3. Google Translate app & Robots for Hotel Service
  • 4.
  • 5. Supervised Learning The two tasks of supervised learning: 1. Regression 2. Classification
  • 6. Try answering below questions 1. How much money will I make by spending more dollars on digital advertising? 2. Will this loan applicant pay back the loan or not? 3. What’s going to happen to the stock market tomorrow?
  • 7. Supervised learning = Data set containing training examples with associated correct labels For example, learning to classify handwritten digits. - Train your data set to such extent that it becomes easy to find out the result of unknown inputs.
  • 8. How supervised learning works ? let’s examine the problem of predicting annual income of person based on the number of years of higher education he/she has completed. Inshort, Build a model that approximates the relationship f between the number of years of higher education X and corresponding annual income Y
  • 9.
  • 10. Solutions Engineering Solution income = ($5,000 * years_of_education) + baseline_income Learning Solution Degree type, Years of work experience, School tiers, etc. For example: “If they completed a Bachelor’s degree or higher, give the income estimate a 1.5x multiplier.”
  • 11. Goal of supervised learning - Learns the relationship between income & education from scratch, by running labeled training data through a learning algorithm. - This learned function can be used to estimate the income of people whose income Y is unknown, as long as we have years of education X as inputs. - Inshort, we can apply our model to the unlabeled test data to estimate GOALS? Predict Y as accurately as possible when given new examples where X is known and Y is unknown.
  • 12. Two task of Supervised Learning
  • 13. 1. Regression - predicting continuous values - Predicts a continuous target variable Y. - Estimate a value, such as housing prices or human lifespan, based on input data X. - Continuous means there aren’t gaps (discontinuities) in the value that Y can take on. Eg. A person’s weight and height are continuous values. - Discrete Variables means it can only take on a finite number of values Eg. Number of kids somebody has is a discrete variable.
  • 14. - Predicting income is a classic regression problem. Your input data X includes all relevant information about individuals in the data set that can be used to predict income, such as years of education, years of work experience, job title, or zip code. - These attributes are called Features, which can be numerical(e.g. years of work experience) or categorical (e.g. job title or field of study). You’ll want as many training observations as possible relating these features to the target output Y, so that your model can learn the relationship f between X and Y.
  • 15.
  • 16. Linear regression(ordinary least squares) We have our data set X, and corresponding target values Y. The goal of ordinary least squares (OLS) regression is to learn a linear model that we can use to predict a new y given a previously unseen x with as little error as possible. We want to guess how much income someone earns based on how many years of education they received.
  • 17.
  • 18. Linear regression is a parametric method, which means it makes an assumption about the form of the function relating X and Y (we’ll cover examples of non-parametric methods later). Our model will be a function that predicts ŷ given a specific x:
  • 19. β0 is the y-intercept and β1 is the slope of our line, i.e. how much income increases (or decreases) with one additional year of education. Our goal is to learn the model parameters (in this case, β0 and β1) that minimize error in the model’s predictions. To find the best parameters: 1. Define a cost function, or loss function, that measures how inaccurate our model’s predictions are. 2. Find the parameters that minimize loss, i.e. make our model as accurate as possible.
  • 20. Gradient Descent - Learn the parameters - TensorFlow(Open source library for machine intelligence) uses this in the background. - Find the minimum of our model’s loss function by iteratively getting a better and better approximation of it.
  • 21. Overfitting - Common problem in Machine Learning. - Overfitting happens when a model overlearns from the training data to the point that it starts picking up idiosyncrasies that aren’t representative of patterns in the real world.
  • 22. 2. Classification More question to answer - - Is this email spam or not? - Is that borrower going to repay their loan? - Will those users click on the ad or not? - Who is that person in your Facebook picture? Two Types : 1. logistic regression 2. Support vector machines (SVMs).
  • 23. - Classification predicts a discrete target label Y. - Classification is the problem of assigning new observations to the class to which they most likely belong, based on a classification model built from labeled training data.
  • 24.
  • 25. Logistic regression: 0 or 1? - Deals with binary type of classification - e.g. when assigning handwritten digits a label between 0 and 9, or using facial recognition to detect which friends are in a Facebook picture.
  • 26. SVM (Support vector Machines) - The algorithm is geometrically motivated in nature, rather than being driven by probabilistic thinking. - SVMs use a separating line (or, in more than two dimensions, a multi-dimensional hyperplane) to split the space into a red zone and a blue zone.
  • 27. 2D Space that are either Blue or Red Separating them by line Way 1 Separating them by line Way 2 - The distance to the nearest point on either side of the line is called the margin, and SVM tries to maximize the margin. - Think about it like a safety space: the bigger that space, the less likely that noisy points get misclassified.
  • 28. Classification vs Regression - Groups the output into class - If it is discrete/categorical variable then it is classification problem - Predicts the output value using training data - If it is real number/continuous then it is regression problem
  • 30. Clustering - Items inside a cluster should be very similar to each other, but very different from those outside. - Does grouping of data without being told ahead of time - Used for knowledge discovery rather than prediction - clustering is useful whenever there is diverse and varied data Famous clustering Method k-means
  • 31.