SlideShare a Scribd company logo
State of the Art in Machine Learning
Poul Petersen
BigML
2
What is ML?
“a field of study that gives computers the
ability to learn without being explicitly
programmed”
Professor Arthur Samuel, 1959
•What “ability to learn” do computers have?
•What does “explicitly programmed” mean?
Square Feet
Price
Sacramento Real Estate Prices by sq_ft
3
Ability to Learn
4
Ability to Learn
• Provide computer with examples of the relationship between
square footage X and price Y.
• The computer learns the equation of a line F() which fits the
examples.
• The computer can now predict the price of any home
knowing only the square footage: F( x ) = y
• This model is known as Linear Regression. There are other
types of models.
• You may have noticed there was some points that did not fit.
This is important!
5
Learning Problems (fit)
Under-fitting Over-fitting
• Model does not fit well enough
• Does not capture the underlying trend
of the data
• Change algorithm or features
• Model fits too well does not “generalize”
• Captures the noise or outliers of the
data
• Change algorithm or filter outliers
6
Learning Problems (missing)
• Missing values at training/prediction time
• Some algorithms can handle missing values, some no
• Missing data is sometimes important
• Replace missing values
• Predict missing values
7
Learning Problems (missing)
Missing@
Decision
Trees
KNN
Logistic
Regression
Naive
Bayes
Neural
Networks
Training Yes No No Yes Yes*
Prediction Yes No No Yes No
8
Not Explicitly Programmed
Control System
“Customers go on vacation… We need a program that
predicts if the light will come on when the control
system flips the switch.”
9
Not Explicitly Programmed
@iLoveRuby
Switch Light?
on TRUE
off FALSE
• @iLoveRuby reasoned the rules by experience
• Then programmed the rules explicitly
10
Not Explicitly Programmed
Switch Light?
on TRUE
off FALSE
• The ML System reasoned the rules from the data and
created a Model
• Functionally the Model is the same as @iLoveRuby’s
explicit model
@iLoveML
question
answer
ML System Model
11
Not Explicitly Programmed
• how long has the bulb has been in service
• reliability of brand
• rated power: higher power shorter life
• duty cycle
• room conditions: temperature, humidity
Goal is to predict if the bulb will come on

but the switch is not the important variable:
Even worse: None of these conditions is absolute.
12
Not Explicitly Programmed
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
13
Not Explicitly Programmed
@iLoveRuby
• Multi-dimensional data is much harder to find rules
• Explicit program requires modification
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
@iLoveML ML System Model
14
Not Explicitly Programmed
• ML System easily re-trains on new data
question
answer
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
@iLoveML ML System Model
15
Terminology
input
prediction
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
koala 15 273 1 27 0.18 FALSE
koala 45 102 1 21 0.48 FALSE
koala 15 110 2 15 0.99 TRUE
otter 45 355 2 15 0.01 FALSE
otter 15 69 1 24 0.70 FALSE
koala 15 69 1 24 0.70 FALSE
koala 15 337 2 27 0.83 TRUE
datasource
training
“putting into
production”
16
Supervised Learning
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27 FALSE
koala 15 315 1 19 0.37 TRUE
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56 FALSE
koala 15 318 2 22 0.45 TRUE
features
instances
label
Labeled Data
17
Supervised Learning
animal state … proximity action
tiger hungry … close run
elephant happy … far take picture
Classification
animal state … proximity min_kmh
tiger hungry … close 70
hippo angry … far 10
Regression
label
animal state … proximity action1 action2
tiger hungry … close run look untasty
elephant happy … far take picture call friends
Multi-Label Classification
18
Unsupervised Learning
date customer account auth class zip amount
Mon Bob 3421 pin clothes 46140 135
Tue Bob 3421 sign food 46140 401
Tue Alice 2456 pin food 12222 234
Wed Sally 6788 pin gas 26339 94
Wed Bob 3421 pin tech 21350 2459
Wed Bob 3421 pin gas 46140 83
The Sally 6788 sign food 26339 51
features
instances
Unlabeled Data
19
Unsupervised Learning
date customer account auth class zip amount
Mon Bob 3421 pin clothes 46140 135
Tue Bob 3421 sign food 46140 401
Tue Alice 2456 pin food 12222 234
Wed Sally 6788 pin gas 26339 94
Wed Bob 3421 pin tech 21350 2459
Wed Bob 3421 pin gas 46140 83
The Sally 6788 sign food 26339 51
Clustering
date customer account auth class zip amount
Mon Bob 3421 pin clothes 46140 135
Tue Bob 3421 sign food 46140 401
Tue Alice 2456 pin food 12222 234
Wed Sally 6788 pin gas 26339 94
Wed Bob 3421 pin tech 21350 2459
Wed Bob 3421 pin gas 46140 83
The Sally 6788 sign food 26339 51
Anomaly Detection
similar
unusual
20
Semi-Supervised Learning
brand power age duty temp humidity FAIL?
koala 45 338 1 16 0.03 FALSE
otter 15 140 1 27 0.27
koala 15 315 1 19 0.37
otter 45 338 1 29 0.27 TRUE
koala 45 211 1 23 0.85 TRUE
otter 15 328 1 17 0.56
koala 15 318 2 22 0.45 TRUE
label
Labeled and Unlabeled Data
}infer
21
Data Types
numeric
1 2 3
1, 2.0, 3, -5.4 categoricaltrue, yes, red, mammal categoricalcategorical
A B C
DATE-TIME2013-09-25 10:02
DATE-TIME
YEAR
MONTH
DAY-OF-MONTH
YYYY-MM-DD
DAY-OF-WEEK
HOUR
MINUTE
YYYY-MM-DD
YYYY-MM-DD
M-T-W-T-F-S-D
HH:MM:SS
HH:MM:SS
2013
September
25
Wednesday
10
02
text
Be not afraid of greatness:
some are born great, some
achieve greatness, and
some have greatness
thrust upon 'em.
text
“great”
“afraid”
“born”
“some”
appears 2 times
appears 1 time
appears 1 time
appears 2 times
22
Text Analysis
Be not afraid of greatness:
some are born great, some
achieve greatness, and
some have greatness
thrust upon 'em.
22
Text Analysis
Be not afraid of greatness:
some are born great, some
achieve greatness, and
some have greatness
thrust upon 'em.
great: appears 4 times
23
Text Analysis
great afraid born achieve
4 1 1 1
… … … …
Be not afraid of greatness:
some are born great, some achieve
greatness, and some have greatness
thrust upon ‘em.
Model
The token “great” 

does not occur
The token “afraid” 

occurs more than once
24
Topic Modeling
25
Brief History of ML
1950 1960 1970 1980 1990 2000 2010
Perceptron
Neural
Networks
Ensembles
Support Vector Machines
Boosting
Interpretability
Rosenblatt, 1957
Quinlan, 1979 (ID3),
Minsky, 1969
Vapnik, 1963 Corina & Vapnik, 1995
Schapire, 1989 (Boosting)
Schapire, 1995 (Adaboost)
Breiman, 2001 (Random Forests)
Breiman, 1994 (Bagging)
Deep Learning
Hinton, 2006Fukushima, 1989 (ANN)
Breiman, 1984 (CART)
2020
+
-
Decision Trees
26
Why ML Now?
• Decreasing cost of data
• Abundant computing power, especially cloud
• Machine Learning APIs
• Abundance of APIs + internet to combine easily
27
Composability
28
The Stages of a ML App
State the problem
Data Wrangling
Feature Engineering
Learning
Deploying
Predicting
Measuring Impact

More Related Content

What's hot

358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16
sumitbardhan
 
Knowledge representation and Predicate logic
Knowledge representation and Predicate logicKnowledge representation and Predicate logic
Knowledge representation and Predicate logic
Amey Kerkar
 
Database Normalization by Dr. Kamal Gulati
Database Normalization by Dr. Kamal GulatiDatabase Normalization by Dr. Kamal Gulati
Predicate logic
 Predicate logic Predicate logic
Predicate logic
Harini Balamurugan
 
Presentation on "Knowledge acquisition & validation"
  Presentation on "Knowledge acquisition & validation"  Presentation on "Knowledge acquisition & validation"
Presentation on "Knowledge acquisition & validation"
Aditya Sarkar
 
Knowledge representation using predicate logic
Knowledge representation using predicate logicKnowledge representation using predicate logic
Knowledge representation using predicate logic
HarshitaSharma285596
 
Spatial Indexing
Spatial IndexingSpatial Indexing
Spatial Indexing
torp42
 
DBMS Question bank
DBMS Question bankDBMS Question bank
DBMS Question bank
Sara Sahu
 
Umldiagrams ForOOAD Lab B.Tech 4-1
Umldiagrams ForOOAD Lab B.Tech 4-1Umldiagrams ForOOAD Lab B.Tech 4-1
Umldiagrams ForOOAD Lab B.Tech 4-1
ShashikanthKaninde
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
Jargalsaikhan Alyeksandr
 
Searching, Sorting and Hashing Techniques
Searching, Sorting and Hashing TechniquesSearching, Sorting and Hashing Techniques
Searching, Sorting and Hashing Techniques
Selvaraj Seerangan
 
Regular Expressions To Finite Automata
Regular Expressions To Finite AutomataRegular Expressions To Finite Automata
Regular Expressions To Finite Automata
International Institute of Information Technology (I²IT)
 
Primary Key & Foreign Key part10
Primary Key & Foreign Key part10Primary Key & Foreign Key part10
Primary Key & Foreign Key part10
DrMohammed Qassim
 
Integrity Constraints
Integrity ConstraintsIntegrity Constraints
Integrity Constraints
madhav bansal
 
Relational algebra ppt
Relational algebra pptRelational algebra ppt
Relational algebra ppt
GirdharRatne
 
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
 
15 puzzle problem using branch and bound
15 puzzle problem using branch and bound15 puzzle problem using branch and bound
15 puzzle problem using branch and bound
Abhishek Singh
 
State space search and Problem Solving techniques
State space search and Problem Solving techniquesState space search and Problem Solving techniques
State space search and Problem Solving techniques
Kirti Verma
 
Active database
Active databaseActive database
Active database
Dabbal Singh Mahara
 

What's hot (20)

358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16358 33 powerpoint-slides_16-files-their-organization_chapter-16
358 33 powerpoint-slides_16-files-their-organization_chapter-16
 
Knowledge representation and Predicate logic
Knowledge representation and Predicate logicKnowledge representation and Predicate logic
Knowledge representation and Predicate logic
 
Database Normalization by Dr. Kamal Gulati
Database Normalization by Dr. Kamal GulatiDatabase Normalization by Dr. Kamal Gulati
Database Normalization by Dr. Kamal Gulati
 
Predicate logic
 Predicate logic Predicate logic
Predicate logic
 
Presentation on "Knowledge acquisition & validation"
  Presentation on "Knowledge acquisition & validation"  Presentation on "Knowledge acquisition & validation"
Presentation on "Knowledge acquisition & validation"
 
Knowledge representation using predicate logic
Knowledge representation using predicate logicKnowledge representation using predicate logic
Knowledge representation using predicate logic
 
Spatial Indexing
Spatial IndexingSpatial Indexing
Spatial Indexing
 
DBMS Question bank
DBMS Question bankDBMS Question bank
DBMS Question bank
 
Umldiagrams ForOOAD Lab B.Tech 4-1
Umldiagrams ForOOAD Lab B.Tech 4-1Umldiagrams ForOOAD Lab B.Tech 4-1
Umldiagrams ForOOAD Lab B.Tech 4-1
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
 
Searching, Sorting and Hashing Techniques
Searching, Sorting and Hashing TechniquesSearching, Sorting and Hashing Techniques
Searching, Sorting and Hashing Techniques
 
Regular Expressions To Finite Automata
Regular Expressions To Finite AutomataRegular Expressions To Finite Automata
Regular Expressions To Finite Automata
 
Primary Key & Foreign Key part10
Primary Key & Foreign Key part10Primary Key & Foreign Key part10
Primary Key & Foreign Key part10
 
Integrity Constraints
Integrity ConstraintsIntegrity Constraints
Integrity Constraints
 
Relational algebra ppt
Relational algebra pptRelational algebra ppt
Relational algebra ppt
 
Normalization in databases
Normalization in databasesNormalization in databases
Normalization in databases
 
Feature selection
Feature selectionFeature selection
Feature selection
 
15 puzzle problem using branch and bound
15 puzzle problem using branch and bound15 puzzle problem using branch and bound
15 puzzle problem using branch and bound
 
State space search and Problem Solving techniques
State space search and Problem Solving techniquesState space search and Problem Solving techniques
State space search and Problem Solving techniques
 
Active database
Active databaseActive database
Active database
 

Viewers also liked

L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
Machine Learning Valencia
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
odsc
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
Terry Taewoong Um
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
A field guide the machine learning zoo
A field guide the machine learning zoo A field guide the machine learning zoo
A field guide the machine learning zoo
Theodoros Vasiloudis
 
Introduction to Machine Learning* Prof. D. Spears
Introduction to Machine Learning* Prof. D. SpearsIntroduction to Machine Learning* Prof. D. Spears
Introduction to Machine Learning* Prof. D. Spearsbutest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learningbutest
 
Machine Learning - Where to Next?, May 2015
Machine Learning  - Where to Next?, May 2015Machine Learning  - Where to Next?, May 2015
Machine Learning - Where to Next?, May 2015
Peter Morgan
 
Ai history to-m-learning
Ai history to-m-learningAi history to-m-learning
Ai history to-m-learning
Kyung Eun Park
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for business
Clement Levallois
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
Srinivasan R
 
BSSML16 L7. Feature Engineering
BSSML16 L7. Feature EngineeringBSSML16 L7. Feature Engineering
BSSML16 L7. Feature Engineering
BigML, Inc
 
A brief history of machine learning
A brief history of  machine learningA brief history of  machine learning
A brief history of machine learning
Robert Colner
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature Engineering
DataRobot
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
Alexandros Karatzoglou
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP Pune
Ganesh Raskar
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
台灣資料科學年會
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Anastasia Jakubow
 

Viewers also liked (20)

L5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature EngineeringL5. Data Transformation and Feature Engineering
L5. Data Transformation and Feature Engineering
 
Feature Engineering
Feature Engineering Feature Engineering
Feature Engineering
 
Introduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep LearningIntroduction to Machine Learning and Deep Learning
Introduction to Machine Learning and Deep Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
A field guide the machine learning zoo
A field guide the machine learning zoo A field guide the machine learning zoo
A field guide the machine learning zoo
 
Introduction to Machine Learning* Prof. D. Spears
Introduction to Machine Learning* Prof. D. SpearsIntroduction to Machine Learning* Prof. D. Spears
Introduction to Machine Learning* Prof. D. Spears
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
Machine Learning - Where to Next?, May 2015
Machine Learning  - Where to Next?, May 2015Machine Learning  - Where to Next?, May 2015
Machine Learning - Where to Next?, May 2015
 
Ai history to-m-learning
Ai history to-m-learningAi history to-m-learning
Ai history to-m-learning
 
An explanation of machine learning for business
An explanation of machine learning for businessAn explanation of machine learning for business
An explanation of machine learning for business
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
BSSML16 L7. Feature Engineering
BSSML16 L7. Feature EngineeringBSSML16 L7. Feature Engineering
BSSML16 L7. Feature Engineering
 
A brief history of machine learning
A brief history of  machine learningA brief history of  machine learning
A brief history of machine learning
 
Make Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature EngineeringMake Sense Out of Data with Feature Engineering
Make Sense Out of Data with Feature Engineering
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP Pune
 
機器學習速遊
機器學習速遊機器學習速遊
機器學習速遊
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 

Similar to L1. State of the Art in Machine Learning

MLSEV. Models, Evaluations and Ensembles
MLSEV. Models, Evaluations and Ensembles MLSEV. Models, Evaluations and Ensembles
MLSEV. Models, Evaluations and Ensembles
BigML, Inc
 
DutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and EnsemblesDutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and Ensembles
BigML, Inc
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
BigML, Inc
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
BigML, Inc
 
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
lucenerevolution
 
DutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision MakingDutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision Making
BigML, Inc
 
VSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly DetectionVSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly Detection
BigML, Inc
 
Deep Learning and Design Thinking
Deep Learning and Design ThinkingDeep Learning and Design Thinking
Deep Learning and Design Thinking
Yen-lung Tsai
 
23T1W3 Mathematics ppt on Number Sequence.ppt
23T1W3 Mathematics ppt on Number Sequence.ppt23T1W3 Mathematics ppt on Number Sequence.ppt
23T1W3 Mathematics ppt on Number Sequence.ppt
RuthAdelekun1
 
Finding needles in haystacks with deep neural networks
Finding needles in haystacks with deep neural networksFinding needles in haystacks with deep neural networks
Finding needles in haystacks with deep neural networks
Calvin Giles
 

Similar to L1. State of the Art in Machine Learning (10)

MLSEV. Models, Evaluations and Ensembles
MLSEV. Models, Evaluations and Ensembles MLSEV. Models, Evaluations and Ensembles
MLSEV. Models, Evaluations and Ensembles
 
DutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and EnsemblesDutchMLSchool. Models, Evaluations, and Ensembles
DutchMLSchool. Models, Evaluations, and Ensembles
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
 
DutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End MLDutchMLSchool 2022 - End-to-End ML
DutchMLSchool 2022 - End-to-End ML
 
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
2013 11-06 lsr-dublin_m_hausenblas_solr as recommendation engine
 
DutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision MakingDutchMLSchool. Automating Decision Making
DutchMLSchool. Automating Decision Making
 
VSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly DetectionVSSML17 L3. Clusters and Anomaly Detection
VSSML17 L3. Clusters and Anomaly Detection
 
Deep Learning and Design Thinking
Deep Learning and Design ThinkingDeep Learning and Design Thinking
Deep Learning and Design Thinking
 
23T1W3 Mathematics ppt on Number Sequence.ppt
23T1W3 Mathematics ppt on Number Sequence.ppt23T1W3 Mathematics ppt on Number Sequence.ppt
23T1W3 Mathematics ppt on Number Sequence.ppt
 
Finding needles in haystacks with deep neural networks
Finding needles in haystacks with deep neural networksFinding needles in haystacks with deep neural networks
Finding needles in haystacks with deep neural networks
 

More from Machine Learning Valencia

From Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de MántarasFrom Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de Mántaras
Machine Learning Valencia
 
Artificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom DietterichArtificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom Dietterich
Machine Learning Valencia
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
Machine Learning Valencia
 
L15. Machine Learning - Black Art
L15. Machine Learning - Black ArtL15. Machine Learning - Black Art
L15. Machine Learning - Black Art
Machine Learning Valencia
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
Machine Learning Valencia
 
L13. Cluster Analysis
L13. Cluster AnalysisL13. Cluster Analysis
L13. Cluster Analysis
Machine Learning Valencia
 
L9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking PredictionsL9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking Predictions
Machine Learning Valencia
 
L11. The Future of Machine Learning
L11. The Future of Machine LearningL11. The Future of Machine Learning
L11. The Future of Machine Learning
Machine Learning Valencia
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
Machine Learning Valencia
 
LR1. Summary Day 1
LR1. Summary Day 1LR1. Summary Day 1
LR1. Summary Day 1
Machine Learning Valencia
 
L6. Unbalanced Datasets
L6. Unbalanced DatasetsL6. Unbalanced Datasets
L6. Unbalanced Datasets
Machine Learning Valencia
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
Machine Learning Valencia
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
Machine Learning Valencia
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
Machine Learning Valencia
 

More from Machine Learning Valencia (14)

From Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de MántarasFrom Turing To Humanoid Robots - Ramón López de Mántaras
From Turing To Humanoid Robots - Ramón López de Mántaras
 
Artificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom DietterichArtificial Intelligence Progress - Tom Dietterich
Artificial Intelligence Progress - Tom Dietterich
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 
L15. Machine Learning - Black Art
L15. Machine Learning - Black ArtL15. Machine Learning - Black Art
L15. Machine Learning - Black Art
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
 
L13. Cluster Analysis
L13. Cluster AnalysisL13. Cluster Analysis
L13. Cluster Analysis
 
L9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking PredictionsL9. Real World Machine Learning - Cooking Predictions
L9. Real World Machine Learning - Cooking Predictions
 
L11. The Future of Machine Learning
L11. The Future of Machine LearningL11. The Future of Machine Learning
L11. The Future of Machine Learning
 
L7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIsL7. A developers’ overview of the world of predictive APIs
L7. A developers’ overview of the world of predictive APIs
 
LR1. Summary Day 1
LR1. Summary Day 1LR1. Summary Day 1
LR1. Summary Day 1
 
L6. Unbalanced Datasets
L6. Unbalanced DatasetsL6. Unbalanced Datasets
L6. Unbalanced Datasets
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
L3. Decision Trees
L3. Decision TreesL3. Decision Trees
L3. Decision Trees
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
 

Recently uploaded

Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 

Recently uploaded (20)

Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 

L1. State of the Art in Machine Learning

  • 1. State of the Art in Machine Learning Poul Petersen BigML
  • 2. 2 What is ML? “a field of study that gives computers the ability to learn without being explicitly programmed” Professor Arthur Samuel, 1959 •What “ability to learn” do computers have? •What does “explicitly programmed” mean?
  • 3. Square Feet Price Sacramento Real Estate Prices by sq_ft 3 Ability to Learn
  • 4. 4 Ability to Learn • Provide computer with examples of the relationship between square footage X and price Y. • The computer learns the equation of a line F() which fits the examples. • The computer can now predict the price of any home knowing only the square footage: F( x ) = y • This model is known as Linear Regression. There are other types of models. • You may have noticed there was some points that did not fit. This is important!
  • 5. 5 Learning Problems (fit) Under-fitting Over-fitting • Model does not fit well enough • Does not capture the underlying trend of the data • Change algorithm or features • Model fits too well does not “generalize” • Captures the noise or outliers of the data • Change algorithm or filter outliers
  • 6. 6 Learning Problems (missing) • Missing values at training/prediction time • Some algorithms can handle missing values, some no • Missing data is sometimes important • Replace missing values • Predict missing values
  • 8. 8 Not Explicitly Programmed Control System “Customers go on vacation… We need a program that predicts if the light will come on when the control system flips the switch.”
  • 9. 9 Not Explicitly Programmed @iLoveRuby Switch Light? on TRUE off FALSE • @iLoveRuby reasoned the rules by experience • Then programmed the rules explicitly
  • 10. 10 Not Explicitly Programmed Switch Light? on TRUE off FALSE • The ML System reasoned the rules from the data and created a Model • Functionally the Model is the same as @iLoveRuby’s explicit model @iLoveML question answer ML System Model
  • 11. 11 Not Explicitly Programmed • how long has the bulb has been in service • reliability of brand • rated power: higher power shorter life • duty cycle • room conditions: temperature, humidity Goal is to predict if the bulb will come on but the switch is not the important variable: Even worse: None of these conditions is absolute.
  • 12. 12 Not Explicitly Programmed brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 FALSE koala 15 315 1 19 0.37 TRUE otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 FALSE koala 15 318 2 22 0.45 TRUE koala 15 273 1 27 0.18 FALSE koala 45 102 1 21 0.48 FALSE koala 15 110 2 15 0.99 TRUE otter 45 355 2 15 0.01 FALSE otter 15 69 1 24 0.70 FALSE koala 15 69 1 24 0.70 FALSE koala 15 337 2 27 0.83 TRUE
  • 13. 13 Not Explicitly Programmed @iLoveRuby • Multi-dimensional data is much harder to find rules • Explicit program requires modification brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 FALSE koala 15 315 1 19 0.37 TRUE otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 FALSE koala 15 318 2 22 0.45 TRUE koala 15 273 1 27 0.18 FALSE koala 45 102 1 21 0.48 FALSE koala 15 110 2 15 0.99 TRUE otter 45 355 2 15 0.01 FALSE otter 15 69 1 24 0.70 FALSE koala 15 69 1 24 0.70 FALSE koala 15 337 2 27 0.83 TRUE
  • 14. @iLoveML ML System Model 14 Not Explicitly Programmed • ML System easily re-trains on new data question answer brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 FALSE koala 15 315 1 19 0.37 TRUE otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 FALSE koala 15 318 2 22 0.45 TRUE koala 15 273 1 27 0.18 FALSE koala 45 102 1 21 0.48 FALSE koala 15 110 2 15 0.99 TRUE otter 45 355 2 15 0.01 FALSE otter 15 69 1 24 0.70 FALSE koala 15 69 1 24 0.70 FALSE koala 15 337 2 27 0.83 TRUE
  • 15. @iLoveML ML System Model 15 Terminology input prediction brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 FALSE koala 15 315 1 19 0.37 TRUE otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 FALSE koala 15 318 2 22 0.45 TRUE koala 15 273 1 27 0.18 FALSE koala 45 102 1 21 0.48 FALSE koala 15 110 2 15 0.99 TRUE otter 45 355 2 15 0.01 FALSE otter 15 69 1 24 0.70 FALSE koala 15 69 1 24 0.70 FALSE koala 15 337 2 27 0.83 TRUE datasource training “putting into production”
  • 16. 16 Supervised Learning brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 FALSE koala 15 315 1 19 0.37 TRUE otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 FALSE koala 15 318 2 22 0.45 TRUE features instances label Labeled Data
  • 17. 17 Supervised Learning animal state … proximity action tiger hungry … close run elephant happy … far take picture Classification animal state … proximity min_kmh tiger hungry … close 70 hippo angry … far 10 Regression label animal state … proximity action1 action2 tiger hungry … close run look untasty elephant happy … far take picture call friends Multi-Label Classification
  • 18. 18 Unsupervised Learning date customer account auth class zip amount Mon Bob 3421 pin clothes 46140 135 Tue Bob 3421 sign food 46140 401 Tue Alice 2456 pin food 12222 234 Wed Sally 6788 pin gas 26339 94 Wed Bob 3421 pin tech 21350 2459 Wed Bob 3421 pin gas 46140 83 The Sally 6788 sign food 26339 51 features instances Unlabeled Data
  • 19. 19 Unsupervised Learning date customer account auth class zip amount Mon Bob 3421 pin clothes 46140 135 Tue Bob 3421 sign food 46140 401 Tue Alice 2456 pin food 12222 234 Wed Sally 6788 pin gas 26339 94 Wed Bob 3421 pin tech 21350 2459 Wed Bob 3421 pin gas 46140 83 The Sally 6788 sign food 26339 51 Clustering date customer account auth class zip amount Mon Bob 3421 pin clothes 46140 135 Tue Bob 3421 sign food 46140 401 Tue Alice 2456 pin food 12222 234 Wed Sally 6788 pin gas 26339 94 Wed Bob 3421 pin tech 21350 2459 Wed Bob 3421 pin gas 46140 83 The Sally 6788 sign food 26339 51 Anomaly Detection similar unusual
  • 20. 20 Semi-Supervised Learning brand power age duty temp humidity FAIL? koala 45 338 1 16 0.03 FALSE otter 15 140 1 27 0.27 koala 15 315 1 19 0.37 otter 45 338 1 29 0.27 TRUE koala 45 211 1 23 0.85 TRUE otter 15 328 1 17 0.56 koala 15 318 2 22 0.45 TRUE label Labeled and Unlabeled Data }infer
  • 21. 21 Data Types numeric 1 2 3 1, 2.0, 3, -5.4 categoricaltrue, yes, red, mammal categoricalcategorical A B C DATE-TIME2013-09-25 10:02 DATE-TIME YEAR MONTH DAY-OF-MONTH YYYY-MM-DD DAY-OF-WEEK HOUR MINUTE YYYY-MM-DD YYYY-MM-DD M-T-W-T-F-S-D HH:MM:SS HH:MM:SS 2013 September 25 Wednesday 10 02 text Be not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon 'em. text “great” “afraid” “born” “some” appears 2 times appears 1 time appears 1 time appears 2 times
  • 22. 22 Text Analysis Be not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon 'em.
  • 23. 22 Text Analysis Be not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon 'em. great: appears 4 times
  • 24. 23 Text Analysis great afraid born achieve 4 1 1 1 … … … … Be not afraid of greatness: some are born great, some achieve greatness, and some have greatness thrust upon ‘em. Model The token “great” does not occur The token “afraid” occurs more than once
  • 26. 25 Brief History of ML 1950 1960 1970 1980 1990 2000 2010 Perceptron Neural Networks Ensembles Support Vector Machines Boosting Interpretability Rosenblatt, 1957 Quinlan, 1979 (ID3), Minsky, 1969 Vapnik, 1963 Corina & Vapnik, 1995 Schapire, 1989 (Boosting) Schapire, 1995 (Adaboost) Breiman, 2001 (Random Forests) Breiman, 1994 (Bagging) Deep Learning Hinton, 2006Fukushima, 1989 (ANN) Breiman, 1984 (CART) 2020 + - Decision Trees
  • 27. 26 Why ML Now? • Decreasing cost of data • Abundant computing power, especially cloud • Machine Learning APIs • Abundance of APIs + internet to combine easily
  • 29. 28 The Stages of a ML App State the problem Data Wrangling Feature Engineering Learning Deploying Predicting Measuring Impact