SlideShare a Scribd company logo
My journey learning Machine Learning
The Good,The Bad and The Ugly
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Brief intro
about me
• French native living in London
• Passionate about adding business
value through data
• Passionate about empowering
corporates and Business Schools’
students with data knowledge
• Digital Analytics background
• Founded The Data Touch in 2016 to
keep doing the above but more freely!
• Strong interest in Machine Learning
• Delighted that you are here today!
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Content
• A challenging start
• The Machine Learning process (in business terms)
• Key Machine Learning concepts
• Introduction to Python, Anaconda and Jupyter
• Tips and Resources
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
It’s a journey…
• Made up of maths and
programming at the same
time!
• BUT…Any breakthrough
will feel amazing.
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
What my first
Python/ML day
looked and felt
like
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Keep calm and
carry on coding!
“It always seems impossible
until it is done”
Nelson Mandela
“It’s always too soon to quit!”
Norman Vincent Peale
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
The Machine Learning Process (in business terms)
Define the problem Acquire, prepare and
explore the data
Apply the algorithms Interpret the results
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Key Concepts
• Train/Learn and Test
• Supervised vs Unsupervised Learning
• Mathematical Distance
• Dummy Variables
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
The Train/Learn
and Test Process
• Testing Phase:
• You test the performance of your model on a
different portion of the dataset.
• You show your model all your features (x) but
not the target variable (y) and have the
model predict the target variable based on
what it learnt in the previous phase.
• The model then gives you a score based on
how it did vs. the real target values.
• Training Phase
• You split your dataset in 2 portions
• You show the model all your data: features (x)
and target variable (y).
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Supervised vs.
Unsupervised Learning
• Supervised:
• The target variable is well defined.
• Eg: an amount of money, the belonging to
such and such category.
• Linear Regression, K Nearest Neighbours
are algorithms using supervised learning.
• Unsupervised:
• There is no target variable per say!
• Eg: clusters of customers identified by
the algorithm that we had not
necessarily thought of.
• K-Means is a popular Unsupervised
Learning algorithm
?
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Mathematical Distance between data points
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Euclidean Distance Reminder
Solution:
Using the good old
Pythagorean Theorem,
we can calculate the
distance between A and
B as follows:
(x2-x1)^2+(y2-y1)^2
x1
y2
y1
x2
√
A
B
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Mathematical Distance in “Action”
Centroid 1
• With K-Means, K is the number of clusters
we want the algorithm to identify.
• K-Means consists in creating coherent and
well separated clusters.
• A lot harder to interpret but potentially
much richer insights.
• With KNN-K Nearest Neighbours, K is the
number of neighbours the algorithm will
look at in the training set to predict which
category each new data point is likely to
belong to.
• KNN is a Supervised Learning algorithm.
White wine Red wineSparkling wine
?
Visualising how KNN works:
Visualising how K-Means works:
Centroid 2
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Dummy variables
Session Device_Desktop Device_Tablet
Mobile 0 0
Desktop 1 0
Tablet 0 1
Session Region_America Region_Europe
America 1 0
Europe 0 1
Asia 0 0
You create n-1 dummy
variables for an algorithm to
be able to use categorical
data.
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Introduction to Python, Anaconda and Jupyter
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Python/R Anaconda Jupyter
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
The Machine Learning Process in Machine
Learning terms this time
1)Import key libraries
2) Import algorithms
3) Import your dataset
4) Define your target variable– The variable you want to predict
(Supervised Learning)
5) Explore your dataset
6) Split dataset in Train and Test
7) Create an instance of your model and fit it on your
training data
8) Evaluate the performance of your model by testing it on your
test data
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Quick Python and Jupyter Demo
Tips and Resources
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Getting
started
• Kaggle
• Udemy
• Towards Data Science (Medium)
• Stackoverflow
• Don’t get too disturbed by the heavy
maths:
• Don’t ignore the maths but pause and revisit
later if need be.
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Once you
have
started
• Practice and Practice again
• Teach it!!
• Play with some real datasets
• Always Stackoverflow it!
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Thanks!
Linkedin.com/in/penelopebellegarde/
@thedatatouch
penelope.bellegarde@thedatatouch.com
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
Machine Learning can be a lot of fun when you start getting somewhere
© 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED

More Related Content

Similar to Measure camp 2021_my_journey_learning_machine_learning_the_good_the_bad_and_the_ugly

L1. Basic Programming Concepts.pdf
L1. Basic Programming Concepts.pdfL1. Basic Programming Concepts.pdf
L1. Basic Programming Concepts.pdf
MMRF2
 
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphHow Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
Neo4j
 
Oracle analytics Live - January 2021
Oracle analytics Live - January 2021Oracle analytics Live - January 2021
Oracle analytics Live - January 2021
Benjamin Arnulf
 
NexGen Essentials... All You Need to Know!
NexGen Essentials... All You Need to Know!NexGen Essentials... All You Need to Know!
NexGen Essentials... All You Need to Know!
Kimsha A. Williams
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptx
Natan Katz
 
Pausefest: Solve your own damn problem
Pausefest: Solve your own damn problemPausefest: Solve your own damn problem
Pausefest: Solve your own damn problem
Mike Ojo
 
Analytics - Trends and Prospects
Analytics - Trends and ProspectsAnalytics - Trends and Prospects
Analytics - Trends and Prospects
Dr. Umesh Rao.Hodeghatta
 
Asp.net Developers portfolio and case study NicheTech
Asp.net Developers portfolio and case study NicheTechAsp.net Developers portfolio and case study NicheTech
Asp.net Developers portfolio and case study NicheTech
NicheTech Computer Solutions Pvt. Ltd.
 
Growing as a software craftsperson (part 1) From Pune Software Craftsmanship.
Growing as a software craftsperson (part 1)  From Pune Software Craftsmanship.Growing as a software craftsperson (part 1)  From Pune Software Craftsmanship.
Growing as a software craftsperson (part 1) From Pune Software Craftsmanship.
Dattatray Kale
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
IRJET Journal
 
Class[1][23ed may] [algorithms]
Class[1][23ed may] [algorithms]Class[1][23ed may] [algorithms]
Class[1][23ed may] [algorithms]
Saajid Akram
 
8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solvingKhan Yousafzai
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
PMI-Montréal
 
Artificial Intellige by example.pdf
Artificial Intellige by example.pdfArtificial Intellige by example.pdf
Artificial Intellige by example.pdf
sandipanpaul16
 
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
Codiax
 
_OOP with JAVA Solution Manual (1).pdf
_OOP with JAVA Solution Manual (1).pdf_OOP with JAVA Solution Manual (1).pdf
_OOP with JAVA Solution Manual (1).pdf
vanithagp1
 
Spark Hearts GraphLab Create
Spark Hearts GraphLab CreateSpark Hearts GraphLab Create
Spark Hearts GraphLab Create
Amanda Casari
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
Dataconomy Media
 

Similar to Measure camp 2021_my_journey_learning_machine_learning_the_good_the_bad_and_the_ugly (20)

L1. Basic Programming Concepts.pdf
L1. Basic Programming Concepts.pdfL1. Basic Programming Concepts.pdf
L1. Basic Programming Concepts.pdf
 
How Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge GraphHow Graph Data Science can turbocharge your Knowledge Graph
How Graph Data Science can turbocharge your Knowledge Graph
 
Oracle analytics Live - January 2021
Oracle analytics Live - January 2021Oracle analytics Live - January 2021
Oracle analytics Live - January 2021
 
NexGen Essentials... All You Need to Know!
NexGen Essentials... All You Need to Know!NexGen Essentials... All You Need to Know!
NexGen Essentials... All You Need to Know!
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptx
 
Pausefest: Solve your own damn problem
Pausefest: Solve your own damn problemPausefest: Solve your own damn problem
Pausefest: Solve your own damn problem
 
Analytics - Trends and Prospects
Analytics - Trends and ProspectsAnalytics - Trends and Prospects
Analytics - Trends and Prospects
 
Asp.net Developers portfolio and case study NicheTech
Asp.net Developers portfolio and case study NicheTechAsp.net Developers portfolio and case study NicheTech
Asp.net Developers portfolio and case study NicheTech
 
Growing as a software craftsperson (part 1) From Pune Software Craftsmanship.
Growing as a software craftsperson (part 1)  From Pune Software Craftsmanship.Growing as a software craftsperson (part 1)  From Pune Software Craftsmanship.
Growing as a software craftsperson (part 1) From Pune Software Craftsmanship.
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
Optimization Problems Solved by Different Platforms Say Optimum Tool Box (Mat...
 
Class[1][23ed may] [algorithms]
Class[1][23ed may] [algorithms]Class[1][23ed may] [algorithms]
Class[1][23ed may] [algorithms]
 
8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving8.1 alogorithm & prolem solving
8.1 alogorithm & prolem solving
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
 
Artificial Intellige by example.pdf
Artificial Intellige by example.pdfArtificial Intellige by example.pdf
Artificial Intellige by example.pdf
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
Luka Postružin (Superbet) – ‘From zero to hero’ in early life customer segmen...
 
_OOP with JAVA Solution Manual (1).pdf
_OOP with JAVA Solution Manual (1).pdf_OOP with JAVA Solution Manual (1).pdf
_OOP with JAVA Solution Manual (1).pdf
 
Spark Hearts GraphLab Create
Spark Hearts GraphLab CreateSpark Hearts GraphLab Create
Spark Hearts GraphLab Create
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 

Recently uploaded

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 

Recently uploaded (20)

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 

Measure camp 2021_my_journey_learning_machine_learning_the_good_the_bad_and_the_ugly

  • 1. My journey learning Machine Learning The Good,The Bad and The Ugly © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 2. Brief intro about me • French native living in London • Passionate about adding business value through data • Passionate about empowering corporates and Business Schools’ students with data knowledge • Digital Analytics background • Founded The Data Touch in 2016 to keep doing the above but more freely! • Strong interest in Machine Learning • Delighted that you are here today! © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 3. Content • A challenging start • The Machine Learning process (in business terms) • Key Machine Learning concepts • Introduction to Python, Anaconda and Jupyter • Tips and Resources © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 4. It’s a journey… • Made up of maths and programming at the same time! • BUT…Any breakthrough will feel amazing. © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 5. What my first Python/ML day looked and felt like © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 6. Keep calm and carry on coding! “It always seems impossible until it is done” Nelson Mandela “It’s always too soon to quit!” Norman Vincent Peale © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 7. The Machine Learning Process (in business terms) Define the problem Acquire, prepare and explore the data Apply the algorithms Interpret the results © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 8. Key Concepts • Train/Learn and Test • Supervised vs Unsupervised Learning • Mathematical Distance • Dummy Variables © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 9. The Train/Learn and Test Process • Testing Phase: • You test the performance of your model on a different portion of the dataset. • You show your model all your features (x) but not the target variable (y) and have the model predict the target variable based on what it learnt in the previous phase. • The model then gives you a score based on how it did vs. the real target values. • Training Phase • You split your dataset in 2 portions • You show the model all your data: features (x) and target variable (y). © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 10. Supervised vs. Unsupervised Learning • Supervised: • The target variable is well defined. • Eg: an amount of money, the belonging to such and such category. • Linear Regression, K Nearest Neighbours are algorithms using supervised learning. • Unsupervised: • There is no target variable per say! • Eg: clusters of customers identified by the algorithm that we had not necessarily thought of. • K-Means is a popular Unsupervised Learning algorithm ? © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 11. Mathematical Distance between data points © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 12. Euclidean Distance Reminder Solution: Using the good old Pythagorean Theorem, we can calculate the distance between A and B as follows: (x2-x1)^2+(y2-y1)^2 x1 y2 y1 x2 √ A B © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 13. Mathematical Distance in “Action” Centroid 1 • With K-Means, K is the number of clusters we want the algorithm to identify. • K-Means consists in creating coherent and well separated clusters. • A lot harder to interpret but potentially much richer insights. • With KNN-K Nearest Neighbours, K is the number of neighbours the algorithm will look at in the training set to predict which category each new data point is likely to belong to. • KNN is a Supervised Learning algorithm. White wine Red wineSparkling wine ? Visualising how KNN works: Visualising how K-Means works: Centroid 2 © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 14. Dummy variables Session Device_Desktop Device_Tablet Mobile 0 0 Desktop 1 0 Tablet 0 1 Session Region_America Region_Europe America 1 0 Europe 0 1 Asia 0 0 You create n-1 dummy variables for an algorithm to be able to use categorical data. © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 15. Introduction to Python, Anaconda and Jupyter © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 16. Python/R Anaconda Jupyter © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 17. The Machine Learning Process in Machine Learning terms this time 1)Import key libraries 2) Import algorithms 3) Import your dataset 4) Define your target variable– The variable you want to predict (Supervised Learning) 5) Explore your dataset 6) Split dataset in Train and Test 7) Create an instance of your model and fit it on your training data 8) Evaluate the performance of your model by testing it on your test data © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 18. Quick Python and Jupyter Demo
  • 19. Tips and Resources © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 20. Getting started • Kaggle • Udemy • Towards Data Science (Medium) • Stackoverflow • Don’t get too disturbed by the heavy maths: • Don’t ignore the maths but pause and revisit later if need be. © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 21. Once you have started • Practice and Practice again • Teach it!! • Play with some real datasets • Always Stackoverflow it! © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED
  • 23. Machine Learning can be a lot of fun when you start getting somewhere © 2021 THE DATA TOUCH LTD ALL RIGHTS RESERVED