SlideShare a Scribd company logo
Aly Osama
Machine
LearningFor everyone
Teaching Assistant, Ain Shams University
 Computational biology and deep learning
Former Research Software Development
Engineer, Microsoft Research (ATLC)
 Speech Recognition Team “Arabic Models”
 Natural Language Processing Team “Virtual Bot”
ABOUT ME
aly.osama@eng.asu.edu.eg
AGENDA
1. Introduction
2. Machine learning tools
 Scikit-learn
3. Case Study applications
 Computer vision
 Natural language processing
 Speech Recognition
4. Learning resources
5. Next step
INTRODUCTION
https://docs.google.com/presentati
on/d/1kSuQyW5DTnkVaZEjGYCkf
OxvzCqGEFzWBy4e9Uedd9k/edit
BREAK
Quick Recap
MACHINE LEARNING
It is hard for people to explicitly write the 'rules' for making decisions
The solution is dependent on lots of complex cases
We don't have the expertise to fully write 'the rules' but we have lots of
examples
Learning from ‘examples’
HOW TO LEARN ?
HOW TO LEARN ?
Nearest neighbor
LEARNING THROUGH LINEAR SEPARATION
HOW TO LEARN ?
Regression
MODEL QUALITY
OVER FITTING PROBLEM
NEURAL NETWORKS
16
17
GRADIENT DESCENT
NEURAL NETWORK TRAINING
DEEP LEARNING AS A BLACK BOX
22
INTERPRETABILITY
OF DEEP
LEARNING
DEEP LEARNING REQUIRES LARGER TRAINING
SETS
RESOURCES
Resources for learning Python
•Codecademy's Python course: browser-based, tons of exercises
•DataQuest: browser-based, teaches Python in the context of data science
•Google's Python class: slightly more advanced, includes videos and downloadable
exercises (with solutions)
•Python for Informatics: beginner-oriented book, includes slides and videos
MACHINE LEARNING TOOLS SciKit learn
SCIKIT LEARN
Benefits
• Consistent interface to machine learning
models
• Provides many tuning parameters but
with sensible defaults
• Exceptional documentation
• Rich set of functionality for companion
tasks
• Active community for development and
support
Drawbacks
• Harder (than R) to get started with
machine learning
• Less emphasis (than R) on model
interpretability
GETTING STARTED IN SCIKIT-LEARN WITH THE
FAMOUS IRIS DATASET
• 50 samples of 3 different species of iris (150 samples total)
• Measurements: sepal length, sepal width, petal length, petal
width
Machine learning on the iris dataset
• Framed as a supervised learning problem: Predict the species of
an iris using the measurements.
• Famous dataset for machine learning because prediction is easy
• Learn more about the iris dataset: UCI Machine Learning
Repository
LOADING THE IRIS
DATASET INTO SCIKIT-
LEARN
Machine learning terminology
• Each row is an observation (also known as:
sample, example, instance, record)
• Each column is a feature (also known as:
predictor, attribute, independent variable,
input, regressor, covariate)
…
…
LOADING THE IRIS
DATASET INTO SCIKIT-
LEARN
Machine learning terminology
• Each value we are predicting is the
response (also known as: target, outcome,
label, dependent variable)
• Classification is supervised learning in
which the response is categorical
• Regression is supervised learning in which
the response is ordered and continuous
REQUIREMENTS FOR
WORKING WITH DATA IN
SCIKIT-LEARN
• Features and response are separate
objects
• Features and response should be numeric
• Features and response should be NumPy
arrays
• Features and response should have
specific shapes
TRAINING A MACHINE LEARNING MODEL WITH
SCIKIT-LEARN
K-nearest neighbors (KNN) classification
• Pick a value for K.
• Search for the K observations in the training data that are "nearest" to the
measurements of the unknown iris.
• Use the most popular response value from the K nearest neighbors as the predicted
response value for the unknown iris.
LOADING THE DATA
SCIKIT-LEARN 4-STEP MODELING PATTERN
SCIKIT-LEARN 4-STEP MODELING PATTERN
USING A DIFFERENT VALUE FOR K
USING A DIFFERENT CLASSIFICATION MODEL
COMPARING MACHINE LEARNING MODELS IN
SCIKIT-LEARN
• Classification task: Predicting the species of an unknown iris
• Used three classification models: KNN (K=1), KNN (K=5), logistic regression
• Need a way to choose between the models
Solution:
Model evaluation procedures:
1. Evaluation procedure #1: Train and test on the entire dataset
2. Evaluation procedure #2: Train/test split
https://github.com/justmarkham/scikit-learn-videos/blob/master/05_model_evaluation.ipynb
DATA SCIENCE PIPELINE: PANDAS, SEABORN,
SCIKIT-LEARN
Types of supervised learning
• Classification: Predict a categorical response
• Regression: Predict a continuous response
DATA SCIENCE PIPELINE: PANDAS, SEABORN,
SCIKIT-LEARN
Reading data using pandas
Pandas: popular Python library for data exploration, manipulation, and analysis
Visualizing data using seaborn
Seaborn: Python library for statistical data visualization built on top of Matplotlib
https://github.com/justmarkham/scikit-learn-videos/blob/master/06_linear_regression.ipynb
CROSS-VALIDATION FOR PARAMETER TUNING,
MODEL SELECTION, AND FEATURE SELECTION
Review of model evaluation procedures
Motivation: Need a way to choose between machine learning models
Goal is to estimate likely performance of a model on out-of-sample data
Initial idea: Train and test on the same data
But, maximizing training accuracy rewards overly complex models which overfit the training
data
CROSS-VALIDATION FOR PARAMETER TUNING,
MODEL SELECTION, AND FEATURE SELECTION
Alternative idea: Train/test split
Split the dataset into two pieces, so that the model can be trained and tested on different
data
Testing accuracy is a better estimate than training accuracy of out-of-sample performance
But, it provides a high variance estimate since changing which observations happen to be in
the testing set can significantly change testing accuracy
https://github.com/justmarkham/scikit-learn-videos/blob/master/07_cross_validation.ipynb
EFFICIENTLY SEARCHING FOR OPTIMAL TUNING
PARAMETERS
Review of K-fold cross-validation
• Steps for cross-validation:
• Dataset is split into K "folds" of equal size
• Each fold acts as the testing set 1 time, and acts as the training set K-1 times
• Average testing performance is used as the estimate of out-of-sample performance
EFFICIENTLY SEARCHING FOR OPTIMAL TUNING
PARAMETERS
Benefits of cross-validation:
• More reliable estimate of out-of-sample performance than train/test split
• Can be used for selecting tuning parameters, choosing between models, and selecting features
Drawbacks of cross-validation:
• Can be computationally expensive
https://github.com/justmarkham/scikit-learn-videos/blob/master/08_grid_search.ipynb
EVALUATING A CLASSIFICATION MODEL
Review of model evaluation
• Need a way to choose between models: different model types, tuning parameters, and
features
• Use a model evaluation procedure to estimate how well a model will generalize to out-of-
sample data
• Requires a model evaluation metric to quantify the model performance
EVALUATING A CLASSIFICATION MODEL
Model evaluation procedures
Training and testing on the same data
 Rewards overly complex models that "overfit" the training data and won't necessarily generalize
Train/test split
 Split the dataset into two pieces, so that the model can be trained and tested on different data
 Better estimate of out-of-sample performance, but still a "high variance" estimate
 Useful due to its speed, simplicity, and flexibility
K-fold cross-validation
 Systematically create "K" train/test splits and average the results together
 Even better estimate of out-of-sample performance
 Runs "K" times slower than train/test split
EVALUATING A CLASSIFICATION MODEL
Model evaluation metrics
• Regression problems: Mean Absolute Error, Mean Squared Error, Root Mean Squared Error
• Classification problems: Classification accuracy
https://github.com/justmarkham/scikit-learn-videos/blob/master/09_classification_metrics.ipynb
LEARNING RESOURCES
4. LEARNING RESOURCES
•Courses
•Stanford Machine Learning: Available via Coursera and taught by Andrew Ng.
•Caltech Learning from Data: Available via edX and taught by Yaser Abu-Mostafa
•Machine Learning Category on VideoLectures.Net: This is an easy place to drown in the overload of content.
•Blogs:
•Machinelearningmastery
•https://machinelearningmastery.com/best-machine-learning-resources-for-getting-started/
•More:
•https://github.com/josephmisiti/awesome-machine-learning/blob/master/courses.md
•https://github.com/josephmisiti/awesome-machine-learning/blob/master/books.md
•https://github.com/josephmisiti/awesome-machine-learning/blob/master/blogs.md
NEXT STEP
5. NEXT STEP
Next Step is learning deep learning
Project:
1. Finish one of these courses (Stanford Machine Learning) or (Caltech Learning from
Data)
2. Submit in all these 4 Kaggle Competition (we will select top 20 students in their
leadership board)
• https://www.kaggle.com/c/titanic
• https://www.kaggle.com/c/ghouls-goblins-and-ghosts-boo
• https://www.kaggle.com/c/digit-recognizer
• https://www.kaggle.com/c/leaf-classification
5. NEXT STEP
Machine Learning for Everyone

More Related Content

What's hot

Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast ReviewAhmad Ali Abin
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningshivani saluja
 
Machine Learning 101 | Essential Tools for Machine Learning
Machine Learning 101 | Essential Tools for Machine LearningMachine Learning 101 | Essential Tools for Machine Learning
Machine Learning 101 | Essential Tools for Machine LearningHafiz Muhammad Attaullah
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introductionAnas Jamil
 
Machine Learning - Supervised learning
Machine Learning - Supervised learningMachine Learning - Supervised learning
Machine Learning - Supervised learningManeesha Caldera
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine LearningJeff Tanner
 
Le Machine Learning de A à Z
Le Machine Learning de A à ZLe Machine Learning de A à Z
Le Machine Learning de A à ZAlexia Audevart
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning OverviewMykhailo Koval
 
Brief introduction to Machine Learning
Brief introduction to Machine LearningBrief introduction to Machine Learning
Brief introduction to Machine LearningCodeForFrankfurt
 
Intro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopIntro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopGrant Ingersoll
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesZenodia Charpy
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
 
Machine Learning Exposed!
Machine Learning Exposed!Machine Learning Exposed!
Machine Learning Exposed!javafxpert
 
Azure Machine Learning Intro
Azure Machine Learning IntroAzure Machine Learning Intro
Azure Machine Learning IntroDamir Dobric
 
Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and ApplicationsGeeta Arora
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningShahar Cohen
 

What's hot (20)

Machine Learning: A Fast Review
Machine Learning: A Fast ReviewMachine Learning: A Fast Review
Machine Learning: A Fast Review
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning 101 | Essential Tools for Machine Learning
Machine Learning 101 | Essential Tools for Machine LearningMachine Learning 101 | Essential Tools for Machine Learning
Machine Learning 101 | Essential Tools for Machine Learning
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
 
Machine Learning - Supervised learning
Machine Learning - Supervised learningMachine Learning - Supervised learning
Machine Learning - Supervised learning
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine Learning
 
Le Machine Learning de A à Z
Le Machine Learning de A à ZLe Machine Learning de A à Z
Le Machine Learning de A à Z
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
 
Brief introduction to Machine Learning
Brief introduction to Machine LearningBrief introduction to Machine Learning
Brief introduction to Machine Learning
 
Intro to Mahout -- DC Hadoop
Intro to Mahout -- DC HadoopIntro to Mahout -- DC Hadoop
Intro to Mahout -- DC Hadoop
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo cases
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
 
Machine Learning Exposed!
Machine Learning Exposed!Machine Learning Exposed!
Machine Learning Exposed!
 
Azure Machine Learning Intro
Azure Machine Learning IntroAzure Machine Learning Intro
Azure Machine Learning Intro
 
Machine Learning and Applications
Machine Learning and ApplicationsMachine Learning and Applications
Machine Learning and Applications
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to Machine Learning for Everyone

Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...Open Cyber University of Korea
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Sonya Liberman
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemPierre Gutierrez
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...Open Cyber University of Korea
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellSri Ambati
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptxMonicaTimber
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...Egyptian Engineers Association
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OSri Ambati
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5Roger Barga
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneIvo Andreev
 
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017VisageCloud
 
InfoEducatie - Face Recognition Architecture
InfoEducatie - Face Recognition ArchitectureInfoEducatie - Face Recognition Architecture
InfoEducatie - Face Recognition ArchitectureBogdan Bocse
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus
 

Similar to Machine Learning for Everyone (20)

Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...
Quick review xAPI and IMS Caliper - Principle of both data capturing technolo...
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
More thinking about xApi and IMS Caliper - Structural/Syntactic & Ontological...
 
H2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin LedellH2O World - Intro to Data Science with Erin Ledell
H2O World - Intro to Data Science with Erin Ledell
 
in5490-classification (1).pptx
in5490-classification (1).pptxin5490-classification (1).pptx
in5490-classification (1).pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
 
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2OStrata San Jose 2016: Scalable Ensemble Learning with H2O
Strata San Jose 2016: Scalable Ensemble Learning with H2O
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017
Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017
 
InfoEducatie - Face Recognition Architecture
InfoEducatie - Face Recognition ArchitectureInfoEducatie - Face Recognition Architecture
InfoEducatie - Face Recognition Architecture
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 

More from Aly Abdelkareem

An Inductive inference Machine
An Inductive inference MachineAn Inductive inference Machine
An Inductive inference MachineAly Abdelkareem
 
Digital Image Processing - Frequency Filters
Digital Image Processing - Frequency FiltersDigital Image Processing - Frequency Filters
Digital Image Processing - Frequency FiltersAly Abdelkareem
 
Deep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationDeep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationAly Abdelkareem
 
Practical Digital Image Processing 5
Practical Digital Image Processing 5Practical Digital Image Processing 5
Practical Digital Image Processing 5Aly Abdelkareem
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4Aly Abdelkareem
 
Practical Digital Image Processing 3
 Practical Digital Image Processing 3 Practical Digital Image Processing 3
Practical Digital Image Processing 3Aly Abdelkareem
 
Pattern recognition 4 - MLE
Pattern recognition 4 - MLEPattern recognition 4 - MLE
Pattern recognition 4 - MLEAly Abdelkareem
 
Practical Digital Image Processing 2
Practical Digital Image Processing 2Practical Digital Image Processing 2
Practical Digital Image Processing 2Aly Abdelkareem
 
Practical Digital Image Processing 1
Practical Digital Image Processing 1Practical Digital Image Processing 1
Practical Digital Image Processing 1Aly Abdelkareem
 
How to use deep learning on biological data
How to use deep learning on biological dataHow to use deep learning on biological data
How to use deep learning on biological dataAly Abdelkareem
 
Deep Learning using Keras
Deep Learning using KerasDeep Learning using Keras
Deep Learning using KerasAly Abdelkareem
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningAly Abdelkareem
 
Pattern recognition Tutorial 2
Pattern recognition Tutorial 2Pattern recognition Tutorial 2
Pattern recognition Tutorial 2Aly Abdelkareem
 
Android Udacity Study group 1
Android Udacity Study group 1Android Udacity Study group 1
Android Udacity Study group 1Aly Abdelkareem
 
Java for android developers
Java for android developersJava for android developers
Java for android developersAly Abdelkareem
 
Introduction to Android Development
Introduction to Android DevelopmentIntroduction to Android Development
Introduction to Android DevelopmentAly Abdelkareem
 

More from Aly Abdelkareem (16)

An Inductive inference Machine
An Inductive inference MachineAn Inductive inference Machine
An Inductive inference Machine
 
Digital Image Processing - Frequency Filters
Digital Image Processing - Frequency FiltersDigital Image Processing - Frequency Filters
Digital Image Processing - Frequency Filters
 
Deep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationDeep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularization
 
Practical Digital Image Processing 5
Practical Digital Image Processing 5Practical Digital Image Processing 5
Practical Digital Image Processing 5
 
Practical Digital Image Processing 4
Practical Digital Image Processing 4Practical Digital Image Processing 4
Practical Digital Image Processing 4
 
Practical Digital Image Processing 3
 Practical Digital Image Processing 3 Practical Digital Image Processing 3
Practical Digital Image Processing 3
 
Pattern recognition 4 - MLE
Pattern recognition 4 - MLEPattern recognition 4 - MLE
Pattern recognition 4 - MLE
 
Practical Digital Image Processing 2
Practical Digital Image Processing 2Practical Digital Image Processing 2
Practical Digital Image Processing 2
 
Practical Digital Image Processing 1
Practical Digital Image Processing 1Practical Digital Image Processing 1
Practical Digital Image Processing 1
 
How to use deep learning on biological data
How to use deep learning on biological dataHow to use deep learning on biological data
How to use deep learning on biological data
 
Deep Learning using Keras
Deep Learning using KerasDeep Learning using Keras
Deep Learning using Keras
 
Object extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learningObject extraction from satellite imagery using deep learning
Object extraction from satellite imagery using deep learning
 
Pattern recognition Tutorial 2
Pattern recognition Tutorial 2Pattern recognition Tutorial 2
Pattern recognition Tutorial 2
 
Android Udacity Study group 1
Android Udacity Study group 1Android Udacity Study group 1
Android Udacity Study group 1
 
Java for android developers
Java for android developersJava for android developers
Java for android developers
 
Introduction to Android Development
Introduction to Android DevelopmentIntroduction to Android Development
Introduction to Android Development
 

Recently uploaded

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringC Sai Kiran
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfKamal Acharya
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfKamal Acharya
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf884710SadaqatAli
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdfKamal Acharya
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdfKamal Acharya
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdfKamal Acharya
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfKamal Acharya
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfKamal Acharya
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdfKamal Acharya
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edgePaco Orozco
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdfKamal Acharya
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageRCC Institute of Information Technology
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxMd. Shahidul Islam Prodhan
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamDr. Radhey Shyam
 

Recently uploaded (20)

CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
A case study of cinema management system project report..pdf
A case study of cinema management system project report..pdfA case study of cinema management system project report..pdf
A case study of cinema management system project report..pdf
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdfRESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
RESORT MANAGEMENT AND RESERVATION SYSTEM PROJECT REPORT.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 

Machine Learning for Everyone

  • 2. Teaching Assistant, Ain Shams University  Computational biology and deep learning Former Research Software Development Engineer, Microsoft Research (ATLC)  Speech Recognition Team “Arabic Models”  Natural Language Processing Team “Virtual Bot” ABOUT ME aly.osama@eng.asu.edu.eg
  • 3. AGENDA 1. Introduction 2. Machine learning tools  Scikit-learn 3. Case Study applications  Computer vision  Natural language processing  Speech Recognition 4. Learning resources 5. Next step
  • 7. MACHINE LEARNING It is hard for people to explicitly write the 'rules' for making decisions The solution is dependent on lots of complex cases We don't have the expertise to fully write 'the rules' but we have lots of examples Learning from ‘examples’
  • 8.
  • 10. HOW TO LEARN ? Nearest neighbor
  • 12. HOW TO LEARN ? Regression
  • 16. 16
  • 17. 17
  • 20.
  • 21. DEEP LEARNING AS A BLACK BOX
  • 22. 22
  • 24. DEEP LEARNING REQUIRES LARGER TRAINING SETS
  • 25. RESOURCES Resources for learning Python •Codecademy's Python course: browser-based, tons of exercises •DataQuest: browser-based, teaches Python in the context of data science •Google's Python class: slightly more advanced, includes videos and downloadable exercises (with solutions) •Python for Informatics: beginner-oriented book, includes slides and videos
  • 26. MACHINE LEARNING TOOLS SciKit learn
  • 27.
  • 28.
  • 29. SCIKIT LEARN Benefits • Consistent interface to machine learning models • Provides many tuning parameters but with sensible defaults • Exceptional documentation • Rich set of functionality for companion tasks • Active community for development and support Drawbacks • Harder (than R) to get started with machine learning • Less emphasis (than R) on model interpretability
  • 30. GETTING STARTED IN SCIKIT-LEARN WITH THE FAMOUS IRIS DATASET • 50 samples of 3 different species of iris (150 samples total) • Measurements: sepal length, sepal width, petal length, petal width Machine learning on the iris dataset • Framed as a supervised learning problem: Predict the species of an iris using the measurements. • Famous dataset for machine learning because prediction is easy • Learn more about the iris dataset: UCI Machine Learning Repository
  • 31. LOADING THE IRIS DATASET INTO SCIKIT- LEARN Machine learning terminology • Each row is an observation (also known as: sample, example, instance, record) • Each column is a feature (also known as: predictor, attribute, independent variable, input, regressor, covariate) … …
  • 32. LOADING THE IRIS DATASET INTO SCIKIT- LEARN Machine learning terminology • Each value we are predicting is the response (also known as: target, outcome, label, dependent variable) • Classification is supervised learning in which the response is categorical • Regression is supervised learning in which the response is ordered and continuous
  • 33. REQUIREMENTS FOR WORKING WITH DATA IN SCIKIT-LEARN • Features and response are separate objects • Features and response should be numeric • Features and response should be NumPy arrays • Features and response should have specific shapes
  • 34. TRAINING A MACHINE LEARNING MODEL WITH SCIKIT-LEARN K-nearest neighbors (KNN) classification • Pick a value for K. • Search for the K observations in the training data that are "nearest" to the measurements of the unknown iris. • Use the most popular response value from the K nearest neighbors as the predicted response value for the unknown iris.
  • 35.
  • 39. USING A DIFFERENT VALUE FOR K
  • 40. USING A DIFFERENT CLASSIFICATION MODEL
  • 41. COMPARING MACHINE LEARNING MODELS IN SCIKIT-LEARN • Classification task: Predicting the species of an unknown iris • Used three classification models: KNN (K=1), KNN (K=5), logistic regression • Need a way to choose between the models Solution: Model evaluation procedures: 1. Evaluation procedure #1: Train and test on the entire dataset 2. Evaluation procedure #2: Train/test split https://github.com/justmarkham/scikit-learn-videos/blob/master/05_model_evaluation.ipynb
  • 42. DATA SCIENCE PIPELINE: PANDAS, SEABORN, SCIKIT-LEARN Types of supervised learning • Classification: Predict a categorical response • Regression: Predict a continuous response
  • 43. DATA SCIENCE PIPELINE: PANDAS, SEABORN, SCIKIT-LEARN Reading data using pandas Pandas: popular Python library for data exploration, manipulation, and analysis Visualizing data using seaborn Seaborn: Python library for statistical data visualization built on top of Matplotlib https://github.com/justmarkham/scikit-learn-videos/blob/master/06_linear_regression.ipynb
  • 44. CROSS-VALIDATION FOR PARAMETER TUNING, MODEL SELECTION, AND FEATURE SELECTION Review of model evaluation procedures Motivation: Need a way to choose between machine learning models Goal is to estimate likely performance of a model on out-of-sample data Initial idea: Train and test on the same data But, maximizing training accuracy rewards overly complex models which overfit the training data
  • 45. CROSS-VALIDATION FOR PARAMETER TUNING, MODEL SELECTION, AND FEATURE SELECTION Alternative idea: Train/test split Split the dataset into two pieces, so that the model can be trained and tested on different data Testing accuracy is a better estimate than training accuracy of out-of-sample performance But, it provides a high variance estimate since changing which observations happen to be in the testing set can significantly change testing accuracy https://github.com/justmarkham/scikit-learn-videos/blob/master/07_cross_validation.ipynb
  • 46. EFFICIENTLY SEARCHING FOR OPTIMAL TUNING PARAMETERS Review of K-fold cross-validation • Steps for cross-validation: • Dataset is split into K "folds" of equal size • Each fold acts as the testing set 1 time, and acts as the training set K-1 times • Average testing performance is used as the estimate of out-of-sample performance
  • 47. EFFICIENTLY SEARCHING FOR OPTIMAL TUNING PARAMETERS Benefits of cross-validation: • More reliable estimate of out-of-sample performance than train/test split • Can be used for selecting tuning parameters, choosing between models, and selecting features Drawbacks of cross-validation: • Can be computationally expensive https://github.com/justmarkham/scikit-learn-videos/blob/master/08_grid_search.ipynb
  • 48. EVALUATING A CLASSIFICATION MODEL Review of model evaluation • Need a way to choose between models: different model types, tuning parameters, and features • Use a model evaluation procedure to estimate how well a model will generalize to out-of- sample data • Requires a model evaluation metric to quantify the model performance
  • 49. EVALUATING A CLASSIFICATION MODEL Model evaluation procedures Training and testing on the same data  Rewards overly complex models that "overfit" the training data and won't necessarily generalize Train/test split  Split the dataset into two pieces, so that the model can be trained and tested on different data  Better estimate of out-of-sample performance, but still a "high variance" estimate  Useful due to its speed, simplicity, and flexibility K-fold cross-validation  Systematically create "K" train/test splits and average the results together  Even better estimate of out-of-sample performance  Runs "K" times slower than train/test split
  • 50. EVALUATING A CLASSIFICATION MODEL Model evaluation metrics • Regression problems: Mean Absolute Error, Mean Squared Error, Root Mean Squared Error • Classification problems: Classification accuracy https://github.com/justmarkham/scikit-learn-videos/blob/master/09_classification_metrics.ipynb
  • 52. 4. LEARNING RESOURCES •Courses •Stanford Machine Learning: Available via Coursera and taught by Andrew Ng. •Caltech Learning from Data: Available via edX and taught by Yaser Abu-Mostafa •Machine Learning Category on VideoLectures.Net: This is an easy place to drown in the overload of content. •Blogs: •Machinelearningmastery •https://machinelearningmastery.com/best-machine-learning-resources-for-getting-started/ •More: •https://github.com/josephmisiti/awesome-machine-learning/blob/master/courses.md •https://github.com/josephmisiti/awesome-machine-learning/blob/master/books.md •https://github.com/josephmisiti/awesome-machine-learning/blob/master/blogs.md
  • 54. 5. NEXT STEP Next Step is learning deep learning Project: 1. Finish one of these courses (Stanford Machine Learning) or (Caltech Learning from Data) 2. Submit in all these 4 Kaggle Competition (we will select top 20 students in their leadership board) • https://www.kaggle.com/c/titanic • https://www.kaggle.com/c/ghouls-goblins-and-ghosts-boo • https://www.kaggle.com/c/digit-recognizer • https://www.kaggle.com/c/leaf-classification