Use Machine learning to solve classification problems through building binary and multi-class classifiers.
Does your company face business-critical decisions that rely on dynamic transactional data? If you answered “yes,” you need to attend this free event featuring Microsoft analytics tools. We’ll focus on Azure Machine Learning capabilities and explore the following topics: - Introduction of two class classification problems.
- Classification Algorithms (Two Class Classification)
- Available algorithms in Azure ML.
- Real business problems that is solved using two class classification.
Use Machine learning to solve classification problems through building binary and multi-class classifiers.
Does your company face business-critical decisions that rely on dynamic transactional data? If you answered “yes,” you need to attend this free event featuring Microsoft analytics tools. We’ll focus on Azure Machine Learning capabilities and explore the following topics: - Introduction of two class classification problems.
- Classification Algorithms (Two Class Classification)
- Available algorithms in Azure ML.
- Real business problems that is solved using two class classification.
1.
Building Machine Learning
Classifiers
Mostafa Elzoghbi
Sr. Technical Evangelist – Microsoft
@MostafaElzoghbi
http://mostafa.rocks
2.
Session Objectives & Takeaways
• What is Machine Learning?
• Azure Machine Learning (AML) for ML Solutions
• Machine Learning Classifiers
• Business Use Cases
• Let’s build smart apps initiative!
3.
What is Machine Learning ?
• Using known data, develop a model to predict unknown data.
Known Data: Big enough archive, previous observations, past data
Model: Known data + Algorithms (ML algorithms)
Unknown Data: Missing, Unseen, not existing, future data
4.
Microsoft Azure Machine Learning
• Web based UI accessible from different browsers
• Share|collaborate to any other ML workspace
• Drag & Drop visual design|development
• Wide range of ML Algorithms catalog
• Extend with OSS R|Python scripts
• Share|Document with IPython|Jupyter
• Deploy|Publish|Scale rapidly (APIs)
5.
Azure Machine Learning Ecosystem
Get/Prepare
Data
Build/Edit
Experiment
Create/Update
Model
Evaluate
Model
Results
Publish Web
Service
Build ML Model Deploy as Web ServiceProvision Workspace
Get Azure
Subscription
Create
Workspace
Publish an App
Azure Data
Marketplace
6.
Blobs and Tables
Hadoop (HDInsight)
Relational DB (Azure SQL DB)
Data Clients
Model is now a web
service that is callable
Monetize the API through
our marketplace
API
Integrated development
environment for Machine
Learning
ML STUDIO
9.
Model (Decision Tree)
Age<30
Income >
$50K
Xbox-One
Customer
Not Xbox-One
Customer
Days Played >
728
Income >
$50K
Xbox-One
Customer
Not Xbox-One
Customer
Xbox-One
Customer
11.
Classify a news article as (politics, sports, technology, health, …)
Politics Sports Tech Health
Model (Classification)
Using known data, develop a model to predict unknown data.
12.
Known data (Training data)
Using known data, develop a model to predict unknown data.
Documents Labels
Tech
Health
Politics
Politics
Sports
Documents consist of
unstructured text. Machine
learning typically assumes a
more structured format of
examples
Process the raw data
13.
Known data (Training data)
Using known data, develop a model to predict unknown data.
LabelsDocuments
Feature
Documents Labels
Tech
Health
Politics
Politics
Sports
Process each data instance to represent it as a feature vector
14.
Feature vector
Known data
Data instance
i.e.
{40, (180, 82), (11,7), 70, …..} : Healthy
Age Height/Weight
Blood Pressure
Hearth Rate
LabelFeatures
Feature Vector
15.
Developing a Model
Using known data, develop a model to predict unknown data.
Documents Labels
Tech
Health
Politics
Politics
Sports
Training data
Train
the
Model
Feature Vectors
Base
Model
Adjust
Parameters
16.
Model’s Performance
Known data with true labels
Tech
Health
Politics
Politics
Sports
Tech
Health
Politics
Politics
Sports
Tech
Health
Politics
Politics
Sports
Model’s
Performance
Difference between
“True Labels” and
“Predicted Labels”
True
labels
Tech
Health
Politics
Politics
Sports
Predicted
labels
Train the Model
Split
Detach
+/-
+/-
+/-
17.
Steps to Build a Machine Learning Solution
1
Problem
Framing
2
Get/Prepare
Data
3
Develop
Model
4
Deploy
Model
5
Evaluate /
Track
Performance
3.1
Analysis/
Metric
definition
3.2
Feature
Engineering
3.3
Model
Training
3.4
Parameter
Tuning
3.5
Evaluation
18.
Machine Learning Algorithms
Flavors of machine learning algorithms
• Supervised
• Unsupervised
• Reinforcement learning (n/a in AML)
Most commonly used machine learning algorithms are supervised
(requires labels)
• Supervised learning examples
• This customer will like coffee
• This network traffic indicates a
denial of service attack
• Unsupervised learning examples
• These customers are similar
• This network traffic is unusual
19.
Common Classes of Algorithms
(Supervised|Unsupervised)
Classification Regression Anomaly
Detection
Clustering
Supervised Supervised SupervisedUnsupervised
20.
Classification
A classification technique (or classifier) is a systematic approach to
build classification models from an input data set.
Examples include decision tree classifiers, rule-based classifiers, neural
networks, support vector machines, and naıve Bayes classifiers.
Scenarios:
▪ Which customer are more likely to buy, stay, leave (churn analysis)
▪ Which transactions|actions are fraudulent
▪ Which quotes are more likely to become orders
▪ Recognition of patterns: speech, speaker, image, movement, etc.
Algorithms: Boosted Decision Tree, Decision Forest, Decision Jungle,
Logistic Regression, SVM, ANN, etc. (14 algorithms so far)
Classification
21.
Binary versus Multiclass Classification
Does your customer want a yes|no answer?
• Binary examples
• click prediction
• yes|no
• over|under
• win|loss
• Multiclass examples
• kind of tree
• kind of network attack
• type of heart disease
22.
ML Classifier Types
• Two Class Classifiers:
• Answer: Yes/No, T/F
• Multi-Class Classifiers:
• Multiple answers
• Definitive list of options
23.
DEMOPredict an individual Income (>=50K) - Binary Classifier
29.
Bayesian methods
Bayesian methods have a highly desirable quality: they avoid overfitting.
They do this by making some assumptions beforehand about the likely distribution of the answer.
Another byproduct of this approach is that they have very few parameters.
Azure Machine Learning has both Bayesian algorithms for both classification (Two-class Bayes' point
machine) and regression (Bayesian linear regression).
Note that these assume that the data can be split or fit with a straight line.
32.
DEMO
Letter Recognition (Multi-Class Classifier)
33.
References
• Free e-book “Azure Machine Learning”
• https://mva.microsoft.com/ebooks#9780735698178
• Azure Machine Learning documentation
• https://azure.microsoft.com/en-us/documentation/services/machine-learning/
• Data Science and Machine Learning Essentials
• www.edx.org
• Azure ML Camp Files (labs & presentation) in GitHub:
https://github.com/melzoghbi/DataCamp
• Azure ML HOL (GitHub):
• https://github.com/Azure-Readiness/hol-azure-machine-learning/
34.
Thank you
• Check out my blog for Azure ML articles: http://mostafa.rocks
• Follow me on Twitter: @MostafaElzoghbi
• Want some help in building ML Solutions? Contact me to know more.
It appears that you have an ad-blocker running. By whitelisting SlideShare on your ad-blocker, you are supporting our community of content creators.
Hate ads?
We've updated our privacy policy.
We’ve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data.
You can read the details below. By accepting, you agree to the updated privacy policy.