SlideShare a Scribd company logo
1 of 14
AzureML – Zero to Hero
Govind Kanshi
MTC Bangalore
2nd August 2014
What we will cover
• AzureML-
• What it enables
• Examples
• Upload data/understand explore it
• Develop model/evaluate it/deploy it
What this discussion is not about
• Data Science/Big Data defn/use etc
• ML Advanced topics
• Feature Engineering – which features are useful/cleaning/dropping
• For PCA kind of work – use R today
• Individual algorithm discussion/deep dive.
• Model tuning(Parameter sweep) or other techniques – boosting/bagging
• Overcoming Data vagaries
What you should walk out with
• Excitement and confidence that ML with AzureML is doable by all of
us as long as we are curious and patient.
• AzureML is democratized platform for learning from data ensuring
better informed decisions. It helps to bring sophisticated algorithms
and mechanisms in easy to use way for masses and high end
researchers today.
What are we trying to do
• Learn from existing Data to do prediction on data
• Classification – Put labels
• Regression - price,
• Recommendation – Rank choices
• Examples – classify different behavior, price,recommend, find anamoly
• Explore data form natural groupings based on some distance formula
• Clustering
Demo
• Deployed model for public dataset to classify if person has diabetes
• Deployed model to predict Decibels of noise
• How old is this stuff term “regression ” firstly appears in the Galton´s (1822-
1911) biological works.
• Y = a_1 * X_1 + ... + a_n * X_n...
• Solve for ...
What did we see
• Exposed Web service in Raw format to do prediction as request-
response
Demo
• Walkthrough of the model creation for Classification
• Possibly choose another algorithm to compare/evaluate
What did we see
AzureML studio – Experiments/Datasets/Web services
Web Services – RR or Batch mode
Algorithms – Classification, Regression, Recommendation, Ranking
Data – Ingestion, cleansing, massaging,
R Integration
Dataset/Experiments are immutable – new versions can be deployed
What did we do(typical AzureML path)
• Define the goal – regression or classification or recommendation
• Create a model and train it using dataset
• Get data –
• Cleanup the data or replace missing data if required
• Use the appropriate algorithm/train it
• Score the model with test data
• Looked at the algorithm parameters
• Evaluate Model using metrics
• Add more algorithms to compare
• Deploy Model as webservice for request-response mechanism
• What about batch – yes you can.
• Data exploration – visualization of data/results
Evaluate Models – summary(classification)
• Confusion Matrix
• Precision - (TP / (TP+FP) )
• Recall - (TP / (TP + FN))
• F1-score
• ROC curve + AUC - Area under ROC curve
Actual  Predicted class yes no
yes True positive (TP) False negative (FN)
no False positive (FP) True negative (TN)
Issues to think about
• Cleaning/choosing right data points
• Missing data/transforming data/dropping data/relationship between features
• Evaluating the algorithm, comparing, tuning the parameters,
relearning
• Which algorithm to choose(Boolean classification vs 10 class vs
ranking), Data has many attributes 1000s to 5 digits, vs very less data
or very sparse/noisy data
• What loss function, hyper parameter to aim for
• Explain the output – black box vs decision trees
• Online/Active Learning
Machine Learning Resources
• Coursera Machine Learning class
https://www.coursera.org/course/ml
• Access to AzureML – it is in preview
• http://www.youtube.com/watch?v=wjTJVhmu1JM
• Draft of Alex Smola and Vishy book on ML: http://alex.smola.org/drafts/thebook.pdf
• Elements of Statistical Learning – Hastie, Tibshirani et al: http://www-stat.stanford.edu/~tibs/ElemStatLearn/
• Information Theory, Inference, and Learning Algos – David Mackay: http://www.inference.phy.cam.ac.uk/mackay/itila/
• Datasets - http://archive.ics.uci.edu/ml/datasets.html
• Official AzureML – tutorials/Video walkthroughs - https://azure.microsoft.com/en-us/documentation/services/machine-learning/
Advanced topics
• Other topics
• How to use various input data cleanup procedures(dropping/adding/correlated features)
• How to publish Web service to Azure Market Place($) - https://azure.microsoft.com/en-us/documentation/articles/machine-learning-publish-web-
service-to-azure-marketplace/
• How do you version assets/”dag”
• Techniques to overcome vagaries of data
• Stratification- sampling for training and testing within classes to overcome issues in data samples
representation
• k-fold CV - data is split randomly into k subsets + each subset is used for testing and the remainder for
training. This is repeated and results averaged. CV uses sampling without replacement.
• Bootstraping - uses sampling with replacement to form the training set.
• Increasing performance of Model
• Bagging - Combining predictions by voting or averaging (for numeric prediction).
• Boosting - Uses voting/averaging but models are weighted according to their performance.
• Parameter sweeping
• Regularization parameter handling – Penalty for overfitting
• Understanding the algorithm performance/visualization of the algorithm path when possible.
• Associated statistics(confidence/distributions)

More Related Content

What's hot

Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesWill Gardella
 
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Institute of Contemporary Sciences
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningSetu Chokshi
 
Using H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle CompetitionsUsing H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle CompetitionsSri Ambati
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2OSri Ambati
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentationehtshamelahi
 
Machine learning 101 dkom 2017
Machine learning 101 dkom 2017Machine learning 101 dkom 2017
Machine learning 101 dkom 2017fredverheul
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Microsoft azure machine learning
Microsoft azure machine learningMicrosoft azure machine learning
Microsoft azure machine learningAmol Gholap
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Gülden Bilgütay
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101Andrew Badera
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedLaurenz Wuttke
 
A Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-LearnA Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-LearnSarah Guido
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsSri Ambati
 
Machine learning 101 sit hvr
Machine learning 101 sit hvrMachine learning 101 sit hvr
Machine learning 101 sit hvrfredverheul
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato ReviewHang Li
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning FundamentalsSigOpt
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
 
Building a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beamBuilding a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beamRaymond Tay
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopHannes Fassold
 

What's hot (20)

Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
 
Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine LearningMicrosoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
 
Using H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle CompetitionsUsing H2O AutoML for Kaggle Competitions
Using H2O AutoML for Kaggle Competitions
 
GLM & GBM in H2O
GLM & GBM in H2OGLM & GBM in H2O
GLM & GBM in H2O
 
MLconf seattle 2015 presentation
MLconf seattle 2015 presentationMLconf seattle 2015 presentation
MLconf seattle 2015 presentation
 
Machine learning 101 dkom 2017
Machine learning 101 dkom 2017Machine learning 101 dkom 2017
Machine learning 101 dkom 2017
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Microsoft azure machine learning
Microsoft azure machine learningMicrosoft azure machine learning
Microsoft azure machine learning
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
 
Azure Machine Learning 101
Azure Machine Learning 101Azure Machine Learning 101
Azure Machine Learning 101
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
A Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-LearnA Beginner's Guide to Machine Learning with Scikit-Learn
A Beginner's Guide to Machine Learning with Scikit-Learn
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Machine learning 101 sit hvr
Machine learning 101 sit hvrMachine learning 101 sit hvr
Machine learning 101 sit hvr
 
2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
 
Machine Learning Fundamentals
Machine Learning FundamentalsMachine Learning Fundamentals
Machine Learning Fundamentals
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Building a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beamBuilding a modern data platform with scala, akka, apache beam
Building a modern data platform with scala, akka, apache beam
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
 

Similar to AzureML – zero to hero

The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...Lucas Jellema
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusoneDotNetCampus
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 antimo musone
 
How to automate Machine Learning pipeline ?
How to automate Machine Learning pipeline ?How to automate Machine Learning pipeline ?
How to automate Machine Learning pipeline ?Axel de Romblay
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for EveryoneAly Abdelkareem
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Lucidworks
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
 
AzureML TechTalk
AzureML TechTalkAzureML TechTalk
AzureML TechTalkUdaya Kumar
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...Lucas Jellema
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud ServicesSC5.io
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MAHIRA
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?Yu Huang
 
How to solve a problem with machine learning
How to solve a problem with machine learningHow to solve a problem with machine learning
How to solve a problem with machine learningAmendra Shrestha
 

Similar to AzureML – zero to hero (20)

The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
Machine learning
Machine learningMachine learning
Machine learning
 
Collab365 Empower-Your-Applications-With-Azure-Machine-Learning
Collab365 Empower-Your-Applications-With-Azure-Machine-LearningCollab365 Empower-Your-Applications-With-Azure-Machine-Learning
Collab365 Empower-Your-Applications-With-Azure-Machine-Learning
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
 
How to automate Machine Learning pipeline ?
How to automate Machine Learning pipeline ?How to automate Machine Learning pipeline ?
How to automate Machine Learning pipeline ?
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
Query-time Nonparametric Regression with Temporally Bounded Models - Patrick ...
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
AzureML TechTalk
AzureML TechTalkAzureML TechTalk
AzureML TechTalk
 
Business Analytics Forum #BAF3
Business Analytics Forum #BAF3Business Analytics Forum #BAF3
Business Analytics Forum #BAF3
 
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
The Art of Intelligence – A Practical Introduction Machine Learning for Oracl...
 
Introduction overviewmachinelearning sig Door Lucas Jellema
Introduction overviewmachinelearning sig Door Lucas JellemaIntroduction overviewmachinelearning sig Door Lucas Jellema
Introduction overviewmachinelearning sig Door Lucas Jellema
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
Machine Learning Using Cloud Services
Machine Learning Using Cloud ServicesMachine Learning Using Cloud Services
Machine Learning Using Cloud Services
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
How to solve a problem with machine learning
How to solve a problem with machine learningHow to solve a problem with machine learning
How to solve a problem with machine learning
 

Recently uploaded

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Recently uploaded (20)

Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

AzureML – zero to hero

  • 1. AzureML – Zero to Hero Govind Kanshi MTC Bangalore 2nd August 2014
  • 2. What we will cover • AzureML- • What it enables • Examples • Upload data/understand explore it • Develop model/evaluate it/deploy it
  • 3. What this discussion is not about • Data Science/Big Data defn/use etc • ML Advanced topics • Feature Engineering – which features are useful/cleaning/dropping • For PCA kind of work – use R today • Individual algorithm discussion/deep dive. • Model tuning(Parameter sweep) or other techniques – boosting/bagging • Overcoming Data vagaries
  • 4. What you should walk out with • Excitement and confidence that ML with AzureML is doable by all of us as long as we are curious and patient. • AzureML is democratized platform for learning from data ensuring better informed decisions. It helps to bring sophisticated algorithms and mechanisms in easy to use way for masses and high end researchers today.
  • 5. What are we trying to do • Learn from existing Data to do prediction on data • Classification – Put labels • Regression - price, • Recommendation – Rank choices • Examples – classify different behavior, price,recommend, find anamoly • Explore data form natural groupings based on some distance formula • Clustering
  • 6. Demo • Deployed model for public dataset to classify if person has diabetes • Deployed model to predict Decibels of noise • How old is this stuff term “regression ” firstly appears in the Galton´s (1822- 1911) biological works. • Y = a_1 * X_1 + ... + a_n * X_n... • Solve for ...
  • 7. What did we see • Exposed Web service in Raw format to do prediction as request- response
  • 8. Demo • Walkthrough of the model creation for Classification • Possibly choose another algorithm to compare/evaluate
  • 9. What did we see AzureML studio – Experiments/Datasets/Web services Web Services – RR or Batch mode Algorithms – Classification, Regression, Recommendation, Ranking Data – Ingestion, cleansing, massaging, R Integration Dataset/Experiments are immutable – new versions can be deployed
  • 10. What did we do(typical AzureML path) • Define the goal – regression or classification or recommendation • Create a model and train it using dataset • Get data – • Cleanup the data or replace missing data if required • Use the appropriate algorithm/train it • Score the model with test data • Looked at the algorithm parameters • Evaluate Model using metrics • Add more algorithms to compare • Deploy Model as webservice for request-response mechanism • What about batch – yes you can. • Data exploration – visualization of data/results
  • 11. Evaluate Models – summary(classification) • Confusion Matrix • Precision - (TP / (TP+FP) ) • Recall - (TP / (TP + FN)) • F1-score • ROC curve + AUC - Area under ROC curve Actual Predicted class yes no yes True positive (TP) False negative (FN) no False positive (FP) True negative (TN)
  • 12. Issues to think about • Cleaning/choosing right data points • Missing data/transforming data/dropping data/relationship between features • Evaluating the algorithm, comparing, tuning the parameters, relearning • Which algorithm to choose(Boolean classification vs 10 class vs ranking), Data has many attributes 1000s to 5 digits, vs very less data or very sparse/noisy data • What loss function, hyper parameter to aim for • Explain the output – black box vs decision trees • Online/Active Learning
  • 13. Machine Learning Resources • Coursera Machine Learning class https://www.coursera.org/course/ml • Access to AzureML – it is in preview • http://www.youtube.com/watch?v=wjTJVhmu1JM • Draft of Alex Smola and Vishy book on ML: http://alex.smola.org/drafts/thebook.pdf • Elements of Statistical Learning – Hastie, Tibshirani et al: http://www-stat.stanford.edu/~tibs/ElemStatLearn/ • Information Theory, Inference, and Learning Algos – David Mackay: http://www.inference.phy.cam.ac.uk/mackay/itila/ • Datasets - http://archive.ics.uci.edu/ml/datasets.html • Official AzureML – tutorials/Video walkthroughs - https://azure.microsoft.com/en-us/documentation/services/machine-learning/
  • 14. Advanced topics • Other topics • How to use various input data cleanup procedures(dropping/adding/correlated features) • How to publish Web service to Azure Market Place($) - https://azure.microsoft.com/en-us/documentation/articles/machine-learning-publish-web- service-to-azure-marketplace/ • How do you version assets/”dag” • Techniques to overcome vagaries of data • Stratification- sampling for training and testing within classes to overcome issues in data samples representation • k-fold CV - data is split randomly into k subsets + each subset is used for testing and the remainder for training. This is repeated and results averaged. CV uses sampling without replacement. • Bootstraping - uses sampling with replacement to form the training set. • Increasing performance of Model • Bagging - Combining predictions by voting or averaging (for numeric prediction). • Boosting - Uses voting/averaging but models are weighted according to their performance. • Parameter sweeping • Regularization parameter handling – Penalty for overfitting • Understanding the algorithm performance/visualization of the algorithm path when possible. • Associated statistics(confidence/distributions)

Editor's Notes

  1. AzureML - where experiments are done and deployed as web services AzureML studio has “toolbar” which has modules for data ingestion/transformation, statistics, machine learning. Some of them have properties which can be set. AzureML has Datasets which can be bought in at runtime or persisted inside. It has public datasets too. AzureML
  2. Classification algorithms can be measured by these metrics Regression have just RMSE which many people are questioning in present circumstances (Sum through all instances (actual class value - predicted one)) Clustering has different mechanism and requires tests/re-runs to ensure grouped/clustered points have cohesion of somekind Types of classification errors often incur different costs. Total error = (FP+FN)/(TP+FP+TN+FN) Lift charts Sort instances by their predicted probability of being a true positive (TP). X axis is sample size and Y axis is number of true positives (TP). ROC curves (ROC means receiver operating characteristic, a term from signal processing) X axis shows %of false positives (FP) Y axis shows %of true positives (TP). Recall - precision (IR world- search world has these terms too ): Precision (retrieved relevant / total retrieved) = TP / (TP+FP) Recall (retrieved relevant / total relevant) = TP / (TP + FN)
  3. Desirables Model interpretation More visualization HMM Native Time series? Text analysis – IR integration