SlideShare a Scribd company logo
 Introduction to Azure ML Studio
 How to Build an ML App in 15 mins
 Data Science Process:
Data Ingress and Egress
Data Visualization, Transformation and Feature Selection
Model Building, Evaluating, and Tuning
Operationalizing a Predictive Model
 Using R in ML Studio
Which statement best matches your relationship to R?
- I’m completely new to R, but want to learn
- I’m learning R
- I’m an experienced R user for personal interests
- I’m an experienced R user for works
- I won’t be using R (but I’m interested in what it can do)
4
• Free and open source R distribution
• Enhanced and distributed by Revolution Analytics
Microsoft R Open
• Built in Advanced Analytics and Stand Alone Server Capability
• Leverages the Benefits of SQL 2016 Enterprise Edition
SQL Server R Services
Microsoft R Products
mran.microsoft.com
Microsoft R Server
Big-data analytics and distributed computing on Linux,
Hadoop and Teradata
SQL Server 2016
Big-data analytics integrated with SQL Server database
(coming soon)
PowerBI Computations and charts from R scripts in dashboards
Azure ML Studio R Scripts in cloud-based Experiment workflows
Visual Studio
R Tools for Visual Studio: integrated development
environment for R (coming soon)
HDInsights R integrated with cloud-based Hadoop clusters
Cortana Analytics Cloud-based R APIs and Virtual Machines
Benchmarks details at MRAN
CRAN, MRO, MRS Comparison
Datasize
In-memory
In-memory In-Memory or Disk Based
Speed of Analysis
Single threaded Multi-threaded
Multi-threaded, parallel
processing 1:N servers
Support
Community Community Community + Commercial
Analytic Breadth
& Depth 7500+ innovative analytic
packages
7500+ innovative analytic
packages
7500+ innovative packages +
commercial parallel high-speed
functions
License
Open Source
Open Source
Commercial license.
Supported release with
indemnity
Microsoft
R Open
Microsoft
R Server
Enterprise
proven
Hybrid
Hyper-scale
Microsoft
Azure
Azure regions
Enterprise
proven
Hybrid
Hyper-scale
Open + Flexible
Open &
flexible Applications
Infrastructure
Management
Databases & Middleware
App Frameworks
Linux
Enterprise
proven
Hybrid
Hyper-scale
Open &
flexible
Trustworthy
More compliance certifications
than any other cloud
Trustworthy
Developer & IT
productivity
Trustworthy
Platform for SaaS
extensibility
Vision Analytics
Recommenda-
tion engines
Advertising
analysis
Weather
forecasting for
business
planning
Social network
analysis
Legal
discovery and
document
archiving
Pricing analysis
Fraud
detection
Churn
analysis
Equipment
monitoring
Location-based
tracking and
services
Personalized
Insurance
Machine learning &
predictive analytics are core
capabilities that are needed
throughout your business
Get/Prepare
Data
Build/Edit
Experiment
Create/Updat
e Model
Evaluate
Model
Results
Publish Web
Service
Build ML Model Deploy as Web ServiceProvision Workspace
Get Azure
Subscription
Create
Workspace
Publish an App
Azure Data
Marketplace
Classification Regression Recommenders Clustering
• State of art ML Algorithms
• Support of open source tools: R (and Python)
• Build a model by visually drag and drop
• Deploy as web service with clicks of button
• Use the service by calling the web API
• Publish an app in Azure Data Marketplace
• Goal
• Data sources
Census Income Dataset
UCI Machine Learning Repository
• Dataset Description
http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country income
39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
50 Self-emp-not-inc83311 Bachelors 13 Married-civ-spouseExec-managerialHusband White Male 0 0 13 United-States <=50K
38 Private 215646 HS-grad 9 Divorced Handlers-cleanersNot-in-family White Male 0 0 40 United-States <=50K
53 Private 234721 11th 7 Married-civ-spouseHandlers-cleanersHusband Black Male 0 0 40 United-States <=50K
28 Private 338409 Bachelors 13 Married-civ-spouseProf-specialty Wife Black Female 0 0 40 Cuba <=50K
37 Private 284582 Masters 14 Married-civ-spouseExec-managerialWife White Female 0 0 40 United-States <=50K
49 Private 160187 9th 5 Married-spouse-absentOther-service Not-in-family Black Female 0 0 16 Jamaica <=50K
52 Self-emp-not-inc209642 HS-grad 9 Married-civ-spouseExec-managerialHusband White Male 0 0 45 United-States >50K
31 Private 45781 Masters 14 Never-married Prof-specialty Not-in-family White Female 14084 0 50 United-States >50K
42 Private 159449 Bachelors 13 Married-civ-spouseExec-managerialHusband White Male 5178 0 40 United-States >50K
37 Private 280464 Some-college 10 Married-civ-spouseExec-managerialHusband Black Male 0 0 80 United-States >50K
30 State-gov 141297 Bachelors 13 Married-civ-spouseProf-specialty Husband Asian-Pac-IslanderMale 0 0 40 India >50K
23 Private 122272 Bachelors 13 Never-married Adm-clerical Own-child White Female 0 0 30 United-States <=50K
32 Private 205019 Assoc-acdm 12 Never-married Sales Not-in-family Black Male 0 0 50 United-States <=50K
40 Private 121772 Assoc-voc 11 Married-civ-spouseCraft-repair Husband Asian-Pac-IslanderMale 0 0 40 ? >50K
34 Private 245487 7th-8th 4 Married-civ-spouseTransport-movingHusband Amer-Indian-EskimoMale 0 0 45 Mexico <=50K
25 Self-emp-not-inc176756 HS-grad 9 Never-married Farming-fishingOwn-child White Male 0 0 35 United-States <=50K
32 Private 186824 HS-grad 9 Never-married Machine-op-inspctUnmarried White Male 0 0 40 United-States <=50K
38 Private 28887 11th 7 Married-civ-spouseSales Husband White Male 0 0 50 United-States <=50K
43 Self-emp-not-inc292175 Masters 14 Divorced Exec-managerialUnmarried White Female 0 0 45 United-States >50K
40 Private 193524 Doctorate 16 Married-civ-spouseProf-specialty Husband White Male 0 0 60 United-States >50K
54 Private 302146 HS-grad 9 Separated Other-service Unmarried Black Female 0 0 20 United-States <=50K
35 Federal-gov 76845 9th 5 Married-civ-spouseFarming-fishingHusband Black Male 0 0 40 United-States <=50K
43 Private 117037 11th 7 Married-civ-spouseTransport-movingHusband White Male 0 2042 40 United-States <=50K
59 Private 109015 HS-grad 9 Divorced Tech-support Unmarried White Female 0 0 40 United-States <=50K
56 Local-gov 216851 Bachelors 13 Married-civ-spouseTech-support Husband White Male 0 0 40 United-States >50K
19 Private 168294 HS-grad 9 Never-married Craft-repair Own-child White Male 0 0 40 United-States <=50K
54 ? 180211 Some-college 10 Married-civ-spouse? Husband Asian-Pac-IslanderMale 0 0 60 South >50K
39 Private 367260 HS-grad 9 Divorced Exec-managerialNot-in-family White Male 0 0 80 United-States <=50K
• Read from various data sources including:







• Write out the results:




• Data sources









• Data sinks




• Data storage


• Element Types



• Categorical aka “factors”


• Dense and sparse

Overview of Azure ML
Data Ingress and Egress
• Data Insights


• Data Scrubbing




• Data Transformation





• Feature Selection
• Problem


• Goal

 Predicting Flight Delays
• Data sources
1.
 Bureau of Transportation Statistics (BTS)
2.
 National Oceanic and Atmospheric Administration (NOAA)
 FTP
Year Month DayofMonth DayOfWeek Carrier OriginAirportID DestAirportID CRSDepTime DepDelay DepDel15 CRSArrTime ArrDelay ArrDel15 Cancelled
2013 7 16 2 DL 13487 14747 2155 -1 0 2336 5 0 0
2013 7 16 2 DL 12889 13487 1555 -6 0 2057 -7 0 0
2013 7 16 2 DL 11278 10397 1600 -5 0 1752 -19 0 0
2013 7 16 2 DL 13851 10397 600 -3 0 904 4 0 0
2013 7 16 2 DL 14747 12478 2330 49 1 736 40 1 0
2013 7 16 2 DL 12478 14747 1735 -4 0 2108 -41 0 0
2013 7 16 2 DL 15016 13487 1656 33 1 1831 17 1 0
2013 7 16 2 DL 11278 11433 659 -7 0 837 -28 0 0
2013 7 16 2 DL 10397 14107 805 -2 0 859 -25 0 0
2013 7 16 2 DL 14107 10397 1005 -6 0 1650 -8 0 0
2013 7 16 2 DL 12953 10397 700 112 1 930 108 1 0
2013 7 16 2 DL 11433 12953 1725 1914 1 1
2013 7 16 2 DL 13495 12892 720 917 1 1
2013 7 16 2 DL 12889 13487 720 0 0 1222 -7 0 0
2013 7 16 2 DL 13487 12889 1750 -9 0 1911 -26 0 0
2013 7 16 2 DL 12889 12892 620 -1 0 730 -5 0 0
2013 7 16 2 DL 12889 10397 715 -7 0 1415 -1 0 0
2013 7 16 2 DL 10397 12892 940 -2 0 1110 -11 0 0
2013 7 16 2 DL 15304 10397 1445 -5 0 1615 -5 0 0
2013 7 16 2 DL 14869 14747 2155 -5 0 2300 31 1 0
2013 7 16 2 DL 15304 12892 1930 -8 0 2125 -17 0 0
2013 7 16 2 DL 13487 12892 1135 2 0 1321 -17 0 0
2013 7 16 2 DL 12892 12173 1442 4 0 1723 2 0 0
2013 7 16 2 DL 11433 10529 720 -4 0 900 -9 0 0
2013 7 16 2 DL 10529 11433 941 -4 0 1129 -12 0 0
2013 7 16 2 DL 10397 14100 1117 -4 0 1320 -5 0 0
2013 7 16 2 DL 14100 10397 1415 -5 0 1616 -12 0 0
2013 7 16 2 DL 14869 13487 2016 6 0 2346 0 0 0
Sample flight data
Year Month Day AirportID Time TimeZone SkyCondition Visibility WeatherType DryBulbFarenheit DryBulbCelsius WetBulbFarenheit WetBulbCelsius DewPointFarenheit DewPointCelsius RelativeHumidity
2013 7 1 14843 56 -4 FEW020SCT035 10 78 25.6 75 23.6 73 22.8 85
2013 7 1 14843 156 -4 FEW035 10 78 25.6 74 23.2 72 22.2 82
2013 7 1 14843 256 -4 FEW050 10 78 25.6 75 23.6 73 22.8 85
2013 7 1 14843 356 -4 FEW055SCT070 10 78 25.6 75 23.6 73 22.8 85
2013 7 1 14843 456 -4 FEW050 10 77 25 74 23.4 73 22.8 88
2013 7 1 14843 556 -4 FEW025SCT065 10 78 25.6 75 23.6 73 22.8 85
2013 7 1 14843 656 -4 FEW025SCT042SCT065 10 78 25.6 75 24 74 23.3 88
2013 7 1 14843 756 -4 FEW034SCT044SCT065 10 81 27.2 77 24.8 75 23.9 82
2013 7 1 14843 856 -4 FEW034SCT045SCT070 9 85 29.4 77 25.1 74 23.3 70
2013 7 1 14843 956 -4 FEW022SCT032 9 86 30 78 25.6 75 23.9 70
2013 7 1 14843 1056 -4 FEW025SCT032 9 87 30.6 78 25.8 75 23.9 68
2013 7 1 14843 1156 -4 SCT025SCT032CBBKN050 9 TS 84 28.9 77 25 74 23.3 72
2013 7 1 14843 1215 -4 SCT025SCT038BKN055 9 84 29 76 24.6 73 23 70
2013 7 1 14843 1256 -4 FEW025SCT035BKN075 7 =-RA 81 27.2 77 24.8 75 23.9 82
2013 7 1 14843 1356 -4 SCT028SCT040BKN080 9 83 28.3 76 24.4 73 22.8 72
2013 7 1 14843 1456 -4 FEW038SCT060BKN110 9 84 28.9 77 24.9 74 23.3 72
2013 7 1 14843 1556 -4 FEW036SCT070BKN110 9 83 28.3 76 24.4 73 22.8 72
2013 7 1 14843 1656 -4 FEW042BKN100 9 83 28.3 77 24.8 74 23.3 74
2013 7 1 14843 1756 -4 FEW055BKN100 9 83 28.3 77 24.8 74 23.3 74
2013 7 1 14843 1856 -4 FEW038SCT075BKN100 9 82 27.8 76 24.3 73 22.8 74
2013 7 1 14843 1956 -4 FEW055 10 81 27.2 75 24.1 73 22.8 77
2013 7 1 14843 2056 -4 FEW045SCT065 10 81 27.2 75 24.1 73 22.8 77
2013 7 1 14843 2156 -4 FEW025SCT055 10 80 26.7 75 23.9 73 22.8 79
2013 7 1 14843 2256 -4 FEW025SCT055 10 80 26.7 76 24.3 74 23.3 82
2013 7 1 14843 2356 -4 FEW025SCT060 10 79 26.1 76 24.1 74 23.3 85
2013 7 2 14843 56 -4 FEW022BKN070 10 79 26.1 76 24.1 74 23.3 85
2013 7 2 14843 156 -4 FEW040 10 79 26.1 75 23.8 73 22.8 82
2013 7 2 14843 256 -4 FEW040 10 =-RA 78 25.6 75 23.6 73 22.8 85
2013 7 2 14843 356 -4 FEW022SCT040 10 78 25.6 75 24 74 23.3 88
Sample weather data
Data Visualization and Transformation
• Filter Based Feature Selection








• Data Partitioning


• Training a Binary Classifier




• Model Comparison


• Parameter Sweeping



Building a Classification Model
Model Evaluation and Parameter Sweeping
Get/Prepare
Data
Build/Edit
Experiment
Create/Updat
e Model
Evaluate
Model
Results
Create
Scoring
Graph
Add
Input/output
Ports
Publish Web
Service
Deploy to
Production
Build ML Model Publish as Web ServiceProvision Workspace
Get Azure
Subscription
Create
Workspace







Operationalizing a Predictive Model
























# Connect Adult dataset - visible as data.frame
# Clean up unfortunate variable names (R does not like ‘-’ in variable names)
# Plot some variable distributions by income

Using R in Azure ML Studio
Azure Machine Learning using R
Azure Machine Learning using R

More Related Content

Similar to Azure Machine Learning using R

Human Resources Performance Management Metrics PowerPoint Presentation Slides
Human Resources Performance Management Metrics PowerPoint Presentation Slides Human Resources Performance Management Metrics PowerPoint Presentation Slides
Human Resources Performance Management Metrics PowerPoint Presentation Slides
SlideTeam
 
Human Resources Performance Management Metrics Powerpoint Presentation Slides
Human Resources Performance Management Metrics Powerpoint Presentation SlidesHuman Resources Performance Management Metrics Powerpoint Presentation Slides
Human Resources Performance Management Metrics Powerpoint Presentation Slides
SlideTeam
 

Similar to Azure Machine Learning using R (20)

Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
 
Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
Den moderne dataplatform - gør din dataplatform til det mest værdifulde asset
 
Presentation.pptx (2)
Presentation.pptx (2)Presentation.pptx (2)
Presentation.pptx (2)
 
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)Build It And They Will Come:  User Adoption SharePoint 2013 (SPS Charlotte)
Build It And They Will Come: User Adoption SharePoint 2013 (SPS Charlotte)
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
 
Data-driven model-based restructuring of enterprise transaction operations
Data-driven model-based restructuring of enterprise transaction operationsData-driven model-based restructuring of enterprise transaction operations
Data-driven model-based restructuring of enterprise transaction operations
 
ASUG 2014 - Big Data and Advanced Analytics
ASUG 2014 - Big Data and Advanced AnalyticsASUG 2014 - Big Data and Advanced Analytics
ASUG 2014 - Big Data and Advanced Analytics
 
Audience Measurement: Nielsen Online vs Google Analytics
Audience Measurement: Nielsen Online vs Google AnalyticsAudience Measurement: Nielsen Online vs Google Analytics
Audience Measurement: Nielsen Online vs Google Analytics
 
Piano rubyslava final
Piano rubyslava finalPiano rubyslava final
Piano rubyslava final
 
Human Resources Performance Management Metrics PowerPoint Presentation Slides
Human Resources Performance Management Metrics PowerPoint Presentation Slides Human Resources Performance Management Metrics PowerPoint Presentation Slides
Human Resources Performance Management Metrics PowerPoint Presentation Slides
 
Realestate appraiser course list
Realestate appraiser course listRealestate appraiser course list
Realestate appraiser course list
 
zilla parisad pune notification 2015
zilla parisad pune notification 2015zilla parisad pune notification 2015
zilla parisad pune notification 2015
 
Human Resources Performance Management Metrics Powerpoint Presentation Slides
Human Resources Performance Management Metrics Powerpoint Presentation SlidesHuman Resources Performance Management Metrics Powerpoint Presentation Slides
Human Resources Performance Management Metrics Powerpoint Presentation Slides
 
Data cleansing project
Data cleansing project Data cleansing project
Data cleansing project
 
Les00 Intoduction
Les00 IntoductionLes00 Intoduction
Les00 Intoduction
 
さくナレの5年間を振り返る
さくナレの5年間を振り返るさくナレの5年間を振り返る
さくナレの5年間を振り返る
 
SPSToronto - Build It and They Will Come: SharePoint User Adoption
SPSToronto - Build It and They Will Come:  SharePoint User AdoptionSPSToronto - Build It and They Will Come:  SharePoint User Adoption
SPSToronto - Build It and They Will Come: SharePoint User Adoption
 
Adding Complex Data to Spark Stack by Tug Grall
Adding Complex Data to Spark Stack by Tug GrallAdding Complex Data to Spark Stack by Tug Grall
Adding Complex Data to Spark Stack by Tug Grall
 
Operational Analytics at Credit Suisse from ThousandEyes Connect
Operational Analytics at Credit Suisse from ThousandEyes ConnectOperational Analytics at Credit Suisse from ThousandEyes Connect
Operational Analytics at Credit Suisse from ThousandEyes Connect
 
Creating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick ImplementationCreating a Big data Strategy with Tactics for Quick Implementation
Creating a Big data Strategy with Tactics for Quick Implementation
 

More from Herman Wu

More from Herman Wu (9)

Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark Use MLflow to manage and deploy Machine Learning model on Spark
Use MLflow to manage and deploy Machine Learning model on Spark
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
運用MMLSpark 來加速Spark 上 機器學習專案
運用MMLSpark 來加速Spark 上機器學習專案運用MMLSpark 來加速Spark 上機器學習專案
運用MMLSpark 來加速Spark 上 機器學習專案
 
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
 
運用對話機器人提供線上客服服務
運用對話機器人提供線上客服服務運用對話機器人提供線上客服服務
運用對話機器人提供線上客服服務
 
選擇正確的Solution 來建置現代化的雲端資料倉儲
選擇正確的Solution 來建置現代化的雲端資料倉儲選擇正確的Solution 來建置現代化的雲端資料倉儲
選擇正確的Solution 來建置現代化的雲端資料倉儲
 
貫通物聯網每一哩路 with Microsfot Azure IoT Sutie
貫通物聯網每一哩路 with Microsfot Azure IoT Sutie貫通物聯網每一哩路 with Microsfot Azure IoT Sutie
貫通物聯網每一哩路 with Microsfot Azure IoT Sutie
 
物聯網應用全貌以及微軟全球案例
物聯網應用全貌以及微軟全球案例物聯網應用全貌以及微軟全球案例
物聯網應用全貌以及微軟全球案例
 

Recently uploaded

Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 

Recently uploaded (20)

Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 

Azure Machine Learning using R

  • 1.
  • 2.  Introduction to Azure ML Studio  How to Build an ML App in 15 mins  Data Science Process: Data Ingress and Egress Data Visualization, Transformation and Feature Selection Model Building, Evaluating, and Tuning Operationalizing a Predictive Model  Using R in ML Studio
  • 3. Which statement best matches your relationship to R? - I’m completely new to R, but want to learn - I’m learning R - I’m an experienced R user for personal interests - I’m an experienced R user for works - I won’t be using R (but I’m interested in what it can do)
  • 4. 4 • Free and open source R distribution • Enhanced and distributed by Revolution Analytics Microsoft R Open • Built in Advanced Analytics and Stand Alone Server Capability • Leverages the Benefits of SQL 2016 Enterprise Edition SQL Server R Services Microsoft R Products
  • 6. Microsoft R Server Big-data analytics and distributed computing on Linux, Hadoop and Teradata SQL Server 2016 Big-data analytics integrated with SQL Server database (coming soon) PowerBI Computations and charts from R scripts in dashboards Azure ML Studio R Scripts in cloud-based Experiment workflows Visual Studio R Tools for Visual Studio: integrated development environment for R (coming soon) HDInsights R integrated with cloud-based Hadoop clusters Cortana Analytics Cloud-based R APIs and Virtual Machines
  • 8. CRAN, MRO, MRS Comparison Datasize In-memory In-memory In-Memory or Disk Based Speed of Analysis Single threaded Multi-threaded Multi-threaded, parallel processing 1:N servers Support Community Community Community + Commercial Analytic Breadth & Depth 7500+ innovative analytic packages 7500+ innovative analytic packages 7500+ innovative packages + commercial parallel high-speed functions License Open Source Open Source Commercial license. Supported release with indemnity Microsoft R Open Microsoft R Server
  • 9.
  • 12. Enterprise proven Hybrid Hyper-scale Open + Flexible Open & flexible Applications Infrastructure Management Databases & Middleware App Frameworks Linux
  • 13. Enterprise proven Hybrid Hyper-scale Open & flexible Trustworthy More compliance certifications than any other cloud Trustworthy Developer & IT productivity Trustworthy Platform for SaaS extensibility
  • 14.
  • 15. Vision Analytics Recommenda- tion engines Advertising analysis Weather forecasting for business planning Social network analysis Legal discovery and document archiving Pricing analysis Fraud detection Churn analysis Equipment monitoring Location-based tracking and services Personalized Insurance Machine learning & predictive analytics are core capabilities that are needed throughout your business
  • 16.
  • 17.
  • 18. Get/Prepare Data Build/Edit Experiment Create/Updat e Model Evaluate Model Results Publish Web Service Build ML Model Deploy as Web ServiceProvision Workspace Get Azure Subscription Create Workspace Publish an App Azure Data Marketplace
  • 19. Classification Regression Recommenders Clustering • State of art ML Algorithms
  • 20. • Support of open source tools: R (and Python) • Build a model by visually drag and drop • Deploy as web service with clicks of button • Use the service by calling the web API • Publish an app in Azure Data Marketplace
  • 21.
  • 22. • Goal • Data sources Census Income Dataset UCI Machine Learning Repository • Dataset Description http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names
  • 23. age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country income 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K 50 Self-emp-not-inc83311 Bachelors 13 Married-civ-spouseExec-managerialHusband White Male 0 0 13 United-States <=50K 38 Private 215646 HS-grad 9 Divorced Handlers-cleanersNot-in-family White Male 0 0 40 United-States <=50K 53 Private 234721 11th 7 Married-civ-spouseHandlers-cleanersHusband Black Male 0 0 40 United-States <=50K 28 Private 338409 Bachelors 13 Married-civ-spouseProf-specialty Wife Black Female 0 0 40 Cuba <=50K 37 Private 284582 Masters 14 Married-civ-spouseExec-managerialWife White Female 0 0 40 United-States <=50K 49 Private 160187 9th 5 Married-spouse-absentOther-service Not-in-family Black Female 0 0 16 Jamaica <=50K 52 Self-emp-not-inc209642 HS-grad 9 Married-civ-spouseExec-managerialHusband White Male 0 0 45 United-States >50K 31 Private 45781 Masters 14 Never-married Prof-specialty Not-in-family White Female 14084 0 50 United-States >50K 42 Private 159449 Bachelors 13 Married-civ-spouseExec-managerialHusband White Male 5178 0 40 United-States >50K 37 Private 280464 Some-college 10 Married-civ-spouseExec-managerialHusband Black Male 0 0 80 United-States >50K 30 State-gov 141297 Bachelors 13 Married-civ-spouseProf-specialty Husband Asian-Pac-IslanderMale 0 0 40 India >50K 23 Private 122272 Bachelors 13 Never-married Adm-clerical Own-child White Female 0 0 30 United-States <=50K 32 Private 205019 Assoc-acdm 12 Never-married Sales Not-in-family Black Male 0 0 50 United-States <=50K 40 Private 121772 Assoc-voc 11 Married-civ-spouseCraft-repair Husband Asian-Pac-IslanderMale 0 0 40 ? >50K 34 Private 245487 7th-8th 4 Married-civ-spouseTransport-movingHusband Amer-Indian-EskimoMale 0 0 45 Mexico <=50K 25 Self-emp-not-inc176756 HS-grad 9 Never-married Farming-fishingOwn-child White Male 0 0 35 United-States <=50K 32 Private 186824 HS-grad 9 Never-married Machine-op-inspctUnmarried White Male 0 0 40 United-States <=50K 38 Private 28887 11th 7 Married-civ-spouseSales Husband White Male 0 0 50 United-States <=50K 43 Self-emp-not-inc292175 Masters 14 Divorced Exec-managerialUnmarried White Female 0 0 45 United-States >50K 40 Private 193524 Doctorate 16 Married-civ-spouseProf-specialty Husband White Male 0 0 60 United-States >50K 54 Private 302146 HS-grad 9 Separated Other-service Unmarried Black Female 0 0 20 United-States <=50K 35 Federal-gov 76845 9th 5 Married-civ-spouseFarming-fishingHusband Black Male 0 0 40 United-States <=50K 43 Private 117037 11th 7 Married-civ-spouseTransport-movingHusband White Male 0 2042 40 United-States <=50K 59 Private 109015 HS-grad 9 Divorced Tech-support Unmarried White Female 0 0 40 United-States <=50K 56 Local-gov 216851 Bachelors 13 Married-civ-spouseTech-support Husband White Male 0 0 40 United-States >50K 19 Private 168294 HS-grad 9 Never-married Craft-repair Own-child White Male 0 0 40 United-States <=50K 54 ? 180211 Some-college 10 Married-civ-spouse? Husband Asian-Pac-IslanderMale 0 0 60 South >50K 39 Private 367260 HS-grad 9 Divorced Exec-managerialNot-in-family White Male 0 0 80 United-States <=50K
  • 24.
  • 25.
  • 26. • Read from various data sources including:        • Write out the results:    
  • 28. • Data storage   • Element Types    • Categorical aka “factors”   • Dense and sparse 
  • 29. Overview of Azure ML Data Ingress and Egress
  • 30.
  • 31. • Data Insights   • Data Scrubbing     • Data Transformation      • Feature Selection
  • 32. • Problem   • Goal   Predicting Flight Delays • Data sources 1.  Bureau of Transportation Statistics (BTS) 2.  National Oceanic and Atmospheric Administration (NOAA)  FTP
  • 33. Year Month DayofMonth DayOfWeek Carrier OriginAirportID DestAirportID CRSDepTime DepDelay DepDel15 CRSArrTime ArrDelay ArrDel15 Cancelled 2013 7 16 2 DL 13487 14747 2155 -1 0 2336 5 0 0 2013 7 16 2 DL 12889 13487 1555 -6 0 2057 -7 0 0 2013 7 16 2 DL 11278 10397 1600 -5 0 1752 -19 0 0 2013 7 16 2 DL 13851 10397 600 -3 0 904 4 0 0 2013 7 16 2 DL 14747 12478 2330 49 1 736 40 1 0 2013 7 16 2 DL 12478 14747 1735 -4 0 2108 -41 0 0 2013 7 16 2 DL 15016 13487 1656 33 1 1831 17 1 0 2013 7 16 2 DL 11278 11433 659 -7 0 837 -28 0 0 2013 7 16 2 DL 10397 14107 805 -2 0 859 -25 0 0 2013 7 16 2 DL 14107 10397 1005 -6 0 1650 -8 0 0 2013 7 16 2 DL 12953 10397 700 112 1 930 108 1 0 2013 7 16 2 DL 11433 12953 1725 1914 1 1 2013 7 16 2 DL 13495 12892 720 917 1 1 2013 7 16 2 DL 12889 13487 720 0 0 1222 -7 0 0 2013 7 16 2 DL 13487 12889 1750 -9 0 1911 -26 0 0 2013 7 16 2 DL 12889 12892 620 -1 0 730 -5 0 0 2013 7 16 2 DL 12889 10397 715 -7 0 1415 -1 0 0 2013 7 16 2 DL 10397 12892 940 -2 0 1110 -11 0 0 2013 7 16 2 DL 15304 10397 1445 -5 0 1615 -5 0 0 2013 7 16 2 DL 14869 14747 2155 -5 0 2300 31 1 0 2013 7 16 2 DL 15304 12892 1930 -8 0 2125 -17 0 0 2013 7 16 2 DL 13487 12892 1135 2 0 1321 -17 0 0 2013 7 16 2 DL 12892 12173 1442 4 0 1723 2 0 0 2013 7 16 2 DL 11433 10529 720 -4 0 900 -9 0 0 2013 7 16 2 DL 10529 11433 941 -4 0 1129 -12 0 0 2013 7 16 2 DL 10397 14100 1117 -4 0 1320 -5 0 0 2013 7 16 2 DL 14100 10397 1415 -5 0 1616 -12 0 0 2013 7 16 2 DL 14869 13487 2016 6 0 2346 0 0 0 Sample flight data
  • 34. Year Month Day AirportID Time TimeZone SkyCondition Visibility WeatherType DryBulbFarenheit DryBulbCelsius WetBulbFarenheit WetBulbCelsius DewPointFarenheit DewPointCelsius RelativeHumidity 2013 7 1 14843 56 -4 FEW020SCT035 10 78 25.6 75 23.6 73 22.8 85 2013 7 1 14843 156 -4 FEW035 10 78 25.6 74 23.2 72 22.2 82 2013 7 1 14843 256 -4 FEW050 10 78 25.6 75 23.6 73 22.8 85 2013 7 1 14843 356 -4 FEW055SCT070 10 78 25.6 75 23.6 73 22.8 85 2013 7 1 14843 456 -4 FEW050 10 77 25 74 23.4 73 22.8 88 2013 7 1 14843 556 -4 FEW025SCT065 10 78 25.6 75 23.6 73 22.8 85 2013 7 1 14843 656 -4 FEW025SCT042SCT065 10 78 25.6 75 24 74 23.3 88 2013 7 1 14843 756 -4 FEW034SCT044SCT065 10 81 27.2 77 24.8 75 23.9 82 2013 7 1 14843 856 -4 FEW034SCT045SCT070 9 85 29.4 77 25.1 74 23.3 70 2013 7 1 14843 956 -4 FEW022SCT032 9 86 30 78 25.6 75 23.9 70 2013 7 1 14843 1056 -4 FEW025SCT032 9 87 30.6 78 25.8 75 23.9 68 2013 7 1 14843 1156 -4 SCT025SCT032CBBKN050 9 TS 84 28.9 77 25 74 23.3 72 2013 7 1 14843 1215 -4 SCT025SCT038BKN055 9 84 29 76 24.6 73 23 70 2013 7 1 14843 1256 -4 FEW025SCT035BKN075 7 =-RA 81 27.2 77 24.8 75 23.9 82 2013 7 1 14843 1356 -4 SCT028SCT040BKN080 9 83 28.3 76 24.4 73 22.8 72 2013 7 1 14843 1456 -4 FEW038SCT060BKN110 9 84 28.9 77 24.9 74 23.3 72 2013 7 1 14843 1556 -4 FEW036SCT070BKN110 9 83 28.3 76 24.4 73 22.8 72 2013 7 1 14843 1656 -4 FEW042BKN100 9 83 28.3 77 24.8 74 23.3 74 2013 7 1 14843 1756 -4 FEW055BKN100 9 83 28.3 77 24.8 74 23.3 74 2013 7 1 14843 1856 -4 FEW038SCT075BKN100 9 82 27.8 76 24.3 73 22.8 74 2013 7 1 14843 1956 -4 FEW055 10 81 27.2 75 24.1 73 22.8 77 2013 7 1 14843 2056 -4 FEW045SCT065 10 81 27.2 75 24.1 73 22.8 77 2013 7 1 14843 2156 -4 FEW025SCT055 10 80 26.7 75 23.9 73 22.8 79 2013 7 1 14843 2256 -4 FEW025SCT055 10 80 26.7 76 24.3 74 23.3 82 2013 7 1 14843 2356 -4 FEW025SCT060 10 79 26.1 76 24.1 74 23.3 85 2013 7 2 14843 56 -4 FEW022BKN070 10 79 26.1 76 24.1 74 23.3 85 2013 7 2 14843 156 -4 FEW040 10 79 26.1 75 23.8 73 22.8 82 2013 7 2 14843 256 -4 FEW040 10 =-RA 78 25.6 75 23.6 73 22.8 85 2013 7 2 14843 356 -4 FEW022SCT040 10 78 25.6 75 24 74 23.3 88 Sample weather data
  • 35. Data Visualization and Transformation
  • 36.
  • 37. • Filter Based Feature Selection        
  • 38.
  • 39. • Data Partitioning   • Training a Binary Classifier     • Model Comparison   • Parameter Sweeping   
  • 41.
  • 42. Model Evaluation and Parameter Sweeping
  • 43.
  • 44.
  • 45. Get/Prepare Data Build/Edit Experiment Create/Updat e Model Evaluate Model Results Create Scoring Graph Add Input/output Ports Publish Web Service Deploy to Production Build ML Model Publish as Web ServiceProvision Workspace Get Azure Subscription Create Workspace
  • 49.
  • 53.    # Connect Adult dataset - visible as data.frame # Clean up unfortunate variable names (R does not like ‘-’ in variable names) # Plot some variable distributions by income 
  • 54.
  • 55. Using R in Azure ML Studio