Azure Global Bootcamp
Hands on with Cortana Intelligence Suite
Jan Pieter Posthuma – 22 april 2016
Welcome @ Azure Global Bootcamp 2017
2Apr-17 – DataScenarios – CC BY 4.0
#GlobalAzure
Rubicon - Gast
Welkom@Rubicon
Timetable
• 09:00 – Welcome
• 09:30 – Introduction
• 09:45 – Azure Stream Analytics
• 11:15 – Azure Data Lake Store
• 12:30 – Lunch
• 13:15 – Azure Data Lake Analytics
• 14:30 – Azure Machine Learning
• 15:45 – Wrap-up / AzureML UDF in ASA
• 16:00 – Drinks
3Apr-17 – DataScenarios – CC BY 4.0
Who am I?
• Jan Pieter Posthuma – Microsoft Data Consultant
• DataScenarios – Data Consultancy Company
• Architect roles at multiple projects
• Creator of Power BI Custom Visuals
• HierarchySlicer
• Box and Whisker chart
• Contact
• mail@datascenarios.nl
• https://twitter.com/jppp
• https://linkedin.com/in/jpposthuma
• https://github.com/liprec
• https://docs.com/liprec
4Apr-17 – DataScenarios – CC BY 4.0
Prerequisites
• Azure Subscription
(Free signup: https://azure.microsoft.com/pricing/free-trial/)
• Azure ML subscription
(Free signup: https://studio.azureml.net/?selectAccess=true&o=2)
• Visual Studio 2017 (2015) with Azure Data Lake Tools
(2017: via Tools and Extensions;
2015: https://www.microsoft.com/en-
us/download/details.aspx?id=49504)
From data to decisions to action
Apr-17 – DataScenarios – CC BY 4.0 6
Cortana Intelligence Suite
7Apr-17 – DataScenarios – CC BY 4.0
Hands on labs scenario
8Apr-17 – DataScenarios – CC BY 4.0
Lab 1 – Ingest Lab 2 – Store and Prepare Lab 3 –Learning
Azure
Stream
Analytics
Azure
Data Lake
Store
Azure
Data Lake
Analytics
Azure
Machine
Learning
Power BI
Close the loop
Lab information
• Find Azure Event Hub information here:
https://bit.ly/GAB-NL
https://bit.ly/GAB-NL-HOL
9Apr-17 – DataScenarios – CC BY 4.0
Apr-17 – DataScenarios – CC BY 4.0 10
Azure Stream Analytics
Apr-17 – DataScenarios – CC BY 4.0 10
Azure Stream Analytics
• Fully managed, cost effective real-time event processing
engine
• T-SQL like language, SAQL
• Scalable cloud solution
• Input: Azure Event Hubs, Azure Storage Blobs
• Output: Azure Event Hubs, Power BI, Azure Storage Blobs, Azure
Data Lake Store, DocumentDB, Azure SQL DB
• AzureML integration via UDF
Apr-17 – DataScenarios – CC BY 4.0 12
Azure Data Lake Store
Apr-17 – DataScenarios – CC BY 4.0 12
Azure Data Lake Store
• Enterprise-wide hyper-scale repository for big data analytic
workloads
• Apache Hadoop file system compatible (HDFS)
• Unlimited storage
• Highly-available: Cloud scale, redundant copies
• Secure: Authentication (AAD), Access control (ACL) and Encryption
(Azure Key Vault)
• All data
Apr-17 – DataScenarios – CC BY 4.0 14
Azure Data Lake Analytics
Apr-17 – DataScenarios – CC BY 4.0 14
Azure Data Lake Analytics
• Big data analytics designed to run on top of
Azure Data Lake Store
• Dynamic scaling: cloud scale, dynamically provisions resources
• U-SQL: simple and familiar, powerful, and extensible (via .NET)
• Job (query, priority and max parallelism) duration defines price
• Works also with: Azure Blob Storage, Azure SQL DB
Apr-17 – DataScenarios – CC BY 4.0 16
Azure Machine Learning
Apr-17 – DataScenarios – CC BY 4.0 16
Azure Machine Learning
• Machine Learning:
‘the study of systems that can learn from data’
• MLStudio: ‘IDE’ for designing, training and validating models
• Data consumption via upload or Azure Storage (no ADLS yet)
• Operationalizing via WebService integration
• Retraining model via API
Supervised learning
• Infer a target function from
labeled dataset
• Example algoritms:
• Classification – identifying which
categories observation belongs
• Regression – estimating the
relationships among variables
• Dataset split needed
• Training
• Validation
• Test
18Apr-17 – DataScenarios – CC BY 4.0
Data Target
Model
Test data Target
Unsupervised learning
• Identify naturally occurring
patterns in data
• Example algorithms:
• Clustering – grouping a set of
objects based on similarity into
clusters
• Outlier detection – identification
of items which do not conform to an
expected pattern
• No data split needed
19Apr-17 – DataScenarios – CC BY 4.0
Data (no labels)
Model
Naturally
occurring
hidden
structure
Apr-17 – DataScenarios – CC BY 4.0 20
Drinks
Apr-17 – DataScenarios – CC BY 4.0 20

Azure Global Bootcamp - CIS Handson

  • 1.
    Azure Global Bootcamp Handson with Cortana Intelligence Suite Jan Pieter Posthuma – 22 april 2016
  • 2.
    Welcome @ AzureGlobal Bootcamp 2017 2Apr-17 – DataScenarios – CC BY 4.0 #GlobalAzure Rubicon - Gast Welkom@Rubicon
  • 3.
    Timetable • 09:00 –Welcome • 09:30 – Introduction • 09:45 – Azure Stream Analytics • 11:15 – Azure Data Lake Store • 12:30 – Lunch • 13:15 – Azure Data Lake Analytics • 14:30 – Azure Machine Learning • 15:45 – Wrap-up / AzureML UDF in ASA • 16:00 – Drinks 3Apr-17 – DataScenarios – CC BY 4.0
  • 4.
    Who am I? •Jan Pieter Posthuma – Microsoft Data Consultant • DataScenarios – Data Consultancy Company • Architect roles at multiple projects • Creator of Power BI Custom Visuals • HierarchySlicer • Box and Whisker chart • Contact • mail@datascenarios.nl • https://twitter.com/jppp • https://linkedin.com/in/jpposthuma • https://github.com/liprec • https://docs.com/liprec 4Apr-17 – DataScenarios – CC BY 4.0
  • 5.
    Prerequisites • Azure Subscription (Freesignup: https://azure.microsoft.com/pricing/free-trial/) • Azure ML subscription (Free signup: https://studio.azureml.net/?selectAccess=true&o=2) • Visual Studio 2017 (2015) with Azure Data Lake Tools (2017: via Tools and Extensions; 2015: https://www.microsoft.com/en- us/download/details.aspx?id=49504)
  • 6.
    From data todecisions to action Apr-17 – DataScenarios – CC BY 4.0 6
  • 7.
    Cortana Intelligence Suite 7Apr-17– DataScenarios – CC BY 4.0
  • 8.
    Hands on labsscenario 8Apr-17 – DataScenarios – CC BY 4.0 Lab 1 – Ingest Lab 2 – Store and Prepare Lab 3 –Learning Azure Stream Analytics Azure Data Lake Store Azure Data Lake Analytics Azure Machine Learning Power BI Close the loop
  • 9.
    Lab information • FindAzure Event Hub information here: https://bit.ly/GAB-NL https://bit.ly/GAB-NL-HOL 9Apr-17 – DataScenarios – CC BY 4.0
  • 10.
    Apr-17 – DataScenarios– CC BY 4.0 10 Azure Stream Analytics Apr-17 – DataScenarios – CC BY 4.0 10
  • 11.
    Azure Stream Analytics •Fully managed, cost effective real-time event processing engine • T-SQL like language, SAQL • Scalable cloud solution • Input: Azure Event Hubs, Azure Storage Blobs • Output: Azure Event Hubs, Power BI, Azure Storage Blobs, Azure Data Lake Store, DocumentDB, Azure SQL DB • AzureML integration via UDF
  • 12.
    Apr-17 – DataScenarios– CC BY 4.0 12 Azure Data Lake Store Apr-17 – DataScenarios – CC BY 4.0 12
  • 13.
    Azure Data LakeStore • Enterprise-wide hyper-scale repository for big data analytic workloads • Apache Hadoop file system compatible (HDFS) • Unlimited storage • Highly-available: Cloud scale, redundant copies • Secure: Authentication (AAD), Access control (ACL) and Encryption (Azure Key Vault) • All data
  • 14.
    Apr-17 – DataScenarios– CC BY 4.0 14 Azure Data Lake Analytics Apr-17 – DataScenarios – CC BY 4.0 14
  • 15.
    Azure Data LakeAnalytics • Big data analytics designed to run on top of Azure Data Lake Store • Dynamic scaling: cloud scale, dynamically provisions resources • U-SQL: simple and familiar, powerful, and extensible (via .NET) • Job (query, priority and max parallelism) duration defines price • Works also with: Azure Blob Storage, Azure SQL DB
  • 16.
    Apr-17 – DataScenarios– CC BY 4.0 16 Azure Machine Learning Apr-17 – DataScenarios – CC BY 4.0 16
  • 17.
    Azure Machine Learning •Machine Learning: ‘the study of systems that can learn from data’ • MLStudio: ‘IDE’ for designing, training and validating models • Data consumption via upload or Azure Storage (no ADLS yet) • Operationalizing via WebService integration • Retraining model via API
  • 18.
    Supervised learning • Infera target function from labeled dataset • Example algoritms: • Classification – identifying which categories observation belongs • Regression – estimating the relationships among variables • Dataset split needed • Training • Validation • Test 18Apr-17 – DataScenarios – CC BY 4.0 Data Target Model Test data Target
  • 19.
    Unsupervised learning • Identifynaturally occurring patterns in data • Example algorithms: • Clustering – grouping a set of objects based on similarity into clusters • Outlier detection – identification of items which do not conform to an expected pattern • No data split needed 19Apr-17 – DataScenarios – CC BY 4.0 Data (no labels) Model Naturally occurring hidden structure
  • 20.
    Apr-17 – DataScenarios– CC BY 4.0 20 Drinks Apr-17 – DataScenarios – CC BY 4.0 20

Editor's Notes

  • #7 Cortana Analytics is a fully managed big data and advanced analytics suite that transforms your data into intelligent action.   It is a comprehensive suite that brings together technologies throughout Microsoft and provides fast and flexible deployment with a simple monthly subscription to reduce the time and cost.   With Cortana Analytics, we are taking years of research and innovation – spanning technology & infrastructure for advanced analytics, including capabilities such as machine learning, big data storage and processing in the cloud, perceptual intelligence e.g. vision, face and speech recognition and integration with Cortana, Microsoft’s personal digital assistant with the goal of helping enterprise customers make better, faster decisions to accelerate their speed of business.