SlideShare a Scribd company logo
1 of 47
Download to read offline
Big Data Analytics from Azure
Data Platform to Power BI
Azure Batch, Azure Data Lake, Azure HDInsight, ML, Power BI
March 22, 2017
Roy Kim
@RoyKimYYZ
roykimtoronto@gmail.com
Agenda
 Overview of Big Data + Azure + Data Insights
 Job Postings demo solution architecture &
implementation
 Mobile Demo with Power BI
 Q&A
Author: Roy Kim
By: Roy Kim
Bio
 Roy Kim
 14+ Years of Microsoft Technology Solutions
 .NET, SharePoint, BI, Office 365, Azure Solutions
 IT Consultant
 University of Toronto – Computer Science Degree
Author: Roy Kim
By: Roy Kim
Data to Insight
Author: Roy Kim
Big Data
Data Platform
Technologies
Solution Data Insights
By: Roy Kim
Job Postings Demo Solution
Author: Roy Kim References:
https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management
Job Postings Azure Data Platform
Data Lake, HDInsight,
SQL, Power BI
Job trends, analysis
By: Roy Kim
Big Data Spectrum
References:
https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management
Veracity
By: Roy Kim
Azure Cloud Platform
References:
https://blogs.technet.microsoft.com/cansql/2015/06/03/microsoft-data-platform-overview/
By: Roy Kim
Many
services
growing
and
maturing
Azure Data Platform
References:
https://blogs.technet.microsoft.com/cansql/2015/06/03/microsoft-data-platform-overview/
Two Illustrations:
By: Roy Kim
Analytics Platform Gartner Magic Quadrant
Data Analytics
By: Roy Kim
Job Postings Data Set
Volume
• Many national
job sites
• New job
postings daily
• Metadata and
full text.
Velocity
• New job
postings
created every
minute
Variety
• Semi-
structured
• Job Title
• Location
• Company
• Unstructured
• Job
Description
Veracity
• Incomplete/Im
precise
• Salary, Per
hour
• FT, PT, Temp,
Contract,
Seasonal
• Main
profession
By: Roy Kim
Power BI – Job Postings Demo Reports
By: Roy Kim
Power BI – Job Postings Demo Reports
By: Roy Kim
Job Postings Big Data Solution Architecture
Azure Data Lake
Analytics
Internet Data
Sets
USQL
Storage Account
Blob Store
Azure Batch
.NET Console
App
Blob Store
(WebHDFS)
Azure HDInsight
Hive
Azure
Active Directory
HDInsight
Azure SQL
database
SQL Data
Warehouse
Storage blob
Storage (Azure)
Visual Studio
Online
Data Lake
Azure SQL / Data
Warehouse
SQL DB
Machine
Learning
ML Studio
StorageTierServicesTier
REST/HTML/..
Visualization
/Reporting
Tools
Presentation
Tier
Mobile
Pig
Scoop
By: Roy Kim
Desktop
Batch
Business
Users
Report Builders
Data Analysts
Azure Data
Factory
Pipeline
Data Factory
Browser
Service
Applicatio
n Insights
Microsoft
Azure
Data
Analysis Services
Tabular
Machine
Learning
Storage Account
Blob Store
(HDFS)
Azure Data Lake
Store
Query
Job Postings from Internet Job Boards
 Web sites that offer APIs
 Use any server-side programming language to retrieve data such as NET, Java,
Node.js, etc.
 If no APIs, consider HTML web page scraping
REST API
http end points typically return JSON or XML data formats
Html Web Page Scraping
HTML Agility Pack to assist in parsing the Document Object Model for data points.
https://www.nuget.org/packages/HtmlAgilityPack
HTML parsing supporting XPath to traverse the Document Object Model
(DOM)
E.g. doc.DocumentElement.SelectSingleNode(“//div*@id=‘Total Sales’+”)By: Roy Kim
Job Postings Data Collector .NET Console Application
.NET console application to read data from the internet and store into Azure Storage
accounts
 Concurrent requests to job postings public API and HTML pages
 Multi-threaded to increase speed and throughput
 Parse HTML pages and JSON
 Store JSON files directly into Azure Data Lake Store with ADLS .NET SDK
 Leverages Azure Application Insights for logging trace and exception error
messages.
 To store files into Azure Data Lake Store, the .NET application needs to access with
an Azure AD service principal with the appropriate access control.
By: Roy Kim
Job Postings Data Collector App Architecture
By: Roy Kim
Azure Application Insights
By: Roy Kim
Application Insights Core API. This package provides core functionality for transmission of all
Application Insights Telemetry Types and is a dependent package for all other Application Insights
packages.
Azure Batch
A managed Azure service executing
command line applications.
For batch processing or batch
computing--running a large volume of
similar tasks to get some desired result.
Commonly used by organizations that
regularly process, transform, and
analyze large volumes of data.
Simply, a set of Azure Virtual Machines
running a console application to process
data that can be on a recurring schedule
and in parallel
References:
https://github.com/Microsoft/azure-docs/blob/master/articles/batch/batch-technical-overview.md
Author: Roy Kim
Azure Batch – Demo Implementation
Azure Batch runs the Console Application on a daily schedule against one node (Virtual
Machine - 2 cores)
To run console application in parallel through compute nodes, used the sample Parallel
Tasks .NET solution which uses the Azure Batch Client SDK.
https://github.com/Azure/azure-batch-
samples/tree/master/CSharp/ArticleProjects/ParallelTasks
Azure batch is an architecture option to support data collection in terms of velocity and
volume.
By: Roy Kim
Azure Data Lake
Intended for data storage in its raw format for future analysis, processing or
data modelling.
For developers, data scientists, and analysts to store data of any size, shape,
and speed.
To do all types of processing and analytics across different platforms and
languages.
Extract and load, minimal transformations
To manage data in characteristic of variety, velocity and volume
Two Components
1. Azure Data Lake Store
2. Azure Data Lake Analytics
References:
https://azure.microsoft.com/en-us/solutions/data-lake/
By: Roy Kim
Azure Data Lake Store
 Azure Data Lake Store is a hyper-scale repository for big data analytic
workloads. Azure Data Lake enables you to capture data of any size, type,
and ingestion speed in one single place for operational and exploratory
analytics.
 The Azure Data Lake store is an Apache Hadoop file system compatible with
Hadoop Distributed File System (HDFS)
 Can be accessed from Hadoop (available with HDInsight cluster) using the
WebHDFS-compatible REST APIs
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim
Azure Data Lake Store
Use Cases
 Store social media
posts, log files, sensor
data
 Store corporate data
such as
relational databases
(as flat files)
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim
Azure Data Lake Analytics
 Azure Data Lake Analytics is built to make big data analytics easy.
 Focus on writing, running, and managing jobs, rather than operating distributed
infrastructure. Instead of deploying, configuring, and tuning hardware.
 Write queries to transform your data and extract valuable insights. The analytics
service can handle jobs of any scale instantly by setting the dial for how much
power you need.
 U-SQL – a Big Data query language. Likeness of SQL + C#
 ”schema on reads”
 Pay for your job when it is running; making it cost-effective.
 Data Collector app stores .json files in respective folders
 USQL scripts logic:
 reads 1000s of JSON files in a given folder
 Outputs to one TSV (tab delimited) file
 Create a Tables to schematize the TSV files
 Query against tables to analyze or transform to a new output file.
References:
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview
By: Roy Kim
Azure Data Lake Analytics – Demo Implementation
By: Roy Kim
 USQL script: process json files into a tab delimited file
Azure Data Lake Analytics – Demo Implementation
By: Roy Kim
Roy Kim
# of JSON
Files
Single output
file
50 compute
nodes
3.4 mins
duration
Azure HDInsight
Hadoop refers to an ecosystem of open-source software that is a framework for
distributed processing, storing, and analysis of big data sets on clusters of
commodity computer hardware.
Azure HDInsight makes the Hadoop components from the Hortonworks Data
Platform (HDP) distribution available in Azure, deploys managed clusters with high
reliability and availability, and provides enterprise-grade security and governance
with Active Directory.
HDInsight offers the cluster types - Hadoop, HBase, Spark, Kafka, Interactive Hive,
Storm, customized, etc.
Supports integration with BI tools such as Power BI, Excel, SQL Server Analysis
Services, and SQL Server Reporting Services.
By: Roy Kim
Azure HDInsight – Demo Implementation
Hadoop Cluster Type
Data Source
Windows Azure Storage Account
Data Lake Store access
Data Lake Store Account
Hive Tables
JobPostings (internal) Table
Schema definition
Data loaded from Azure Data Lake Store .TSV file into Hadoop cluster’s WASB
JobPostings External Table
Data referenced in Azure Data Lake Store .TSV file. This is external to
HDInsight storage account.
By: Roy Kim
Azure HD Insight – Demo Implementation
Considerations
 To manage the compute costs, script the provisioning and de-provisioning of the
cluster.
 While a cluster is running, execute scripts and query the data into self service BI
tools and into other data warehouses.
 In comparison to Azure Data Lake, ADL Analytics may be more cost effective
since it is pay per use at a more granular level - # of nodes and execution time.
E.g. Running against 100 nodes may cost a few dollars per minute in ADL
Analytics; whereas, in HDInsight, 13 nodes for small VM size may cost a few
dollars an hour.
By: Roy Kim
Azure SQL Database
A relational database-as-a-service in the cloud built on the Microsoft SQL Server
engine
No need to manage the infrastructure.
Scale up or down based on Database Transaction Units (DTUs).
1TB storage maximum
Can be used as a simpler data warehouse.
By: Roy Kim
Azure SQL Database – Demo Implementation
Developed a simple data warehouse
modelling
Job Postings data loaded from ADLS
Star schema
Added a date dimension table
Table of # of jobs for each province by
a date hierarchy
By: Roy Kim
Azure Data Factory
 Cloud-based data integration service that orchestrates and automates
the movement and transformation of data.
 Create data pipelines that move and transform data, and then run the pipelines on a specified
schedule (hourly, daily, weekly, etc.)
By: Roy Kim
Azure Data Factory
Category Data store Supported as a source Supported as a sink
Azure Azure Blob storage
Azure Data Lake Store
Azure SQL Database
Azure SQL Data Warehouse
Azure Table storage
Azure DocumentDB
Azure Search Index
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Databases SQL Server*
Oracle*
MySQL*
DB2*
Teradata*
PostgreSQL*
Sybase*
Cassandra*
MongoDB*
Amazon Redshift
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
File File System*
HDFS*
Amazon S3
FTP
✓
✓
✓
✓
✓
Others Salesforce
Generic ODBC*
Generic OData
Web Table (table from HTML)
GE Historian*
✓
✓
✓
✓
✓
By: Roy Kim
Azure Machine Learning – Demo Implementation
By: Roy Kim
Predicting Salary for a
given set of parameters
such as job title and
location
Azure Machine Learning – Demo Implementation
By: Roy Kim
Power BI Mobile
By: Roy Kim
The main features of
your Power BI service UI:
1. navigation bar
2. dashboard with tiles
3. Q&A question box
4. help and feedback
buttons
5. dashboard title
6. Office 365 app
launcher
7. Power BI home
buttons
8. Additional dashboard
actions
Power BI App Service
By: Roy Kim
• Frequently updated and accessed reports
• Minutes, hours, daily, weekly
• Fast and easy access of reports and dashboards
• IoT and sensor data
• Retail and customer analytics
• Team and organizational performance and productivity e.g. ticket
management
• Collaborative analysis and decision making
• Not always in front of a large screen device
Key Mobile Scenarios
By: Roy Kim
• Navigation
• Dashboards and Reports
• Responsive design
• Visualization interaction
• Sharing
• Annotations
• Q&A
• Alerts
• Favourites
Annotations
Mobile App IOS Key Features & Demo
By: Roy Kim
Architecture
Governance
Security
Capacity
Business
Processes
DataPerformance
Application
Design
Operations
By: Roy Kim
Security Architecture
Azure Data Lake
Analytics
Internet Data
Sources
USQL
Storage Account
Blob Store
Azure Batch
.NET Console
App
Azure Data Lake Store
Blob Store
(HDFS)
Azure HDInsight
Hive
Storage Account
Blob Store
(HDFS)
Azure
Active Directory
HDInsight
Azure SQL
database
SQL Data
Warehouse
Storage blob
Storage (Azure)
Visual Studio
Online
Data Lake
Azure SQL / Data
Warehouse
SQL DB
Analysis Services
(preview)
Tabular
StorageTierServicesTier
REST/HTML/..
Visualization
/Reporting
Tools
Pig
Scoop
Roy Kim
Desktop
Batch
BI
Developers
IT Ops
Azure Data
Factory
Pipeline
Data Factory
Microsoft
AzureAPI
Key
API
Key
AAD App
Service Principal
AAD App
Service Principal
AAD User
SQL account
AAD User
SQL account
Account
Key
AAD User
End Users
By: Roy Kim
Applicatio
n Insights
Data Processing & Formats
Azure Data Lake
Analytics
Internet Data
Sources
USQL
Azure Batch
.NET Console
App
Azure HDInsight
Hive
HDInsight
Azure SQL
database
SQL Data
Warehouse
Data Lake
Azure SQL
SQL DB
Analysis Services
(preview)
Tabular
DataFormatServicesTier
REST/HTML/.. Pig
Scoop
Roy Kim
Batch
Azure Data
Factory
Pipeline
Data Factory
JSON
HTML
JSON
TSV Hive Ext.
Table
Relational
DB
Hive Int.
Table
Table
By: Roy Kim
Closing Remarks
 Cloud services such as Azure Data Platform provide new capabilities in
Data Analytics. That is in terms of scale, cost and agility.
 Azure Data Lake is a productive option for organizations new to
Hadoop. Yet continue to plan for other Hadoop offerings best fit for
other scenarios.
 Many azure services fit together to make the appropriate solution.
That is SaaS, PaaS, IaaS, Data, App, Operational, etc.
 As part of planning and design, be aware of MS roadmap and industry
trends.
By: Roy Kim
Q&A
By: Roy Kim
• @RoyKimYYZ
• roykimtoronto@gmail.com
roykim.ca
Appendix - Azure Data Lake Analytics
By: Roy Kim
50 assigned
DLAU for job
Appendix - Azure Data Lake Analytics
By: Roy Kim
Appendix –BI Tooling

More Related Content

What's hot

Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)James Serra
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data LakeCalum Miller
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big DataJames Serra
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAmazon Web Services
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseJames Serra
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage CCG
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricksBrandon Berlinrut
 
Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Turner Kunkel
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
 

What's hot (20)

Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Azure data stack_2019_08
Azure data stack_2019_08Azure data stack_2019_08
Azure data stack_2019_08
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWS
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks
 
Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
 

Similar to Big Data Analytics from Azure Cloud to Power BI Mobile

Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data PlatformShu-Jeng Hsieh
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Lace Lofranco
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptxFedoRam1
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsCollective Intelligence Inc.
 
Building Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBuilding Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBill Wilder
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksDatabricks
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueAmazon Web Services
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudKellyn Pot'Vin-Gorman
 
ArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudMicrosoft ArcReady
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Amazon Web Services
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data FactorySlava Kokaev
 

Similar to Big Data Analytics from Azure Cloud to Power BI Mobile (20)

Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
 
Azure Data.pptx
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Building Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows AzureBuilding Cloud-Native Applications with Microsoft Windows Azure
Building Cloud-Native Applications with Microsoft Windows Azure
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Scaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with DatabricksScaling and Modernizing Data Platform with Databricks
Scaling and Modernizing Data Platform with Databricks
 
ABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS GlueABD315_Serverless ETL with AWS Glue
ABD315_Serverless ETL with AWS Glue
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Power BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle CloudPower BI with Essbase in the Oracle Cloud
Power BI with Essbase in the Oracle Cloud
 
ArcReady - Architecting For The Cloud
ArcReady - Architecting For The CloudArcReady - Architecting For The Cloud
ArcReady - Architecting For The Cloud
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Sky High With Azure
Sky High With AzureSky High With Azure
Sky High With Azure
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
Semplificare l'analisi dei dati con architetture "Serverless": architetture e...
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 

More from Roy Kim

Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...Roy Kim
 
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUGAzure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUGRoy Kim
 
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template DeploymentAzure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template DeploymentRoy Kim
 
Azure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration TestsAzure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration TestsRoy Kim
 
Applying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web AppsApplying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web AppsRoy Kim
 
Design and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web AppsDesign and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web AppsRoy Kim
 
SharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid OverviewSharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid OverviewRoy Kim
 
SharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and BootstrapSharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and BootstrapRoy Kim
 
Designing for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted AppsDesigning for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted AppsRoy Kim
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsRoy Kim
 
SharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy KimSharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy KimRoy Kim
 
Networking For Application Developers by Roy Kim
Networking For Application Developers by Roy KimNetworking For Application Developers by Roy Kim
Networking For Application Developers by Roy KimRoy Kim
 
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer FeatureSharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer FeatureRoy Kim
 

More from Roy Kim (13)

Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
Microsoft Reactor Toronto 5/5/2020 | Azure Kubernetes In Action - Running and...
 
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUGAzure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
Azure AD App Proxy Login Scenarios with an On Premises Applications - TSPUG
 
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template DeploymentAzure Key Vault with a PaaS Architecture and ARM Template Deployment
Azure Key Vault with a PaaS Architecture and ARM Template Deployment
 
Azure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration TestsAzure App Gateway and Log Analytics under Penetration Tests
Azure App Gateway and Log Analytics under Penetration Tests
 
Applying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web AppsApplying Advanced Techniques to Azure Web Apps
Applying Advanced Techniques to Azure Web Apps
 
Design and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web AppsDesign and Configure Azure App Service Web Apps
Design and Configure Azure App Service Web Apps
 
SharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid OverviewSharePoint 2016 Hybrid Overview
SharePoint 2016 Hybrid Overview
 
SharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and BootstrapSharePoint Hosted Add-in with AngularJS and Bootstrap
SharePoint Hosted Add-in with AngularJS and Bootstrap
 
Designing for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted AppsDesigning for SharePoint Provider Hosted Apps
Designing for SharePoint Provider Hosted Apps
 
Microsoft Azure For Solutions Architects
Microsoft Azure For Solutions ArchitectsMicrosoft Azure For Solutions Architects
Microsoft Azure For Solutions Architects
 
SharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy KimSharePoint 2013 Hosted App Presentation by Roy Kim
SharePoint 2013 Hosted App Presentation by Roy Kim
 
Networking For Application Developers by Roy Kim
Networking For Application Developers by Roy KimNetworking For Application Developers by Roy Kim
Networking For Application Developers by Roy Kim
 
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer FeatureSharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
SharePoint Saturday 2010 - SharePoint 2010 Content Organizer Feature
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Big Data Analytics from Azure Cloud to Power BI Mobile

  • 1. Big Data Analytics from Azure Data Platform to Power BI Azure Batch, Azure Data Lake, Azure HDInsight, ML, Power BI March 22, 2017 Roy Kim @RoyKimYYZ roykimtoronto@gmail.com
  • 2. Agenda  Overview of Big Data + Azure + Data Insights  Job Postings demo solution architecture & implementation  Mobile Demo with Power BI  Q&A Author: Roy Kim By: Roy Kim
  • 3. Bio  Roy Kim  14+ Years of Microsoft Technology Solutions  .NET, SharePoint, BI, Office 365, Azure Solutions  IT Consultant  University of Toronto – Computer Science Degree Author: Roy Kim By: Roy Kim
  • 4. Data to Insight Author: Roy Kim Big Data Data Platform Technologies Solution Data Insights By: Roy Kim
  • 5. Job Postings Demo Solution Author: Roy Kim References: https://softwarestrategiesblog.com/2015/09/05/10-ways-big-data-is-revolutionizing-supply-chain-management Job Postings Azure Data Platform Data Lake, HDInsight, SQL, Power BI Job trends, analysis By: Roy Kim
  • 9. Analytics Platform Gartner Magic Quadrant
  • 11. Job Postings Data Set Volume • Many national job sites • New job postings daily • Metadata and full text. Velocity • New job postings created every minute Variety • Semi- structured • Job Title • Location • Company • Unstructured • Job Description Veracity • Incomplete/Im precise • Salary, Per hour • FT, PT, Temp, Contract, Seasonal • Main profession By: Roy Kim
  • 12. Power BI – Job Postings Demo Reports By: Roy Kim
  • 13. Power BI – Job Postings Demo Reports By: Roy Kim
  • 14. Job Postings Big Data Solution Architecture Azure Data Lake Analytics Internet Data Sets USQL Storage Account Blob Store Azure Batch .NET Console App Blob Store (WebHDFS) Azure HDInsight Hive Azure Active Directory HDInsight Azure SQL database SQL Data Warehouse Storage blob Storage (Azure) Visual Studio Online Data Lake Azure SQL / Data Warehouse SQL DB Machine Learning ML Studio StorageTierServicesTier REST/HTML/.. Visualization /Reporting Tools Presentation Tier Mobile Pig Scoop By: Roy Kim Desktop Batch Business Users Report Builders Data Analysts Azure Data Factory Pipeline Data Factory Browser Service Applicatio n Insights Microsoft Azure Data Analysis Services Tabular Machine Learning Storage Account Blob Store (HDFS) Azure Data Lake Store Query
  • 15. Job Postings from Internet Job Boards  Web sites that offer APIs  Use any server-side programming language to retrieve data such as NET, Java, Node.js, etc.  If no APIs, consider HTML web page scraping REST API http end points typically return JSON or XML data formats Html Web Page Scraping HTML Agility Pack to assist in parsing the Document Object Model for data points. https://www.nuget.org/packages/HtmlAgilityPack HTML parsing supporting XPath to traverse the Document Object Model (DOM) E.g. doc.DocumentElement.SelectSingleNode(“//div*@id=‘Total Sales’+”)By: Roy Kim
  • 16. Job Postings Data Collector .NET Console Application .NET console application to read data from the internet and store into Azure Storage accounts  Concurrent requests to job postings public API and HTML pages  Multi-threaded to increase speed and throughput  Parse HTML pages and JSON  Store JSON files directly into Azure Data Lake Store with ADLS .NET SDK  Leverages Azure Application Insights for logging trace and exception error messages.  To store files into Azure Data Lake Store, the .NET application needs to access with an Azure AD service principal with the appropriate access control. By: Roy Kim
  • 17. Job Postings Data Collector App Architecture By: Roy Kim
  • 18. Azure Application Insights By: Roy Kim Application Insights Core API. This package provides core functionality for transmission of all Application Insights Telemetry Types and is a dependent package for all other Application Insights packages.
  • 19. Azure Batch A managed Azure service executing command line applications. For batch processing or batch computing--running a large volume of similar tasks to get some desired result. Commonly used by organizations that regularly process, transform, and analyze large volumes of data. Simply, a set of Azure Virtual Machines running a console application to process data that can be on a recurring schedule and in parallel References: https://github.com/Microsoft/azure-docs/blob/master/articles/batch/batch-technical-overview.md Author: Roy Kim
  • 20. Azure Batch – Demo Implementation Azure Batch runs the Console Application on a daily schedule against one node (Virtual Machine - 2 cores) To run console application in parallel through compute nodes, used the sample Parallel Tasks .NET solution which uses the Azure Batch Client SDK. https://github.com/Azure/azure-batch- samples/tree/master/CSharp/ArticleProjects/ParallelTasks Azure batch is an architecture option to support data collection in terms of velocity and volume. By: Roy Kim
  • 21. Azure Data Lake Intended for data storage in its raw format for future analysis, processing or data modelling. For developers, data scientists, and analysts to store data of any size, shape, and speed. To do all types of processing and analytics across different platforms and languages. Extract and load, minimal transformations To manage data in characteristic of variety, velocity and volume Two Components 1. Azure Data Lake Store 2. Azure Data Lake Analytics References: https://azure.microsoft.com/en-us/solutions/data-lake/ By: Roy Kim
  • 22. Azure Data Lake Store  Azure Data Lake Store is a hyper-scale repository for big data analytic workloads. Azure Data Lake enables you to capture data of any size, type, and ingestion speed in one single place for operational and exploratory analytics.  The Azure Data Lake store is an Apache Hadoop file system compatible with Hadoop Distributed File System (HDFS)  Can be accessed from Hadoop (available with HDInsight cluster) using the WebHDFS-compatible REST APIs References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 23. Azure Data Lake Store Use Cases  Store social media posts, log files, sensor data  Store corporate data such as relational databases (as flat files) References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 24. Azure Data Lake Analytics  Azure Data Lake Analytics is built to make big data analytics easy.  Focus on writing, running, and managing jobs, rather than operating distributed infrastructure. Instead of deploying, configuring, and tuning hardware.  Write queries to transform your data and extract valuable insights. The analytics service can handle jobs of any scale instantly by setting the dial for how much power you need.  U-SQL – a Big Data query language. Likeness of SQL + C#  ”schema on reads”  Pay for your job when it is running; making it cost-effective.  Data Collector app stores .json files in respective folders  USQL scripts logic:  reads 1000s of JSON files in a given folder  Outputs to one TSV (tab delimited) file  Create a Tables to schematize the TSV files  Query against tables to analyze or transform to a new output file. References: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-overview By: Roy Kim
  • 25. Azure Data Lake Analytics – Demo Implementation By: Roy Kim  USQL script: process json files into a tab delimited file
  • 26. Azure Data Lake Analytics – Demo Implementation By: Roy Kim Roy Kim # of JSON Files Single output file 50 compute nodes 3.4 mins duration
  • 27. Azure HDInsight Hadoop refers to an ecosystem of open-source software that is a framework for distributed processing, storing, and analysis of big data sets on clusters of commodity computer hardware. Azure HDInsight makes the Hadoop components from the Hortonworks Data Platform (HDP) distribution available in Azure, deploys managed clusters with high reliability and availability, and provides enterprise-grade security and governance with Active Directory. HDInsight offers the cluster types - Hadoop, HBase, Spark, Kafka, Interactive Hive, Storm, customized, etc. Supports integration with BI tools such as Power BI, Excel, SQL Server Analysis Services, and SQL Server Reporting Services. By: Roy Kim
  • 28. Azure HDInsight – Demo Implementation Hadoop Cluster Type Data Source Windows Azure Storage Account Data Lake Store access Data Lake Store Account Hive Tables JobPostings (internal) Table Schema definition Data loaded from Azure Data Lake Store .TSV file into Hadoop cluster’s WASB JobPostings External Table Data referenced in Azure Data Lake Store .TSV file. This is external to HDInsight storage account. By: Roy Kim
  • 29. Azure HD Insight – Demo Implementation Considerations  To manage the compute costs, script the provisioning and de-provisioning of the cluster.  While a cluster is running, execute scripts and query the data into self service BI tools and into other data warehouses.  In comparison to Azure Data Lake, ADL Analytics may be more cost effective since it is pay per use at a more granular level - # of nodes and execution time. E.g. Running against 100 nodes may cost a few dollars per minute in ADL Analytics; whereas, in HDInsight, 13 nodes for small VM size may cost a few dollars an hour. By: Roy Kim
  • 30. Azure SQL Database A relational database-as-a-service in the cloud built on the Microsoft SQL Server engine No need to manage the infrastructure. Scale up or down based on Database Transaction Units (DTUs). 1TB storage maximum Can be used as a simpler data warehouse. By: Roy Kim
  • 31. Azure SQL Database – Demo Implementation Developed a simple data warehouse modelling Job Postings data loaded from ADLS Star schema Added a date dimension table Table of # of jobs for each province by a date hierarchy By: Roy Kim
  • 32. Azure Data Factory  Cloud-based data integration service that orchestrates and automates the movement and transformation of data.  Create data pipelines that move and transform data, and then run the pipelines on a specified schedule (hourly, daily, weekly, etc.) By: Roy Kim
  • 33. Azure Data Factory Category Data store Supported as a source Supported as a sink Azure Azure Blob storage Azure Data Lake Store Azure SQL Database Azure SQL Data Warehouse Azure Table storage Azure DocumentDB Azure Search Index ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Databases SQL Server* Oracle* MySQL* DB2* Teradata* PostgreSQL* Sybase* Cassandra* MongoDB* Amazon Redshift ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ File File System* HDFS* Amazon S3 FTP ✓ ✓ ✓ ✓ ✓ Others Salesforce Generic ODBC* Generic OData Web Table (table from HTML) GE Historian* ✓ ✓ ✓ ✓ ✓ By: Roy Kim
  • 34. Azure Machine Learning – Demo Implementation By: Roy Kim Predicting Salary for a given set of parameters such as job title and location
  • 35. Azure Machine Learning – Demo Implementation By: Roy Kim
  • 37. The main features of your Power BI service UI: 1. navigation bar 2. dashboard with tiles 3. Q&A question box 4. help and feedback buttons 5. dashboard title 6. Office 365 app launcher 7. Power BI home buttons 8. Additional dashboard actions Power BI App Service By: Roy Kim
  • 38. • Frequently updated and accessed reports • Minutes, hours, daily, weekly • Fast and easy access of reports and dashboards • IoT and sensor data • Retail and customer analytics • Team and organizational performance and productivity e.g. ticket management • Collaborative analysis and decision making • Not always in front of a large screen device Key Mobile Scenarios By: Roy Kim
  • 39. • Navigation • Dashboards and Reports • Responsive design • Visualization interaction • Sharing • Annotations • Q&A • Alerts • Favourites Annotations Mobile App IOS Key Features & Demo By: Roy Kim
  • 41. Security Architecture Azure Data Lake Analytics Internet Data Sources USQL Storage Account Blob Store Azure Batch .NET Console App Azure Data Lake Store Blob Store (HDFS) Azure HDInsight Hive Storage Account Blob Store (HDFS) Azure Active Directory HDInsight Azure SQL database SQL Data Warehouse Storage blob Storage (Azure) Visual Studio Online Data Lake Azure SQL / Data Warehouse SQL DB Analysis Services (preview) Tabular StorageTierServicesTier REST/HTML/.. Visualization /Reporting Tools Pig Scoop Roy Kim Desktop Batch BI Developers IT Ops Azure Data Factory Pipeline Data Factory Microsoft AzureAPI Key API Key AAD App Service Principal AAD App Service Principal AAD User SQL account AAD User SQL account Account Key AAD User End Users By: Roy Kim Applicatio n Insights
  • 42. Data Processing & Formats Azure Data Lake Analytics Internet Data Sources USQL Azure Batch .NET Console App Azure HDInsight Hive HDInsight Azure SQL database SQL Data Warehouse Data Lake Azure SQL SQL DB Analysis Services (preview) Tabular DataFormatServicesTier REST/HTML/.. Pig Scoop Roy Kim Batch Azure Data Factory Pipeline Data Factory JSON HTML JSON TSV Hive Ext. Table Relational DB Hive Int. Table Table By: Roy Kim
  • 43. Closing Remarks  Cloud services such as Azure Data Platform provide new capabilities in Data Analytics. That is in terms of scale, cost and agility.  Azure Data Lake is a productive option for organizations new to Hadoop. Yet continue to plan for other Hadoop offerings best fit for other scenarios.  Many azure services fit together to make the appropriate solution. That is SaaS, PaaS, IaaS, Data, App, Operational, etc.  As part of planning and design, be aware of MS roadmap and industry trends. By: Roy Kim
  • 44. Q&A By: Roy Kim • @RoyKimYYZ • roykimtoronto@gmail.com roykim.ca
  • 45. Appendix - Azure Data Lake Analytics By: Roy Kim 50 assigned DLAU for job
  • 46. Appendix - Azure Data Lake Analytics By: Roy Kim