SQL Server 2014 Faster Insights from Any Data

Stéphane Fréchette
Stéphane FréchetteData & Business Intelligence Solutions Architect | Consultant | Big Data | NoSQL | Data Science | Data Platform MVP
SQL Server 2014
Faster Insight from Any Data
Stéphane Fréchette
Friday May 9, 2014
Email: stephanefrechette@ukubu.com
Twitter: @sfrechette
Blog: stephanefrechette.com
Stéphane Fréchette
Founder, CEO | Strategic consultant
Microsoft SQL Server MVP
Session Overview
SQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any Data
Excel BI | Capabilities
Microsoft Power BI for Office 365
1 in 4 enterprise customers on Office 3651 Billion Office Users
Analyze Visualize Share Find
Q&A
MobileDiscover
Scalable | Manageable | Trusted
Extend with Hybrid Cloud Solutions
Extend with Hybrid Cloud Solutions
Extend with Hybrid Cloud Solutions
Power Query, PowerPivot,
Power View, and Power Map
Powerful Self-Service BI with Excel 2013
Power Query
Enable self-service data discovery, query, transformation and mashup experiences for Information
Workers, via Excel and PowerPivot
Discovery and connectivity to a wide range of data sources, spanning volume as well as
variety of data.
Highly interactive and intuitive experience for rapidly and iteratively building queries over
any data source, any size.
Consistency of experience, and parity of query capabilities over all data sources.
Joins across different data sources; ability to create custom views over data that can then be
shared with team/department.
Power Query
Discover, combine, and refine Big Data, small data, and any data with Data
Explorer for Excel.
S
Data Sources
Windows Azure
Marketplace
Windows Active
Directory
Azure SQL
Database
Azure HDInsight
Powerful Self-Service BI with Excel 2013
Introducing PowerPivot
PowerPivot for SharePoint
Powerful Self-Service BI with Excel 2013
Introducing Power View
Power View for Multidimensional Models
• Power View on Analysis Services via BISM
• Native support for DAX in Analysis Services
• Better flexibility: Choice of DAX on Tabular or Multidimensional (cubes)
Architecture
Internet Explorer
Analysis Services
BI Semantic Model
Tabular
SharePoint
(2010 or 2013)
Reporting
Services
Power View
Analysis Services
BI Semantic Model
Multidimensional
SQL Server Data Tools
SQL Server Data Tools
1
2
35
6
4
BI Semantic Model: ArchitectureThird-party
applications
Reporting Services
(Power View) Excel PowerPivot
Databases LOB Applications Files OData Feeds Cloud Services
SharePoint
Insights
BISM-MD Object Tabular Object
Cube Model
Cube Dimension Table
Attributes (Key(s), Name) Columns
Measure Group Table
Measure Measure
Measure without MeasureGroup Within Table called “Measures”
MeasuregroupCube Dimension relationship Relationship
Perspective Perspective
KPI KPI
User/Parent-Child Hierarchies Hierarchies
Multidimensional-Tabular Mapping
Powerful Self-Service BI with Excel 2013
Power Map for Microsoft Excel enables information workers to discover and share new insights
from geographical and temporal data through three-dimensional storytelling.
What Is Power Map?
Map Data
• Data in Excel
• Geo-Code
• 3D and 3 Visuals
Discover Insights
• Play over Time
• Annotate points
• Capture scenes
Share Stories
• Cinematic Effects
• Interactive Tours
• Share Workbook
Power Map: Steps to 3D insights
Map Data
•
Discover Insights
•
•
•
•
Share Stories
•
•
•
• Export to Video for Viral!
Power Map
Excel Add-in to Enhance Data Visualization
Power BI Site
Power BI for Office 365 | Capabilities
Power BI for Office 365 | Capabilities
Power BI for Office 365 | Capabilities
Power BI for Office 365 | Capabilities
Corporate
Data Sources
Data Management Gateway
Enabling Corporate
OData Feeds
Enabling Excel Workbook
Data Refresh using
SharePoint Online
Enabling
Discovery in
Power Query
capabilities
Power BI Admin Center
Data Management
Gateway
Data Management Gateway - Conceptual
Power BI Admin Center
Allows IT to configure, manage
and monitor access to
corporate data sources.
Data Management Gateway
Connects to corporate data
sources and sends data to
Microsoft cloud services through a
secure channel (Service Bus).
Corporate Data Sources
The Gateway can connect to
a variety of data sources.
Secure Credential Store
All credentials used by
the gateway are stored
on-premises.
Data Management Gateway
Network Topology
MICROSOFT DATA CENTERINTERNET
PERIMETER
NETWORK
INTRANET
Data Management
Gateway
Data Management
Gateway Cloud
Services
Customer network
Power Query
Outgoing connection to cloud services
(Registration, Regular Heartbeat, Data
Source definition requests)
Connect to
Corporate
OData feed
Data
Per Machine: Single gateway installed
Credential
Management
Saves
credentials
Corporate OData Feeds and
Data Management Gateway
Data Management
Gateway
Power Query
(1) Using Power Query Anna connects to OData feed
(URL: http://feedgwMyDB )
Example: ContosoAnna
(2) The Data Management
Gateway connects to SQL
Server using either Windows
account or Database account
setup by Patrick when creating
the feed
Example: DB1_Reader
(3) Returns Result
(4) Returns OData feed
Scheduled Refresh
Scenario: workbook is refreshed on schedule as configured by the author in BI Sites
• Scheduler runs in BI Azure and triggers refresh as configured in the BI Sites application
• The flow assumes the workbook has been added to Power BI, thus save back is done directly to SPO
• When refresh is called by BI Azure, SPO rehydrates the user identity and calls WAC in a back channel (i.e. redirect equivalent)
3. Refresh workbook
BI Azure
Office Web Apps
Service (WAC)
Excel Services
5. Get shadow
workbook refresh
Data Model
SPO
Azure Active
Directory(AAD)
OrgID, MSODS,ACS
Excel
Service
s SOAP
API
1. Verify user existence and license in MSODS and get
access token to target URL in SPO from ACS
2. Construct the user part of the access token, and trigger
refresh for a workbook on behalf of the scheduled refresh user
On-Prem
Data
Sources
Cloud Data
Sources
6. Get data from
cloud/on-prem
sources and re-
process the data
model
7. Save updated workbook to SPO
4. Power BI workbook?
On-premises Data Access from BI Azure
Scenario: Interactive refresh from Excel Web Access where the data source is on-premises
• For interactive refresh, shared data sources are configured in advance in the Power BI Admin Center
• For scheduled refresh, personal data sources can be configured by the workbook owner
Azure Active
Directory (AAD)
OrgID, MSODS,
ACS
BI Azure
Hybrid Proxy
ADO.NET
Provider
Discovery API Tenant
Configuration
SQL Azure
Hybrid Data Integration Service
Hybrid Proxy
Hybrid Delivery
1. Determine whether data
source is cloud or on-prem,
and retrieve registered ID
2. Authenticate &
retrieve tenant
information
3. Get registered
data source info
On-Prem
Cloud
4. Issue refresh query
Data Management Gateway
Windows Azure Service Bus
5. Send request to Gateway
(via Service Bus)
Hybrid Delivery
Client API
6. Read query request from
Service Bus queue
7. Retrieve data
source
credentials
Credential
Manager8. Run query and
retrieve the data
9. Coordinate
transfer job
Azure Storage
(temporary)
10. Compress &
stream data in
multiple chunks
11 . Receive & decompress
data
Azure Active Directory
(AAD)
OrgID, MSODS, ACS
BI Azure
Hybrid Proxy
ADO.NET
Provider
Discovery API
Hybrid Data Integration Service
Hybrid Proxy
Hybrid Delivery
Data Refresh in SPO– How does it work?
Data Management
Gateway
Excel Workbook in
SharePoint Online
Gateway
Cloud Service
(1) Excel workbook
uploaded to
SharePoint
Online
(2) Click Data
Refresh for
Excel workbook
(3) Connects to Gateway
Cloud Service
(4) Checks whether user is
authorized to perform a
refresh
(5) Sends command (SQL
statement, connection string)
to on-premise Data
Management Gateway
(6) Sends SQL to
SQL Server
(7) Return Results
(8) Efficiently
transfer this to
cloud service
(9) Returns data to Excel
Workbook
Data Management Gateway - OData
Power BI for Office 365 | Capabilities
Engage customers with smart,
contextual mobile experiences
Boost agility with real-time access to
apps and data from anywhere
Enable Deep Business and Customer Connections
Virtually Anytime, Anywhere
Stay Productive on the Go
Deliver Familiar, Connected Experiences to a Mobile Workforce
…while ensuring enterprise security, manageability, and compliance
Mobile BI Capabilities Available Today
Browser-based corporate
BI solutions on iOS,
Android and Windows:
• SharePoint Mobile enhancements
• PerformancePoint Services
• Excel Services
• SQL Server Reporting Services
“Ultimately, the new Microsoft mobile BI solution leads to more revenue for Recall
and gives us deeper customer insight, helping us stay ahead of our competitors.”
Recall Records Management Company Gets Real-Time BI, Boosts Sales with Mobile Solution case study. Full Case study.
Excel Web App
Excel Web App
Quick Explore
Mobile-Friendly Apps for Office
Power BI for Office 365 | Capabilities
Tabular models for Power BI
Datasources
Creating & managing models in Power BI
Reliable Persistent Storage (RPS)
Power BI Tabular Model Architecture
SSDT
SQL Azure
HDInsight
Azure Tables
External Data Sources
AS Instance AS Instance AS Instance AS Instance
…
On Prem SQL
Gateway
Power BI Portal
in O365
Excel
XMLA REST
Service Health Monitoring
At a glance view of the health of IT managed gateways
Enabler of Self Service BI
Varying levels of control across data sources,
departments
Oversight and monitoring of cloud data access
Ability to make corporate data sources easier
to discover, and easier to access
Role of the IT Admin in Power BI
https://itadmin.clouddatahub.net/
Power BI Admin Center
Power BI Admin Portal & Data Management
Gateway
Power BI Admin CenterPower BI Admin Center
HDInsight, Polybase, and
StreamInsight
Key Trends
Big Data Analytics
Internet of things
Audio / Video
Log Files
Text/Image
Social Sentiment
Data Market Feeds
eGov Feeds
Weather
Wikis / Blogs
Click Stream
Sensors / RFID / Devices
Spatial & GPS Coordinates
WEB 2.0Mobile
Advertising CollaborationeCommerce
Digital Marketing
Search Marketing
Web Logs
Recommendations
ERP / CRM
Sales Pipeline
Payables
Payroll
Inventory
Contacts
Deal Tracking
Terabytes
(10E12)
Gigabytes
(10E9)
Exabytes
(10E18)
Petabytes
(10E15)
Velocity - Variety - variability
Volume
1980
190,000$
2010
0.07$
1990
9,000$
2000
15$
Storage/GB
ERP / CRM WEB
2.0
Internet of things
What Is Big Data?
Modern Data Warehousing
Hadoop Distributed Architecture
MapReduce: Move Code to the Data
So How Does It Work?
Distributed Storage
(HDFS)
Query
(Hive)
Distributed Processing
(MapReduce)
ODBC
Legend
Red = Core
Hadoop
Blue = Data
processing
Gray= Microsoft
integration
points and
value adds
Orange = Data
Movement
Green =
Packages
HDInsight and Hadoop Ecosystem
Record
reader
Map Combiner
Partitioner
Shuffle
and sort
Reduce
Output
format
SQL Server 2014 Faster Insights from Any Data
MapReduce Summary
Programming HDInsight
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus…
C#, F# Map/Reduce, LINQ to Hive, Microsoft .NET
management clients
JavaScript Map/Reduce, browser hosted console, Node.js
management clients
PowerShell, cross-platform CLI tools
RDBMS vs. Hadoop
Microsoft Hadoop Vision
Insights to all users by activating new types of data
Polybase
76
DBHDFS
SQL Server PDW querying HDFS data, in-situ
=
Polybase in PDW V2
77
Hadoop
HDFS DB
(a) PDW query in, results out
Hadoop
HDFS DB
(b) PDW query in, results stored in HDFS
Sensor
& RFID
Web
Apps
Unstructured data Structured data
Traditional schema-
based DW applications
RDBMS
Hadoop
Social
Apps
Mobile
Apps
How to overcome the
“impedance mismatch”
Increasingly massive amounts of
unstructured data driven by new
sources
At the same time, vast amounts of
corporate data and data sources,
and the bulk of their data analysis
Polybase addresses this challenge for advanced data analytics by allowing native query across
PDW and Hadoop, integrating structured and unstructured data
Native Query Across Hadoop and PDW
• Querying data in Hadoop from PDW using regular SQL queries, including
• Full SQL query access to data stored in HDFS, represented as ‘external tables’ in
PDW
• Basic statistics support for data coming from HDFS
• Querying across PDW and Hadoop tables (joining ‘on the fly’)
• Fully parallelized, high performance import of data from HDFS files into PDW tables
• Fully parallelized, high performance export of data in PDW tables into HDFS files
• Integration with various Hadoop distributions: Hadoop on Windows Server,
Hortonwork and Cloudera.
• Supporting Hadoop 1.0 and 2.0
Native Query Across Hadoop and PDW
Polybase Features in SQL Server PDW
Native Query Across Hadoop and PDW
Creating “External Tables”
• Internal representation of data residing in Hadoop/HDFS (delimited text files only)
• High-level permissions required for creating external tables
• ADMINISTER BULK OPERATIONS & ALTER SCHEMA
• Different than ‘regular SQL tables’: essentially read only (no DML support)
CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ])
{WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])}
[;]
Indicates
“External” Table
1
Required location of
Hadoop cluster and file
2
Optional Format Options associated
with data import from HDFS
3
Native Query Across Hadoop and PDW
Querying Unstructured Data
1. Querying data in HDFS and displaying results in table form (using external tables)
2. Joining data from HDFS with relational PDW data
Example – Creating external table ‘ClickStream’:
CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_IP
varchar(50)), WITH (LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’,
FORMAT_OPTIONS (FIELD_TERMINATOR = '|'));
Text file in HDFS with | as field delimiter
SELECT top 10 (url) FROM ClickStream where user_IP = ‘192.168.0.1’ Filter query against data in
HDFS
SELECT url.description FROM ClickStream cs, Url_Description url
WHERE cs.url = url.name and cs.url=’www.cars.com’;
Join data coming from files in
HDFS
(Url_Description is a second text file in HDFS)
Query Examples
1
2
SELECT user_name FROM ClickStream cs, Users u WHERE
cs.user_IP = u.user_IP and cs.url=’www.microsoft.com’;
3 Join data from HDFS
with relational PDW table
(Users is a distributed PDW table)
Native Query Across Hadoop and PDW
Parallel Data Import from HDFS into PDW
Persistently storing data from HDFS in PDW tables
Fully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed
or replicated) as destination
CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url)
AS SELECT url, event_date, user_IP FROM ClickStream
Retrieval of data in HDFS “on-the-fly”
Enhanced
PDW query
engine
CTAS Results
External Table
DMS
Reader
1
DMS
Reader
N
…
HDFS bridge
Parallel
HDFS Reads
Parallel
Importing
Sensor
& RFID
Web
Apps
Unstructured data
Hadoop
Social
Apps
Mobile
Apps
Structured data
Traditional DW
applications
PDW
Sensor
& RFID
Web
Apps
Unstructured data
Social
Apps
Mobile
Apps
HDFS data nodes
Native Query Across Hadoop and PDW
Parallel Data Export from PDW into HDFS
• Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as
destination table and PDW tables as source
• ‘Round-trip of data’ possible with first importing data from HDFS, joining it with relational
data, and then exporting results back to HDFS
CREATE EXTERNAL TABLE ClickStream (url, event_date, user_IP)
WITH (LOCATION =‘hdfs://MyHadoop:5000/users/outputDir’, FORMAT_OPTIONS
(FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream_PDW
Enhanced
PDW query
engine
CETAS Results
External Table
DMS
Writer
1
DMS
Writer
N
…
HDFS bridge
Parallel
HDFS Writes
Parallel
Reading
Structured data
Traditional DW
applications
PDW
In-Memory for big data analytics
Interactive Analytics over “Big Data”
84
• SQL Server Analysis Services scaled out to very
large data volumes
• Sourced from “Big Data” sources, e.g.
• Hadoop, Isotope, etc.
• Enterprise data sources (SQL Server, Oracle, SAP,
etc.)
• Built upon the In-Memory Analytics engine
• In-memory, column-store, 10x compression
• Deployment vehicles: Box, Appliance, Cloud
• Customers:
• Skype, Klout, Halo 4, UBS, AdCenter, Windows
Update
XMLAWeb services
External
Data Sources
GW
Mgmt
Deploy
Monitor
AS
Instance
AS
Instance
AS
Instance
Reliable Persistent Storage
Excel, PV
3rd party apps,
tools, etc.
StreamInsight
Managing Streaming Data In-Memory
•
•
•
Customer benefits
•
•
•
•
85
Event
Output
stream
Input
stream
Complete and Consistent Data Platform
What Questions Do You Have?
Thank You
for attending this session
1 of 88

Recommended

Building Modern Data Platform with Microsoft Azure by
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureDmitry Anoshin
3.2K views45 slides
Introduction to Azure HDInsight by
Introduction to Azure HDInsightIntroduction to Azure HDInsight
Introduction to Azure HDInsightStéphane Fréchette
3.2K views29 slides
Cortana Analytics Suite by
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
9.4K views27 slides
Designing a modern data warehouse in azure by
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure Antonios Chatzipavlis
851 views35 slides
Azure Data Factory V2; The Data Flows by
Azure Data Factory V2; The Data FlowsAzure Data Factory V2; The Data Flows
Azure Data Factory V2; The Data FlowsThomas Sykes
658 views21 slides
How does Microsoft solve Big Data? by
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
7.4K views59 slides

More Related Content

What's hot

Azure Synapse Analytics Overview (r2) by
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)James Serra
23.2K views251 slides
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ... by
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
742 views25 slides
Azure Purview Data Toboggan Erwin de Kreuk by
Azure Purview Data Toboggan Erwin de KreukAzure Purview Data Toboggan Erwin de Kreuk
Azure Purview Data Toboggan Erwin de KreukErwin de Kreuk
1K views41 slides
Azure Lowlands: An intro to Azure Data Lake by
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data LakeRick van den Bosch
1.6K views52 slides
A lap around Azure Data Factory by
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data FactoryBizTalk360
4K views35 slides
Transitioning to a BI Role by
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI RoleJames Serra
8.4K views19 slides

What's hot(20)

Azure Synapse Analytics Overview (r2) by James Serra
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra23.2K views
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ... by Michael Rys
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Michael Rys742 views
Azure Purview Data Toboggan Erwin de Kreuk by Erwin de Kreuk
Azure Purview Data Toboggan Erwin de KreukAzure Purview Data Toboggan Erwin de Kreuk
Azure Purview Data Toboggan Erwin de Kreuk
Erwin de Kreuk1K views
Azure Lowlands: An intro to Azure Data Lake by Rick van den Bosch
Azure Lowlands: An intro to Azure Data LakeAzure Lowlands: An intro to Azure Data Lake
Azure Lowlands: An intro to Azure Data Lake
Rick van den Bosch1.6K views
A lap around Azure Data Factory by BizTalk360
A lap around Azure Data FactoryA lap around Azure Data Factory
A lap around Azure Data Factory
BizTalk3604K views
Transitioning to a BI Role by James Serra
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI Role
James Serra8.4K views
Big Data Analytics in the Cloud with Microsoft Azure by Mark Kromer
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer2K views
Is the traditional data warehouse dead? by James Serra
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra9.3K views
Webinar - Introduction to Azure Data Lake by Josh Lane
Webinar - Introduction to Azure Data LakeWebinar - Introduction to Azure Data Lake
Webinar - Introduction to Azure Data Lake
Josh Lane278 views
Microsoft cloud big data strategy by James Serra
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
James Serra8.7K views
Azure Data Lake Intro (SQLBits 2016) by Michael Rys
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
Michael Rys2.8K views
Azure Analysis Services (Azure Bootcamp 2018) by Turner Kunkel
Azure Analysis Services (Azure Bootcamp 2018)Azure Analysis Services (Azure Bootcamp 2018)
Azure Analysis Services (Azure Bootcamp 2018)
Turner Kunkel734 views
Introducing Azure SQL Data Warehouse by James Serra
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra7.7K views
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac... by Lace Lofranco
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
Lace Lofranco1.8K views
Data Lake Overview by James Serra
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra19.8K views
Power BI for Big Data and the New Look of Big Data Solutions by James Serra
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
James Serra7.1K views
Why Power BI is the right tool for you by Marcos Freccia
Why Power BI is the right tool for youWhy Power BI is the right tool for you
Why Power BI is the right tool for you
Marcos Freccia312 views
Running cost effective big data workloads with Azure Synapse and Azure Data L... by Michael Rys
Running cost effective big data workloads with Azure Synapse and Azure Data L...Running cost effective big data workloads with Azure Synapse and Azure Data L...
Running cost effective big data workloads with Azure Synapse and Azure Data L...
Michael Rys734 views
Cortana Analytics Workshop: Azure Data Lake by MSAdvAnalytics
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics3.2K views

Similar to SQL Server 2014 Faster Insights from Any Data

SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At... by
SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...
SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...David J Rosenthal
944 views99 slides
Best practices to deliver data analytics to the business with power bi by
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power biSatya Shyam K Jayanty
2.5K views52 slides
Working with Microsoft Power Business Inteligence Tools - Presented by Atidan by
Working with Microsoft Power Business Inteligence Tools - Presented by AtidanWorking with Microsoft Power Business Inteligence Tools - Presented by Atidan
Working with Microsoft Power Business Inteligence Tools - Presented by AtidanDavid J Rosenthal
2K views53 slides
Formulating Power BI Enterprise Strategy by
Formulating Power BI Enterprise StrategyFormulating Power BI Enterprise Strategy
Formulating Power BI Enterprise StrategyTeo Lachev
5.1K views33 slides
Power BI by
Power BIPower BI
Power BICybage Software Pvt ltd
376 views45 slides
Microsoft Power BI Overview by
Microsoft Power BI OverviewMicrosoft Power BI Overview
Microsoft Power BI OverviewNetwoven Inc.
23.9K views45 slides

Similar to SQL Server 2014 Faster Insights from Any Data(20)

SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At... by David J Rosenthal
SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...
SQL Server 2014 Faster Insights from Any Data -Level 300 Presentation from At...
David J Rosenthal944 views
Best practices to deliver data analytics to the business with power bi by Satya Shyam K Jayanty
Best practices to deliver data analytics to the business with power biBest practices to deliver data analytics to the business with power bi
Best practices to deliver data analytics to the business with power bi
Working with Microsoft Power Business Inteligence Tools - Presented by Atidan by David J Rosenthal
Working with Microsoft Power Business Inteligence Tools - Presented by AtidanWorking with Microsoft Power Business Inteligence Tools - Presented by Atidan
Working with Microsoft Power Business Inteligence Tools - Presented by Atidan
Formulating Power BI Enterprise Strategy by Teo Lachev
Formulating Power BI Enterprise StrategyFormulating Power BI Enterprise Strategy
Formulating Power BI Enterprise Strategy
Teo Lachev5.1K views
Microsoft Power BI Overview by Netwoven Inc.
Microsoft Power BI OverviewMicrosoft Power BI Overview
Microsoft Power BI Overview
Netwoven Inc. 23.9K views
Business Intelligence in SharePoint 2013 by Jason Himmelstein
Business Intelligence in SharePoint 2013Business Intelligence in SharePoint 2013
Business Intelligence in SharePoint 2013
Jason Himmelstein33.4K views
Create Your First SQL Server Cubes by Mark Kromer
Create Your First SQL Server CubesCreate Your First SQL Server Cubes
Create Your First SQL Server Cubes
Mark Kromer1.5K views
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data by Netwoven Inc.
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce DataLearn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Learn How to Use Microsoft Power BI for Office 365 to Analyze Salesforce Data
Netwoven Inc. 8.5K views
Embed Interactive Reports in Your Apps by Teo Lachev
Embed Interactive Reports in Your AppsEmbed Interactive Reports in Your Apps
Embed Interactive Reports in Your Apps
Teo Lachev643 views
Microsoft BI reporting capabilities (on-prem solutions) Presentation by jeromedoyen
Microsoft BI reporting capabilities (on-prem solutions) PresentationMicrosoft BI reporting capabilities (on-prem solutions) Presentation
Microsoft BI reporting capabilities (on-prem solutions) Presentation
jeromedoyen339 views
powerbioverview-191114161542.pdf by MarkMayle2
powerbioverview-191114161542.pdfpowerbioverview-191114161542.pdf
powerbioverview-191114161542.pdf
MarkMayle219 views
Power BI Overview by James Serra
Power BI OverviewPower BI Overview
Power BI Overview
James Serra78.2K views
Introduction to Advanced Analytics with SharePoint Composites by Mark Tabladillo
Introduction to Advanced Analytics with SharePoint CompositesIntroduction to Advanced Analytics with SharePoint Composites
Introduction to Advanced Analytics with SharePoint Composites
Mark Tabladillo1.5K views
Azure Data.pptx by FedoRam1
Azure Data.pptxAzure Data.pptx
Azure Data.pptx
FedoRam176 views
Microsoft Azure BI Solutions in the Cloud by Mark Kromer
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
Mark Kromer719 views
Using Power BI and Azure as analytics engine for business applications by Digital Illustrated
Using Power BI and Azure as analytics engine for business applicationsUsing Power BI and Azure as analytics engine for business applications
Using Power BI and Azure as analytics engine for business applications
Digital Illustrated2.5K views

More from Stéphane Fréchette

Back to the future - Temporal Table in SQL Server 2016 by
Back to the future - Temporal Table in SQL Server 2016Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016Stéphane Fréchette
4.8K views16 slides
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston by
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston  Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston Stéphane Fréchette
1.4K views24 slides
Power BI - Bring your data together by
Power BI - Bring your data togetherPower BI - Bring your data together
Power BI - Bring your data togetherStéphane Fréchette
1.9K views28 slides
Data Analytics with R and SQL Server by
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
5.9K views30 slides
Self-Service Data Integration with Power Query by
Self-Service Data Integration with Power QuerySelf-Service Data Integration with Power Query
Self-Service Data Integration with Power QueryStéphane Fréchette
2.5K views24 slides
Le journalisme de données... par où commencer? by
Le journalisme de données... par où commencer?Le journalisme de données... par où commencer?
Le journalisme de données... par où commencer?Stéphane Fréchette
1.1K views36 slides

More from Stéphane Fréchette(17)

Back to the future - Temporal Table in SQL Server 2016 by Stéphane Fréchette
Back to the future - Temporal Table in SQL Server 2016Back to the future - Temporal Table in SQL Server 2016
Back to the future - Temporal Table in SQL Server 2016
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston by Stéphane Fréchette
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston  Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Self-Service Data Integration with Power Query - SQLSaturday #364 Boston
Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg by Stéphane Fréchette
Graph Databases for SQL Server Professionals - SQLSaturday #350 WinnipegGraph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
Graph Databases for SQL Server Professionals - SQLSaturday #350 Winnipeg
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...) by Stéphane Fréchette
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
Introduction to Master Data Services in SQL Server 2012 by Stéphane Fréchette
Introduction to Master Data Services in SQL Server 2012Introduction to Master Data Services in SQL Server 2012
Introduction to Master Data Services in SQL Server 2012
Stéphane Fréchette19.3K views

Recently uploaded

Vertical User Stories by
Vertical User StoriesVertical User Stories
Vertical User StoriesMoisés Armani Ramírez
14 views16 slides
HTTP headers that make your website go faster - devs.gent November 2023 by
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023Thijs Feryn
22 views151 slides
Future of Indian ConsumerTech by
Future of Indian ConsumerTechFuture of Indian ConsumerTech
Future of Indian ConsumerTechKapil Khandelwal (KK)
21 views68 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
56 views21 slides
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdfDr. Jimmy Schwarzkopf
19 views29 slides
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院IttrainingIttraining
52 views8 slides

Recently uploaded(20)

HTTP headers that make your website go faster - devs.gent November 2023 by Thijs Feryn
HTTP headers that make your website go faster - devs.gent November 2023HTTP headers that make your website go faster - devs.gent November 2023
HTTP headers that make your website go faster - devs.gent November 2023
Thijs Feryn22 views
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf by Dr. Jimmy Schwarzkopf
STKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdfSTKI Israeli Market Study 2023   corrected forecast 2023_24 v3.pdf
STKI Israeli Market Study 2023 corrected forecast 2023_24 v3.pdf
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas... by Bernd Ruecker
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
iSAQB Software Architecture Gathering 2023: How Process Orchestration Increas...
Bernd Ruecker37 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada136 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman33 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma39 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst478 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi127 views

SQL Server 2014 Faster Insights from Any Data

  • 1. SQL Server 2014 Faster Insight from Any Data Stéphane Fréchette Friday May 9, 2014
  • 2. Email: stephanefrechette@ukubu.com Twitter: @sfrechette Blog: stephanefrechette.com Stéphane Fréchette Founder, CEO | Strategic consultant Microsoft SQL Server MVP
  • 6. Excel BI | Capabilities
  • 7. Microsoft Power BI for Office 365 1 in 4 enterprise customers on Office 3651 Billion Office Users Analyze Visualize Share Find Q&A MobileDiscover Scalable | Manageable | Trusted
  • 8. Extend with Hybrid Cloud Solutions
  • 9. Extend with Hybrid Cloud Solutions
  • 10. Extend with Hybrid Cloud Solutions
  • 11. Power Query, PowerPivot, Power View, and Power Map
  • 12. Powerful Self-Service BI with Excel 2013
  • 13. Power Query Enable self-service data discovery, query, transformation and mashup experiences for Information Workers, via Excel and PowerPivot Discovery and connectivity to a wide range of data sources, spanning volume as well as variety of data. Highly interactive and intuitive experience for rapidly and iteratively building queries over any data source, any size. Consistency of experience, and parity of query capabilities over all data sources. Joins across different data sources; ability to create custom views over data that can then be shared with team/department.
  • 14. Power Query Discover, combine, and refine Big Data, small data, and any data with Data Explorer for Excel.
  • 15. S Data Sources Windows Azure Marketplace Windows Active Directory Azure SQL Database Azure HDInsight
  • 16. Powerful Self-Service BI with Excel 2013
  • 19. Powerful Self-Service BI with Excel 2013
  • 21. Power View for Multidimensional Models • Power View on Analysis Services via BISM • Native support for DAX in Analysis Services • Better flexibility: Choice of DAX on Tabular or Multidimensional (cubes)
  • 22. Architecture Internet Explorer Analysis Services BI Semantic Model Tabular SharePoint (2010 or 2013) Reporting Services Power View Analysis Services BI Semantic Model Multidimensional SQL Server Data Tools SQL Server Data Tools 1 2 35 6 4
  • 23. BI Semantic Model: ArchitectureThird-party applications Reporting Services (Power View) Excel PowerPivot Databases LOB Applications Files OData Feeds Cloud Services SharePoint Insights
  • 24. BISM-MD Object Tabular Object Cube Model Cube Dimension Table Attributes (Key(s), Name) Columns Measure Group Table Measure Measure Measure without MeasureGroup Within Table called “Measures” MeasuregroupCube Dimension relationship Relationship Perspective Perspective KPI KPI User/Parent-Child Hierarchies Hierarchies Multidimensional-Tabular Mapping
  • 25. Powerful Self-Service BI with Excel 2013
  • 26. Power Map for Microsoft Excel enables information workers to discover and share new insights from geographical and temporal data through three-dimensional storytelling. What Is Power Map?
  • 27. Map Data • Data in Excel • Geo-Code • 3D and 3 Visuals Discover Insights • Play over Time • Annotate points • Capture scenes Share Stories • Cinematic Effects • Interactive Tours • Share Workbook Power Map: Steps to 3D insights
  • 31. Power Map Excel Add-in to Enhance Data Visualization
  • 33. Power BI for Office 365 | Capabilities
  • 34. Power BI for Office 365 | Capabilities
  • 35. Power BI for Office 365 | Capabilities
  • 36. Power BI for Office 365 | Capabilities Corporate Data Sources
  • 37. Data Management Gateway Enabling Corporate OData Feeds Enabling Excel Workbook Data Refresh using SharePoint Online Enabling Discovery in Power Query capabilities Power BI Admin Center Data Management Gateway
  • 38. Data Management Gateway - Conceptual Power BI Admin Center Allows IT to configure, manage and monitor access to corporate data sources. Data Management Gateway Connects to corporate data sources and sends data to Microsoft cloud services through a secure channel (Service Bus). Corporate Data Sources The Gateway can connect to a variety of data sources. Secure Credential Store All credentials used by the gateway are stored on-premises.
  • 39. Data Management Gateway Network Topology MICROSOFT DATA CENTERINTERNET PERIMETER NETWORK INTRANET Data Management Gateway Data Management Gateway Cloud Services Customer network Power Query Outgoing connection to cloud services (Registration, Regular Heartbeat, Data Source definition requests) Connect to Corporate OData feed Data Per Machine: Single gateway installed Credential Management Saves credentials
  • 40. Corporate OData Feeds and Data Management Gateway Data Management Gateway Power Query (1) Using Power Query Anna connects to OData feed (URL: http://feedgwMyDB ) Example: ContosoAnna (2) The Data Management Gateway connects to SQL Server using either Windows account or Database account setup by Patrick when creating the feed Example: DB1_Reader (3) Returns Result (4) Returns OData feed
  • 41. Scheduled Refresh Scenario: workbook is refreshed on schedule as configured by the author in BI Sites • Scheduler runs in BI Azure and triggers refresh as configured in the BI Sites application • The flow assumes the workbook has been added to Power BI, thus save back is done directly to SPO • When refresh is called by BI Azure, SPO rehydrates the user identity and calls WAC in a back channel (i.e. redirect equivalent) 3. Refresh workbook BI Azure Office Web Apps Service (WAC) Excel Services 5. Get shadow workbook refresh Data Model SPO Azure Active Directory(AAD) OrgID, MSODS,ACS Excel Service s SOAP API 1. Verify user existence and license in MSODS and get access token to target URL in SPO from ACS 2. Construct the user part of the access token, and trigger refresh for a workbook on behalf of the scheduled refresh user On-Prem Data Sources Cloud Data Sources 6. Get data from cloud/on-prem sources and re- process the data model 7. Save updated workbook to SPO 4. Power BI workbook?
  • 42. On-premises Data Access from BI Azure Scenario: Interactive refresh from Excel Web Access where the data source is on-premises • For interactive refresh, shared data sources are configured in advance in the Power BI Admin Center • For scheduled refresh, personal data sources can be configured by the workbook owner Azure Active Directory (AAD) OrgID, MSODS, ACS BI Azure Hybrid Proxy ADO.NET Provider Discovery API Tenant Configuration SQL Azure Hybrid Data Integration Service Hybrid Proxy Hybrid Delivery 1. Determine whether data source is cloud or on-prem, and retrieve registered ID 2. Authenticate & retrieve tenant information 3. Get registered data source info On-Prem Cloud 4. Issue refresh query Data Management Gateway Windows Azure Service Bus 5. Send request to Gateway (via Service Bus) Hybrid Delivery Client API 6. Read query request from Service Bus queue 7. Retrieve data source credentials Credential Manager8. Run query and retrieve the data 9. Coordinate transfer job Azure Storage (temporary) 10. Compress & stream data in multiple chunks 11 . Receive & decompress data Azure Active Directory (AAD) OrgID, MSODS, ACS BI Azure Hybrid Proxy ADO.NET Provider Discovery API Hybrid Data Integration Service Hybrid Proxy Hybrid Delivery
  • 43. Data Refresh in SPO– How does it work? Data Management Gateway Excel Workbook in SharePoint Online Gateway Cloud Service (1) Excel workbook uploaded to SharePoint Online (2) Click Data Refresh for Excel workbook (3) Connects to Gateway Cloud Service (4) Checks whether user is authorized to perform a refresh (5) Sends command (SQL statement, connection string) to on-premise Data Management Gateway (6) Sends SQL to SQL Server (7) Return Results (8) Efficiently transfer this to cloud service (9) Returns data to Excel Workbook
  • 45. Power BI for Office 365 | Capabilities
  • 46. Engage customers with smart, contextual mobile experiences Boost agility with real-time access to apps and data from anywhere Enable Deep Business and Customer Connections Virtually Anytime, Anywhere
  • 47. Stay Productive on the Go Deliver Familiar, Connected Experiences to a Mobile Workforce …while ensuring enterprise security, manageability, and compliance
  • 48. Mobile BI Capabilities Available Today Browser-based corporate BI solutions on iOS, Android and Windows: • SharePoint Mobile enhancements • PerformancePoint Services • Excel Services • SQL Server Reporting Services “Ultimately, the new Microsoft mobile BI solution leads to more revenue for Recall and gives us deeper customer insight, helping us stay ahead of our competitors.” Recall Records Management Company Gets Real-Time BI, Boosts Sales with Mobile Solution case study. Full Case study.
  • 52. Power BI for Office 365 | Capabilities
  • 53. Tabular models for Power BI
  • 55. Creating & managing models in Power BI
  • 56. Reliable Persistent Storage (RPS) Power BI Tabular Model Architecture SSDT SQL Azure HDInsight Azure Tables External Data Sources AS Instance AS Instance AS Instance AS Instance … On Prem SQL Gateway Power BI Portal in O365 Excel XMLA REST
  • 57. Service Health Monitoring At a glance view of the health of IT managed gateways
  • 58. Enabler of Self Service BI Varying levels of control across data sources, departments Oversight and monitoring of cloud data access Ability to make corporate data sources easier to discover, and easier to access Role of the IT Admin in Power BI
  • 60. Power BI Admin Portal & Data Management Gateway Power BI Admin CenterPower BI Admin Center
  • 64. Internet of things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates WEB 2.0Mobile Advertising CollaborationeCommerce Digital Marketing Search Marketing Web Logs Recommendations ERP / CRM Sales Pipeline Payables Payroll Inventory Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety - variability Volume 1980 190,000$ 2010 0.07$ 1990 9,000$ 2000 15$ Storage/GB ERP / CRM WEB 2.0 Internet of things What Is Big Data?
  • 67. MapReduce: Move Code to the Data
  • 68. So How Does It Work?
  • 69. Distributed Storage (HDFS) Query (Hive) Distributed Processing (MapReduce) ODBC Legend Red = Core Hadoop Blue = Data processing Gray= Microsoft integration points and value adds Orange = Data Movement Green = Packages HDInsight and Hadoop Ecosystem
  • 73. Programming HDInsight Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… C#, F# Map/Reduce, LINQ to Hive, Microsoft .NET management clients JavaScript Map/Reduce, browser hosted console, Node.js management clients PowerShell, cross-platform CLI tools
  • 75. Microsoft Hadoop Vision Insights to all users by activating new types of data
  • 76. Polybase 76 DBHDFS SQL Server PDW querying HDFS data, in-situ =
  • 77. Polybase in PDW V2 77 Hadoop HDFS DB (a) PDW query in, results out Hadoop HDFS DB (b) PDW query in, results stored in HDFS
  • 78. Sensor & RFID Web Apps Unstructured data Structured data Traditional schema- based DW applications RDBMS Hadoop Social Apps Mobile Apps How to overcome the “impedance mismatch” Increasingly massive amounts of unstructured data driven by new sources At the same time, vast amounts of corporate data and data sources, and the bulk of their data analysis Polybase addresses this challenge for advanced data analytics by allowing native query across PDW and Hadoop, integrating structured and unstructured data Native Query Across Hadoop and PDW
  • 79. • Querying data in Hadoop from PDW using regular SQL queries, including • Full SQL query access to data stored in HDFS, represented as ‘external tables’ in PDW • Basic statistics support for data coming from HDFS • Querying across PDW and Hadoop tables (joining ‘on the fly’) • Fully parallelized, high performance import of data from HDFS files into PDW tables • Fully parallelized, high performance export of data in PDW tables into HDFS files • Integration with various Hadoop distributions: Hadoop on Windows Server, Hortonwork and Cloudera. • Supporting Hadoop 1.0 and 2.0 Native Query Across Hadoop and PDW Polybase Features in SQL Server PDW
  • 80. Native Query Across Hadoop and PDW Creating “External Tables” • Internal representation of data residing in Hadoop/HDFS (delimited text files only) • High-level permissions required for creating external tables • ADMINISTER BULK OPERATIONS & ALTER SCHEMA • Different than ‘regular SQL tables’: essentially read only (no DML support) CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ]) {WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])} [;] Indicates “External” Table 1 Required location of Hadoop cluster and file 2 Optional Format Options associated with data import from HDFS 3
  • 81. Native Query Across Hadoop and PDW Querying Unstructured Data 1. Querying data in HDFS and displaying results in table form (using external tables) 2. Joining data from HDFS with relational PDW data Example – Creating external table ‘ClickStream’: CREATE EXTERNAL TABLE ClickStream(url varchar(50), event_date date, user_IP varchar(50)), WITH (LOCATION =‘hdfs://MyHadoop:5000/tpch1GB/employee.tbl’, FORMAT_OPTIONS (FIELD_TERMINATOR = '|')); Text file in HDFS with | as field delimiter SELECT top 10 (url) FROM ClickStream where user_IP = ‘192.168.0.1’ Filter query against data in HDFS SELECT url.description FROM ClickStream cs, Url_Description url WHERE cs.url = url.name and cs.url=’www.cars.com’; Join data coming from files in HDFS (Url_Description is a second text file in HDFS) Query Examples 1 2 SELECT user_name FROM ClickStream cs, Users u WHERE cs.user_IP = u.user_IP and cs.url=’www.microsoft.com’; 3 Join data from HDFS with relational PDW table (Users is a distributed PDW table)
  • 82. Native Query Across Hadoop and PDW Parallel Data Import from HDFS into PDW Persistently storing data from HDFS in PDW tables Fully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed or replicated) as destination CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url) AS SELECT url, event_date, user_IP FROM ClickStream Retrieval of data in HDFS “on-the-fly” Enhanced PDW query engine CTAS Results External Table DMS Reader 1 DMS Reader N … HDFS bridge Parallel HDFS Reads Parallel Importing Sensor & RFID Web Apps Unstructured data Hadoop Social Apps Mobile Apps Structured data Traditional DW applications PDW
  • 83. Sensor & RFID Web Apps Unstructured data Social Apps Mobile Apps HDFS data nodes Native Query Across Hadoop and PDW Parallel Data Export from PDW into HDFS • Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as destination table and PDW tables as source • ‘Round-trip of data’ possible with first importing data from HDFS, joining it with relational data, and then exporting results back to HDFS CREATE EXTERNAL TABLE ClickStream (url, event_date, user_IP) WITH (LOCATION =‘hdfs://MyHadoop:5000/users/outputDir’, FORMAT_OPTIONS (FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream_PDW Enhanced PDW query engine CETAS Results External Table DMS Writer 1 DMS Writer N … HDFS bridge Parallel HDFS Writes Parallel Reading Structured data Traditional DW applications PDW
  • 84. In-Memory for big data analytics Interactive Analytics over “Big Data” 84 • SQL Server Analysis Services scaled out to very large data volumes • Sourced from “Big Data” sources, e.g. • Hadoop, Isotope, etc. • Enterprise data sources (SQL Server, Oracle, SAP, etc.) • Built upon the In-Memory Analytics engine • In-memory, column-store, 10x compression • Deployment vehicles: Box, Appliance, Cloud • Customers: • Skype, Klout, Halo 4, UBS, AdCenter, Windows Update XMLAWeb services External Data Sources GW Mgmt Deploy Monitor AS Instance AS Instance AS Instance Reliable Persistent Storage Excel, PV 3rd party apps, tools, etc.
  • 85. StreamInsight Managing Streaming Data In-Memory • • • Customer benefits • • • • 85 Event Output stream Input stream
  • 86. Complete and Consistent Data Platform
  • 87. What Questions Do You Have?
  • 88. Thank You for attending this session