Hybrid Modern Data Architecture
with Microsoft and Apache Hadoop

© Hortonworks Inc. 2014
Your Presenters
• Oliver Chiu (twitter name )
– Title
– Years of experience
– Fun Fact

• John Kreisa (@marked_man)
– VP S...
Poll 1: What stage are you looking in Hadoop
• Research
• Evaluation
• Trial
• Haven’t started research
Today’s Topics
• Introduction
• What is a Hybrid Modern Data Architecture (MDA)?
• Apache Hadoop in the Hybrid MDA
• The H...
DATA	
  	
  SYSTEM	
  

APPLICATIO
NS	
  

Existing Data Architecture
Custom	
  
Applica4ons	
  

Business	
  	
  
Analy4c...
APPLICATIONS	
  

Modern Data Architecture Enabled
Custom	
  
Applica4ons	
  

Business	
  	
  
Analy4cs	
  

Packaged	
  ...
Hadoop Powers Modern Data Architecture
Hadoop Cluster
compute
&
storage

.

.

.

.

.

.

.

compute
&
storage

.

.

.

...
3

Requirements for Hadoop Adoption
Requirements for Hadoop’s Role in the
Modern Data Architecture

Integrated

Interopera...
Use Cases for the MDA
Industry

Use Case
New Account Risk Screens

Infrastructure Investment

Government

Server Logs, Tex...
New!
Power BI
Public Preview

DEV	
  &	
  DATA	
  TOOLS	
  

Microsoft Applications

DATA	
  	
  SYSTEM	
  

APPLICATIONS	...
Today’s Topics
• Introduction
• What is a Hybrid Modern Data Architecture (MDA)?
• Apache Hadoop in the Hybrid MDA
• The H...
Hortonworks and Microsoft

Engineering alignment
Corporate alignment
Field Alignment
End-to-End Data Platform

SQL Server

PDW

SQL Server for
DW in Azure

Hortonworks
Data Platform

PDW vNext
(PDW +
HDInsig...
Hadoop Solutions From Microsoft

Hortonworks Data Platform

PDW vNext
(PDW + HDInsight)

Windows Azure
HDInsight
Hortonworks Data Platform for Windows

Hortonworks Data Platform
Parallel Data Warehouse Next w/ HDInsight

PDW vNext
(PDW + HDInsight)
Select
…

Result
Set

PolyBase

Hadoop
Data
Microsoft Confidential

Relatio
nal
Data

17
Scale out technologies in SQL Server Parallel Data Warehouse

18
Windows Azure HDInsight

Windows Azure
HDInsight
Master Chief meets
Big Data
§  In-game analysis detects cheaters
and improves experience for
everyone
§  Enables targete...
Hadoop Solutions From Microsoft

Hortonworks Data Platform

PDW vNext
(PDW + HDInsight)

Windows Azure
HDInsight
Hortonworks & Microsoft
Reference
Architecture

Management and Monitoring

Development and Data Tools

SOURCE DATA
Query/V...
More about Microsoft and Hortonworks
http://hortonworks.com/labs/Microsoft

Get started with Hortonworks Sandbox
http://ho...
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Upcoming SlideShare
Loading in...5
×

Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data

4,167

Published on

Joint webinar with Microsoft and Hortonworns on the power of combining the Hortonworks Data Platform with Microsoft’s ubiquitous Windows, Office, SQL Server, Parallel Data Warehouse, and Azure platform to build the Modern Data Architecture for Big Data.

Published in: Technology

Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data

  1. 1. Hybrid Modern Data Architecture with Microsoft and Apache Hadoop © Hortonworks Inc. 2014
  2. 2. Your Presenters • Oliver Chiu (twitter name ) – Title – Years of experience – Fun Fact • John Kreisa (@marked_man) – VP Strategic Marketing, Hortonworks – Over 20 years in data management as a developer and a marketer – Avid camper
  3. 3. Poll 1: What stage are you looking in Hadoop • Research • Evaluation • Trial • Haven’t started research
  4. 4. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  5. 5. DATA    SYSTEM   APPLICATIO NS   Existing Data Architecture Custom   Applica4ons   Business     Analy4cs   Packaged   Applica4ons   2.8  ZB  in  2012   85%  from  New  Data  Types   RDBMS   EDW   MPP   REPOSITORIES   15x  Machine  Data  by  2020   40  ZB  by  2020   SOURCES   Source: IDC Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   © Hortonworks Inc. 2014
  6. 6. APPLICATIONS   Modern Data Architecture Enabled Custom   Applica4ons   Business     Analy4cs   Packaged   Applica4ons   DEV  &  DATA   TOOLS   SOURCES   DATA    SYSTEM   BUILD  &  TEST   OPERATIONAL   TOOLS   RDBMS   EDW   MANAGE  &   MONITOR   MPP   REPOSITORIES   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   © Hortonworks Inc. 2014 Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)  
  7. 7. Hadoop Powers Modern Data Architecture Hadoop Cluster compute & storage . . . . . . . compute & storage . . . Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment.
  8. 8. 3 Requirements for Hadoop Adoption Requirements for Hadoop’s Role in the Modern Data Architecture Integrated Interoperable with existing data center investments Key Services Skills Leverage your existing skills: development, operations, analytics Platform, operational and data services essential for the enterprise
  9. 9. Use Cases for the MDA Industry Use Case New Account Risk Screens Infrastructure Investment Government Server Logs, Text, Social Clickstream, Text Localized, Personalized Promotions Geographic Clickstream Sensor Assembly Line Quality Assurance Sensor Crowdsourced Quality Assurance Oil & Gas Machine, Server Logs Supply Chain and Logistics Pharmaceuticals Machine, Geographic Website Optimization Healthcare Geographic, Sensor, Text 360° View of the Customer Manufacturing Server Logs Real-time Bandwidth Allocation Retail Trading Risk Call Detail Records (CDRs) Telecom Text, Server Logs Insurance Underwriting Social Use Genomic Data in Medical Trials Structured Monitor Patient Vitals in Real-Time Sensor Recruit and Retain Patients for Drug Trials Social, Clickstream Improve Prescription Adherence Social, Unstructured, Geographic Unify Exploration & Production Data Sensor, Geographic & Unstructured Monitor Rig Safety in Real-Time Sensor, Unstructured ETL Offload in Response to Federal Budgetary Pressures Financial Services Type of Data Structured Sentiment Analysis for Government Programs © Hortonworks Inc. 2013 Social Page 9
  10. 10. New! Power BI Public Preview DEV  &  DATA  TOOLS   Microsoft Applications DATA    SYSTEM   APPLICATIONS   Microsoft in the Modern Data Architecture OPERATIONAL  TOOLS   SOURCES   INFRASTRUCTURE   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   © Hortonworks Inc. 2014 Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)  
  11. 11. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  12. 12. Hortonworks and Microsoft Engineering alignment Corporate alignment Field Alignment
  13. 13. End-to-End Data Platform SQL Server PDW SQL Server for DW in Azure Hortonworks Data Platform PDW vNext (PDW + HDInsight) Windows Azure HDInsight
  14. 14. Hadoop Solutions From Microsoft Hortonworks Data Platform PDW vNext (PDW + HDInsight) Windows Azure HDInsight
  15. 15. Hortonworks Data Platform for Windows Hortonworks Data Platform
  16. 16. Parallel Data Warehouse Next w/ HDInsight PDW vNext (PDW + HDInsight)
  17. 17. Select … Result Set PolyBase Hadoop Data Microsoft Confidential Relatio nal Data 17
  18. 18. Scale out technologies in SQL Server Parallel Data Warehouse 18
  19. 19. Windows Azure HDInsight Windows Azure HDInsight
  20. 20. Master Chief meets Big Data §  In-game analysis detects cheaters and improves experience for everyone §  Enables targeted campaigns that improve customer retention
  21. 21. Hadoop Solutions From Microsoft Hortonworks Data Platform PDW vNext (PDW + HDInsight) Windows Azure HDInsight
  22. 22. Hortonworks & Microsoft Reference Architecture Management and Monitoring Development and Data Tools SOURCE DATA Query/Visualization/ Reporting/Analytics AMBARI Databases DATA SERVICES HBASE Files LOAD Servers & Mainframe PIG HCATALOG MAPREDUCE SQOOP JDBC TEZ HADOOP Data Services INTERFACE Governance HDFS SQOOP Java RPC FLUME Web HDFS Sensor data ODBC YARN JMS Queue’s Social HIVE Exchange JAVA RPC Replication Enterprise Repositories
  23. 23. More about Microsoft and Hortonworks http://hortonworks.com/labs/Microsoft Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/partner-tutorial-microsoft/ Follow us: @hortonworks @MicrosoftBI Question & Answer session will be conducted electronically, using the panel to the right of your screen
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×