© Hortonworks Inc. 2014
Hybrid Modern Data Architecture
with Microsoft and Apache Hadoop
Your Presenters
• Oliver Chiu (twitter name )
– Title
– Years of experience
– Fun Fact
• John Kreisa (@marked_man)
– VP St...
Poll 1: What stage are you looking in Hadoop
• Research
• Evaluation
• Trial
• Haven’t started research
Today’s Topics
• Introduction
• What is a Hybrid Modern Data Architecture (MDA)?
• Apache Hadoop in the Hybrid MDA
• The H...
© Hortonworks Inc. 2014
Existing Data ArchitectureAPPLICATIO
NS	
  
DATA	
  	
  SYSTEM	
  
REPOSITORIES	
  
SOURCES	
  
Ex...
© Hortonworks Inc. 2014
Modern Data Architecture EnabledAPPLICATIONS	
  DATA	
  	
  SYSTEM	
  
REPOSITORIES	
  
SOURCES	
 ...
Hadoop Powers Modern Data Architecture
Apache Hadoop is an open source project
governed by the Apache Software Foundation
...
Integrated
Interoperable with
existing data center
investments
Skills
Leverage your existing skills:
development, operatio...
© Hortonworks Inc. 2013
Use Cases for the MDA
Page 9
Industry Use Case Type of Data
Financial Services
New Account Risk Sc...
© Hortonworks Inc. 2014
Microsoft in the Modern Data Architecture
INFRASTRUCTURE	
  
SOURCES	
  
Emerging	
  Sources	
  	
...
Today’s Topics
• Introduction
• What is a Hybrid Modern Data Architecture (MDA)?
• Apache Hadoop in the Hybrid MDA
• The H...
Hortonworks and Microsoft
Engineering alignment
Corporate alignment
Field Alignment
End-to-End Data Platform
PDW vNext
(PDW +
HDInsight)
Windows Azure
HDInsight
Hortonworks
Data Platform
PDW
SQL Server for
...
PDW vNext
(PDW + HDInsight)
Windows Azure
HDInsight
Hadoop Solutions From Microsoft
Hortonworks Data Platform
Hortonworks Data Platform for Windows
Hortonworks Data Platform
Parallel Data Warehouse Next w/ HDInsight
PDW vNext
(PDW + HDInsight)
Microsoft Confidential 17
Select
…
Hadoop
Data
Result
Set
Relatio
nal
Data
PolyBase
18
Scale out technologies in SQL Server Parallel Data Warehouse
Windows Azure HDInsight
Windows Azure
HDInsight
Master Chief meets
Big Data
§  In-game analysis detects cheaters
and improves experience for
everyone
§  Enables targete...
PDW vNext
(PDW + HDInsight)
Windows Azure
HDInsight
Hadoop Solutions From Microsoft
Hortonworks Data Platform
Development and Data Tools
Hortonworks & Microsoft
AMBARI
MAPREDUCE
YARN
TEZ
DATA SERVICES
HIVE
HBASE
PIG
HCATALOG
HDFS
Ja...
Question & Answer session will be conducted electronically,
using the panel to the right of your screen
More about Microso...
Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft
Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft
Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft
Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft
Upcoming SlideShare
Loading in...5
×

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft

2,041

Published on

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,041
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft "

  1. 1. © Hortonworks Inc. 2014 Hybrid Modern Data Architecture with Microsoft and Apache Hadoop
  2. 2. Your Presenters • Oliver Chiu (twitter name ) – Title – Years of experience – Fun Fact • John Kreisa (@marked_man) – VP Strategic Marketing, Hortonworks – Over 20 years in data management as a developer and a marketer – Avid camper
  3. 3. Poll 1: What stage are you looking in Hadoop • Research • Evaluation • Trial • Haven’t started research
  4. 4. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  5. 5. © Hortonworks Inc. 2014 Existing Data ArchitectureAPPLICATIO NS   DATA    SYSTEM   REPOSITORIES   SOURCES   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   RDBMS   EDW   MPP   Business     Analy4cs   Custom   Applica4ons   Packaged   Applica4ons   Source: IDC 2.8  ZB  in  2012   85%  from  New  Data  Types   15x  Machine  Data  by  2020   40  ZB  by  2020  
  6. 6. © Hortonworks Inc. 2014 Modern Data Architecture EnabledAPPLICATIONS  DATA    SYSTEM   REPOSITORIES   SOURCES   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   RDBMS   EDW   MPP   Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)   OPERATIONAL   TOOLS   MANAGE  &   MONITOR   DEV  &  DATA   TOOLS   BUILD  &  TEST   Business     Analy4cs   Custom   Applica4ons   Packaged   Applica4ons  
  7. 7. Hadoop Powers Modern Data Architecture Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment. Hadoop Cluster compute & storage . . . . . . . . compute & storage . . Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware
  8. 8. Integrated Interoperable with existing data center investments Skills Leverage your existing skills: development, operations, analytics Requirements for Hadoop Adoption Key Services Platform, operational and data services essential for the enterprise 3Requirements for Hadoop’s Role in the Modern Data Architecture
  9. 9. © Hortonworks Inc. 2013 Use Cases for the MDA Page 9 Industry Use Case Type of Data Financial Services New Account Risk Screens Text, Server Logs Trading Risk Server Logs Insurance Underwriting Geographic, Sensor, Text Telecom Call Detail Records (CDRs) Machine, Geographic Infrastructure Investment Machine, Server Logs Real-time Bandwidth Allocation Server Logs, Text, Social Retail 360° View of the Customer Clickstream, Text Localized, Personalized Promotions Geographic Website Optimization Clickstream Manufacturing Supply Chain and Logistics Sensor Assembly Line Quality Assurance Sensor Crowdsourced Quality Assurance Social Healthcare Use Genomic Data in Medical Trials Structured Monitor Patient Vitals in Real-Time Sensor Pharmaceuticals Recruit and Retain Patients for Drug Trials Social, Clickstream Improve Prescription Adherence Social, Unstructured, Geographic Oil & Gas Unify Exploration & Production Data Sensor, Geographic & Unstructured Monitor Rig Safety in Real-Time Sensor, Unstructured Government ETL Offload in Response to Federal Budgetary Pressures Structured Sentiment Analysis for Government Programs Social
  10. 10. © Hortonworks Inc. 2014 Microsoft in the Modern Data Architecture INFRASTRUCTURE   SOURCES   Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   APPLICATIONS  DATA    SYSTEM   OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   Microsoft Applications New! Power BI Public Preview
  11. 11. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  12. 12. Hortonworks and Microsoft Engineering alignment Corporate alignment Field Alignment
  13. 13. End-to-End Data Platform PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hortonworks Data Platform PDW SQL Server for DW in AzureSQL Server
  14. 14. PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hadoop Solutions From Microsoft Hortonworks Data Platform
  15. 15. Hortonworks Data Platform for Windows Hortonworks Data Platform
  16. 16. Parallel Data Warehouse Next w/ HDInsight PDW vNext (PDW + HDInsight)
  17. 17. Microsoft Confidential 17 Select … Hadoop Data Result Set Relatio nal Data PolyBase
  18. 18. 18 Scale out technologies in SQL Server Parallel Data Warehouse
  19. 19. Windows Azure HDInsight Windows Azure HDInsight
  20. 20. Master Chief meets Big Data §  In-game analysis detects cheaters and improves experience for everyone §  Enables targeted campaigns that improve customer retention
  21. 21. PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hadoop Solutions From Microsoft Hortonworks Data Platform
  22. 22. Development and Data Tools Hortonworks & Microsoft AMBARI MAPREDUCE YARN TEZ DATA SERVICES HIVE HBASE PIG HCATALOG HDFS JavaRPC INTERFACE ODBC JDBC JAVA RPC HADOOP Data Services Governance Exchange Replication Query/Visualization/ Reporting/Analytics SQOOP Reference Architecture SOURCE DATA JMS Queue’s Servers & Mainframe Files Databases Sensor data Social LOAD SQOOP FLUME Web HDFS Enterprise Repositories Management and Monitoring
  23. 23. Question & Answer session will be conducted electronically, using the panel to the right of your screen More about Microsoft and Hortonworks http://hortonworks.com/labs/Microsoft Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/partner-tutorial-microsoft/ Follow us: @hortonworks @MicrosoftBI

×