• Save
Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft

on

  • 610 views

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft

Statistics

Views

Total Views
610
Views on SlideShare
610
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Building a Hybrid Modern Data Architecture for Apache Hadoop with Hortonworks and Microsoft Presentation Transcript

  • 1. © Hortonworks Inc. 2014 Hybrid Modern Data Architecture with Microsoft and Apache Hadoop
  • 2. Your Presenters • Oliver Chiu (twitter name ) – Title – Years of experience – Fun Fact • John Kreisa (@marked_man) – VP Strategic Marketing, Hortonworks – Over 20 years in data management as a developer and a marketer – Avid camper
  • 3. Poll 1: What stage are you looking in Hadoop • Research • Evaluation • Trial • Haven’t started research
  • 4. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  • 5. © Hortonworks Inc. 2014 Existing Data ArchitectureAPPLICATIO NS   DATA    SYSTEM   REPOSITORIES   SOURCES   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   RDBMS   EDW   MPP   Business     Analy4cs   Custom   Applica4ons   Packaged   Applica4ons   Source: IDC 2.8  ZB  in  2012   85%  from  New  Data  Types   15x  Machine  Data  by  2020   40  ZB  by  2020  
  • 6. © Hortonworks Inc. 2014 Modern Data Architecture EnabledAPPLICATIONS  DATA    SYSTEM   REPOSITORIES   SOURCES   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   RDBMS   EDW   MPP   Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)   OPERATIONAL   TOOLS   MANAGE  &   MONITOR   DEV  &  DATA   TOOLS   BUILD  &  TEST   Business     Analy4cs   Custom   Applica4ons   Packaged   Applica4ons  
  • 7. Hadoop Powers Modern Data Architecture Apache Hadoop is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment. Hadoop Cluster compute & storage . . . . . . . . compute & storage . . Hadoop clusters provide scale-out storage and distributed data processing on commodity hardware
  • 8. Integrated Interoperable with existing data center investments Skills Leverage your existing skills: development, operations, analytics Requirements for Hadoop Adoption Key Services Platform, operational and data services essential for the enterprise 3Requirements for Hadoop’s Role in the Modern Data Architecture
  • 9. © Hortonworks Inc. 2013 Use Cases for the MDA Page 9 Industry Use Case Type of Data Financial Services New Account Risk Screens Text, Server Logs Trading Risk Server Logs Insurance Underwriting Geographic, Sensor, Text Telecom Call Detail Records (CDRs) Machine, Geographic Infrastructure Investment Machine, Server Logs Real-time Bandwidth Allocation Server Logs, Text, Social Retail 360° View of the Customer Clickstream, Text Localized, Personalized Promotions Geographic Website Optimization Clickstream Manufacturing Supply Chain and Logistics Sensor Assembly Line Quality Assurance Sensor Crowdsourced Quality Assurance Social Healthcare Use Genomic Data in Medical Trials Structured Monitor Patient Vitals in Real-Time Sensor Pharmaceuticals Recruit and Retain Patients for Drug Trials Social, Clickstream Improve Prescription Adherence Social, Unstructured, Geographic Oil & Gas Unify Exploration & Production Data Sensor, Geographic & Unstructured Monitor Rig Safety in Real-Time Sensor, Unstructured Government ETL Offload in Response to Federal Budgetary Pressures Structured Sentiment Analysis for Government Programs Social
  • 10. © Hortonworks Inc. 2014 Microsoft in the Modern Data Architecture INFRASTRUCTURE   SOURCES   Emerging  Sources     (Sensor,  Sen4ment,  Geo,  Unstructured)   Exis4ng  Sources     (CRM,  ERP,  Clickstream,  Logs)   APPLICATIONS  DATA    SYSTEM   OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   Microsoft Applications New! Power BI Public Preview
  • 11. Today’s Topics • Introduction • What is a Hybrid Modern Data Architecture (MDA)? • Apache Hadoop in the Hybrid MDA • The Hybrid MDA and Microsoft • Q&A
  • 12. Hortonworks and Microsoft Engineering alignment Corporate alignment Field Alignment
  • 13. End-to-End Data Platform PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hortonworks Data Platform PDW SQL Server for DW in AzureSQL Server
  • 14. PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hadoop Solutions From Microsoft Hortonworks Data Platform
  • 15. Hortonworks Data Platform for Windows Hortonworks Data Platform
  • 16. Parallel Data Warehouse Next w/ HDInsight PDW vNext (PDW + HDInsight)
  • 17. Microsoft Confidential 17 Select … Hadoop Data Result Set Relatio nal Data PolyBase
  • 18. 18 Scale out technologies in SQL Server Parallel Data Warehouse
  • 19. Windows Azure HDInsight Windows Azure HDInsight
  • 20. Master Chief meets Big Data §  In-game analysis detects cheaters and improves experience for everyone §  Enables targeted campaigns that improve customer retention
  • 21. PDW vNext (PDW + HDInsight) Windows Azure HDInsight Hadoop Solutions From Microsoft Hortonworks Data Platform
  • 22. Development and Data Tools Hortonworks & Microsoft AMBARI MAPREDUCE YARN TEZ DATA SERVICES HIVE HBASE PIG HCATALOG HDFS JavaRPC INTERFACE ODBC JDBC JAVA RPC HADOOP Data Services Governance Exchange Replication Query/Visualization/ Reporting/Analytics SQOOP Reference Architecture SOURCE DATA JMS Queue’s Servers & Mainframe Files Databases Sensor data Social LOAD SQOOP FLUME Web HDFS Enterprise Repositories Management and Monitoring
  • 23. Question & Answer session will be conducted electronically, using the panel to the right of your screen More about Microsoft and Hortonworks http://hortonworks.com/labs/Microsoft Get started with Hortonworks Sandbox http://hortonworks.com/hadoop-tutorial/partner-tutorial-microsoft/ Follow us: @hortonworks @MicrosoftBI