Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Edw Optimization Solution

1,149 views

Published on

In 2017, more and more corporations are looking to reduce operational overheads in their enterprise data warehouse (EDW) installations. Hortonworks just launched Industry’s first turn key EDW Optimization solution together with our partners Syncsort and AtScale. Join Hortonworks’ CTO Scott Gnau to learn more about this exciting solution and its 3 use cases.

Published in: Technology

Edw Optimization Solution

  1. 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Scott Gnau, CTO @Scott_Gnau
  2. 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The Next Gen EDW is the Big Data Warehouse  In Forrester’s 2016 global survey, 59% of respondents stated that leveraging big data and analytics was a critical or high priority.
  3. 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Companies Are Looking to Big Data for EDW Optimization  82% of 2550+ respondents are looking to Big Data for EDW Optimization rather than a straight replacement. – 2016 Big Data Maturity Survey
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Connected Data Platforms and Solutions Hortonworks Connection Hortonworks Solutions Enterprise Data Warehouse Optimization Cyber Security and Threat Management Internet of Things and Streaming Analytics Hortonworks Connection Subscription Support SmartSense Premier Support Educational Services Professional Services Community Connection Cloud Hortonworks Data Cloud AWS HDInsight Data Center Hortonworks Data Suite HDFHDP
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Drivers of a Modern BI Infrastructure Deeper and Broader Data Sets Complete Data ‘Provenance’ Leading Analytics and Tools Integrate non-EDW data and EDW data Total Cost of Ownership
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Open Source Transformational Impact to EDW Unmatched Economics support low cost data-center and cloud architectures for Enterprise Apache Hadoop Eliminates Risk and Ensures Integration prevents vendor lock-in and speeds ecosystem adoption of ODPi-compliant core COST EFFICIENCY DATA VARIETY EDW PROPRIETARY HADOOP HORTONWORKS OPEN SOURCE RDBMS
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved But, why aren’t more companies running to this solution? Risky Hadoop requires a bunch of new skill sets It’ll take a long time There’s too much manual coding required It’s hard to integrate to my BI tool stack
  8. 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Legacy EDW vs. EDW Optimization Solution with Connected Data Platforms
  9. 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: Fast BI on Hadoop  The Problem: – Legacy EDW systems were adopted for Fast BI and deep slice-and-dice analytics, but EDW costs can limit breadth and depth of these analytics.  The Solution: – Interactive SQL is a reality on Hadoop today. – AtScale Intelligence Platform adds OLAP capabilities for deep drilldown at scale.  The Result: – Query terabytes of data in seconds. – Connect your favorite BI tools like Tableau and Excel through SQL and MDX interfaces. – The EDW Optimization Solution is tailor-made to deliver Fast BI on Hadoop. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  10. 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: ETL Offload  The Problem: – EDWs can consume between 50% and 90% of resources just on ETL/ELT tasks. – These jobs interfere with more business- critical tasks like BI and advanced analytics.  The Solution: – Hive and HDP deliver ETL that scales to petabytes. – Syncsort DMX-h for simple drag-and-drop ETL workflows. – Economical scale-out processing on commodity servers.  The Result: – Better SLAs for mission-critical analytics. – Limit EDW expansion or retire old systems. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  11. 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization: Active Archive  The Problem: – Increasing data volumes and cost pressure force data to be archived to tape. – Archived data not available for analytics, or must be retrieved at great expense.  The Solution: – Adopting Hadoop delivers cost per terabyte on par with tape backup solutions. – Data in Hadoop can be analyzed by all major BI tools, allowing analytics on archive data.  The Result: – Data always available for analytics. – Store years of data rather than months. ETL/ELT DATA MART DATA LANDING & DEEP ARCHIVE CUBE MART END USER APPLICATIONS APPLICATIONS APPLICATIONS END USERS AND APPS EDW OPTIMIZATION SOLUTION
  12. 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Multi-Channel Behavioral Analysis  Industry: Mass Media – Largest broadcasting and cable company in the world by revenue – Multiple channels: Cable (set-top-box), wireless devices, streaming programming, – 22 million+ subscribers (internet & video)  Results: – Scalability: 480B rows, 500 nodes – 60x query performance improvement – Insights: New info improve negations – Loyalty: Outreach to customers viewing competitive streams; ▼churn ▲ revenue Before After Leading Media Company Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Netezza Data Mart Channel Feeds Tableau + MS Excel + R Channel Feeds Tableau + MS Excel
  13. 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Campaign Paid-Search Effectiveness: Retail  Industry: Retail / eCommerce – Top US department store (by rev) – Online sales $4B+ & growing (11%+ total) – 800+ department stores nationwide  Results – Scale: Millions paid keywords analyzed – Speed: Eliminate extract step – Insight: Operationalized closed-loop analysis  insight  decision  action – Impact: Make and save $ millions w/ instant bid decisions over 6-week season  that drives 60% annual revenue Before After Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Vertica Data Marts Ad & Paid Keywords Cognos + Tableau + Excel Ad & Paid Keywords Tableau + Excel Leading Retailer
  14. 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Client and Patient Analysis  Industry: Managed Health Care – Member of Fortune 100 – Health, life + other insurance products – ~ 52 million members; medical/dental/pharm  Results – Scalable: BI directly on 264+ nodes data – Time: Eliminate data movement step – 62x query performance improvement – Speed: <2.2 second average query time – Insight: Tableau on Hadoop for 1000+ – Security: Access control by user; HIPAA Before After Leading Managed Healthcare Provider Hortonworks HDP AtScale Intelligence Server Hortonworks HDP Netezza Data Mart Client / Patient Details Tableau + MS Excel Client / Patient Details Tableau + MS Excel
  15. 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Solution Architecture Inbound HDFS (Based Data and Aggregates Stored in ORC) HIVE (Batch and Interactive SQL) HORTONWORKS DATA PLATFORM (HDP) MULTITENANT PROCESSING: YARN (syncsort, llap, spark, tez) AtScale virtual cube DMX Data Funnel DMX-h Engine EDW/ Legacy 4. Build Virtual Cube using AtScale 5. Build aggregates in Atscale for optimization 6. Query data using BI Tool like Tableau/Excel through odbc/jdbc connection High Level Flow 1. Install HDP, Syncsort and AtScale 2. Install EDW/Hive Drivers on Edge Node 3. Bring all tables involved in use case using Syncsort data funnel into Hive
  16. 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks EDW Optimization Solution Components Syncsort High-Performance Data Movement Hadoop Scalable Storage and Compute Hive LLAP High Performance SQL Data Mart AtScale Intelligence Platform OLAP Cubes for Higher Performance Source Data Systems Fast, scalable SQL analytics Intelligent in-memory caching Define OLAP cubes for 10x faster queries Unified semantic layer for all BI tools High performance data import from all major EDW platforms Pre-aggregated data ... Or, full-fidelity data
  17. 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ETL Workflow Onboarding: SyncSort DMX-h
  18. 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hybrid Query Service ❑ Choice of BI Tool ❑ Zero Client Install ❑ Secure Data Access ❑ Optimized Queries
  19. 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Enterprise Data Optimization Solution Components  Hortonworks: 24 nodes of Enterprise Plus Support  Syncsort: 24 nodes of DMX-H  AtScale: 24 nodes of AtScale Intelligence Platform  Single Legacy Data source  1 Fact table with 5 Dimensions  Load up to 15 tables  One time data dump  Up to 1 cube with 10 measures  1 BI Connection  5TB Total Cube Limit 12 month license and support offering Pre-packaged Professional Services
  20. 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future Proof  Hive Optimizations – Hve, Tez, ORC, LLAP – Additional SQL coverage  ACID Merge for SQL 2011 compliant (Upsert)  Business Continuity Options – Replication – Backup/Restore  Additional Hive options tech preview in 2.6
  21. 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Package: Professional Services ‘Proof of Value’ 1. Install HDP, AtScale and Syncsort 2. Configure drivers for appropriate EDW and Hive on Edge Node 3. Enable and configure Interactive Hive (LLAP) 4. Ingest data from 1 legacy system 5. Create up to 3 BI cubes 6. Support connection to BI Tool 7. Demo of capabilities ( functionality and Performance). Under 10 second response time. 8. Solution Architecture Document and Schema definition
  22. 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved EDW Optimization Solution - Try It Now! Tool-based approach means we can leverage existing skillsets Proof points in 60 days Integrated into my BI tool stack Hive supports scaled queries and fast queries It works!
  23. 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved To Learn More  Everyone will receive a free copy of Forrester White Paper titled ”The Next-Generation EDW Is The Big Data Warehouse”  EDW Optimization with HDP – http://hortonworks.com/solutions/edw-optimization/ – EDW Optimization 7 min video  AtScale Intelligence Platform – http://hortonworks.com/partner/atscale/  Syncsort DMX-h – http://hortonworks.com/partner/syncsort/
  24. 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Connected Data Platforms and Solutions Hortonworks Connection Hortonworks Solutions Enterprise Data Warehouse Optimization Cyber Security and Threat Management Internet of Things and Streaming Analytics Hortonworks Connection Subscription Support SmartSense Premier Support Educational Services Professional Services Community Connection Cloud Hortonworks Data Cloud AWS HDInsight Data Center Hortonworks Data Suite HDFHDP
  25. 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You

×