Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!


Published on

So you built your Hadoop cluster. How do you get data from hundreds of database tables, streaming Kafka sources, and data shared by 20-year-old COBOL programs, all in there and working together quickly, efficiently and securely? With many customers asking this same question, Hortonworks recently expanded its partnership with Syncsort to provide optimized ETL onboarding for Hadoop. During this talk, we'll discuss how a next-generation ETL tool, built on contributions to the open source community and natively integrated in Hadoop, can drive lasting value for your organization. 1) Seamlessly onboard data from all your enterprise sources – batch and streaming -- into Hadoop for fast and easy analytics. 2) Stay agile and simplify your environment with a "design once, deploy anywhere" approach that minimizes disruption and risk in the face of a rapidly evolving big data ecosystem. 3) Secure, govern and manage your data with full integration with Apache Ambari, Apache Ranger, and more. These benefits come to life with real customer case studies. Learn how a national insurance company and global hotel chain are using Hortonworks HDP and Syncsort DMX-h to get bigger insights from their enterprise data, securely, efficiently, and cost-effectively, without spending hundreds of man-hours.

Published in: Software
  • Be the first to comment

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!

  1. 1. Powering the Connected Data Platform With ETL Onboarding @Scott_Gnau CTO, Hortonworks @TenduYogurtcu Big Data GM, Syncsort
  2. 2. Global Leader in Big Iron to Big Data Solutions 2Syncsort Confidential and Proprietary - do not copy or distribute • Provider of enterprise software and leader in Big Iron to Big Data solutions in more than 85 countries around the world • Global presence in 87% of enterprise Fortune 500 companies • High performance & scalable software harnessing valuable data assets to power business and operational analytics, while dramatically reducing the cost of mainframe and legacy systems • Unique focus on customer value through cost-effective solutions and unparalleled support; trusted leader for nearly 50 years WOODCLIFF LAKE, NJ JAPAN SINGAPORE 2 Global customer base of leaders and emerging businesses across all major industries Strategic partnerships in Big Iron and Big Data ecosystems
  3. 3. Meet Today’s Presenters 3Syncsort Confidential and Proprietary - do not copy or distribute Scott Gnau CTO, Hortonworks Tendu Yogurtcu, PhD GM, Big Data, Syncsort
  4. 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Open and Connected Data Platforms DATA AT REST DATA IN MOTION ACTIONABLE INTELLIGENCE The Future of the Enterprise is About All Data Modern Data Applications
  5. 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Modern Data Applications Modern Data Architecture • ALL Data: Data-at-Rest & Data-in-Motion • Cloud & Data Center • Powered by Open Source Big Data Analytics & IoT Next Generation Data Use-Cases: • Predictive Retail • Factory Automation • Connected Cars • Predictive Analytics • Artificial Intelligence The Shift to the Modern Data Architecture System-centric User-centric Relational Database Mainframe Client/Server Web & SaaS IDMS Data at Rest Data in Motion ACTIONABLE INTELLIGENCE Modern Data Applications
  6. 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Connected Data Platforms Enable Enterprise Transformations Data in Motion Data in Motion Data at Rest Data at Rest Machine Learning Deep Historical Analysis CLOUD DATA CENTER Stream Analytics Edge Data Edge Data Edge Analytics
  7. 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data is the new Raw Material for Commerce  Easy Onboarding of New Data from New Sources  Access to Data from Legacy Systems and Apps  Successful Modern Data Apps  New Business and Revenue models All Data
  8. 8. Data – Raw Material for Advanced Analytics 8
  9. 9. Syncsort Makes ALL Data Accessible & Usable – Ready for Analytics 9
  10. 10. Our Strategy: Simplify Big Data Integration • Deploy on premise or in the cloud • Choose among multiple execution frameworks – Hadoop, Spark, Linux, Unix, Windows • Integrate streaming and batch data with a single data pipeline for innovative applications, like IoT • Future-proof applications to avoid re-writing jobs in order to take advantage of innovations in new execution frameworks • Access and integrate ALL enterprise data sources – including mainframe – for advanced analytics 10
  11. 11. Three Commitments Underpin Our Big Data Integration Strategy Syncsort Confidential and Proprietary - do not copy or distribute 12 Light footprint Self-tuning engine Single install. No 3rd party dependencies World-class data processing, mainframe expertise JIRA: MAPREDUCE-2454 MAPREDUCE-4807 MAPREDUCE-4049 MAPREDUCE-5455 HIVE-8347 SQOOP-1272 PARQUET-134 Spark-packages and more! Ongoing Contributions to the Open Source Community 1 Leverage Syncsort Technology Innovations & Mainframe Heritage 2 Strong Partnerships with Strategic Big Data & Hadoop Players 3
  12. 12. ETL Onboarding with Syncsort 13
  13. 13. Insurance: Easy Access to ALL Data for Better Analytics 14Syncsort Confidential and Proprietary - do not copy or distribute • Challenge: Needed hard-to-access operational data for advanced analytics • Solution: • Quickly load ~1000 database tables into HDP with the click of a button • Access & integrate complex Mainframe VSAM files, data from DB2/z, Oracle & SQL Server • Track changes & keep data up to date • Benefits: • Insight: Better and faster analytics • Agility: Reclaim development time; single tool to ingest, detect changes and populate the data lake • Compliance: Build audit trails, keep EDW current • Productivity: No need for deep understanding of Hadoop
  14. 14. Leading Media Company: Accelerate New Business Initiatives 15Syncsort Confidential and Proprietary - do not copy or distribute • Challenge: Build scalable platform to support new business initiatives & scale for double-digit data growth, while reducing escalating EDW & ELT Costs • Solution: • Shift data storage & processing out of the EDW into Hadoop • Migrate 500+ SQL ELT workloads to DMX-h on HDP • Benefits: • Agility: Scalable architecture to deploy new business initiatives – analyze more set top box data, blend website user activity data, etc. • Cost: Millions of dollars in savings from EDW, including SQL tuning & maintenance costs • Productivity: ETL developers can stop coding & tuning, and get up & running on Hadoop quickly
  15. 15. Hotel Chain: Ease of Use, Timely & Up-to-Date Reporting 16 • Challenge: More timely collection & reporting on room availability, event bookings, inventory and other hotel data from 4,000+ properties globally • Solution: • Near real-time reporting • DMX-h consumes property updates from Kafka every 10s • DMX-h processes data on HDP, loading to TD every 30 min • Deployed on Google Cloud Platform • Benefits: • Time to Value: DMX-h ease of use drastically cut development time • Agility: Reports updated every 30 minutes vs every 24 hours • Productivity: Leveraging ETL team for Hadoop (Spark), visual understanding of data pipeline • Insight: Up-to-date data = better business decisions = happier customers
  16. 16. Syncsort DMX-h: Benefits to Business 17Syncsort Confidential and Proprietary - do not copy or distribute • Faster Time to Value: •Faster & better insights with readily-accessible data • Compliance: •Secure data access, ability to build audit trails • Increased Productivity: •Reclaim development time by automating, optimizing and future-proofing development •Across platforms, on premise and in the cloud • Cost: •Lower archival costs •Reduced development time •Reduced Total Cost of Ownership, higher ROI
  17. 17. Syncsort Confidential and Proprietary - do not copy or distribute 18 See For Yourself! *** Take a 30-day Free Trial @