Transform Your Business with Big Data and Hortonworks

431 views

Published on

Customer insight and marketplace predictions are a few of the profitable benefits found in big data technology. Leading companies are using the advanced analytics solution to find new revenue streams, increase customer satisfaction and optimize the supply chain.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Transform Your Business with Big Data and Hortonworks

  1. 1. CONSULTING SOLUTIONS OUTSOURCING PARTNER FOR A NEW ERA Transform Your Business with Big Data and Hortonworks Tom Kersnick – Pactera – Director Big Data Solutions Robby Richardson – Hortonworks – Enterprise Account Manager
  2. 2. Topics © Pactera. Confidential. All Rights Reserved. 2 Who is Hortonworks? 3 Hortonworks HDP: Enterprise Hadoop Distribution 4 5 Pactera Intro 6 Big Data Deep Dive Hadoop 2.0: The Enterprise Generation 1 Hortonworks Intro 2
  3. 3. Hortonworks Snapshot • We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform • We engineer, test & certify HDP for enterprise usage • We employ the core architects, builders and operators of Apache Hadoop • We drive innovation within Apache Software Foundation projects • We are uniquely positioned to deliver the highest quality of Hadoop support • We enable the ecosystem to work better with Hadoop Develop Distribute Support We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution Endorsed by Strategic Partners Headquarters: Palo Alto, CA Employees: 200+ and growing Investors: Benchmark, Index, Yahoo 3© Pactera. Confidential. All Rights Reserved. 3
  4. 4. Rapid Customer Growth 4© Pactera. Confidential. All Rights Reserved. 4
  5. 5. Hortonworks HDP: Enterprise Hadoop 1.x Distribution © Pactera. Confidential. All Rights Reserved. OS Cloud VM Appliance PLATFORM SERVICES HADOOP CORE Enterprise Readiness High Availability, Disaster Recovery, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OPERATIONAL SERVICES DATA SERVICES HIVE (HCATALOG) PIG HBASE OOZIE AMBARI HDFS MAP REDUCE Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source and complete distribution • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability SQOOP FLUME NFS LOAD & EXTRACT WebHDFS 5
  6. 6. Hadoop 2.0… The Enterprise Generation © Pactera. Confidential. All Rights Reserved. Business Value Big Data Transactions, Interacti ons, Observations Single Platform Multiple Use BATCH INTERACTIVE ONLINE 1.0 Architected for the Large Web Properties 2.0 Architected for the Broad Enterprise Enterprise Requirements Hadoop 2.0 Features Mixed workloads YARN Interactive Query Hive on Tez Reliability Full Stack HA Point in time Recovery Snapshots Multi Data Center Disaster Recovery ZERO downtime Rolling Upgrades Security Knox Gateway 6
  7. 7. HDP: Enterprise Hadoop 2.0 Distribution © Pactera. Confidential. All Rights Reserved. OS/VM Cloud Appliance PLATFORM SERVICES HADOOP CORE Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OPERATIONAL SERVICES DATA SERVICES HIVE & HCATALOG PIG HBASE HDFS MAP Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source and complete distribution • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability SQOOP FLUME NFS LOAD & EXTRACT WebHDFS KNOX* OOZIE AMBARI FALCON* YARN* TEZ* OTHERREDUCE 7
  8. 8. Seamless Interoperability with Microsoft Tools © Pactera. Confidential. All Rights Reserved. • Integrated with Microsoft tools for native big data analysis » Bi-directional connectors for SQL Server and SQL Azure through SQOOP » Excel ODBC integration through Hive • Addressing demand for Hadoop on Windows » Ideal for Windows customers with Hadoop operational experience • Enables most common Hadoop workloads in the Enterprise » Data refinement and ETL offload for high-volume data landing » Data exploration for discovery of new business opportunities » Data enrichment for fined tuned delivery and recommendation engines APPLICATIONSDATASYSTEMS Microsoft Applications HORTONWORKS DATA PLATFORM For Windows DATASOURCES MOBILE DATA OLTP, PO S SYSTEMS Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media) 8
  9. 9. Transferring Our Hadoop Expertise to You © Pactera. Confidential. All Rights Reserved. The expert source for Apache Hadoop training & certification • World class training programs designed to help you learn fast • Role-based hands on classes with 50% lab time • Certification to demonstrate Hadoop Expertise in Development and Administration • Expert consulting services • Programs designed to transfer knowledge • Industry leading Hadoop Sandbox • Free download • Fastest way to learn Apache Hadoop • Personal, portable Hadoop environment 9
  10. 10. Hortonworks Summary © Pactera. Confidential. All Rights Reserved. • Leading the Innovation in Core Hadoop • Addressing the requirements for Enterprise usage • Enabling interoperability of the ecosystem • No lock-in. 100% Open Source. • Best in industry support with flexible pricing model • Find out moreworks.com » www.hortonworks.com/hadoop-training/ » www.hortonworks.com/sandbox 10
  11. 11. Big Data is Critical © Pactera. Confidential. All Rights Reserved. Challenges to Using Big Data Given that nearly less than one-third of businesses are in the dark about their available data, it makes sense that silos are the primary hurdle in using this information. Lack of sharing data is an obstacle to measuring marketing ROI Not using data effectively to personalize marketing communications Not able to link data together at the individual customer level Data collected infrequently or not quickly enough Too little or no customer/ consumer data 51% 45% 42% 39% 29% 11
  12. 12. What Initiatives Are Using Big Data © Pactera. Confidential. All Rights Reserved. 12
  13. 13. Obstacles to Define Big Data ROI © Pactera. Confidential. All Rights Reserved. Not enough skilled resources for adaptation • Advance competencies Traditional IT Architectures cause limitations • Identifying the right technologies • Adapting to particular needs • Assemble business use cases • Silos Optimizing Solutions • Strong internal use cases • Inability to effectively automate data 13
  14. 14. Keys to a Successful Big Data Initiative © Pactera. Confidential. All Rights Reserved. Define the Impact • Short term VS. Long term measures What cannot be answered today? • This is your starting point Create User Centric Internal Applications • Decision support framework Predicting the Consumer • Algorithms, Models, Testing, and More Testing! 14
  15. 15. Solution Architecture using Multiple Ecosystems © Pactera. Confidential. All Rights Reserved. incoming outgoing Real Time In-Memory Solution EDW Hadoop Sand box 2 3 4 7 8 9 6 5 Models Algorithms Simulations 1. Data Feeds into a Real-Time Memory solution that will ingest data into EDW, Hadoop, and other platforms as mobile, API’s, etc. 2. ELT streaming into In-Memory Solution to provide visibility to Real-Time Social, Mobile, and Shell approaches to Algorithms, Models, and Simulations 3. In-Memory Real-Time Solution such as YARN or Storm to digest data to EDW, Hadoop, Social Media, and other such platforms. 4. EDW for Structured Information from Sources in 1. 5. Hadoop for semi-structured and unstructured data. Solution architecture including Sand Box availability. 6. Shell UI Interfaces utilizing data from Real-Time in memory solution as well as EDW, Hadoop, etc. for Models, Algorithms, and Simulations. 7. Structured and Unstructured Reporting in reporting interfaces 8. Deep Dive analytics in Hadoop and Real-time Streaming 9. Real-Time customer interaction for Social and other similar platforms. 1 15
  16. 16. Predictive Analysis Use Case for Online Travel Company 16© Pactera. Confidential. All Rights Reserved.
  17. 17. Flight Cost by Variants Determination Data Feeds utilize real-time in-memory streaming to execute matching algorithms. Used in order to determine views within a session of certain one-way and round trip flights viewed by users. Predictive Analytics algorithms determine how to increase/decrease prices based on views, market pricing, time, and availability. © Pactera. Confidential. All Rights Reserved. http logs partners custom incoming outgoing destinations rdbms hadoop application mobile Real Time In-Memory Solution (Storm) 17
  18. 18. Solution Architecture using YARN © Pactera. Confidential. All Rights Reserved. • Created to manage resource needs across all uses • Ensures predictable performance & QoS for all apps • Enables apps to run “IN” Hadoop rather than “ON” » Key to leveraging all other common services of the Hadoop platform: security, data lifecycle management, etc. Applications Run Natively IN Hadoop HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) 18
  19. 19. Pactera Big Data Capability © Pactera. Confidential. All Rights Reserved. Big Data Solution Architecture  In-Memory Solutions  Scalable Distributed Platforms Next Generation Analytics  Models, Algorithms, and Simulations  Visualization Improving Operational Ability  Help companies drive more operational efficiencies from existing investments.  Moving from the realm of data scientists into everyday business transactions and encounters. New Business Processes  Impact on both customer intelligence and operational efficiency by making everything immediately actionable.  Armed with immediate decision-making capability and intelligence, companies will be able to implement new business processes that will change how business is done.  We ask the Right Questions 19
  20. 20. How Pactera can help with Big Data Implementation and Architecture Benchmark and Monitoring Implementation and Architecture POC (2-4 Weeks) © Pactera. Confidential. All Rights Reserved. Executive Workshop Strategies, Planning, and Expectations • Big Data strategy on what tomorrow will look like • Using Big Data to establish market dominance • Big Data project takeaways • Roadblocks to implementing Big Data analytics • Defining an ROI for Big Data • Getting the right ROI on Big Data Workshop (4 Hours) Proof of Concept (2-4 Weeks) Projects: •Benchmark & Monitoring •Integrations & Migrations •Implementation & Architecture •Project Management •Analytics •Reporting Technical Workshop End-To-End Management • System tuning/auto-tuning and configuration management • Dealing with both structured and unstructured data • Monitoring, diagnosis, and automated behavior detection Solution Architecture • Processor, memory, and system architectures for data analysis • Benchmarks, metrics, and workload characterization for big data • Availability, fault tolerance and recovery issues • Data management and analytics for vast amounts of unstructured data 20
  21. 21. © Pactera. Confidential. All Rights Reserved. Thank You Tom Kersnick tom.kersnick@pactera.com Robby Richardson rrichardson@hortonworks.com

×