Hortonworks Presentation at Big Data London

  • 2,311 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,311
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
151
Comments
0
Likes
8

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. HortonworksEnterprise Apache HadoopMarch 5, 2013© Hortonworks Inc. 2013 Page 1
  • 2. Hortonworks•  Who is Hortonworks•  Our Approach•  Customer Use Cases Page 2 © Hortonworks Inc. 2013
  • 3. Housekeeping Items•  Restrooms on 2nd and 4th Floors•  Hadoop Summit –  March 20-21 in Amsterdam –  PreConference Training on March 18-19 –  Discount Code Amst13Spon20•  Download SandBox –  QR Code at postcode on table Page 3 © Hortonworks Inc. 2013
  • 4. A Brief History of Apache Hadoop Apache Project Yahoo! begins to Hortonworks Established Operate at scale Data Platform 2013 2004 2006 2008 2010 2012 Enterprise Hadoop2005: Yahoo! creates team under E14 to Focus on INNOVATION work on Hadoop 2008: Yahoo team extends focus to operations to support multiple Focus on OPERATIONS projects & growing clusters 2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with 24 STABILITY key Hadoop engineers from Yahoo Page 4 © Hortonworks Inc. 2013
  • 5. Hortonworks Snapshot We develop, distribute and support the ONLY 100% open source Headquarters: Palo Alto, CA Employees: 180+ and growing Enterprise Hadoop distribution Investors: Benchmark, Index, YahooDevelop Distribute Support•  We employ the core •  We distribute the only 100% •  We are uniquely positioned architects, builders and Open Source Enterprise to deliver the highest quality operators of Apache Hadoop Hadoop Distribution: of Hadoop support Hortonworks Data Platform•  We drive innovation within •  We enable the ecosystem to Apache Software •  We engineer, test & certify work better with Hadoop Foundation projects HDP for enterprise usageEndorsed by Strategic Partners Page 5 © Hortonworks Inc. 2013
  • 6. Hortonworks•  Who is Hortonworks•  Our approach –  Leading Open Source Hadoop innovation –  Addressing “Enterprise Hadoop” Requirements –  Enabling Interoperability of the Ecosystem –  Ensuring No Lock-In: 100% Open Source•  Patterns of Use Page 6 © Hortonworks Inc. 2013
  • 7. Apache Community Leadership Apache Apache Software Foundation Pig Test & Guiding Principles Patch Release Apache •  Release early & often Hadoop Apache •  Transparency, respect, meritocracy Hive Design & Develop Key Roles held by Hortonworkers Apache Apache HBase HCatalog •  VP & PMC Members –  Arun Murthy (Hadoop), Daniel Dai (Pig), Apache Ambari Mahadev Konar (Zookeeper) Other Apache Projects •  Release Managers –  Matt Foley (Hadoop 1.x), Arun Murthy (Hadoop 2.x), Ashutosh Chauhan (Hive),“We have noticed more activity over the last year Daniel Dai (Pig), Alan Gates (HCatalog), from Hortonworks’ engineers on building out Mahadev Konar (Ambari) Apache Hadoop’s more innovative features. These include YARN, Ambari and HCatalog..” •  Committers - Jeff Kelly: Wikibon –  54 across all Hadoop-related projects Page 7 © Hortonworks Inc. 2013
  • 8. Leadership that Starts at the Core•  Driving next generation Hadoop –  YARN, MapReduce2, HDFS2, High Availability, Disaster Recovery•  420k+ lines authored since 2006 –  More than twice nearest contributor•  Deeply integrating w/ecosystem –  Enabling new deployment platforms –  (ex. Windows & Azure, Linux & VMware HA) –  Creating deeply engineered solutions –  (ex. Teradata big data appliance)•  All Apache, NO holdbacks –  100% of code contributed to Apache Page 8 © Hortonworks Inc. 2013
  • 9. Driving Enterprise Hadoop Innovation Lines Of Code By Company Hortonworks Cloudera Source: Apache Software Foundation Committers Committers HADOOP 19 9 CORE PIG 5 1 HIVE 1 0HCATALOG 5 0 HBASE 3 7 AMBARI 14 0 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Hortonworks Yahoo! Cloudera Other Page 9 © Hortonworks Inc. 2013
  • 10. Hortonworks Process for Enterprise HadoopUpstream Community Projects Downstream Enterprise Product Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstream Integrate & Test Fixed Issues Apache Design & Pig Test & Patch Develop Apache Release Package Hadoop & Certify Apache Stable Project Hortonworks Hive Releases Design & Develop Data PlatformApache ApacheHBase HCatalog Distribute Apache Other Ambari Apache Projects No Lock-in: Integrated, tested & certified distribution lowers risk by ensuring close alignment with Apache projects Page 10 © Hortonworks Inc. 2013
  • 11. Hortonworks•  Who is Hortonworks•  Our approach –  Leading Open Source Hadoop Innovation –  Addressing “Enterprise Hadoop” Requirements –  Enabling Interoperability of the Ecosystem –  Ensuring NO LOCK-IN: 100% Open Source•  Patterns of use Page 11 © Hortonworks Inc. 2013
  • 12. Enhancing the Core of Apache Hadoop Deliver high-scale storage & processing with enterprise-ready platform services Distributed Unique Focus Areas: HADOOP  CORE   Storage & Processing •  Bigger, faster, more flexible Continued focus on speed & scale and PLATFORM  SERVICES   Enterprise Readiness enabling near-real-time apps •  Tested & certified at scale Run ~1300 system tests on large Yahoo clusters for every release Hortonworkers are the architects, operators, and builders of core Hadoop •  Enterprise-ready services High availability, disaster recovery, snapshots, security, … Page 12 © Hortonworks Inc. 2013
  • 13. Data Services for Full Data Lifecycle DATA   Provide data services to SERVICES   store, process & access Store, data in many ways Process and Access Data Unique Focus Areas: Distributed •  Apache HCatalog HADOOP  CORE   Storage & Processing Metadata services for consistent table access to Hadoop data PLATFORM  SERVICES   Enterprise Readiness •  Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools Hortonworks enables Hadoop data to be accessed via existing tools & systems Page 13 © Hortonworks Inc. 2013
  • 14. Operational Services for Ease of Use OPERATIONAL   DATA   Include complete SERVICES   SERVICES   operational services for Manage & Store, productive operations Operate at Process and Scale Access Data & management Distributed Unique Focus Area: HADOOP  CORE   Storage & Processing •  Apache Ambari: Provision, manage & monitor a cluster; PLATFORM  SERVICES   Enterprise Readiness complete REST APIs to integrate with existing operational tools; job & task visualizer to diagnose issues Only Hortonworks provides a complete open source Hadoop management tool Page 14 © Hortonworks Inc. 2013
  • 15. Deployable Across a Range of Options OPERATIONAL   DATA   Only Hortonworks SERVICES   SERVICES   allows you to deploy Manage & Store, seamlessly across any Operate at Process and Scale Access Data deployment option Distributed •  Linux & Windows HADOOP  CORE   Storage & Processing •  Azure, Rackspace & other clouds •  Virtual platforms PLATFORM  SERVICES   Enterprise Readiness •  Big data appliances HORTONWORKS     DATA  PLATFORM  (HDP)   OS   Cloud   VM   Appliance   Page 15 © Hortonworks Inc. 2013
  • 16. HDP: Enterprise Hadoop Distribution OPERATIONAL   DATA   Hortonworks SERVICES   SERVICES   Data Platform (HDP) Manage & Store, Operate at Process and Enterprise Hadoop Scale Access Data •  The ONLY 100% open source HADOOP  CORE   Distributed and complete distribution Storage & Processing PLATFORM  SERVICES   Enterprise Readiness •  Enterprise grade, proven and tested at scale HORTONWORKS     DATA  PLATFORM  (HDP)   •  Ecosystem endorsed to ensure interoperability OS   Cloud   VM   Appliance   Page 16 © Hortonworks Inc. 2013
  • 17. HDP 1.2: Data Services Improvements OPERATIONAL   DATA   Hortonworks SERVICES   SERVICES   Data Platform (HDP) AMBARI   FLUME   PIG   HIVE   HBASE   Enterprise Hadoop OOZIE   SQOOP   HCATALOG   •  The ONLY 100% open source HADOOP  CORE   WEBHDFS   MAP  REDUCE   and complete distribution HDFS   YARN  (in  2.0)   Enterprise Readiness PLATFORM  SERVICES   High Availability, Disaster Recovery, •  Enterprise grade, proven and Snapshots, Security, etc… tested at scale HORTONWORKS     DATA  PLATFORM  (HDP)   •  Ecosystem endorsed to ensure interoperability OS   Cloud   VM   Appliance   Page 17 © Hortonworks Inc. 2013
  • 18. Latest Hortonworks AnnouncementsTwo releases in January 2013 JANUARY Hortonworks Data Platform 1.2 Hortonworks Brings Enterprise Manageability to 100% 15 Open Source Apache Hadoop Distribution JANUARY Hortonworks Sandbox Hortonworks accelerates Hadoop skills development 22 with an easy-to-use, flexible and extensible platform to learn, evaluate and use Apache Hadoop Page 18 © Hortonworks Inc. 2013
  • 19. Latest Hortonworks AnnouncementsFebruary 2013 February Hortonworks : New Apache projects Hortonworks fuel the Open Source by releasing three 20 new projects : KNOX / TEZ / STINGER February HDP available on Microsoft Windows To help the Hadoop adoption, Hortonworks release 25 HDP on Microsoft Windows Page 19 © Hortonworks Inc. 2013
  • 20. Hortonworks•  Who is Hortonworks•  Our approach –  Leading Open Source Hadoop Innovation –  Addressing “Enterprise Hadoop” Requirements –  Enabling Interoperability of the Ecosystem –  Ensuring No Lock-in: 100% Open Source•  Patterns of use Page 20 © Hortonworks Inc. 2013
  • 21. Existing Data ArchitectureAPPLICATIONS   Business   Custom   Enterprise   AnalyLcs   ApplicaLons   ApplicaLons   DEV  &  DATA   TOOLS   BUILD  &   TEST  DATA  SYSTEMS   OPERATIONAL   TOOLS   MANAGE  &   MONITOR   RDBMS   EDW   MPP   TRADITIONAL  REPOS  DATA  SOURCES   TradiLonal  Sources     (RDBMS,  OLTP,  OLAP)   OLTP,  POS   SYSTEMS   Page 21 © Hortonworks Inc. 2013
  • 22. An Emerging Data ArchitectureAPPLICATIONS   Business   Custom   Enterprise   AnalyLcs   ApplicaLons   ApplicaLons   DEV  &  DATA   TOOLS   BUILD  &   TEST  DATA  SYSTEMS   OPERATIONAL   TOOLS   HORTONWORKS     MANAGE  &   DATA  PLATFORM   MONITOR   RDBMS   EDW   MPP   TRADITIONAL  REPOS  DATA  SOURCES   TradiLonal  Sources     New  Sources     (RDBMS,  OLTP,  OLAP)   OLTP,  POS   (web  logs,  email,  sensor  data,  social  mMOBILE   edia)   SYSTEMS   DATA   Page 22 © Hortonworks Inc. 2013
  • 23. Interoperating With Your ToolsAPPLICATIONS   Microsoft Applications DEV  &  DATA   TOOLS  DATA  SYSTEMS   OPERATIONAL   TOOLS   HORTONWORKS     DATA  PLATFORM   TRADITIONAL  REPOS   ViewpointDATA  SOURCES   TradiLonal  Sources     New  Sources     (RDBMS,  OLTP,  OLAP)   OLTP,  POS   (web  logs,  email,  sensor  data,  social  mMOBILE   edia)   SYSTEMS   DATA   Page 23 © Hortonworks Inc. 2013
  • 24. Hortonworks•  Who is Hortonworks•  Our approach –  Leading Open Source Hadoop Innovation –  Addressing “Enterprise Hadoop” Requirements –  Enabling Interoperability of the Ecosystem –  Ensuring No Lock-In: 100% Open Source•  Patterns of use Page 24 © Hortonworks Inc. 2013
  • 25. Hortonworks•  Who is Hortonworks•  Our approach•  Patterns of use Page 25 © Hortonworks Inc. 2013
  • 26. Operational Data Refinery Refine Explore EnrichAPPLICATIONS   Business   Custom   Enterprise   Collect data and apply AnalyLcs   ApplicaLons   ApplicaLons   a known algorithm to it in trusted operational process 1 Capture 3 Capture all dataDATA  SYSTEMS   HORTONWORKS     DATA  PLATFORM   2 2 Process RDBMS   EDW   MPP   TRADITIONAL  REPOS   Parse, cleanse, apply structure & transform 3 Exchange 1 Push to existing data warehouse for use with existing analytic toolsDATA  SOURCES   TradiLonal  Sources     New  Sources     (RDBMS,  OLTP,  OLAP)   (web  logs,  email,  sensor  data,  social  media)   Page 26 © Hortonworks Inc. 2013
  • 27. Big Data Exploration & Visualization Refine Explore EnrichAPPLICATIONS   Business   Custom   Enterprise   Collect data and AnalyLcs   ApplicaLons   ApplicaLons   perform iterative investigation for value 3 1 Capture Capture all dataDATA  SYSTEMS   HORTONWORKS     DATA  PLATFORM   2 2 Process RDBMS   EDW   MPP   TRADITIONAL  REPOS   Parse, cleanse, apply structure & transform 3 Exchange 1 Explore and visualize with analytics tools supporting HadoopDATA  SOURCES   TradiLonal  Sources     New  Sources     (RDBMS,  OLTP,  OLAP)   (web  logs,  email,  sensor  data,  social  media)   Page 27 © Hortonworks Inc. 2013
  • 28. Application Enrichment Refine Explore EnrichAPPLICATIONS   Custom   Enterprise   Collect data, analyze ApplicaLons   ApplicaLons   and present salient results for online apps 3 1 Capture Capture all dataDATA  SYSTEMS   HORTONWORKS     DATA  PLATFORM   2 2 Process RDBMS   EDW   MPP   NOSQL   TRADITIONAL  REPOS   Parse, cleanse, apply structure & transform 3 Exchange 1 Incorporate data directly into applicationsDATA  SOURCES   TradiLonal  Sources     New  Sources     (RDBMS,  OLTP,  OLAP)   (web  logs,  email,  sensor  data,  social  media)   Page 28 © Hortonworks Inc. 2013
  • 29. Key 2013 “Enterprise Hadoop” Initiatives Invest In: Tez / “Stinger” Interactive Query – Platform Services Ambari HBase – DR, Snapshot, …Manage & Operate Online Data OPERATIONAL   DATA   SERVICES   SERVICES   HADOOP  CORE   – Data Services PLATFORM  SERVICES   – In support of Refine, “Gateway” HORTONWORKS     “Herd” Explore, Enrich Secure Access DATA  PLATFORM  (HDP)   Data Integration – Operational Services “Continuum” – Manageability, Biz Continuity Security, … Page 29 © Hortonworks Inc. 2013
  • 30. Stinger: Make Hive Best for All Needs Interac4ve   Non-­‐Interac4ve   Batch   •  Parameterized   •  Data  prepara4on   •  Opera4onal  batch   Reports   •  Incremental  batch   processing   •  Drilldown   processing   •  Enterprise  Reports   •  Visualiza4on   •  Dashboards  /   •  Data  Mining   •  Explora4on   Scorecards   5s – 1m 1m – 1h 1h+ Data SizeImprove Latency & Throughput Extend Deep Analytical Ability•  Query engine improvements •  Analytics functions•  New “Optimized RCFile” column store •  Improved SQL coverage•  Next-gen runtime (elim’s M/R latency) •  Continued focus on core Hive use cases Page 30 ©  Hortonworks  Inc.  2013  
  • 31. Flexible Support Subscription Programs Leverage Hortonworks Expertise: Subscription and Support delivered and backed by Hadoop experts; subscriptions based on nodes or storage Developer Support 12 x 5 All Sev: Application “How to” guidance for 1 seat Code Review Web only 1 business day Design Advice developers and archs Enterprise Support 24 x 7 Sev 1: 1 Hour 5 Patches & Cluster Design, Install, Operations support for Phone & Sev 2: 4 Bus Hour Contacts Updates Maintain, Performance Web critical clusters Additional Options Standard Support 12 x 5 All Sev: 3 Patches & Cluster Design, Install, Operations support for Web only 1 business day Contacts Updates Maintain, Performance dev & test clusters Essential Support* 12 x 5 All Sev: 3 Patches & Cluster Design, Install, Operations support for Web only 1 business day Contacts Updates Maintain, Performance small research clusters* Limited in size and no expansion © Hortonworks Inc. 2013 Page 31
  • 32. Hortonworks: Best In Class Hadoop Support•  Experienced enterprise support team –  Experience supporting enterprise clients in production –  Core engineers have real operational experience: built and supported 44+K nodes in production –  Extensive experience in commercial big data offerings including HDP, MapR, Karmasphere•  Global 24x7 operation – support based in Sunnyvale, UK & India•  Stringent case management processes ensures high quality customer service & responsiveness Page 32 © Hortonworks Inc. 2013
  • 33. Transferring Our Hadoop Expertise to You The expert source for Apache Hadoop training & certification •  World class training programs designed to help you learn fast – Role-based hands on classes with 50% lab time •  Expert consulting services – Programs designed to transfer knowledge •  Industry leading Hadoop Sandbox program – Fastest way to learn Apache Hadoop – Multi-level tutorials for wide applicability – Customizable and updateable Page 33 © Hortonworks Inc. 2013
  • 34. Summary• Leading the Innovation in Core Hadoop• Addressing the requirements for Enterprise usage• Enabling interoperability of the ecosystem• No lock-in. 100% Open Source.• Best in industry support with flexible pricing model• Find out more – www.hortonworks.com – http://hortonworks.com/hadoop-training/ Page 34 © Hortonworks Inc. 2013