Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Hadoop Now, Next & BeyondCommunity Driven Enterprise Apache HadoopEric Baldeschwieler, “Eric14”Hortonworks CTO@jeric14© Ho...
Quick History: Hadoop at Yahoo!Source: http://developer.yahoo.com/blogs/ydn/posts/2013/02/hadoop-at-yahoo-more-than-ever-b...
Hortonworks Approach to Enterprise HadoopCommunity Driven Enterprise Apache Hadoop                    Identify and introdu...
Making Hadoop Enterprise-Ready                     OPERATIONAL                     DATA                       SERVICES    ...
HCatalog: Table-level Abstractions • Consistency of data models across tools (MapReduce, Pig, HBase and Hive) • Accessibil...
Ambari: Provision > Manage > Monitor A framework for operating Hadoop…with APIs for integration                           ...
Ambari: Latest Highlights                                     • Job Diagnostics                                     • Clus...
See Hadoop > Learn Hadoop > Do Hadoop      Hands on                               Full environment    step-by- step       ...
Hadoop 2.0 Innovations - YARN• Focus on scale and innovation   – Support 10,000+ computer clusters   – Extensible to encou...
Tez on YARN: Going Beyond Batch                Tez Task Tez Optimizes Execution          Always-On Tez Service  New runtim...
Stinger Initiative• Community initiative around Hive• Enables Hive to support interactive workloads• Enhances Hive’s stand...
Stinger: Make Hive Fly For All BI NeedsParameterized Reports                                                       Enterpr...
Knox: Make Hadoop Security Simple                                Authentication &                                  Verific...
Hortonworks Data Platform 2.0 Alpha 2Key New Features                                  Business Value –    Hive performanc...
Falcon: Data Lifecycle Management• New Apache Incubator Project• Introduced by InMobi, Hortonworks and Yahoo!• Data Lifecy...
Join the Community & Get Involved!                                             • INNOVATE                               Op...
Upcoming SlideShare
Loading in …5
×

Apache Hadoop Now Next and Beyond

4,424 views

Published on

With the rise of Apache Hadoop, a next-generation enterprise data architecture is emerging that connects the systems powering business transactions and business intelligence. Hadoop is uniquely capable of storing, aggregating, and refining multi-structured data sources into formats that fuel new business insights. Apache Hadoop is fast becoming the defacto platform for processing Big Data. Hadoop started from a relatively humble beginning as a point solution for small search systems. Its growth into an important technology to the broader enterprise community dates back to Yahoo’s 2006 decision to evolve Hadoop into a system for solving its internet scale big data problems. Eric will discuss the current state of Hadoop and what is coming from a development standpoint as Hadoop evolves to meet more workloads.

  • Be the first to comment

Apache Hadoop Now Next and Beyond

  1. 1. Hadoop Now, Next & BeyondCommunity Driven Enterprise Apache HadoopEric Baldeschwieler, “Eric14”Hortonworks CTO@jeric14© Hortonworks Inc. 2013
  2. 2. Quick History: Hadoop at Yahoo!Source: http://developer.yahoo.com/blogs/ydn/posts/2013/02/hadoop-at-yahoo-more-than-ever-before/ © Hortonworks Inc. 2013
  3. 3. Hortonworks Approach to Enterprise HadoopCommunity Driven Enterprise Apache Hadoop Identify and introduce enterprise requirements into the pubic domain Work with the community to advance and incubate open source projects Apply Enterprise Rigor to provide the most stable and reliable distribution © Hortonworks Inc. 2013
  4. 4. Making Hadoop Enterprise-Ready OPERATIONAL DATA SERVICES SERVICES Manage & AMBARI FLUME Store, HIVE PIG Operate at Process and HBASE Scale SQOOP Access Data OOZIE HCATALOG MAP REDUCE Distributed HADOOP CORE Storage & Processing HDFS Enterprise Readiness: HA, PLATFORM SERVICES DR, Snapshots, Security, … HORTONWORKS DATA PLATFORM (HDP) OS / VM Cloud Appliance © Hortonworks Inc. 2013
  5. 5. HCatalog: Table-level Abstractions • Consistency of data models across tools (MapReduce, Pig, HBase and Hive) • Accessibility: share data as tables inside and out of Hadoop HCatalog Shared table and schema management • Raw Hadoop data Table access opens the • Inconsistent, unknown Consistent schema platform • Tool specific access REST API © Hortonworks Inc. 2013
  6. 6. Ambari: Provision > Manage > Monitor A framework for operating Hadoop…with APIs for integration Manage Monitor Provision Ambari Integrate Hadoop Cluster © Hortonworks Inc. 2013
  7. 7. Ambari: Latest Highlights • Job Diagnostics • Cluster History • Instant Insight • Cluster Navigation • REST interfaceApache Ambari Dashboard © Hortonworks Inc. 2013
  8. 8. See Hadoop > Learn Hadoop > Do Hadoop Hands on Full environment step-by- step to evaluate tutorials to learn Hadoop © Hortonworks Inc. 2013
  9. 9. Hadoop 2.0 Innovations - YARN• Focus on scale and innovation – Support 10,000+ computer clusters – Extensible to encourage innovation Graph Processing• Next generation execution MapReduce – Improves MapReduce performance Other Tez• Supports new frameworks beyond MapReduce YARN: Cluster Resource Management – Low latency, Streaming, Services – Do more with a single Hadoop cluster HDFS Redundant, Reliable Storage © Hortonworks Inc. 2013
  10. 10. Tez on YARN: Going Beyond Batch Tez Task Tez Optimizes Execution Always-On Tez Service New runtime engine for Low latency processing formore efficient data processing all Hadoop data processing © Hortonworks Inc. 2013
  11. 11. Stinger Initiative• Community initiative around Hive• Enables Hive to support interactive workloads• Enhances Hive’s standard SQL interface for Hadoop• Improves existing tools & preserves investments Execution Query File Engine Planner Format + + = 100X Tez Hive ORC file © Hortonworks Inc. 2013
  12. 12. Stinger: Make Hive Fly For All BI NeedsParameterized Reports Enterprise Reports Dashboard / Scorecard Visualization Data Mining More SQL + 100X Faster Interactive Batch © Hortonworks Inc. 2013
  13. 13. Knox: Make Hadoop Security Simple Authentication & Verification User Store Hadoop Cluster KDC, AD, LDAP {REST} Knox Client Gateway © Hortonworks Inc. 2013
  14. 14. Hortonworks Data Platform 2.0 Alpha 2Key New Features Business Value – Hive performance – First distribution to include Tez Single Platform Multiple Use From HDP 2.0 Alpha BATCH INTERACTIVE ONLINE – Yarn – Full Stack HA – Snapshots – Disaster Recovery Big Data Transactions, Interactions, Observations – Rolling Upgrades Available today http://Hortonworks.com/products Page 14 © Hortonworks Inc. 2013
  15. 15. Falcon: Data Lifecycle Management• New Apache Incubator Project• Introduced by InMobi, Hortonworks and Yahoo!• Data Lifecycle Management Framework for Hadoop• Configure and Manage Workflows & Policies for: – Data Movement – Disaster Recovery – Data Retention © Hortonworks Inc. 2013
  16. 16. Join the Community & Get Involved! • INNOVATE Open Source Vendors • INTEGRATE End Users • COMMUNICATE © Hortonworks Inc. 2013

×