Your SlideShare is downloading. ×
0
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop - Now, Next and Beyond

830

Published on

Shaun Connolly of Horton presents at the 2012 Big Analytics Roadshow

Shaun Connolly of Horton presents at the 2012 Big Analytics Roadshow

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
830
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Apache HadoopNow, Next, and BeyondShaun ConnollyVP Corporate Strategy, HortonworksApril 19, 2012© Hortonworks Inc. 2012
  • 2. Big Data: Transactions + Interactions + Observations BIG DATA User Generated Content Sensors / RFID / DevicesPetabytes Mobile Web Social Interactions & Feeds Sentiment User Click Stream Spatial & GPS Coordinates Web logs Web A/B testing External Demographics Terabytes Offer history Dynamic Pricing Business Data Feeds Affiliate Networks HD Video, Audio, Images CRM Segmentation Gigabytes Search Marketing Offer details Speech to Text ERP Customer Touches Behavioral Targeting Product/Service Logs Purchase detail Support Contacts Megabytes Purchase record Dynamic Funnels SMS/MMS Payment record Increasing data variety and complexity Page 2 © Hortonworks Inc. 2012
  • 3. What is Apache Hadoop?• Collection of Open Source Projects One of the best examples of – Apache Software Foundation (ASF) open source driving innovation – Loosely coupled, ship early/often and creating a market • Solution for big data – Stores petabytes of data reliably – Runs highly distributed applications – Enables a rational economics model – Powers data-driven business Page 3 © Hortonworks Inc. 2012
  • 4. Key Hadoop Stack Components Core Components Extended Components Pig Hive Ambari & (Columnar NoSQL Store) (Data Flow) (SQL-like Access) Other Monitoring & Management HBase (Cluster Coordination) MapReduce Oozie &Zookeeper (Distributed Programing Framework) Other Workflow Scheduling HCatalog Sqoop & (Table & Schema Management) Other Ingest, ETL tools HDFS Mahout & (Hadoop Distributed File System) Other Libraries Page 4 © Hortonworks Inc. 2012
  • 5. Hadoop Now, Next, and Beyond Apache community, including Hortonworks investing to improve Hadoop: • Make Hadoop an open, extensible, and enterprise viable platform • Enable more applications to run on Apache Hadoop “Hadoop.Beyond” Integrate w/ecosystem “Hadoop.Next” (Hadoop 0.23) HDP 2 “Hadoop.Now” Next-gen HDFS & MapReduce (Hadoop 1.0) HDP 1Most stable Hadoop ever Page 5 © Hortonworks Inc. 2012
  • 6. Unifying Classic & Big Data Methods Classic Method Structured & Repeatable AnalysisBusiness determines what IT structures the data to questions to ask answer those questions SQL Performance and Structure “Capture only what’s needed”“Capture in case it’s needed” MapReduce Processing Flexibility IT delivers a platform for Big Data Method storing, refining, and Business explores data for Multi-structured & Iterative Analysis questions worth answeringanalyzing all data sources Page 6 © Hortonworks Inc. 2012
  • 7. Unified Big Data ArchitectureEnable Developers, Data Scientists, & Information Workers Java, C/C++, Pig, JavaScript, Python, R, SAS, SQL, Excel, BI Tools, Reporting, etc. Capture, Store, Refine, Discover, Analyze, Report, Retain • Fast data loading • Path & pattern analysis • Operational analysis • ELT/ETL and refinement • Graph analysis • Transactional analysis • Image/video analysis • Text analysis • High volume ad-hoc • Online retention • Iterative discovery • Elastic data marts Batch Interactive Active Audio, Docs & Machine Coords & Social Web & Video & CRM SCM ERP Text Logs Sensors Content Mobile Images Page 7 © Hortonworks Inc. 2012
  • 8. Hortonworks Vision We believe that by the end of 2015, more than half the worlds data will be processed by Apache Hadoop. Q: How to achieve that vision??? A: Ecosystem enablement around enterprise- viable open source data platform Page 8 © Hortonworks Inc. 2012
  • 9. • 2-day event (June 13-14, 2012) in San Jose, CA• 84 breakout sessions• Showcasing real-world examples, developments and best practices of Apache Hadoop• Plus, Geoffrey Moore to keynote and more to be announced• Register now at: http://www.hadoopsummit.org Page 9
  • 10. June 13-14, 2012San Jose, CA

×