Hadoop for the disillusioned
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Hadoop for the disillusioned






Total Views
Views on SlideShare
Embed Views



1 Embed 2

https://twitter.com 2


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Hadoop is not new - NY Time Source: http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/ <br />
  • Wired Source: http://www.wired.com/wired/issue/16-07 <br />
  • Source: Gartner Hype Cycle - http://www.gartner.com/technology/research/methodologies/hype-cycle.jsp <br /> “Big Data is a fad”, “Its just BI 2.0”, “This is all just hype”, “We can’t figure out how to use it”, “There’s nothing new here”, “It’s not ready”, “Too few support options”, “Its too hard” <br />
  • - You’re sharding your RDBMS infrastructure and its becoming brittle and a nightmare to maintain. <br /> - Twitter has a good quote where they stated it used to take them 2 weeks to run an alter table statement <br />
  • Using Hadoop for ETL to save money by displacing ETL vendors <br /> Using Hive to offload datasets and their corresponding queries from your EDW and lower your EDW bill <br />
  • A great way to competitively differentiate with arbitrarily structured data <br />
  • Hadoop’s power is in its single storage repository and its support for arbitrary data structures. You have the technology to ask any question if you just have the data. <br />
  • http://escience.washington.edu/get-help-now/astronomical-image-processing-hadoop <br />
  • http://strataconf.com/stratany2013/public/schedule/detail/30810 <br />
  • http://vimeo.com/16861296 <br />

Hadoop for the disillusioned Presentation Transcript

  • 1. Hadoop for the disillusioned Steve Watt, Red Hat CC flickr rubenswieringa @wattsteve
  • 2. @wattsteve
  • 3. Wired Magazine - July 2008 @wattsteve
  • 4. Hadoop in 2013 Platform Layers Technologies Computational Runtimes YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger & more FileSystems Azure, CassandraFS, CephFS, CleverSafe, GlusterFS, GridGain, HDFS, Lustre MapR FS, S3, SWIFT, Quantcast FS, Symantec VCFS & more Infrastructures System on a Chip, x86, Virtualization and Cloud Distributions Cloudera, Hortonworks, IBM, Intel, MapR, WanDisco CC flickr lowfatbrains @wattsteve
  • 5. Source: Gartner Hype Cycle @wattsteve
  • 6. Your data is growing beyond your ability to manage & query it CC flickr kakadu @wattsteve
  • 7. Save money when asking the same questions of your data CC flickr martijnsnels @wattsteve
  • 8. Hadoop Customer, “Great, but now what?” Innovators Early Adopters Early Majority Late Majority Laggards CHASM Geoffrey Moore’s Technology Adoption Lifecycle @wattsteve
  • 9. new and build data products CC flickr cbcastro @wattsteve
  • 10.      Ask your domain experts and LOB folks what unanswered questions they have Where can you get the data you need to answer that question? (domain experts should know where to get it) Some of this data may be outside your organization (Social Media, Sensor Data, Data brokerages/Marketplaces, Web Pages) and some of it may be inside. If the data for the query doesn’t exist, figure out how to instrument or gather it. Pair your domain experts with your data engineers so they can work out how to obtain and massage the data given the types of queries desired CC flickr birdwatcher63 @wattsteve
  • 11. • Building data products is a similar exercise except that it involves typical product planning, such as identifying a market. • This is also a great way for an organization to explore what assets they have within their data CC flickr syume @wattsteve
  • 12. Mapping the night sky CC flickr bobfamiliar @wattsteve
  • 13. Analyzing farm soil content to predict human conflict CC flickr oxfam @wattsteve
  • 14. Crisis Management for the Chilean Earthquake CC flickr flodigrip @wattsteve
  • 15. Thanks for listening Steve Watt swatt@redhat.com @wattsteve