Is There Room For Another Elephant In Tucson


Published on

Would you like to scale data-intensive tasks horizontally? Would you like an open source project that gave you that foundation?

Well, there is: Apache Hadoop. It's a Java software framework for supporting data-intensive distributed applications. The framework was inspired by Google papers on their MapReduce framework and Google File System.

Who uses Hadoop? Here's a short list: Yahoo!,, LinkedIn, Facebook, ImageShack, eHarmony, Hulu,, and The New York Times. The highest profile user, Yahoo!, is also a major contributor to the project. They use it extensively in their web search and advertising divisions.

In this talk, titled "Is there room for another elephant in Tucson?", Andrew Lenards will tell us about Hadoop and describe how it could be applied to several practical problems, even if you aren't as big as Google.

Published in: Technology
  • Be the first to comment

Is There Room For Another Elephant In Tucson

  1. 1. Is There Room for Another                    Elephant in               Tucson? Tucson Java User Group Andrew Lenards
  2. 2. andrew.lenards@gmail <ul><li>UA grad, Dec 2001 </li></ul><ul><li>former teaching assistant UA CS </li></ul><ul><li>former instructor UA CS </li></ul><ul><li>reformed .NET developer </li></ul><ul><li>10 years on/off coding Java  </li></ul><ul><li>Co-founder UA Student ACM </li></ul><ul><li>Active in:  </li></ul><ul><ul><li>Tucson Java User Group </li></ul></ul><ul><ul><li>Tucson Startup Drinks </li></ul></ul><ul><li>Semi-active in:  </li></ul><ul><ul><li>Tucson Free Unix Group </li></ul></ul><ul><ul><li>Ubuntu Arizona Local Community </li></ul></ul>
  3. 3. Why do I care? <ul><li>Hadoop holds the TeraSoft, MinuteSort, GraySort benchmarks...  </li></ul><ul><li>  </li></ul><ul><li>Captured TeraSort in 2008 </li></ul><ul><li>Captured MinuteSort, GraySort in 2009 </li></ul><ul><li>Metrics ( </li></ul><ul><li>GraySort : Sort rate (TBs / minute) achieved while sorting a very large amount of data (currently 100 TB minimum).  </li></ul><ul><li>MinuteSort : Amount of data that can be sorted in 60.00 seconds or less. </li></ul><ul><li>TeraSort *: Elapsed time to sort 10 12 bytes of data.  </li></ul><ul><li>  </li></ul><ul><li>* Now deprecated </li></ul>
  4. 4. Establishing our setting...
  5. 5. Differing World Views
  6. 6. Brendan's quote... <ul><ul><li>&quot;When a computer fails, they don't bother trying to find it and fix it.  They just add more machines.&quot; </li></ul></ul><ul><ul><ul><ul><li>-- former UA student, former Google intern, circa 2000 </li></ul></ul></ul></ul>
  7. 7. End of Moore's Law?
  8. 8. End of Moore's Law?
  9. 9. Scaling Up Endangered  Practice? Polar Bear Endangered Species
  10. 10. DATA!
  11. 11. DATA!
  12. 12. Zettabytes??? <ul><ul><li>The New York Stock Exchange generates about one terabyte of new trade data per day. </li></ul></ul><ul><ul><li>Facebook hosts approximately 10 billion photos, taking up one petabyte of storage </li></ul></ul><ul><ul><li> stores around 2.5 petabytes of data. </li></ul></ul><ul><ul><li>The Internet Archive stores around 2 petabytes of data, and is growing at a rate of 20 terabytes per month </li></ul></ul><ul><ul><li>The Large Hadron Collider will produce about 15 petabytes of data per year </li></ul></ul><ul><li>The &quot;digital universe&quot; is estimated to be 1.8 zettabytes by 2011 </li></ul><ul><li>Source: &quot;Hadoop: The Definitive Guide, Tom White&quot; </li></ul>
  13. 14. MapReduce: the Abstraction
  14. 15. MapReduce <ul><li>Introduced in 2004 Google paper: </li></ul><ul><ul><li>&quot;MapReduce: Simplified Data Processing on Large Clusters&quot; </li></ul></ul><ul><li>  </li></ul><ul><li>&quot;Structured as functional programming meets distributed processing&quot; (Aaron Kimball, Cloudera) </li></ul><ul><li>Designed for batch processing, not designed for interactive </li></ul>
  15. 16. MapReduce + RDBMS, not versus Source: &quot;Hadoop: The Definite Guide, Tom White&quot; Traditional RDBMS MapReduce Data Size Gigabytes Petabytes Access Interactive & batch Batch Updates Read and write many times Write once, read many times Structure Static schema Dynamic schema Integrity High Low Scaling Nonlinear Linear
  16. 17. Shared-state makes everything hard... <ul><li>Sharing requires the usage communication mechanisms between processes.  (which we know complicates things) </li></ul><ul><li>The MapReduce abstraction limits communication to keep benefits.  </li></ul><ul><li>Mappers do not need to communicate </li></ul><ul><li>  </li></ul><ul><li>Reducers do not need to communicate </li></ul>
  17. 18. Shared Nothing Architecture <ul><li>Introduced in 1986 paper by Michael Stonebraker on distributed computing architectures, but applies to large scale web applications. </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>Note: Stonebraker was co-author of  &quot;MapReduce: A Major Step Backwards&quot; in January 2007. </li></ul>
  18. 19. Functional inspiration, but not dogmatic <ul><li>Functions w/ no side-effects are pure functions </li></ul><ul><ul><li>Map is an n-to-n operation  </li></ul></ul><ul><ul><li>Fold is an n-to-1 operation (often called a &quot;reduce&quot;) </li></ul></ul><ul><li>    </li></ul><ul><li>With MapReduce, we define a problem in Mappers & Reducers </li></ul><ul><li>However, a Mapper can produce more than 1 key per element.  And a Reducer may produce many values.  So the abstraction is not married to the functional model. </li></ul>
  19. 20. Partitioning work... <ul><li>The design of scaling out horizontally with MapReduce is done by break large files into chunks (or blocks) and bringing computation to the data (data locality).   </li></ul><ul><li>  </li></ul><ul><li>The &quot;blocks&quot; are the input to Mappers, so work partitioning is implicit to the system. </li></ul>
  20. 21. Raising the level of abstraction <ul><li>MapReduce allows you to focus on the problem, let the library deal w/ the messy details </li></ul><ul><li>  </li></ul><ul><li>An understanding of the high-level domain and the low-level details does not need to exist within the same human-form anymore. </li></ul>
  21. 22. MapReduce Usage
  22. 23. Example usage... <ul><ul><li>Distributed Grep </li></ul></ul><ul><ul><li>Word Count / Count URL Frequency </li></ul></ul><ul><ul><li>Inverted Index </li></ul></ul><ul><ul><li>Term-Vector per Website </li></ul></ul><ul><ul><li>Reverse Web-Link Graph </li></ul></ul>
  23. 24. Apache Web Server Logfiles <ul><li>Consider we want to do a simple analysis of visits per host. </li></ul><ul><li>An abstract view of the inputs would be: </li></ul><ul><li><k1, v1> -> Mapper -> <k2, v2> -> Reducer -> <k3, v3> </li></ul><ul><li>or </li></ul><ul><li>(<line-number>, <line>) --> Mapper --> (<hostname>, 1) </li></ul><ul><li>  </li></ul><ul><li>(<hostname>, 1) --> Reducer --> (<hostname>, count) </li></ul>
  24. 25. - - [16/Aug/2009:04:40:36 -0700] &quot;GET /tree/home.pages/searchTOL?taxon=Arna&Submit2=Find&startline=26  HTTP/1.1&quot; 200 14693 &quot;-&quot; &quot;Mozilla/5.0 (compatible; Googlebot/2.1; +; - - [16/Aug/2009:04:40:36 -0700] &quot;GET /onlinecontributors/img/quicknav/RightArrow.png HTTP /1.1&quot; 200 321 &quot;; &quot;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)&quot; localhost.localdomain - - [16/Aug/2009:04:40:36 -0700] &quot;GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SSiphono phorida&sp=S8149&sp=S HTTP/1.1&quot; 200 5894 &quot;-&quot; &quot;Mozilla/5.0 (compatible; Yahoo! Slurp/3.0;; - - [16/Aug/2009:04:40:36 -0700] &quot;GET /Siphonophorida/8149 HTTP/1.0&quot; 200 5894 &quot;-&quot; &quot;Mozilla/5.0 (compatible; Yahoo! Slurp/3.0;; localhost.localdomain - - [16/Aug/2009:04:40:36 -0700] &quot;GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SCampanu lotes&sp=S73605&sp=S HTTP/1.1&quot; 200 8571 &quot;-&quot; &quot;Mozilla/4.0&quot; - - [16/Aug/2009:04:40:36 -0700] &quot;GET /Campanulotes/73605 HTTP/1.1&quot; 200 8571 &quot;-&quot; &quot;Mozilla/4.0&quot; - - [16/Aug/2009:04:40:36 -0700] &quot;GET /tree/img/magnify.gif HTTP/1.1&quot; 200 124 &quot;http://www.; &quot;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)&quot; localhost.localdomain - - [16/Aug/2009:04:40:37 -0700] &quot;GET /onlinecontributors/app?service=external&page=ViewBranchOrLeaf&sp=SHomo&sp =S16418&sp=S HTTP/1.1&quot; 200 10283 &quot;-&quot; &quot;Mozilla/5.0 (compatible; Ask Jeeves/Teoma; + &quot; - - [16/Aug/2009:04:40:37 -0700] &quot;GET /Homo/16418 HTTP/1.0&quot; 200 10283 &quot;-&quot; &quot;Mozilla/5.0 (compatible; Ask Jeeves/Teo ma; +; - - [16/Aug/2009:04:40:37 -0700] &quot;GET /tree/img/tinylink.png HTTP/1.1&quot; 200 207 &quot;http://www; &quot;Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6)&quot; - - [16/Aug/2009:04:40:37 -0700] &quot;GET /onlinecontributors/app?page=ImageGallery&service=external&s p=l27570&state:ImageGallery=ZH4sIAAAAAAAAAFvzloG1nJeBgYGJgYEtLz8l1TOluIiBLyuxLFEvJzEvXc8nPy%2FduvvJhDP9yveZGBi9GFjLEnNKUyuKGAQQivxKc5N Si9rWTJXlnvKgG2hURQEDGGRfKhdgYODNTU3JTHTOSSwu9swrAZoviNAKFEhNTy0SerRgyffGdgugFZ4wKwoZ6hgYQaYAAKhZ4XSlAAAA HTTP/1.1&quot; 200 7899 &quot;-&quot; &quot;msnb ot/1.1 (+;
  25. 26. Input to the Mappers... <ul><li>(0, &quot; - - [16/Aug...&quot;) </li></ul><ul><li>(1, &quot;localhost.localdomain - - [16/Aug/2009:04:40...&quot;) </li></ul><ul><li>(2, &quot; - - [16/Aug/2009:04:40:3...&quot;)  </li></ul><ul><li>(3, &quot; - - [16/...&quot;) </li></ul><ul><li>(4, &quot; ...&quot;) </li></ul><ul><li>  ... </li></ul>
  26. 27. Output from Mappers, Input to the Reducers... <ul><li>(&quot;com.googlebot.crawl-66-249-71-34&quot;, 1) </li></ul><ul><li>(&quot;com.googlebot.crawl-66-249-71-34&quot;, 1) </li></ul><ul><li>(&quot;com.ask.crawler5108&quot;, 1) </li></ul><ul><li>(&quot;com.ask.crawler5108&quot;, 1) </li></ul><ul><li>(&quot;;, 1) </li></ul><ul><li>(&quot;;, 1) </li></ul><ul><li>(&quot;;, 1) </li></ul><ul><li>(&quot;;, 1) </li></ul><ul><li>(&quot;fr.wanadoo.abo.w81-248.astdenis-105-1-18-94&quot;, 1) </li></ul>
  27. 28. Output from Reducers... <ul><li>(&quot;com.ask.crawler5108&quot;, 2)  </li></ul><ul><li>(&quot;com.googlebot.crawl-66-249-71-34&quot;, 2)  </li></ul><ul><li>(&quot;;, 3) </li></ul><ul><li>(&quot;;, 1) </li></ul><ul><li>(&quot;fr.wanadoo.abo.w81-248.astdenis-105-1-18-94&quot;, 1) </li></ul><ul><li>... </li></ul>
  28. 29. Using the analysis... <ul><li>We know that analysis of logfiles is a particularly well-suited problem for MapReduce.  But what do companies use the resulting analysis for? </li></ul><ul><li>  </li></ul><ul><li>Rackspace's mail division, Mailtrust, used Hadoop for processing email logs.  They use an ad hoc query to determine geographic distribution of their users.  Then, they scheduled this MapReduce job to run monthly and use it help decide where to place new mail servers in their data centers  </li></ul><ul><li>  </li></ul><ul><li>Source: &quot;Hadoop: The Definitive Guide, by Tom White&quot; </li></ul>
  29. 30. A Yellow Elephant Enters...
  30. 31. Apache Hadoop Project <ul><li>3 years old... Grew out of the Lucene & Nutch projects. </li></ul><ul><li>&quot;[I]n a nutshell... Hadoops provides: a reliable shared storage and analysis system.&quot; </li></ul><ul><ul><ul><ul><li>-- &quot;Hadoop: The Definitive Guide, Tom White&quot; </li></ul></ul></ul></ul><ul><li>  </li></ul><ul><li>Storage: Hadoop Distributed Filesystem (HDFS) </li></ul><ul><li>Analysis: MapReduce implementation </li></ul><ul><li>... and a small ecosystem of supporting sub-projects </li></ul>
  31. 32. Hadoop's assumptions <ul><ul><li>  Hardware is going to failure </li></ul></ul><ul><ul><li>  Access is going to be in batch processing, so high   throughput trumps low latency data access </li></ul></ul><ul><ul><li>  Data sets are large, files will be gigabytes to terabytes   in size </li></ul></ul><ul><ul><li>  Write-once-read-many is the file access needed by   applications </li></ul></ul><ul><ul><li>  Moving computation is cheaper than moving data </li></ul></ul><ul><ul><li>  Must be portable from one platform to another (both    software & hardware) </li></ul></ul>
  32. 33. Coke/Pepsi, Google/Hadoop <ul><li>There is nearly a one-to-one mapping between the Google architecture and Apache Hadoop </li></ul>
  33. 34. Google/Hadoop Decoder Ring <ul><li>MapReduce </li></ul><ul><li>Google Filesystem (GFS) </li></ul><ul><li>BigTable </li></ul><ul><li>Chubby Lock System </li></ul><ul><li>Sawzall </li></ul><ul><li>.... </li></ul><ul><li>Hadoop MapReduce </li></ul><ul><li>HDFS </li></ul><ul><li>HBase </li></ul><ul><li>ZooKeeper </li></ul><ul><li>Pig </li></ul><ul><li>.... </li></ul>
  34. 35. NameNode, DataNodes <ul><li>Only one dedicated machine will run NameNode software service for an entire cluster.  Each machine in a cluster will run DataNode software services. NameNode plays role of arbitrator & metadata repository (for HDFS).  User data never flows through the NameNode. </li></ul><ul><li>NameNode maintains the file system namespace. Any change to the file system namespace or its properties is recorded by the NameNode. </li></ul><ul><li>Yes, this means there is a Single Point of Failure.  </li></ul>
  35. 36. Big Files, Narrow Access Pattern <ul><li>HDFS is optimized to store LARGE files, on the order of Gigabytes.   </li></ul><ul><li>Files are wrote to disk, start-to-finish, and then immutable.   </li></ul><ul><li>Files are read from disk, start-to-finish, by client applications (like MapReduce jobs). </li></ul><ul><li>Files are redundantly stored. </li></ul>
  36. 37. HDFS <ul><li>Filesystem is an unfortunate name because it makes us think about files and directories.  We really should think about HDFS as a 'dataset system.' </li></ul>
  37. 38. JobTracker, TaskTracker <ul><li>JobTracker runs on the NameNode </li></ul><ul><li>TaskTracker runs on each DataNode </li></ul><ul><li>JobTracker pushes out work to available TaskTrackers in the cluster.  It attempts to keep the computation close to the data (again, data locality).  But, if it cannot find an available TaskTracker with the data block needed for the task - it will attempt to schedule with a machine on the same rack.  </li></ul><ul><li>So, this means that JobTracker is &quot;rack-aware&quot; (or, that it understands the network topology of the cluster). </li></ul>
  38. 39. Re-execute slow running tasks <ul><li>To avoid the &quot;Convoy Effect&quot;, slow running tasks may be reassigned for execution by another DataNode holding the data for a block.   This means that failing or slow hardware will be hold up the rest of the computations for the job. </li></ul><ul><li>Re-execution of tasks can be done when &quot;speculative-execution&quot; enabled. </li></ul>
  39. 40. Hadoop Ecosystem <ul><ul><li>HBase A distributed column-oriented database (BigTable impl) </li></ul></ul><ul><ul><li>Hive A distributed data warehouse. </li></ul></ul><ul><ul><li>Pig A data flow language & execution environments for exploring very large datasets. </li></ul></ul><ul><ul><li>Zookeeper A distributed, highly available coordination service. </li></ul></ul><ul><ul><li>Chukwa A distributed data collection & analysis system. </li></ul></ul><ul><ul><li>Avro A data serialization system for efficient, cross-language RPC, and persistent data storage </li></ul></ul>
  40. 41. Where to next? <ul><ul><li>Cloudera Training Videos </li></ul></ul><ul><ul><li>Hadoop: The Definitive Guide, Tom White, O’Reilly/Yahoo! </li></ul></ul><ul><ul><li>Intro to Parallel Programming & MapReduce </li></ul></ul><ul><ul><li>Google Papers MapReduce, Google Filesystem, BigTable </li></ul></ul><ul><ul><li>Trending Topics </li></ul></ul><ul><ul><li>Tutorials everywhere! </li></ul></ul>
  41. 42. Acknowledgments <ul><li>iPlant Collaborative for a job (& allowing me to research Hadoop) </li></ul><ul><li>Cloudera and Aaron Kimball for training videos </li></ul><ul><li>  </li></ul><ul><li>Tom White for &quot;Hadoop: The Definitive Guide&quot; </li></ul>
  42. 43. Photo Acknowledgments <ul><li>S#01: </li></ul><ul><li>S#02: taken by Alex Yelich </li></ul><ul><li>S#05: </li></ul><ul><ul><li> & </li></ul></ul><ul><li>S#07: </li></ul><ul><li>S#08: </li></ul><ul><li>S#09: </li></ul><ul><li>S#10: </li></ul><ul><li>S#11: Andrew Lenards </li></ul><ul><li>S#13: </li></ul><ul><li>S#44: </li></ul>
  43. 45. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.