The Hadoop Ecosystem

1,711 views

Published on

My Hadoop Ecosystem presentation at the 2011 BreizhCamp.

See the talk video (in french):

http://mediaserver.univ-rennes1.fr/videos/?video=MEDIA110628093346744

Published in: Technology

The Hadoop Ecosystem

  1. 1. Theecosystem
  2. 2. First of all...
  3. 3. Hadoop is not an acronym
  4. 4. Highly Available Data Object Oriented Program
  5. 5. Hadoop is an elephant
  6. 6. http://www.fickr.com/photos/18555810@N00/2145683109/in/photostream/
  7. 7. How it all began...
  8. 8. 1997
  9. 9. 2002
  10. 10. 2003
  11. 11. Nutch Distributed FileSystem
  12. 12. 2004
  13. 13. 2005
  14. 14. Nutch Algorithms ported to M/R
  15. 15. 2006
  16. 16. as subprojects of Lucene
  17. 17. Doug Cutting joins
  18. 18. One month later they adopt Hadoopand set up a dedicated research team
  19. 19. 2007
  20. 20. 2008
  21. 21. Top Level Project
  22. 22. Yahoo! announces its Web index is built using Hadoop (2008-02)
  23. 23. Hadoop sorts 1TB in 209 seconds using 910 nodes (2008-04)
  24. 24. 2009
  25. 25. Doug Cutting joins
  26. 26. Today
  27. 27. Big Data is here to stay
  28. 28. http://salsahpc.indiana.edu/CloudCom2010/slides/PDF/tutorials/Yahoo_business_seminar.pdf
  29. 29. More and more commercial solutions
  30. 30. Hadoop Distributed FileSystem
  31. 31. Fault tolerant, very large fle support
  32. 32. Write once, read many access model
  33. 33. Hierarchical namespace
  34. 34. File permissions
  35. 35. Files are divided into blocks Typical block size 64 or 128 MB
  36. 36. Blocks are replicatedmultiple machines, multiple racks
  37. 37. Blocks are managed by DataNodes
  38. 38. DataNodes report to a NameNode
  39. 39. Write Data Flow
  40. 40. Read Data Flow
  41. 41. HDFS Thrift Server
  42. 42. Map Reduce
  43. 43. Origins in FP and λ-calculus
  44. 44. MapMap(k,v) ⇒ list(k,v)
  45. 45. Sort{ list(k,v) } ⇒ { (k, list(v)) }
  46. 46. ReduceReduce(k,list(v)) ⇒ list(k”, v”)
  47. 47. The Hello, World of MR, word count
  48. 48. MR word countMap: <∅,line> ⇒ <word0,1> <word1,1> ... <wordk,1>Reduce: <wordi, 10, 11, ..., 1N-1> ⇒ <wordi, N>
  49. 49. MapReduce in practice
  50. 50. Data moves slowly on the network
  51. 51. So we bring computation to the data
  52. 52. Data is read according to an InputFormat
  53. 53. An InputFormat splits the data
  54. 54. Each InputSplit is fed to a Mapper
  55. 55. HDFS blocks make ideal splits
  56. 56. Run Mappers on DataNodes
  57. 57. Each DataNode is also a TaskTracker
  58. 58. A JobTracker dispatches Mappers & Reducers
  59. 59. Mappers output data are written locally
  60. 60. A Partitioner splits them per Reducer
  61. 61. The Reducers retrieve those partitions
  62. 62. The Shuffe
  63. 63. This step can be costly depending on volume
  64. 64. Compress Mappers output
  65. 65. Run Combiners on Mappers output
  66. 66. Combiners: Reducers executing at the Mappers
  67. 67. Reducers sort data retrieved during Shuffe
  68. 68. MapReduce job ends with one fle per reducer
  69. 69. Hadoop Streaming
  70. 70. http://www.fickr.com/photos/19melissa68/2476168474/sizes/l/in/photostream/
  71. 71. Pig / Hive
  72. 72. Pig Latin is easy to learn
  73. 73. I bet you can understand the followingA = LOAD mydata USING PigStorage() AS (url, time, size);B = GROUP A BY url;C = FOREACH B GENERATE group, COUNT(A), SUM(size), SUM(time)/COUNT(A);DUMP C;
  74. 74. Pig Latin is converted to MR jobs
  75. 75. Pig supports Streaming and UDFs
  76. 76. Offers HiveQL, close to SQLCREATE TABLE page_view(viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, friends ARRAY<BIGINT>, properties MAP<STRING, STRING> ip STRING COMMENT IP Address of the User)COMMENT This is the page view tablePARTITIONED BY(dt STRING, country STRING)CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETSROW FORMAT DELIMITED FIELDS TERMINATED BY 1 COLLECTION ITEMS TERMINATED BY 2 MAP KEYS TERMINATED BY 3STORED AS SEQUENCEFILE;INSERT OVERWRITE TABLE xyz_com_page_viewsSELECT page_views.*FROM page_viewsWHERE page_views.date >= 2008-03-01 AND page_views.date <= 2008-03-31 AND page_views.referrer_url like %xyz.com;
  77. 77. HiveQL converted to MR jobs
  78. 78. Hive also supports UDFs
  79. 79. MapReduce is powerful
  80. 80. But MapReduce works in batch mode
  81. 81. Need something similar for real time streams
  82. 82. ZooKeeper
  83. 83. ZooKeeper provides a highly available flesystem abstraction
  84. 84. ZooKeeper has no fles or directories but znodes
  85. 85. Znodes form a hierarchical namespace
  86. 86. A znode can have data AND children 2f 85 1e 4a 73 47 c5 e4 39 ff 0f b6 46 79 ac c5 48 c1 99 85 48 16 df 04 6a 2c cc ce 9e 4f ae cb 20 a5 9d 62 57 96 35 c3 eb 3d cb c3 1c cb 91 f8 a2 4d 90 57 0a 62 24 f9 5e a4 50 00 6a bd 3c ea 68 61 3f bf 7a 48 8f 26 63 24 e9 d4 3b b4 55 c2
  87. 87. A znode has Access Control ListsCREATE (children)READ (list children and get data)WRITE (set data)DELETE (children)ADMIN (setACL)
  88. 88. Znodes can be persistent or ephemeralEphemeral znodes vanish with their creators sessionPersistent znodes outlive their creators sessionEphemeral znodes cannot have children
  89. 89. Znodes can be sequentialA monotonically increasing counter is appended to the znode name /zk/foo-1 /zk/foo-3 /zk/foo-4 ... This can be used to impose a global order among direct children
  90. 90. Znodes can have watchesWatches allow clients to be notifed when znodes change NodeCreated NodeDeleted NodeDataChanged NodeChildrenChanged
  91. 91. ZooKeeper has a simple API exists getData getChildren } set watches create delete setData setACL getACL sync
  92. 92. ZooKeeper consistencySequential Consistency Updates from a client are applied in the order sentAtomicity Updates either succeed or failSingle System Image Unique view, regardless of the server we connect toDurability Updates once succeeded will not be undoneTimeliness Lag is bounded
  93. 93. ZooKeeper use casesConfguration service Get latest confg and get notifed when it changesLock service Provide mutual exclusionLeader election There can be only one...Group membership Dynamically determine members of a groupQueue Producer/Consumer paradigm
  94. 94. http://www.fickr.com/photos/7-how-7/2308008668/sizes/o/in/photostream/
  95. 95. HBase
  96. 96. Data Model
  97. 97. “a sparse, distributed multi-dimensional sorted map”
  98. 98. Row keys, column qualifers, values byte[]
  99. 99. Regions
  100. 100. A region is a slice of row keys
  101. 101. A region is served by a Region Server
  102. 102. HBase Master
  103. 103. Coordinates the slaves (Region Servers)
  104. 104. Assigns Regions to Region Servers
  105. 105. Does so using ZooKeeper
  106. 106. .META.
  107. 107. Special table, stores region assignments
  108. 108. -ROOT-
  109. 109. Special single region table
  110. 110. Stores .META. region assignments
  111. 111. Region Server
  112. 112. MemStore fushed when full
  113. 113. New immutable fles written
  114. 114. When too many fles per region
  115. 115. Perform a minor compaction
  116. 116. Merge most recent N fles
  117. 117. Periodically merge all fles in a region
  118. 118. Thats a major compaction
  119. 119. When aggregate size of a region is too big
  120. 120. Split region into two new regions
  121. 121. Parent region fles eventually discared
  122. 122. Region Server Crashes
  123. 123. Master will split WAL into Region oldlogs
  124. 124. Regions will be reassigned
  125. 125. HBase API
  126. 126. Put, Get, Delete
  127. 127. CheckAndPut, Lock
  128. 128. Counters
  129. 129. Scan
  130. 130. Filters, row/column, in Region Server
  131. 131. Coprocessors
  132. 132. REST / Thrift gateways for non Java access
  133. 133. Map Reduce NextGen
  134. 134. NodeManager
  135. 135. Resource Containers
  136. 136. Scheduler
  137. 137. Application Manager
  138. 138. More generic and more scalable
  139. 139. Monitor Everything
  140. 140. Automate as much as possible
  141. 141. Test Drive on
  142. 142. Were hiringwww.arkea.com
  143. 143. @herberts

×