Big Data & Hadoop

1,780 views

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,780
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
63
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Big Data & Hadoop

  1. 1. BIG DATA & HADOOP The future of the information economy by Thanakrit Lersmethasakul lersmethasakul@live.com
  2. 2. A Technology Blueprint
  3. 3. Big Data Storymap
  4. 4. Big Data Concept
  5. 5. Big Data Concept
  6. 6. Big Data Concept
  7. 7. Big Data Architecture
  8. 8. Big Data Ecosystem
  9. 9. Big Data Landscape
  10. 10. Big Data Life-cycle Management
  11. 11. Hadoop Concept
  12. 12. Hadoop Concept
  13. 13. Hadoop Concept
  14. 14. Hadoop Architecture
  15. 15. Hadoop Architecture Hadoop Client Contacts Name Node for data or Job Tracker to submit jobs Name Node Job Tracker Maintains mapping of file blocks to data node slaves Schedules jobs across task tracker slaves Data Node Task Tracker Stores and serves blocks of data Runs tasks (work units) within a job Share Physical Node
  16. 16. Hadoop Process MapReduce Example for Word Count cat *.txt | mapper.pl | sort | reducer.pl > out.txt Split 1 (docid, text) Map 1 (words, counts) (sorted words, counts) Be, 5 Reduce 1 “To Be Or Not To Be?” (sorted words, sum of counts) Output File 1 Be, 30 Be, 12 Split i (docid, text) Reduce i Map i Be, 7 Be, 6 Split N (docid, text) Map M (sorted words, sum of counts) Reduce R (sorted words, sum of counts) Shuffle (words, counts) Map(in_key, in_value) => list of (out_key, intermediate_value) (sorted words, counts) Output File i Output File R Reduce(out_key, list of intermediate_values) => out_value(s)
  17. 17. Hadoop Ecosystem
  18. 18. Hadoop Ecosystem
  19. 19. Hadoop Ecosystem
  20. 20. Thank You

×