Intro to MapReduce

913 views

Published on

Delhi Hadoop User Group MeetUp - 10th Sept. 2011 - Slides

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
913
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Intro to MapReduce

  1. 1. Intro to Map/Reduce<br />- Somil Asthana<br />
  2. 2. Agenda<br /><ul><li>Brief Introduction to Hadoop Architecture
  3. 3. Map / Reduce Pipe.
  4. 4. Map / Reduce Computation and Cube Generation (Example)</li></li></ul><li>Why Map / Reduce ? <br /><ul><li>Implicitly parallelize data processing
  5. 5. Works with a model where computation moves to Data rather than Data moving to Computing Machine.
  6. 6. Takes care of issue arises due to distributed computing.
  7. 7. Performs load Balancing (makes system reliable and fault tolerant).</li></li></ul><li>HDFS Architecture<br />Job Tracker<br />Task Tracker<br />Task Tracker<br />Courtesy : Hadoop Org<br />
  8. 8. Map Reduce Pipe<br />Raw Data <br />Mapper (Key, Value Format)<br />Shuffle & Sort (based on Key)<br />Reducer (For Each Key list of Values)<br />Output (Key, Value Format)<br />
  9. 9. ABC Ecommerce Company interested in analytics<br /><ul><li>They get daily customer sales data which they want to analyze.
  10. 10. The Data is in 9-tuple format: <OrderID, EmailID, MobileNum, ProductID, PayableAmount, DeliveryCharges, ModeofPayment, OrderStatus, OrderSite>
  11. 11. ModeofPayment = (COD,Credit,Check)
  12. 12. OrderStatus = (Ordered, Clicked, Verified, Rejected, Dispatched, Returned from Client, Delivered)
  13. 13. OrderSite = (Facebook, Google, Yahoo, MSN, Komli, HT, Store)</li></li></ul><li>Cube Generation for ABC Company<br />
  14. 14. Cube Generation for ABC Company<br />Mapper <br />Reducer<br />
  15. 15. <ul><li>Questions ?</li>

×