Your SlideShare is downloading. ×

MapMap-Reduce recipes in with c#

72

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
72
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. MAP-REDUCE RECIPES IN WITH C# Erik LeBel elebel@pyxis-tech.com
  • 2. WHAT IS MAP-REDUCE? © Pyxis Technologies inc. Outlined in a paper published by google in 2004 In tandem with the google file system (also described in a white paper circa 2004) make crunching of large data possible across multiple systems Is not very complicated or unique to google Is how we define two operations in a data processing pipeline
  • 3. WHY SHOULD I CARE? © Pyxis Technologies inc. Is a tool to have under your belt Is useful for analyzing large volumes of data of various varieties
  • 4. WHAT IS IT USED FOR? Data transformations © Pyxis Technologies inc.  Searches  Indexing  Restructuring data  Data aggregation  Anything that would traditionally been solved by ETL
  • 5. SO ABOUT THIS MAP REDUCE? © Pyxis Technologies inc. A two operation pattern for processing It runs as a series of jobs
  • 6. THE MAPPING OPERATION © Pyxis Technologies inc.  Extracts data from source files  Can be run in parallel in multiple processes and systems  Emits a collection of data associated with a key  The key will be used shuffle the data and join the results later
  • 7. THE REDUCE OPERATION © Pyxis Technologies inc.  The reduce function is applied to each group of data based on the keys  Typically a reduce function returns a list of value for a specified key  When gathered together, the results of all reduce operations make up the result of the Map -Reduce job
  • 8. SOLVING A PROBLEM Word Count
  • 9. WORD COUNT (SIMPLE) © Pyxis Technologies inc. Map shuffle (a, 3) (b, 1), (b, 1) Reduce (b) (b, 2) (c, 1) abacab (a, 1) (b, 1) (a, 1) (c, 1) (a, 1) (b, 1) Reduce (a) Reduce (c) (c, 1) Combiner (a, 1), (a, 1), (a1) (a, 3) (b, 2) (c, 1)
  • 10. DISTRIBUTED WORD COUNT abbacc © Pyxis Technologies inc. babbcc Map Map (a, 1) (b, 1) (b, 1) (a, 1) (c, 1) (c, 1) Map (b, 1) (a, 1) (b, 1) (b, 1) (c, 1) (c, 1) (a, 1), (a, 1).. shuffle Reduce (a) (a, 6) (b, 1), (b, 1).. Reduce (b) (b, 7) (c, 1), (c, 1).. Reduce (c) (c, 5) Combiner abacab (a, 1) (b, 1) (a, 1) (c, 1) (a, 1) (b, 1) (a, 6) (b, 7) (c, 5)
  • 11. AND NOW FOR SOME CODE
  • 12. Erik LeBel elebel@pyxis-tech.com Questions? T h a n k yo u ! Titre sur mesure POINTS FORTS 1 pyxis-tech.com
  • 13. LIVE MAP REDUCE (10 PARTICIPANTS) Mappers (4) Queues (2) Reducers (2) Combiners (1) © Pyxis Technologies inc. Summarize data Keep buffer Split red and black Count by suit Result (1) Store result

×