• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Introducing Map Reduce: CloudCamp St. Louis 2009
 

Introducing Map Reduce: CloudCamp St. Louis 2009

on

  • 1,331 views

This introduction to Map/Reduce was presented by Michael Groner of Appistry at CloudCamp St. Louis on December 10, 2009.

This introduction to Map/Reduce was presented by Michael Groner of Appistry at CloudCamp St. Louis on December 10, 2009.

Statistics

Views

Total Views
1,331
Views on SlideShare
1,277
Embed Views
54

Actions

Likes
1
Downloads
0
Comments
0

3 Embeds 54

http://cloudpulseblog.com 24
http://cloudpul.se 18
http://www.slideshare.net 12

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Introducing Map Reduce: CloudCamp St. Louis 2009 Introducing Map Reduce: CloudCamp St. Louis 2009 Presentation Transcript

    • Introducing MapReduceMichael GronerCloudCamp ST. Louis 2009
    • Description
      MapReduce: A software framework to support processing of massive data sets across distributed computers
      Simple, powerful programming model
      Language independent
      Break down the processing problem into embarrassingly parallel atomic operations
    • MapReduce installations
      Google – Index building
      Visa – Transaction Processing
      Facebook – Facebook Lexicon
      Intelligence Community
      Yahoo/Google – Terabyte Sort
      10 billion, 100 byte records
      Yahoo: 910 nodes, 206 seconds
      Google: ~1,000 nodes, 68 seconds
    • Algorithm
      Map Phase
      Raw data analyzed and converted to name/value pair
      Shuffle Phase
      All name/value pairs are sorted and grouped by their keys
      Reduce Phase
      All values associated with a key are processed for results
    • MAPREDUCE BENEFITS
      Scale
      Processing speed increases with number of machines involved
      Reliable
      Loss of any one machine doesn’t stop processing
      Cost
      Often built from heterogeneous commodity grade computers
    • More INFORMATION
      MapReduce: Simplified Data Processing on Large Clusters
      Dean and Ghemawat
      Slides: tech.michaelgroner.com
    • Thank You