MapReduce
Upcoming SlideShare
Loading in...5
×
 

MapReduce

on

  • 687 views

 

Statistics

Views

Total Views
687
Views on SlideShare
687
Embed Views
0

Actions

Likes
0
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

MapReduce MapReduce Presentation Transcript

  • MapReduce: Simplified DataProcessing on Large Clusters Rob Keisler CSCI 638 Summer 2011
  • Outline● Background● Model● Examples● Execution● Conclusions
  • Background● Transformation operations are conceptually straightforward ○ Until data is large and the computation must be distributed over hundred or thousands of machines● So, Google created MapReduce● MapReduce is a programming abstraction ○ Expresses simple computations ○ Hides complexity details
  • Model● Utilizes higher-order shaping functions Map and Reduce to take a set of input key/value pairs and produce a set of output key/value pairs● Map ○ Takes an input key/value pair and produces a set of intermediate key/value pairs● Reduce ○ Accepts an intermediate key I and a set of values for that key, and merges those values to form possibly smaller sets of values
  • Examples● Distributed Grep● Count of URL Access Frequency● Reverse Web-Link Graph● Term-Vector per Host● Inverted Index● Distributed Sort
  • Execution Overview
  • Conclusions● The MapReduce programming model proved to be a useful abstraction for many different purposes ○ Easy to use ■ even for programmers without experience with parallel and distributed systems ○ A large variety of problems are easily expressible as MapReduce computations ○ The implementation scales to large clusters of machines● Greatly simplifies large-scale computations at Google
  • Questions?http://labs.google.com/papers/mapreduce.html