Your SlideShare is downloading. ×
0
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Map/Confused? A practical approach to Map/Reduce with MongoDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Map/Confused? A practical approach to Map/Reduce with MongoDB

34,886

Published on

Talk given at MongoDb Munich on 16.10.2012 about the different approaches in MongoDB for using the Map/Reduce algorithm. The talk compares the performance of built-in MongoDB Map/Reduce, group(), …

Talk given at MongoDb Munich on 16.10.2012 about the different approaches in MongoDB for using the Map/Reduce algorithm. The talk compares the performance of built-in MongoDB Map/Reduce, group(), aggregate(), find() and the MongoDB-Hadoop Adapter using a practical use case.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
34,886
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
52
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1.      
  • 2.      
  • 3.      
  • 4. { "_id" : ObjectId("4fb9fb91d066d657de8d6f36"), "text" : “MongoDB uses Map/Reduce #epic #win", … "user" : { "friends_count" : 73, … "followers_count" : 102, "id" : 53507833, }, …}
  • 5.      mongod --rest --shardsvr --port 27017 --dbpath /tmp/shard1/ --smallfiles  mongod --rest --shardsvr --port 27017 --dbpath /tmp/shard1/ --smallfiles  mongod --configsvr --port 10000 --dbpath /tmp/config/ --smallfiles  mongos --port 22222 --configdb localhost:10000 1. db.tweets.mapReduce() 2. db.tweets.group() 3. db.tweets.aggregate() 4. MongoDB-Hadoop Adapter 5. db.tweets.find()
  • 6. var measure = function(c) { var a = Date.now(); var results = c.apply(); var d = Date.now() - a; return { results:results, duration:d };};
  • 7. function() { if (this.user != null) { emit("user", {userName: this.user.name, followers: this.user.followers_count}); }}
  • 8. function(key, values) { var result = null; values.forEach( function(value) { if (result == null || result.followers < value.followers) { result = value; } }) return result;}
  • 9. db.tweets.group({ key: {}, initial: { name:, followers_count:0 }, reduce: function(obj,prev) { if (obj.user != null && prev.followers_count < obj.user.followers_count) { prev.name = obj.user.name; prev.followers_count = obj.user.followers_count; } }})
  • 10. db.tweets.aggregate( {$group: { _id: {user_name: "$user.name"}, followers_count: {$max: "$user.followers_count"} }}, {$sort: {"followers_count" : -1}}, {$limit : 1}, {$project: { _id : 0, user_name : "$_id.user_name", followers_count : "$followers_count" }})
  • 11. #!/usr/bin/env python# encoding: utf-8import syssys.path.append(".")from pymongo_hadoop import BSONMapperdef mapper(documents): for doc in documents: if doc[user] != None: yield {_id: doc[user][name].encode(utf-8), followers:doc[user][followers_count]}BSONMapper(mapper)print >> sys.stderr, "Done Mapping!"
  • 12. #!/usr/bin/env python# encoding: utf-8import syssys.path.append(.)from pymongo_hadoop import BSONReducerdef reducer(key, values): print >> sys.stderr, "Processing key %s" % key.encode(utf-8) _count = 0 for v in values: if _count < v[followers]: _count = v["followers"] return {"_id": key.encode(utf-8), "count": _count}BSONReducer(reducer)print >> sys.stderr, "Done Reducing!"
  • 13. hadoop jar /usr/lib/hadoop/lib/mongo-hadoop-streaming-assembly-1.1.0-SNAPSHOT.jar-files mapper.py, reducer.py-inputURI mongodb://localhost:27017/twitter.tweets-outputURI mongodb://localhost:27017/twitter.top_user-mapper mapper.py-reducer reducer.py
  • 14. db.tweets.find().sort( {"user.followers_count": -1} ).limit(1)
  • 15. db.tweets.mapReduce()db.tweets.group()db.tweets.aggregate()MongoDB-Hadoop Adapterdb.tweets.find()
  • 16. db.tweets.mapReduce()db.tweets.group()db.tweets.aggregate()MongoDB-Hadoop Adapterdb.tweets.find()
  • 17. db.tweets.mapReduce()db.tweets.group()db.tweets.aggregate()MongoDB-Hadoop Adapterdb.tweets.find()
  • 18. db.tweets.mapReduce()db.tweets.group()db.tweets.aggregate()MongoDB-Hadoop Adapterdb.tweets.find()
  • 19. db.tweets.mapReduce()db.tweets.group()db.tweets.aggregate()MongoDB-Hadoop Adapterdb.tweets.find()
  • 20.   
  • 21. 

×