MongoSF - mongodb @ foursquare

10,118 views

Published on

Talk at MongoSF 2011, on how Foursquare uses MongoDB. 5/24/2011

0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,118
On SlideShare
0
From Embeds
0
Number of Embeds
5,254
Actions
Shares
0
Downloads
116
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • MongoSF - mongodb @ foursquare

    1. 1. mongodb @ foursquareMongoSF - 5/24/2011Jorge Ortiz (@jorgeortiz85)
    2. 2. what is foursquare?location-based social network - “check-in” to bars,restaurants, museums, parks, etc friend-finder (where are my friends right now?) virtual game (badges, points, mayorships) city guide (local, personalized recommendations) location diary + stats engine (where was I a year ago?) specials (get rewards at your favorite restaurant)
    3. 3. foursquare: the numbers>9M users~3M checkins/day>15M venues>300k merchants>60 employees
    4. 4. foursquare: the tech Nginx, HAProxy Scala, Lift MongoDB, PostgreSQL (legacy) (Kestrel, Munin, Ganglia, Python, Memcache, ...) All on EC2
    5. 5. foursquare <3’s mongodbfastindexes & rich queriessharding, auto-balancingreplication (see: http://engineering.foursquare.com/)geo-indexesamazing support
    6. 6. mongodb: our numbers8 clusters some sharded, some not some master/slave, some replica set~40 machines (68.4GB, m2.4xl on EC2)2.3 billion records~15k QPS
    7. 7. mongodb: lessons learnedkeep working set in memoryavoid long-running queries (reads or writes)monitor everything (especially per-collection stats)shard from day 1beware EBSuse small field names for large collections
    8. 8. keep working set in memory
    9. 9. avoid long-running queries
    10. 10. monitor everything(per collection stats)
    11. 11. shard from day 1
    12. 12. beware EBS
    13. 13. use small field names forlarge collections
    14. 14. mongodb: pain pointsmongos -- failover and thundering herdsindex creation -- production impact unclearauto-balancing -- getting therereplication chains -- use replica sets
    15. 15. rogue: a scala dsl for mongo type-safe all mongo query features logging & validation hooks pagination index-aware http://github.com/foursquare/rogue
    16. 16. mongo-java-driver queryval query = (BasicDBOBjectBuilder .start .push(“mayorid”) .add(“$lte”, 100) .pop .push(“veneuname”) .add(“$eq”, “Starbucks”) .pop .get)
    17. 17. rogue: code exampleVenue where (_.mayorid <= 100) and (_.venuename eqs “Starbucks”) and (_.tags contains “wifi”) and (_.latlng near (39.0, -74.0, Degrees(0.2)) orderDesc (_._id) fetch (5)
    18. 18. rogue: schema exampleclass Venue extends MongoRecord[Venue] { object _id extends ObjectIdField(this) object venuename extends StringField(this) object mayorid extends LongField(this) object tags extends ListField[String](this) object latlng extends LatLngField(this)}
    19. 19. rogue: logging & validation logging: slf4j Tracer validation: radius, $in size index checks
    20. 20. rogue: paginationval query: Query[Venue] = ...val vs: List[Venue] = (query .countPerPage(20) .setPage(5) .fetch())
    21. 21. rogue: cursorsval query: Query[Checkin] = ...for (checkin <- query) { ... f(checkin) ...}
    22. 22. rogue: index-awareval vs: List[Checkin] = (Checkin where (_.userid eqs 646) and (_.venueid eqs vid) fetch ())
    23. 23. rogue: index-awareval vs: List[Checkin] = (Checkin where (_.userid eqs 646) and (_.venueid eqs vid) // hidden scan! fetch ())
    24. 24. rogue: index-awareval vs: List[Checkin] = (Checkin where (_.userid eqs 646) // known index scan (_.venueid eqs vid) // known scan fetch ())
    25. 25. rogue: future directions iteratees for cursors compile-time index checking select partial objects generate full javascript for mapreduce
    26. 26. we’re hiring (nyc & sf)http://foursquare.com/jobs jorge@foursquare.com @jorgeortiz85

    ×