mongodb + ex.fm

  • 185 views
Uploaded on

_id, padding factor, and bucketing, oh my! Slides from my talk at MongoPGH http://www.10gen.com/events/mongodb-pgh May 15, 2012

_id, padding factor, and bucketing, oh my! Slides from my talk at MongoPGH http://www.10gen.com/events/mongodb-pgh May 15, 2012

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
185
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH
  • 2. ex.fm turns websites into CD’s
  • 3. browser extensions
  • 4. _id and indexes• Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes – Long,VariableWidthStringsMissIndexes• Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index
  • 5. activity feeds: first attempt{“_id”: “201109122304-lucas-dan-c7dede43…”,"username”: “lucas”, "created”: 201109122304,"actor”: “dan”, “verb”: “love”}db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟}).sort({„created‟: -1})Working just fine for 4MM documents, but getting slow…
  • 6. new version of activity feeds{“_id”: “201109122304-lucas-dan-c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”:lucas-love-201109122304, "actor”: “dan”}db.user.feed.find({„vid‟: /^lucas-/}).sort({„vid‟: -1})Fast for all 3 use cases!
  • 7. removing indexes pays offDon‟t need to buy more/bigger machines!
  • 8. sites! sites! sites!
  • 9. padding factor• Variable document size• Allocate for the latest and fattest• Document moves• Can be very inefficient• More RAM!• Pre-allocate to prevent moves
  • 10. unbounded embedded lists• Useful for followers, favorites• Good for a few things, bad for lots• Constantly bumping up padding factor• Lots of document moves
  • 11. a metaphor • You run a coffee shop and can buy only one size of cup. Which size do you buy? • On average, each customer has only one cup • Heavy drinkers have hundreds of cupscredit: Macintex macintex.deviantart.com
  • 12. bucketing!• Split list across multiple documents• Median number of items = bucket size• Pre-allocate• Easy seeking and traversal• Much faster
  • 13. hey charts!site.meta 1 site.meta 2site.songs 1 site.songs 2 Allocated and unused Allocated and full of data
  • 14. same charts when using bucketingsite.meta 1 site.meta 2site.songs 1 - 1 site.songs 2 - 1 site.songs 2 - 2site.songs 1 -2 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 -6 Allocated and unused Allocated and full of data
  • 15. doesn’t work for everything…• Picking right bucket size• Defragging• Random insertion – Easy for things you don‟t much care about the order of – More difficult is you‟re going to insert and change the order later
  • 16. micro documentsdb.site.songs.find({_id:/^bfc25de08d964a8a41226c6016dd7753-/}).sort({_id:-1}){ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" :18436532 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" :18804590 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" :18804591 }
  • 17. paying it back• Bent mongoengine to make this easy• Follow github.com/exfm• Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index
  • 18. thanks! lucas@ex.fmgithub.com/exfm