Your SlideShare is downloading. ×
mongodb + ex.fm
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

mongodb + ex.fm

189
views

Published on

_id, padding factor, and bucketing, oh my! Slides from my talk at MongoPGH http://www.10gen.com/events/mongodb-pgh May 15, 2012

_id, padding factor, and bucketing, oh my! Slides from my talk at MongoPGH http://www.10gen.com/events/mongodb-pgh May 15, 2012


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
189
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH
  • 2. ex.fm turns websites into CD’s
  • 3. browser extensions
  • 4. _id and indexes• Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes – Long,VariableWidthStringsMissIndexes• Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index
  • 5. activity feeds: first attempt{“_id”: “201109122304-lucas-dan-c7dede43…”,"username”: “lucas”, "created”: 201109122304,"actor”: “dan”, “verb”: “love”}db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟}).sort({„created‟: -1})Working just fine for 4MM documents, but getting slow…
  • 6. new version of activity feeds{“_id”: “201109122304-lucas-dan-c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”:lucas-love-201109122304, "actor”: “dan”}db.user.feed.find({„vid‟: /^lucas-/}).sort({„vid‟: -1})Fast for all 3 use cases!
  • 7. removing indexes pays offDon‟t need to buy more/bigger machines!
  • 8. sites! sites! sites!
  • 9. padding factor• Variable document size• Allocate for the latest and fattest• Document moves• Can be very inefficient• More RAM!• Pre-allocate to prevent moves
  • 10. unbounded embedded lists• Useful for followers, favorites• Good for a few things, bad for lots• Constantly bumping up padding factor• Lots of document moves
  • 11. a metaphor • You run a coffee shop and can buy only one size of cup. Which size do you buy? • On average, each customer has only one cup • Heavy drinkers have hundreds of cupscredit: Macintex macintex.deviantart.com
  • 12. bucketing!• Split list across multiple documents• Median number of items = bucket size• Pre-allocate• Easy seeking and traversal• Much faster
  • 13. hey charts!site.meta 1 site.meta 2site.songs 1 site.songs 2 Allocated and unused Allocated and full of data
  • 14. same charts when using bucketingsite.meta 1 site.meta 2site.songs 1 - 1 site.songs 2 - 1 site.songs 2 - 2site.songs 1 -2 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 -6 Allocated and unused Allocated and full of data
  • 15. doesn’t work for everything…• Picking right bucket size• Defragging• Random insertion – Easy for things you don‟t much care about the order of – More difficult is you‟re going to insert and change the order later
  • 16. micro documentsdb.site.songs.find({_id:/^bfc25de08d964a8a41226c6016dd7753-/}).sort({_id:-1}){ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" :18436532 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" :18804590 }{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" :18804591 }
  • 17. paying it back• Bent mongoengine to make this easy• Follow github.com/exfm• Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index
  • 18. thanks! lucas@ex.fmgithub.com/exfm