SlideShare a Scribd company logo
mongo @ ex.fm
 Lucas Hrabovsky
      CTO
   #MongoPGH
ex.fm turns websites into CD’s
browser extensions
_id and indexes
• Bad Ideas
  – ObjectId("4fb284…")
  – Big Compound Indexes
  – Long,VariableWidthStringsMissIndexes
• Good Ideas
  – Make _id mean something
  – Fixed Width Hashes
  – Use _id as a compound index
activity feeds: first attempt
{“_id”: “201109122304-lucas-dan-c7dede43…”,
"username”: “lucas”, "created”: 201109122304,
"actor”: “dan”, “verb”: “love”}


db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟})
.sort({„created‟: -1})



Working just fine for 4MM documents, but getting slow…
new version of activity feeds
{“_id”: “201109122304-lucas-dan-
c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”:
lucas-love-201109122304, "actor”: “dan”}


db.user.feed.find({„vid‟: /^lucas-/})
.sort({„vid‟: -1})

Fast for all 3 use cases!
removing indexes pays off




Don‟t need to buy more/bigger machines!
sites! sites! sites!
padding factor
•   Variable document size
•   Allocate for the latest and fattest
•   Document moves
•   Can be very inefficient
•   More RAM!
•   Pre-allocate to prevent moves
unbounded embedded lists
•   Useful for followers, favorites
•   Good for a few things, bad for lots
•   Constantly bumping up padding factor
•   Lots of document moves
a metaphor
     • You run a coffee shop and can buy only
       one size of cup. Which size do you buy?
     • On average, each customer has only one
       cup
     • Heavy drinkers have hundreds of cups




credit: Macintex macintex.deviantart.com
bucketing!
•   Split list across multiple documents
•   Median number of items = bucket size
•   Pre-allocate
•   Easy seeking and traversal
•   Much faster
hey charts!
site.meta 1                         site.meta 2

site.songs 1                             site.songs 2




  Allocated and unused

  Allocated and full of data
same charts when using
                bucketing
site.meta 1                               site.meta 2

site.songs 1 - 1               site.songs 2 - 1    site.songs 2 - 2



site.songs 1 -2                site.songs 2 - 3    site.songs 2 - 4



                               site.songs 2 - 5    site.songs 2 -6




  Allocated and unused

  Allocated and full of data
doesn’t work for everything…
• Picking right bucket size
• Defragging
• Random insertion
  – Easy for things you don‟t much care about the
    order of
  – More difficult is you‟re going to insert and
    change the order later
micro documents
db.site.songs.find({_id:
/^bfc25de08d964a8a41226c6016dd7753-
/}).sort({_id:-1})

{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" :
18436532 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" :
18804590 }
{ "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" :
18804591 }
paying it back
• Bent mongoengine to make this easy
• Follow github.com/exfm
• Also added tooling for
  – Trace all queries
  – Aggregate tracing by request middleware
  – Raise exceptions when queries miss an index
thanks!

  lucas@ex.fm
github.com/exfm

More Related Content

Similar to mongodb + ex.fm

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
kiwilkins
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
Jacinto Limjap
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
lehresman
 
Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
Kaushik Paranjape
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
Andrew Brust
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
Andrew Brust
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDB
Fredrik Björk
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Matias Cascallares
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data Models
EDB
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
Saltmarch Media
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
Marakana Inc.
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
Dave Nielsen
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
Uwe Printz
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
MongoDB
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics
MongoDB
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
MongoDB
 

Similar to mongodb + ex.fm (20)

MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
Scalable web architecture
Scalable web architectureScalable web architecture
Scalable web architecture
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 
Real-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDBReal-time Location Based Social Discovery using MongoDB
Real-time Location Based Social Discovery using MongoDB
 
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'tsThe Fine Art of Schema Design in MongoDB: Dos and Don'ts
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
 
Postgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data ModelsPostgres Vision 2018: Five Sharding Data Models
Postgres Vision 2018: Five Sharding Data Models
 
Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?Is NoSQL The Future of Data Storage?
Is NoSQL The Future of Data Storage?
 
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
MongoDB .local Bengaluru 2019: A Complete Methodology to Data Modeling for Mo...
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
 
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 201910 Ways to Scale Your Website Silicon Valley Code Camp 2019
10 Ways to Scale Your Website Silicon Valley Code Camp 2019
 
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Munich 2019: A Complete Methodology to Data Modeling for MongoDB
 
MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)MongoDB for Coder Training (Coding Serbia 2013)
MongoDB for Coder Training (Coding Serbia 2013)
 
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDBMongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
MongoDB .local Toronto 2019: A Complete Methodology of Data Modeling for MongoDB
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDBMongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
MongoDB .local Chicago 2019: A Complete Methodology to Data Modeling for MongoDB
 
Using Aggregation for Analytics
Using Aggregation for Analytics Using Aggregation for Analytics
Using Aggregation for Analytics
 
Using Aggregation for analytics
Using Aggregation for analyticsUsing Aggregation for analytics
Using Aggregation for analytics
 

Recently uploaded

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 

Recently uploaded (20)

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 

mongodb + ex.fm

  • 1. mongo @ ex.fm Lucas Hrabovsky CTO #MongoPGH
  • 2. ex.fm turns websites into CD’s
  • 4. _id and indexes • Bad Ideas – ObjectId("4fb284…") – Big Compound Indexes – Long,VariableWidthStringsMissIndexes • Good Ideas – Make _id mean something – Fixed Width Hashes – Use _id as a compound index
  • 5. activity feeds: first attempt {“_id”: “201109122304-lucas-dan-c7dede43…”, "username”: “lucas”, "created”: 201109122304, "actor”: “dan”, “verb”: “love”} db.user.feed.find({„username‟: „lucas‟, „verb‟: „love‟}) .sort({„created‟: -1}) Working just fine for 4MM documents, but getting slow…
  • 6. new version of activity feeds {“_id”: “201109122304-lucas-dan- c7dede43…”, ”uid”: “lucas-201109122304”, ”vid”: lucas-love-201109122304, "actor”: “dan”} db.user.feed.find({„vid‟: /^lucas-/}) .sort({„vid‟: -1}) Fast for all 3 use cases!
  • 7. removing indexes pays off Don‟t need to buy more/bigger machines!
  • 9. padding factor • Variable document size • Allocate for the latest and fattest • Document moves • Can be very inefficient • More RAM! • Pre-allocate to prevent moves
  • 10. unbounded embedded lists • Useful for followers, favorites • Good for a few things, bad for lots • Constantly bumping up padding factor • Lots of document moves
  • 11. a metaphor • You run a coffee shop and can buy only one size of cup. Which size do you buy? • On average, each customer has only one cup • Heavy drinkers have hundreds of cups credit: Macintex macintex.deviantart.com
  • 12. bucketing! • Split list across multiple documents • Median number of items = bucket size • Pre-allocate • Easy seeking and traversal • Much faster
  • 13. hey charts! site.meta 1 site.meta 2 site.songs 1 site.songs 2 Allocated and unused Allocated and full of data
  • 14. same charts when using bucketing site.meta 1 site.meta 2 site.songs 1 - 1 site.songs 2 - 1 site.songs 2 - 2 site.songs 1 -2 site.songs 2 - 3 site.songs 2 - 4 site.songs 2 - 5 site.songs 2 -6 Allocated and unused Allocated and full of data
  • 15. doesn’t work for everything… • Picking right bucket size • Defragging • Random insertion – Easy for things you don‟t much care about the order of – More difficult is you‟re going to insert and change the order later
  • 16. micro documents db.site.songs.find({_id: /^bfc25de08d964a8a41226c6016dd7753- /}).sort({_id:-1}) { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029114", ”s" : 18436532 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029113", ”s" : 18804590 } { "_id" : "bfc25de08d964a8a41226c6016dd7753-1337029112", ”s" : 18804591 }
  • 17. paying it back • Bent mongoengine to make this easy • Follow github.com/exfm • Also added tooling for – Trace all queries – Aggregate tracing by request middleware – Raise exceptions when queries miss an index