SlideShare a Scribd company logo
1 of 47
The Rough Guide to MongoDB
Simeon Simeonov
@simeons
Founding. Funding.
Growing. Startups.
Why MongoDB?
I am @simeons
recruit amazing people
solve hard problems
ship
make users happy
repeat
Why MongoDB?
Again,
please
SQL is slow
(for our business)
SQL is slow
(for our developer workflow)
SQL is slow
(for our analytics system)
So what’s Swoop?
Display Advertising
Makes the Web Suck
User-focused optimization
Tens of millions of users
1000+% better than average
200+% better than Google
Swoop Fixes That
Mobile SDKs
iOS & Android
Web SDK
RequireJS & jQuery
Components
AngularJS
NLP, etc.
Python
Targeting
High-Perf Java
Analytics
Ruby 2.0
Internal Apps
Ruby 2.0 / Rails 3
Pub Portal
Ruby 2.0 / Rails 3
Ad Portal
Ruby 2.0 / Rails 4
MongoDB: the Good
Fast
Flexible
JavaScript
MongoDB: the Bad
Not Quite Enterprise-Grade
Not Quite Enterprise-Grade
Not Cheap to Run Well
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will write more robust code
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
I will design a better map-reduce
RAM + locks == $$$
Five Steps to Happiness
Sharding
Native Relationships
Atomic Update Buffering
Content-Addressed Storage
Shell Tricks
// Google AdWords object model
Account
Campaign
AdGroup // this joins ads & keywords
Ad
Keyword
// For example
AdGroup has an Account
AdGroup has a Campaign
AdGroup has many Ads
AdGroup has many Keywords
Slam dunk
for SQL
// Let’s play a bit
Account
Campaign
AdGroup
Ad
Keyword
// Let’s play some more
Account
Campaign
AdGroup
Ad
Keyword
// There is just one bit left
Account
Campaign
AdGroup
1 Ad
0 Keyword
// build a hierarchical ID
accountIDcampaignIDadGroupID((0keywordID)|(1adID))
// a binary ID!
10100100001100000000101001100110101010010100
< accountID >< campaignID >< …
// Encode it in base 16, 32 or 64
{"_id" : "a4300a66a94d20f1", … }
// Example
The 5th
ad
Of the 3rd
ad group
Of the 7th
campaign
Of the 255th
account
could have the _id 0x00ff000700031005
The _id for the 10th
keyword of the same ad group
would be 0x00ff00070003000a
// Neat: the ad’s and keyword’s _ids contain the
// IDs of all of their ancestors in the hierarchy.
keywordId = 0x00ff00070003000a
adGroupId = keywordId & 0xffffffffffff0000
campaignId = keywordId & 0xffffffff00000000
accountId = keywordId & 0xffff000000000000
// has-a relationship is a simple lookup
account = db.accounts.findOne({_id: accountId})
// Neater: has-many relationships are just
// range queries on the _id index.
adGroupId = keywordId & 0xffffffffffff0000
startOfAds = adGroupId + 0x1000
endOfAds = adGroupId + 0x1fff
adsForKeyword = db.ads.find({
_id: {$gte: startOfAds, $lte: endOfAds}
})
// Technically, that was a join via the ad group.
// Who said Mongo can’t do joins???
> db.reports.findOne()
{
"_id" : …,
"period" : "hour",
"shard" : 0, // 16Mb doc limit protection
"topic" : "ce-1",
"ts" : ISODate("2012-06-12T05:00:00Z"),
"variations" : {
"2" : { // variationID (dimension set)
"hint" : {
"present" : 311, // hint.present is a metric
"clicks" : 1
}
},
"4" : {
"hint" : {
"present" : 331
}
}
}
}
Content Addressed Storage
Lazy join abstraction
Very space efficient
Extremely (pre-)cacheable
Join only happens during reporting
// Step 1: take a set of dimensions worth tracking
data = {
"domain_id" : "SW-28077508-16444",
"hint" : "Find an organic alternative",
"theme" : "red"
}
// Step 2: compute a digital signature, e.g., MD5
sig = "000069569F4835D16E69DF704187AC2F”
// Step 3: if new sig, increment a counter
counter = 264034
// Step 4: create a document in the context-
// addressed store collection for these
> db.cas.findOne()
{
"_id" : "000069569F4835D16E69DF704187AC2F", // MD5 hash
"data" : { // data that was digested to the hash above
"domain_id" : "SW-28077508-16444",
"hint" : "Find an organic alternative",
"theme” : "red"
},
"meta_data" : {
"id" : 264034 // variationID
},
"created_at" : ISODate("2013-02-04T12:05:34.752Z")
}
// Elsewhere, in the reports collection…
"variations" : {
"264034" : {
// metrics here
},
…
lazy join
// Use underscore.js in the shell
// See http://underscorejs.org/
function underscore() {
load("/mongo_hacks/underscore.js");
}
// Loads underscore.js on the MongoDB server
function server_underscore(force) {
force = force || false;
if (force || typeof(underscoreLoaded) === 'undefined') {
db.eval(cat("/mongo_hacks/underscore.js"));
underscoreLoaded = true;
}
}
// Callstack printing on exception -- wraps a function
function dbg(f) {
try {
f();
} catch (e) {
print("n**** Exception: " + e.toString());
print("n");
print(e.stack);
print("n");
if (arguments.length > 1) {
printjson(arguments);
print("n");
}
throw e;
}
}
function minutesAgo(minutes, d) {
d = d || new Date();
return new Date(d.valueOf() - minutes * 60 * 1000);
}
function hoursAgo(hours, d) {
d = d || new Date();
return minutesAgo(60 * hours, d);
}
function daysAgo(days, d) {
d = d || new Date();
return hoursAgo(24 * days, d);
}
// Don’t write in the shell.
// Use your fav editor, save & type t() in mongo
function t() {
load("/mongo_hacks/bag_of_tricks.js");
}
@simeons
sim@swoop.com

More Related Content

Similar to The Rough Guide to MongoDB

Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalogMongoDB
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
When dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWhen dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWim Godden
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTRPhil Pearce
 
Lessons Learned - Building YDN
Lessons Learned - Building YDNLessons Learned - Building YDN
Lessons Learned - Building YDNDan Theurer
 
moma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layermoma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layerGadi Oren
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB PerformanceMoshe Kaplan
 
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubJoining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubData Con LA
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBMongoDB
 
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)MongoDB
 
IOOF IT System Modernisation
IOOF IT System ModernisationIOOF IT System Modernisation
IOOF IT System ModernisationMongoDB
 
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffBig Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffMoshe Kaplan
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBMongoDB
 
API-Entwicklung bei XING
API-Entwicklung bei XINGAPI-Entwicklung bei XING
API-Entwicklung bei XINGMark Schmidt
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 

Similar to The Rough Guide to MongoDB (20)

Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
When dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniquesWhen dynamic becomes static - the next step in web caching techniques
When dynamic becomes static - the next step in web caching techniques
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTR
 
Lessons Learned - Building YDN
Lessons Learned - Building YDNLessons Learned - Building YDN
Lessons Learned - Building YDN
 
moma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layermoma-django overview --> Django + MongoDB: building a custom ORM layer
moma-django overview --> Django + MongoDB: building a custom ORM layer
 
mongoDB Performance
mongoDB PerformancemongoDB Performance
mongoDB Performance
 
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave ClubJoining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
Joining the Club: Using Spark to Accelerate Big Data at Dollar Shave Club
 
Back to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDBBack to Basics, webinar 2: La tua prima applicazione MongoDB
Back to Basics, webinar 2: La tua prima applicazione MongoDB
 
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
Implementing Your Full Stack App with MongoDB Stitch (Tutorial)
 
IOOF IT System Modernisation
IOOF IT System ModernisationIOOF IT System Modernisation
IOOF IT System Modernisation
 
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuffBig Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
Big Data Seminar: Analytics, Hadoop, Map Reduce, Mongo and other great stuff
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDB
 
API-Entwicklung bei XING
API-Entwicklung bei XINGAPI-Entwicklung bei XING
API-Entwicklung bei XING
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same treeMongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same tree
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 

More from Simeon Simeonov

HyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathHyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathSimeon Simeonov
 
High accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataHigh accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataSimeon Simeonov
 
Memory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsMemory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsSimeon Simeonov
 
Three Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendThree Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendSimeon Simeonov
 
Swoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSwoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSimeon Simeonov
 
Build a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsBuild a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsSimeon Simeonov
 
Strategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovStrategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovSimeon Simeonov
 
Patterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovPatterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovSimeon Simeonov
 
Customer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfCustomer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfSimeon Simeonov
 

More from Simeon Simeonov (10)

HyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard MathHyperLogLog Intuition Without Hard Math
HyperLogLog Intuition Without Hard Math
 
High accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive dataHigh accuracy ML & AI over sensitive data
High accuracy ML & AI over sensitive data
 
Memory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails ApplicationsMemory Issues in Ruby on Rails Applications
Memory Issues in Ruby on Rails Applications
 
Three Tips for Winning Startup Weekend
Three Tips for Winning Startup WeekendThree Tips for Winning Startup Weekend
Three Tips for Winning Startup Weekend
 
Swoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly RobotsSwoop: Solve Hard Problems & Fly Robots
Swoop: Solve Hard Problems & Fly Robots
 
Build a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy StepsBuild a Story Factory for Inbound Marketing in Five Easy Steps
Build a Story Factory for Inbound Marketing in Five Easy Steps
 
Strategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon SimeonovStrategies for Startup Success by Simeon Simeonov
Strategies for Startup Success by Simeon Simeonov
 
Patterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon SimeonovPatterns of Successful Angel Investing by Simeon Simeonov
Patterns of Successful Angel Investing by Simeon Simeonov
 
Customer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob DorfCustomer Development: The Second Decade by Bob Dorf
Customer Development: The Second Decade by Bob Dorf
 
Beyond Bootstrapping
Beyond BootstrappingBeyond Bootstrapping
Beyond Bootstrapping
 

Recently uploaded

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 

Recently uploaded (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

The Rough Guide to MongoDB

  • 1. The Rough Guide to MongoDB Simeon Simeonov @simeons
  • 5. recruit amazing people solve hard problems ship make users happy repeat
  • 6.
  • 8. SQL is slow (for our business)
  • 9. SQL is slow (for our developer workflow)
  • 10. SQL is slow (for our analytics system)
  • 12.
  • 13. Display Advertising Makes the Web Suck User-focused optimization Tens of millions of users 1000+% better than average 200+% better than Google Swoop Fixes That
  • 14. Mobile SDKs iOS & Android Web SDK RequireJS & jQuery Components AngularJS NLP, etc. Python Targeting High-Perf Java Analytics Ruby 2.0 Internal Apps Ruby 2.0 / Rails 3 Pub Portal Ruby 2.0 / Rails 3 Ad Portal Ruby 2.0 / Rails 4
  • 16. MongoDB: the Bad Not Quite Enterprise-Grade Not Quite Enterprise-Grade Not Cheap to Run Well
  • 17. I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code I will write more robust code
  • 18. I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce I will design a better map-reduce
  • 19. RAM + locks == $$$
  • 20.
  • 21. Five Steps to Happiness Sharding Native Relationships Atomic Update Buffering Content-Addressed Storage Shell Tricks
  • 22.
  • 23.
  • 24. // Google AdWords object model Account Campaign AdGroup // this joins ads & keywords Ad Keyword // For example AdGroup has an Account AdGroup has a Campaign AdGroup has many Ads AdGroup has many Keywords Slam dunk for SQL
  • 25. // Let’s play a bit Account Campaign AdGroup Ad Keyword
  • 26. // Let’s play some more Account Campaign AdGroup Ad Keyword
  • 27. // There is just one bit left Account Campaign AdGroup 1 Ad 0 Keyword
  • 28. // build a hierarchical ID accountIDcampaignIDadGroupID((0keywordID)|(1adID)) // a binary ID! 10100100001100000000101001100110101010010100 < accountID >< campaignID >< … // Encode it in base 16, 32 or 64 {"_id" : "a4300a66a94d20f1", … }
  • 29. // Example The 5th ad Of the 3rd ad group Of the 7th campaign Of the 255th account could have the _id 0x00ff000700031005 The _id for the 10th keyword of the same ad group would be 0x00ff00070003000a
  • 30. // Neat: the ad’s and keyword’s _ids contain the // IDs of all of their ancestors in the hierarchy. keywordId = 0x00ff00070003000a adGroupId = keywordId & 0xffffffffffff0000 campaignId = keywordId & 0xffffffff00000000 accountId = keywordId & 0xffff000000000000 // has-a relationship is a simple lookup account = db.accounts.findOne({_id: accountId})
  • 31. // Neater: has-many relationships are just // range queries on the _id index. adGroupId = keywordId & 0xffffffffffff0000 startOfAds = adGroupId + 0x1000 endOfAds = adGroupId + 0x1fff adsForKeyword = db.ads.find({ _id: {$gte: startOfAds, $lte: endOfAds} }) // Technically, that was a join via the ad group. // Who said Mongo can’t do joins???
  • 32.
  • 33.
  • 34.
  • 35.
  • 36. > db.reports.findOne() { "_id" : …, "period" : "hour", "shard" : 0, // 16Mb doc limit protection "topic" : "ce-1", "ts" : ISODate("2012-06-12T05:00:00Z"), "variations" : { "2" : { // variationID (dimension set) "hint" : { "present" : 311, // hint.present is a metric "clicks" : 1 } }, "4" : { "hint" : { "present" : 331 } } } }
  • 37. Content Addressed Storage Lazy join abstraction Very space efficient Extremely (pre-)cacheable Join only happens during reporting
  • 38. // Step 1: take a set of dimensions worth tracking data = { "domain_id" : "SW-28077508-16444", "hint" : "Find an organic alternative", "theme" : "red" } // Step 2: compute a digital signature, e.g., MD5 sig = "000069569F4835D16E69DF704187AC2F” // Step 3: if new sig, increment a counter counter = 264034 // Step 4: create a document in the context- // addressed store collection for these
  • 39. > db.cas.findOne() { "_id" : "000069569F4835D16E69DF704187AC2F", // MD5 hash "data" : { // data that was digested to the hash above "domain_id" : "SW-28077508-16444", "hint" : "Find an organic alternative", "theme” : "red" }, "meta_data" : { "id" : 264034 // variationID }, "created_at" : ISODate("2013-02-04T12:05:34.752Z") } // Elsewhere, in the reports collection… "variations" : { "264034" : { // metrics here }, … lazy join
  • 40.
  • 41. // Use underscore.js in the shell // See http://underscorejs.org/ function underscore() { load("/mongo_hacks/underscore.js"); }
  • 42. // Loads underscore.js on the MongoDB server function server_underscore(force) { force = force || false; if (force || typeof(underscoreLoaded) === 'undefined') { db.eval(cat("/mongo_hacks/underscore.js")); underscoreLoaded = true; } }
  • 43. // Callstack printing on exception -- wraps a function function dbg(f) { try { f(); } catch (e) { print("n**** Exception: " + e.toString()); print("n"); print(e.stack); print("n"); if (arguments.length > 1) { printjson(arguments); print("n"); } throw e; } }
  • 44. function minutesAgo(minutes, d) { d = d || new Date(); return new Date(d.valueOf() - minutes * 60 * 1000); } function hoursAgo(hours, d) { d = d || new Date(); return minutesAgo(60 * hours, d); } function daysAgo(days, d) { d = d || new Date(); return hoursAgo(24 * days, d); }
  • 45. // Don’t write in the shell. // Use your fav editor, save & type t() in mongo function t() { load("/mongo_hacks/bag_of_tricks.js"); }
  • 46.