SlideShare a Scribd company logo
1 of 57
MongoDB: How We Did It – 
Reanimating Identity at AOL
Topics 
• Motivation 
• Challenges 
• Approach 
• MongoDB Testing 
• Deployment 
• Collections 
• Problem/Solution 
• Lessons Learned 
• Going Forward
Motivation
Motivation 
• Cluttered data 
• Ambiguous data 
• Functionally shared data 
• Immutable data model
Challenges
Challenges 
• Leaving behind fault-tolerant (Non-Stop) 
platform/Transactional integrity 
• Merge/extricate Identity data 
• Scaling to handle consolidated traffic 
• Continue to support Legacy
Approach
Approach 
• Document-based data model – use MongoDB 
• Migrate data 
• Build adapter/interceptor layer 
• Production testing with no impacts
Approach 
• Audit of setup with MongoDB 
• Tweak mongo settings, including driver, to 
optimize for performance 
• Leverage eventual consistency to overcome 
transactional integrity loss 
• Switch Identity to new data model using 
MongoDB
Migration
Migration 
• Adapters support 4 stages: 
1. Read/write legacy 
2. Read/write legacy, write mongoDB (shadow read 
mongoDB) 
3. read/write mongoDB, write legacy 
4. Read/write mongoDB
Stage 1 Stage 1
Stage 2 
Stage 2
Stage 3 
Stage 3
Stage 4 Stage 4
MongoDB Testing
Production Testing 
• “Chaos Monkey” testing of MongoDB 
• 4 Million requests/Minute (production load, 
read to write ratio 99%) 
• Test primary failover (graceful) 
• Kill Primary
Production Testing 
• Test secondary failure 
• Shutdown all secondaries 
• Manually shutdown interface on primary 
• Performance benchmarking
Production Testing 
• Performance very good, shard key reads ~2- 
3ms 
• Scatter-gather reads ~12ms 
• Writes good as well, ~3-20ms 
• Failovers 4-5 minutes
MongoDB Healthcheck 
• Use dedicated machines for Config servers 
• Place Config servers in different data centers 
• Handle failover in application, if network 
exception, fallback to secondary 
• Set lower TCP keepalive values (5 minutes)
Deployment
Deployment 
• Version 2.4.9 
• All 75 mongod’s on separate switches 
• 2 x 12 Core CPUs, 192GB of RAM and internal 
controller based RAID 10 Ext4 File Systems 
• Using default chunk size (64MB)
Deployment 
• Have dedicated slaves for backup (configured 
as hidden members with priority 0). Backup 
runs during 6-8am window 
• Enable powerOf2Sizes for collections to 
reduce fragmentation 
• Balancer restricted to 4-6am daily
Collections
Document Model 
• Entire data set must be in memory to meet 
performance demands 
• Document field names abbreviated, but 
descriptive 
• Don’t store default values (Legacy document is 
80% defaults) 
• Working hard to keep legacy artifacts out, but 
always about trade-offs
UserIdentity Collection 
• Core data model for Identity 
• Heterogenous collection (some documents 
are “aliases” which are pointers to primary 
document) 
• Index on user+namespace 
• Shard key is guid (UUID Type 1, flipped – 
node then time)
UserIdentity 
{ 
_id: “baebc8bcc8e14f6e9bf70221d81711e2”, 
user: “jdoe”, 
ns: “aol, 
… 
"profile" : { 
"cc" : "US", 
"firstNm" : ”John", 
"lang" : "en_US", 
"lastNm" : ”Doe”}, 
"sysTime" : ISODate("2014-05-03T04:43:49.899Z”) 
}
Relationship Collection 
• Support all cardinalities 
• Equivalent to RDBMS intersection table (guid 
on each end of relationship) 
• Use eventually consistent framework for non-atomic 
writes 
• Shard key is parent+child+type (parent lookup 
is primary use case)
Relationship Collection 
{ 
"_id" : ”baa000163e5ff405b8083d5f164c11e3", 
"child" : "8a9e00237d617f08df7f1685527711e2", 
"createTime" : ISODate("2013-09-05T17:00:51.209Z"), 
"modTime" : ISODate("2013-09-05T17:00:51.209Z"), 
"attributes" : null, 
"parent" : ” baebc8bcc8e14f6e9bf70221d81711e2", 
"type" : ”CLASSROOM” 
}
Legacy Collection 
• Bridge collection to facilitate migration from 
old data model to new 
• Near-image of old data model but with some 
refactoring (3 tables into 1 document) 
• Once migration is complete, plan is to drop 
this collection 
• Defaults not stored, 1-2 character field names
Legacy Collection 
{ 
"_id" : ”jdoe", 
”subData" : { 
"f" : NumberLong(1018628731), 
"g" : ”jdoe", 
"d" : false, 
"e" : NumberLong(1018628731), 
"b" : NumberLong(434077116), 
"a" : ”JDoe", 
"l" : NumberLong("212200907100000000"), 
"i" : NumberLong(659952670) 
}, 
”guid" : "baebc8bcc8e14f6e9bf70221d81711e2", 
"st" : ISODate("2013-06-24T20:13:16.627Z") 
}
Reservation Collection 
• Namespace protection 
• Enforce uniqueness of user/namespace from 
application side because shard key for 
UserIdentity collection is guid 
• Shard key is username+namespace
Reservation Collection 
{ 
"_id" : "b13a00163e062d8ee9dc9eaf3e2411e1", 
"createTime" : ISODate("2012-01- 
13T20:26:46.111Z"), 
"user" : ”jdoe", 
"expires" : ISODate("2012-01-13T21:26:46.111Z"), 
”rsvId" : "e9bddfe1-1c84-42c9-8f4c-1a7a96920ff4", 
”data" : { "k1": "v1", "k2" : "v2" }, 
”ns" : "aol", 
"type" : "R" 
}
Problems/Solutions
Problem 
Writes spanning multiple documents sometimes 
fail part way
Solution 
• Developed eventually consistent framework 
“synchronizer” 
• Events sent to framework to validate, repair, 
or finish 
• Events retryable until success or ttl is expired
Problem 
Scatter-gather queries slower, 100% 
performance impact on failover
Solution 
• Use Memcached to map non-shard key to 
shard key (99% hit ratio for one mapping, 55% 
for other) 
• Use Memcached to map potentially expensive 
intermediary results (88% hit ratio)
Problem 
Querying lists of users required parallel 
processing for performance -- increasing 
connection requirements
Solution 
Use $in operator to query lists of users rather 
than looping through individual queries
Problem 
At application startup a large number of 
requests failed because of overhead in creating 
mongos connections
Solution 
Build into application a “warm-up” stage that 
executes stock queries prior to going online and 
taking traffic
Problem 
During failovers or other slow periods, 
application queues back up and recovery takes 
too long
Solution 
Determine request’s time in queue, if exceeds 
client’s timeout, don’t process, drop request
Problem 
Using application applied optimistic lock 
encounters lock errors during concurrent writes 
(entire document updated)
Solution 
Use $Set operator to target writes to just those 
impacted elements, use MongoDB to enforce 
atomicity
Problem 
Reads from primary, but when secondaries lost, 
reads fail
Solution 
Use primaryPreferred for reads. Want the 
freshest data (password for example), but still 
want reads to work if no primary exists
Problem 
Large number of connections to 
mongos/mongod is extending the failover times 
and nearing limits
Solution 
• Application DAOs share connections to same 
Mongo cluster 
• Connection params initially set too high 
• Set connectionsPerHost and 
connectionMultiplier plus a buffer to cover 
the fixed number of worker threads per 
application (15/5 for 32 worker threads). 
• Went from 15K connections to 2K connections
Benefits
Benefits 
• Unanticipated benefit was ability for all 
eligible users to use the AOL client 
• Easily added Identity extensions leveraging 
the new data model 
• Support for multiple namespaces made 
building APIs for multi-tenancy 
straightforward 
• Model is positioned in such a way to make 
vision for AOL Identity feasible
Lessons Learned
Lessons Learned 
• Keep connections as low as possible 
– Higher connection numbers increase failover 
times 
• Avoid scatter-gather reads (use cache if 
possible to get to shard key) 
• Keep data set in memory 
• Fail fast on application side to lower recovery 
time
Going Forward
Going forward 
• Implement tagging to target secondaries 
• Further reduction in scatter-gather reads 
• Reduce failover window to as short as possible 
• Contact: doug.haydon@teamaol.com

More Related Content

What's hot

MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
MongoDB Evenings DC: Get MEAN and Lean with Docker and KubernetesMongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
MongoDB Evenings DC: Get MEAN and Lean with Docker and KubernetesMongoDB
 
MongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
MongoDB Evenings DC: MongoDB - The New Default Database for Giant IdeasMongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
MongoDB Evenings DC: MongoDB - The New Default Database for Giant IdeasMongoDB
 
Building a Scalable and Modern Infrastructure at CARFAX
Building a Scalable and Modern Infrastructure at CARFAXBuilding a Scalable and Modern Infrastructure at CARFAX
Building a Scalable and Modern Infrastructure at CARFAXMongoDB
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBMongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...MongoDB
 
eHarmony - Messaging Platform with MongoDB Atlas
eHarmony - Messaging Platform with MongoDB Atlas eHarmony - Messaging Platform with MongoDB Atlas
eHarmony - Messaging Platform with MongoDB Atlas MongoDB
 
An Agile Supply Chain at The Gap
An Agile Supply Chain at The GapAn Agile Supply Chain at The Gap
An Agile Supply Chain at The GapMongoDB
 
How to deliver a Single View in Financial Services
 How to deliver a Single View in Financial Services How to deliver a Single View in Financial Services
How to deliver a Single View in Financial ServicesMongoDB
 
Jumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & TableauJumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & TableauMongoDB
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBRick Copeland
 
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...MongoDB
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsMongoDB
 
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB
 
Designing Cloud Products
Designing Cloud Products Designing Cloud Products
Designing Cloud Products MongoDB
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMongoDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLEDB
 
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseDipti Borkar
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2MongoDB
 

What's hot (20)

MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
MongoDB Evenings DC: Get MEAN and Lean with Docker and KubernetesMongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
 
MongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
MongoDB Evenings DC: MongoDB - The New Default Database for Giant IdeasMongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
MongoDB Evenings DC: MongoDB - The New Default Database for Giant Ideas
 
Building a Scalable and Modern Infrastructure at CARFAX
Building a Scalable and Modern Infrastructure at CARFAXBuilding a Scalable and Modern Infrastructure at CARFAX
Building a Scalable and Modern Infrastructure at CARFAX
 
Building LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDBBuilding LinkedIn's Learning Platform with MongoDB
Building LinkedIn's Learning Platform with MongoDB
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
 
eHarmony - Messaging Platform with MongoDB Atlas
eHarmony - Messaging Platform with MongoDB Atlas eHarmony - Messaging Platform with MongoDB Atlas
eHarmony - Messaging Platform with MongoDB Atlas
 
An Agile Supply Chain at The Gap
An Agile Supply Chain at The GapAn Agile Supply Chain at The Gap
An Agile Supply Chain at The Gap
 
How to deliver a Single View in Financial Services
 How to deliver a Single View in Financial Services How to deliver a Single View in Financial Services
How to deliver a Single View in Financial Services
 
Jumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & TableauJumpstart: MongoDB BI Connector & Tableau
Jumpstart: MongoDB BI Connector & Tableau
 
MongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDBMongoATL: How Sourceforge is Using MongoDB
MongoATL: How Sourceforge is Using MongoDB
 
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...
MongoDB and Our Journey from Old, Slow and Monolithic to Fast and Agile Micro...
 
Benefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSsBenefits of Using MongoDB Over RDBMSs
Benefits of Using MongoDB Over RDBMSs
 
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
 
Designing Cloud Products
Designing Cloud Products Designing Cloud Products
Designing Cloud Products
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Migrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDBMigrating from RDBMS to MongoDB
Migrating from RDBMS to MongoDB
 
NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
How companies use NoSQL and Couchbase
How companies use NoSQL and CouchbaseHow companies use NoSQL and Couchbase
How companies use NoSQL and Couchbase
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
 
Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2Webinar: What's New in MongoDB 3.2
Webinar: What's New in MongoDB 3.2
 

Similar to MongoDB: How We Did It – Reanimating Identity at AOL

MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014Ryusuke Kajiyama
 
MongoDB at MapMyFitness
MongoDB at MapMyFitnessMongoDB at MapMyFitness
MongoDB at MapMyFitnessMapMyFitness
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
Mongodb at-gilt-groupe-seattle-2012-09-14-final
Mongodb at-gilt-groupe-seattle-2012-09-14-finalMongodb at-gilt-groupe-seattle-2012-09-14-final
Mongodb at-gilt-groupe-seattle-2012-09-14-finalMongoDB
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesDavid Martínez Rego
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
MongoDB at Gilt Groupe
MongoDB at Gilt GroupeMongoDB at Gilt Groupe
MongoDB at Gilt GroupeMongoDB
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community EngineCommunity Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community enginemathraq
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware ProvisioningMongoDB
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
Development of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data GridsDevelopment of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data Gridsjlorenzocima
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 

Similar to MongoDB: How We Did It – Reanimating Identity at AOL (20)

MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014MySQL Performance Tuning at COSCUP 2014
MySQL Performance Tuning at COSCUP 2014
 
MongoDB at MapMyFitness
MongoDB at MapMyFitnessMongoDB at MapMyFitness
MongoDB at MapMyFitness
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Mongodb at-gilt-groupe-seattle-2012-09-14-final
Mongodb at-gilt-groupe-seattle-2012-09-14-finalMongodb at-gilt-groupe-seattle-2012-09-14-final
Mongodb at-gilt-groupe-seattle-2012-09-14-final
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 
Building Big Data Streaming Architectures
Building Big Data Streaming ArchitecturesBuilding Big Data Streaming Architectures
Building Big Data Streaming Architectures
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
MongoDB at Gilt Groupe
MongoDB at Gilt GroupeMongoDB at Gilt Groupe
MongoDB at Gilt Groupe
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
MongoDB
MongoDBMongoDB
MongoDB
 
Development of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data GridsDevelopment of concurrent services using In-Memory Data Grids
Development of concurrent services using In-Memory Data Grids
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Fastest Servlets in the West
Fastest Servlets in the WestFastest Servlets in the West
Fastest Servlets in the West
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 

Recently uploaded (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 

MongoDB: How We Did It – Reanimating Identity at AOL

  • 1. MongoDB: How We Did It – Reanimating Identity at AOL
  • 2. Topics • Motivation • Challenges • Approach • MongoDB Testing • Deployment • Collections • Problem/Solution • Lessons Learned • Going Forward
  • 4. Motivation • Cluttered data • Ambiguous data • Functionally shared data • Immutable data model
  • 6. Challenges • Leaving behind fault-tolerant (Non-Stop) platform/Transactional integrity • Merge/extricate Identity data • Scaling to handle consolidated traffic • Continue to support Legacy
  • 8. Approach • Document-based data model – use MongoDB • Migrate data • Build adapter/interceptor layer • Production testing with no impacts
  • 9. Approach • Audit of setup with MongoDB • Tweak mongo settings, including driver, to optimize for performance • Leverage eventual consistency to overcome transactional integrity loss • Switch Identity to new data model using MongoDB
  • 11. Migration • Adapters support 4 stages: 1. Read/write legacy 2. Read/write legacy, write mongoDB (shadow read mongoDB) 3. read/write mongoDB, write legacy 4. Read/write mongoDB
  • 17. Production Testing • “Chaos Monkey” testing of MongoDB • 4 Million requests/Minute (production load, read to write ratio 99%) • Test primary failover (graceful) • Kill Primary
  • 18. Production Testing • Test secondary failure • Shutdown all secondaries • Manually shutdown interface on primary • Performance benchmarking
  • 19. Production Testing • Performance very good, shard key reads ~2- 3ms • Scatter-gather reads ~12ms • Writes good as well, ~3-20ms • Failovers 4-5 minutes
  • 20. MongoDB Healthcheck • Use dedicated machines for Config servers • Place Config servers in different data centers • Handle failover in application, if network exception, fallback to secondary • Set lower TCP keepalive values (5 minutes)
  • 22.
  • 23. Deployment • Version 2.4.9 • All 75 mongod’s on separate switches • 2 x 12 Core CPUs, 192GB of RAM and internal controller based RAID 10 Ext4 File Systems • Using default chunk size (64MB)
  • 24. Deployment • Have dedicated slaves for backup (configured as hidden members with priority 0). Backup runs during 6-8am window • Enable powerOf2Sizes for collections to reduce fragmentation • Balancer restricted to 4-6am daily
  • 26. Document Model • Entire data set must be in memory to meet performance demands • Document field names abbreviated, but descriptive • Don’t store default values (Legacy document is 80% defaults) • Working hard to keep legacy artifacts out, but always about trade-offs
  • 27. UserIdentity Collection • Core data model for Identity • Heterogenous collection (some documents are “aliases” which are pointers to primary document) • Index on user+namespace • Shard key is guid (UUID Type 1, flipped – node then time)
  • 28. UserIdentity { _id: “baebc8bcc8e14f6e9bf70221d81711e2”, user: “jdoe”, ns: “aol, … "profile" : { "cc" : "US", "firstNm" : ”John", "lang" : "en_US", "lastNm" : ”Doe”}, "sysTime" : ISODate("2014-05-03T04:43:49.899Z”) }
  • 29. Relationship Collection • Support all cardinalities • Equivalent to RDBMS intersection table (guid on each end of relationship) • Use eventually consistent framework for non-atomic writes • Shard key is parent+child+type (parent lookup is primary use case)
  • 30. Relationship Collection { "_id" : ”baa000163e5ff405b8083d5f164c11e3", "child" : "8a9e00237d617f08df7f1685527711e2", "createTime" : ISODate("2013-09-05T17:00:51.209Z"), "modTime" : ISODate("2013-09-05T17:00:51.209Z"), "attributes" : null, "parent" : ” baebc8bcc8e14f6e9bf70221d81711e2", "type" : ”CLASSROOM” }
  • 31. Legacy Collection • Bridge collection to facilitate migration from old data model to new • Near-image of old data model but with some refactoring (3 tables into 1 document) • Once migration is complete, plan is to drop this collection • Defaults not stored, 1-2 character field names
  • 32. Legacy Collection { "_id" : ”jdoe", ”subData" : { "f" : NumberLong(1018628731), "g" : ”jdoe", "d" : false, "e" : NumberLong(1018628731), "b" : NumberLong(434077116), "a" : ”JDoe", "l" : NumberLong("212200907100000000"), "i" : NumberLong(659952670) }, ”guid" : "baebc8bcc8e14f6e9bf70221d81711e2", "st" : ISODate("2013-06-24T20:13:16.627Z") }
  • 33. Reservation Collection • Namespace protection • Enforce uniqueness of user/namespace from application side because shard key for UserIdentity collection is guid • Shard key is username+namespace
  • 34. Reservation Collection { "_id" : "b13a00163e062d8ee9dc9eaf3e2411e1", "createTime" : ISODate("2012-01- 13T20:26:46.111Z"), "user" : ”jdoe", "expires" : ISODate("2012-01-13T21:26:46.111Z"), ”rsvId" : "e9bddfe1-1c84-42c9-8f4c-1a7a96920ff4", ”data" : { "k1": "v1", "k2" : "v2" }, ”ns" : "aol", "type" : "R" }
  • 36. Problem Writes spanning multiple documents sometimes fail part way
  • 37. Solution • Developed eventually consistent framework “synchronizer” • Events sent to framework to validate, repair, or finish • Events retryable until success or ttl is expired
  • 38. Problem Scatter-gather queries slower, 100% performance impact on failover
  • 39. Solution • Use Memcached to map non-shard key to shard key (99% hit ratio for one mapping, 55% for other) • Use Memcached to map potentially expensive intermediary results (88% hit ratio)
  • 40. Problem Querying lists of users required parallel processing for performance -- increasing connection requirements
  • 41. Solution Use $in operator to query lists of users rather than looping through individual queries
  • 42. Problem At application startup a large number of requests failed because of overhead in creating mongos connections
  • 43. Solution Build into application a “warm-up” stage that executes stock queries prior to going online and taking traffic
  • 44. Problem During failovers or other slow periods, application queues back up and recovery takes too long
  • 45. Solution Determine request’s time in queue, if exceeds client’s timeout, don’t process, drop request
  • 46. Problem Using application applied optimistic lock encounters lock errors during concurrent writes (entire document updated)
  • 47. Solution Use $Set operator to target writes to just those impacted elements, use MongoDB to enforce atomicity
  • 48. Problem Reads from primary, but when secondaries lost, reads fail
  • 49. Solution Use primaryPreferred for reads. Want the freshest data (password for example), but still want reads to work if no primary exists
  • 50. Problem Large number of connections to mongos/mongod is extending the failover times and nearing limits
  • 51. Solution • Application DAOs share connections to same Mongo cluster • Connection params initially set too high • Set connectionsPerHost and connectionMultiplier plus a buffer to cover the fixed number of worker threads per application (15/5 for 32 worker threads). • Went from 15K connections to 2K connections
  • 53. Benefits • Unanticipated benefit was ability for all eligible users to use the AOL client • Easily added Identity extensions leveraging the new data model • Support for multiple namespaces made building APIs for multi-tenancy straightforward • Model is positioned in such a way to make vision for AOL Identity feasible
  • 55. Lessons Learned • Keep connections as low as possible – Higher connection numbers increase failover times • Avoid scatter-gather reads (use cache if possible to get to shard key) • Keep data set in memory • Fail fast on application side to lower recovery time
  • 57. Going forward • Implement tagging to target secondaries • Further reduction in scatter-gather reads • Reduce failover window to as short as possible • Contact: doug.haydon@teamaol.com

Editor's Notes

  1. ----- Meeting Notes (9/9/14 14:55) ----- brands.aol.com, fingerprint brand logo, stuck on bottom of master page, invert colors
  2. ----- Meeting Notes (9/9/14 14:55) ----- after challanges, how we overcame, what is our strategy to overcome -- approach -- why chose mongo, document based
  3. ----- Meeting Notes (9/9/14 14:55) ----- what did we learn from this? What did we change? 10gen audit
  4. ----- Meeting Notes (9/9/14 14:55) ----- what did we learn from this? What did we change? 10gen audit
  5. ----- Meeting Notes (9/9/14 14:55) ----- 12 TB/700M
  6. 120 GB/140M
  7. 120 GB/140M
  8. 300GB/400M
  9. 300GB/400M
  10. 400GB
  11. 400GB