SlideShare a Scribd company logo
Running MongoDB in the Cloud

          Tony Tam
          @fehguy
What this Talk is About

Wordnik left the cloud and came back
 •   What?!?
 •   Why we left
 •   Decisions
 •   Why we came back (and what we did differently)
Who is Wordnik?

• World’s fastest updating English dictionary
 •   Based on input of text at ~8k words/second
 •   Word Graph as basis to our analysis
     •   Synchronous & asynchronous processing
• 10’s of Billions of documents in NR
  storage
• Concept & Meaning Discovery Engine
• > 20M daily REST API calls, billions
  served
So Why the Detour?

• Architectural Choices
• Business Choices
• Feedback, tooling, infrastructure
• Learning
• Changes in use case
• Progress!
Architecture History

• EC2-based LAMP Stack
 •   POC (and seed funding)
 •   A manageable corpus < 1M records
• REST API
 •   Web + public
 •   MySQL in master/slave
 •   ~1B documents
 •   Operational nightmare
Architecture History
• MongoDB
 •   First-order MySQL issues solved
 •   But it got slow…
• Real Servers to the rescue!
 •   Faster, bigger disks
• MongoDB for Corpus, Structured Data
 •   Faster Reads + Writes!
 •   More metal (72GB RAM)
 •   More cores
 •   “cold” query from 400ms to < 100
Why Change?

Easy!
• Can’t beat metal…except
 •   Quick expansion
 •   Batch jobs/experiments
 •   Add a datacenter
 •   Full cluster migration
 •   The bill for unused capacity
Architectural Mindshift

1. Anything can die, anytime
2. Centralized, redundant state (see point 1)
3. Server performance is *different*
  •   CPU, I/O, Memory—choose one
  •   Smart design makes it work!
Architectural Mindshift

• Your software will need to change!
 •   So will the components you rely on
Your Infrastructure   Cloud
                                   Hero
• Deploying Servers
 •   Going to need a lot!
• Configuration
• Updates to your software

                    What about
                     Data?
Let’s make this Work!

• MySQL Master Slave
  •   Take a snapshot (yes, this will block)
  •   Keep your binlogs!

change master to MASTER_HOST='app1',
MASTER_USER='XXXX', MASTER_PASSWORD='XXXX',
MASTER_LOG_FILE='app1-relay.0038774',
MASTER_LOG_POS=6754205951;
Let’s make this Work!

But…
• Your master is down!
 •   Quick, promote a slave!
 •   Point the other slaves to the new master
• As for the clients…
                                 “Well, we
                               never really
                               tried that…”
Better with Mongo

• Easy up, easy down!
 •   Startup: Sync your data, and announce to clients
     when ready for business
 •   Shutdown: Announce your departure and leave
• Replica sets
 rs.add("db4.wordnik.com:27017");
 rs.remove("db1.wordnik.com:27017");
Better with Mongo
But what about Performance?

• Software Design
    •   It’s slow! (What is *it*?)
    •   Profile everything
import com.wordnik.util.perf._
...
def findUser(id:Long): User = {
    Profile("UserDao::findUserById", dao.findUserById(id))
}

         http://github.com/wordnik/wordnik-oss
But what about Performance?
But what about Performance?

• “It’s the database!”
  •   What is it?
• Mapping layer
  •   Mysql (12+ joins) => 50 records/sec
  •   Mongo JSON  POJO => 1000 records/sec
  •   Mongo DBO  POJO => 35,000 records/sec
• How do you know?
                                      Profile
                                        it!
It’s Still Slow!

• It’s the index!
  •   How do you know?
  •   AHHHHH
It’s Still Slow!

• Balance your B-Tree
 •   Can't always keep index in ram. MMF "does it's
     thing"
 •   Right-balanced b-tree keeps necessary index hot
 •   If you hit indexes on disk, mute your pager
                                                   1
                                                   7




                                                       1   2
                                                       5   7
But it’s Still Slow!

• Look at your Schema design
  •   Design to limit index size/number
  •   _id is your friend—make it meaningful
  •   Record size consistency
      •   Hierarchal Data beware!
      •   Split documents even in same collection!
db.posts.find({_id:/^tony_posts_/})
{_id:"tony_posts_1”, posts:[...]}
{_id:"tony_posts_2”, posts:[...]}       YOUR
{_id:"tony_posts_3”, posts:[...]}     app knows
                                         best
Really, it’s STILL slow!

• Your monolithic app/DB won’t scale same
  on VMs
• Specialize!
 •   Wordnik uses SOA

                 Powered API
                               swagger.wordnik.com

 •   Data tiers follow service types
 •   Smaller *everything*
Really, it’s STILL slow!

• Your monolithic app/DB won’t scale same
  on VMs
• Specialize!
 •   Wordnik uses SOA                   A contract
                                         for your
                               swagger.wordnik.com
                 Powered API
                                          clients
 •   Data tiers follow service types
 •   Smaller *everything*
Be the Boss of your Data

• Your app *should* be smarter than your
  DB
 •   Lots of users?
 •   Lots of blog posts?
 •   Lots of images?
 •   Shard? On what?
• Data dimensionality
 •   Keep active data hot
 •   Don’t try to boil the ocean
Cloud Computing + Mongo

• It can work extremely well
  •   No “Save as Cloud!” menu item
• Shifting constraints
  •   Optimize for RAM on VM
  •   Virtual disk => virtual performance
• Be “Deployable”
  •   Mongo Replica Sets are made for this
Cloud Computing + Mongo

• System Durability
 •   Design your software for abuse
 •   Your old design doesn’t apply
 •   Add APM hooks, now!
• Dissect your app
 •   Build to micro services with dedicated MongoDB
     clusters
• Deployment Infrastructure
 •   Don’t wait until it’s too late
See More
•   See more about Wordnik APIs
                    http://developer.wordnik.com

•   Migrating from MySQL to MongoDB
http://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik

•   Maintaining your MongoDB Installation
              http://www.slideshare.net/fehguy/mongo-sv-tony-tam

•   Swagger API Framework
                          http://swagger.wordnik.com

•   Mapping Benchmark
               https://github.com/fehguy/mongodb-benchmark-tools

•   Wordnik OSS Tools
                   https://github.com/wordnik/wordnik-oss
Questions?

More Related Content

What's hot

Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
Tony Tam
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)
Oren Eini
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Jeremy Zawodny
 
GoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at ApceraGoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at Apcera
Derek Collison
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
Jeremy Zawodny
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Jeremy Zawodny
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
Jeremy Zawodny
 
RavenDB embedded at massive scales
RavenDB embedded at massive scalesRavenDB embedded at massive scales
RavenDB embedded at massive scales
Oren Eini
 
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
Malin Weiss
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleservice
DaeMyung Kang
 
Engineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineEngineering an Encrypted Storage Engine
Engineering an Encrypted Storage Engine
MongoDB
 
RavenDB Presentation
RavenDB PresentationRavenDB Presentation
RavenDB PresentationMark Rodseth
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js Developers
Rob Tweed
 
NATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platformsNATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platforms
Derek Collison
 
Fusion-io and MySQL at Craigslist
Fusion-io and MySQL at CraigslistFusion-io and MySQL at Craigslist
Fusion-io and MySQL at Craigslist
Jeremy Zawodny
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)MongoSF
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDB
Brian Ritchie
 
How Different are MongoDB Drivers
How Different are MongoDB DriversHow Different are MongoDB Drivers
How Different are MongoDB Drivers
Norberto Leite
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Jeremy Zawodny
 
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyWhat's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by company
MongoDB APAC
 

What's hot (20)

Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)Know thy cost (or where performance problems lurk)
Know thy cost (or where performance problems lurk)
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
 
GoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at ApceraGoSF Summerfest - Why Go at Apcera
GoSF Summerfest - Why Go at Apcera
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
MySQL And Search At Craigslist
MySQL And Search At CraigslistMySQL And Search At Craigslist
MySQL And Search At Craigslist
 
RavenDB embedded at massive scales
RavenDB embedded at massive scalesRavenDB embedded at massive scales
RavenDB embedded at massive scales
 
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in MinutesSenchaCon 2016 - How to Auto Generate a Back-end in Minutes
SenchaCon 2016 - How to Auto Generate a Back-end in Minutes
 
Internet scaleservice
Internet scaleserviceInternet scaleservice
Internet scaleservice
 
Engineering an Encrypted Storage Engine
Engineering an Encrypted Storage EngineEngineering an Encrypted Storage Engine
Engineering an Encrypted Storage Engine
 
RavenDB Presentation
RavenDB PresentationRavenDB Presentation
RavenDB Presentation
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js Developers
 
NATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platformsNATS - A new nervous system for distributed cloud platforms
NATS - A new nervous system for distributed cloud platforms
 
Fusion-io and MySQL at Craigslist
Fusion-io and MySQL at CraigslistFusion-io and MySQL at Craigslist
Fusion-io and MySQL at Craigslist
 
From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)From MySQL to MongoDB at Wordnik (Tony Tam)
From MySQL to MongoDB at Wordnik (Tony Tam)
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDB
 
How Different are MongoDB Drivers
How Different are MongoDB DriversHow Different are MongoDB Drivers
How Different are MongoDB Drivers
 
Lessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at CraigslistLessons Learned Migrating 2+ Billion Documents at Craigslist
Lessons Learned Migrating 2+ Billion Documents at Craigslist
 
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by companyWhat's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6 at India event by company
 

Similar to Running MongoDB in the Cloud

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?DATAVERSITY
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
Tony Tam
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
DATAVERSITY
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
Reuven Lerner
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
Tony Tam
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
boorad
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planningasya999
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
boorad
 
Discover MongoDB - Israel
Discover MongoDB - IsraelDiscover MongoDB - Israel
Discover MongoDB - IsraelMichael Fiedler
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
mathraq
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
Grig Gheorghiu
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and Results
MongoDB
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile appsMugunth Kumar
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Bob Pusateri
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
John Adams
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
Roger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
smallerror
 

Similar to Running MongoDB in the Cloud (20)

A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
Data Modeling for NoSQL
Data Modeling for NoSQLData Modeling for NoSQL
Data Modeling for NoSQL
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
2013 CPM Conference, Nov 6th, NoSQL Capacity Planning
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Discover MongoDB - Israel
Discover MongoDB - IsraelDiscover MongoDB - Israel
Discover MongoDB - Israel
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and Results
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
 
Designing your API Server for mobile apps
Designing your API Server for mobile appsDesigning your API Server for mobile apps
Designing your API Server for mobile apps
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 

More from Tony Tam

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
Tony Tam
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with Swagger
Tony Tam
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with Swagger
Tony Tam
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger Inflector
Tony Tam
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + Swagger
Tony Tam
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)
Tony Tam
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Tony Tam
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-api
Tony Tam
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startups
Tony Tam
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
Tony Tam
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
Tony Tam
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at Wordnik
Tony Tam
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing Swagger
Tony Tam
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDB
Tony Tam
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at Wordnik
Tony Tam
 

More from Tony Tam (15)

A Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification LinksA Tasty deep-dive into Open API Specification Links
A Tasty deep-dive into Open API Specification Links
 
API Design first with Swagger
API Design first with SwaggerAPI Design first with Swagger
API Design first with Swagger
 
Developing Faster with Swagger
Developing Faster with SwaggerDeveloping Faster with Swagger
Developing Faster with Swagger
 
Writer APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger InflectorWriter APIs in Java faster with Swagger Inflector
Writer APIs in Java faster with Swagger Inflector
 
Fastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + SwaggerFastest to Mobile with Scalatra + Swagger
Fastest to Mobile with Scalatra + Swagger
 
Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)Swagger APIs for Humans and Robots (Gluecon)
Swagger APIs for Humans and Robots (Gluecon)
 
Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)Love your API with Swagger (Gluecon lightning talk)
Love your API with Swagger (Gluecon lightning talk)
 
Swagger for-your-api
Swagger for-your-apiSwagger for-your-api
Swagger for-your-api
 
Swagger for startups
Swagger for startupsSwagger for startups
Swagger for startups
 
System insight without Interference
System insight without InterferenceSystem insight without Interference
System insight without Interference
 
Keeping MongoDB Data Safe
Keeping MongoDB Data SafeKeeping MongoDB Data Safe
Keeping MongoDB Data Safe
 
Scala & Swagger at Wordnik
Scala & Swagger at WordnikScala & Swagger at Wordnik
Scala & Swagger at Wordnik
 
Introducing Swagger
Introducing SwaggerIntroducing Swagger
Introducing Swagger
 
Building a Directed Graph with MongoDB
Building a Directed Graph with MongoDBBuilding a Directed Graph with MongoDB
Building a Directed Graph with MongoDB
 
Migrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at WordnikMigrating from MySQL to MongoDB at Wordnik
Migrating from MySQL to MongoDB at Wordnik
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 

Running MongoDB in the Cloud

  • 1. Running MongoDB in the Cloud Tony Tam @fehguy
  • 2. What this Talk is About Wordnik left the cloud and came back • What?!? • Why we left • Decisions • Why we came back (and what we did differently)
  • 3. Who is Wordnik? • World’s fastest updating English dictionary • Based on input of text at ~8k words/second • Word Graph as basis to our analysis • Synchronous & asynchronous processing • 10’s of Billions of documents in NR storage • Concept & Meaning Discovery Engine • > 20M daily REST API calls, billions served
  • 4. So Why the Detour? • Architectural Choices • Business Choices • Feedback, tooling, infrastructure • Learning • Changes in use case • Progress!
  • 5. Architecture History • EC2-based LAMP Stack • POC (and seed funding) • A manageable corpus < 1M records • REST API • Web + public • MySQL in master/slave • ~1B documents • Operational nightmare
  • 6. Architecture History • MongoDB • First-order MySQL issues solved • But it got slow… • Real Servers to the rescue! • Faster, bigger disks • MongoDB for Corpus, Structured Data • Faster Reads + Writes! • More metal (72GB RAM) • More cores • “cold” query from 400ms to < 100
  • 7. Why Change? Easy! • Can’t beat metal…except • Quick expansion • Batch jobs/experiments • Add a datacenter • Full cluster migration • The bill for unused capacity
  • 8. Architectural Mindshift 1. Anything can die, anytime 2. Centralized, redundant state (see point 1) 3. Server performance is *different* • CPU, I/O, Memory—choose one • Smart design makes it work!
  • 9. Architectural Mindshift • Your software will need to change! • So will the components you rely on
  • 10. Your Infrastructure Cloud Hero • Deploying Servers • Going to need a lot! • Configuration • Updates to your software What about Data?
  • 11. Let’s make this Work! • MySQL Master Slave • Take a snapshot (yes, this will block) • Keep your binlogs! change master to MASTER_HOST='app1', MASTER_USER='XXXX', MASTER_PASSWORD='XXXX', MASTER_LOG_FILE='app1-relay.0038774', MASTER_LOG_POS=6754205951;
  • 12. Let’s make this Work! But… • Your master is down! • Quick, promote a slave! • Point the other slaves to the new master • As for the clients… “Well, we never really tried that…”
  • 13. Better with Mongo • Easy up, easy down! • Startup: Sync your data, and announce to clients when ready for business • Shutdown: Announce your departure and leave • Replica sets rs.add("db4.wordnik.com:27017"); rs.remove("db1.wordnik.com:27017");
  • 15. But what about Performance? • Software Design • It’s slow! (What is *it*?) • Profile everything import com.wordnik.util.perf._ ... def findUser(id:Long): User = { Profile("UserDao::findUserById", dao.findUserById(id)) } http://github.com/wordnik/wordnik-oss
  • 16. But what about Performance?
  • 17. But what about Performance? • “It’s the database!” • What is it? • Mapping layer • Mysql (12+ joins) => 50 records/sec • Mongo JSON  POJO => 1000 records/sec • Mongo DBO  POJO => 35,000 records/sec • How do you know? Profile it!
  • 18. It’s Still Slow! • It’s the index! • How do you know? • AHHHHH
  • 19. It’s Still Slow! • Balance your B-Tree • Can't always keep index in ram. MMF "does it's thing" • Right-balanced b-tree keeps necessary index hot • If you hit indexes on disk, mute your pager 1 7 1 2 5 7
  • 20. But it’s Still Slow! • Look at your Schema design • Design to limit index size/number • _id is your friend—make it meaningful • Record size consistency • Hierarchal Data beware! • Split documents even in same collection! db.posts.find({_id:/^tony_posts_/}) {_id:"tony_posts_1”, posts:[...]} {_id:"tony_posts_2”, posts:[...]} YOUR {_id:"tony_posts_3”, posts:[...]} app knows best
  • 21. Really, it’s STILL slow! • Your monolithic app/DB won’t scale same on VMs • Specialize! • Wordnik uses SOA Powered API swagger.wordnik.com • Data tiers follow service types • Smaller *everything*
  • 22. Really, it’s STILL slow! • Your monolithic app/DB won’t scale same on VMs • Specialize! • Wordnik uses SOA A contract for your swagger.wordnik.com Powered API clients • Data tiers follow service types • Smaller *everything*
  • 23. Be the Boss of your Data • Your app *should* be smarter than your DB • Lots of users? • Lots of blog posts? • Lots of images? • Shard? On what? • Data dimensionality • Keep active data hot • Don’t try to boil the ocean
  • 24. Cloud Computing + Mongo • It can work extremely well • No “Save as Cloud!” menu item • Shifting constraints • Optimize for RAM on VM • Virtual disk => virtual performance • Be “Deployable” • Mongo Replica Sets are made for this
  • 25. Cloud Computing + Mongo • System Durability • Design your software for abuse • Your old design doesn’t apply • Add APM hooks, now! • Dissect your app • Build to micro services with dedicated MongoDB clusters • Deployment Infrastructure • Don’t wait until it’s too late
  • 26. See More • See more about Wordnik APIs http://developer.wordnik.com • Migrating from MySQL to MongoDB http://www.slideshare.net/fehguy/migrating-from-mysql-to-mongodb-at-wordnik • Maintaining your MongoDB Installation http://www.slideshare.net/fehguy/mongo-sv-tony-tam • Swagger API Framework http://swagger.wordnik.com • Mapping Benchmark https://github.com/fehguy/mongodb-benchmark-tools • Wordnik OSS Tools https://github.com/wordnik/wordnik-oss