SlideShare a Scribd company logo
1 of 27
Download to read offline
Introduction to WiredTiger and the
Storage Engine API
Valeri Karpov
NodeJS Engineer, MongoDB
www.thecodebarbarian.com
www.slideshare.net/vkarpov15
github.com/vkarpov15
@code_barbarian
*
A Bit About Me
•CI/NodeJS Engineer at MongoDB
•Maintainer of mongoose ODM
•Recently working on rewriting mongodump, etc.
*
Talk Overview
•What is the Storage Engine API?
•What is WT and why you should care
•Basic WT internals and gotchas
•MMS Automation and WT
•Some very basic performance numbers
*
Introducing Storage Engines
•How MongoDB persists data
•<= MongoDB 2.6: “mmapv1” storage engine
•MongoDB 3.0 has a few options
• mmapv1
• in_memory
• wiredTiger
• devnull (now /dev/null does support sharding!)
•Internal hack: Twitter storage engine
*
Why Storage Engine API?
•Different performance characteristics
•mmapv1 doesn’t handle certain workloads well
•Consistent API on top of storage layer
• Can mix storage engines in a replset or sharded cluster!
*
Storage Engines Visually
•mmapv1 persists data to dbpath
*
Storage Engines Visually
•WT persists data differently
*
What is WiredTiger?
•Storage engine company founded by BerkleyDB alums
•Recently acquired by MongoDB
•Available as a storage engine option in MongoDB 3.0
*
Why is WiredTiger Awesome?
•Document-level locking
•Compression on disk
•Consistency without journaling
•Better performance on certain workloads
*
Document-level Locking
•The often-criticized global write lock was removed in
2.2
•Database-level locking
•3.0 with mmapv1 has collection-level locking
•3.0 with WT only locks at document layer
•Writes no longer block all other writes
•Better CPU usage: more cores ~= more writes
*
Compression
•WT uses snappy compression by default
•Data is compressed on disk
•2 supported compression algorithms:
• snappy: default. Good compression, relatively low overhead
• zlib: Better compression, but at cost of more overhead
*
Consistency without Journaling
•mmapv1 uses write-ahead log to guarantee consistency
as well as durability
•WT doesn’t have this problem: no in-place updates
•Potentially good for insert-heavy workloads
•Rely on replication for durability
•More on this in the next section
*
WiredTiger Internals: Running It
•mongod now has a --storageEngine option
*
WiredTiger and dbpath
•Don’t run WT when you have mmapv1 files in your
dbpath (and vice versa)
*
Upgrading to WT
•Can’t copy database files
•Can’t just restart with same dbpath
•Other methods for upgrading still work:
• Initial sync from replica set
• mongodump/mongorestore
•Can still do rolling upgrade of replica set to WT:
• Shut down secondary, delete dbpath, bring it back up with --
storageEngine wiredTiger
*
Other Configuration Options
•Compression: --wiredTigerCollectionBlockCompressor
•
*
Other Configuration Options
•directoryperdb: doesn’t exist in WT
•Databases are a higher level abstraction with WT
•Following options also have no WT equivalent
• noprealloc
• syncdelay
• smallfiles
• journalCommitInterval
*
Configuration with YAML Files
•MongoDB 2.6 introduced YAML config files
•The storage.wiredTiger field lets you tweak WT options
*
WiredTiger Journaling
•Journaling in WT is a little different
•Write-ahead log committed to disk at checkpoints
•By default checkpoint every 60 seconds or 2GB written
•Data files always consistent - no journaling means you
lose data since last checkpoint
•No journal commit interval: writes are written to
journal as they come in
*
Gotcha: system.indexes, system.
namespaces deprecated
•Special collections in mmapv1
•Explicit commands: db.getIndexes(), db.
getCollectionNames()
*
Gotcha: No 32-bit Support
•WT storage engine will not work on 32-bit platforms at
all
*
Using MMS Automation with WT
•MMS automation allows you to manage and deploy
MongoDB installations
•Demo of upgrading a standalone to WT
*
Some Basic Performance Numbers
•My desired use case: MongoDB for analytics data
•Write-heavy workloads aren’t mmapv1’s strong suit
•Don’t care about durability as much, but do care about
high throughput
•Compression is a plus
•How does WT w/o journaling do on insert-only?
•Simple N=1 experiment
*
Some Basic Performance Numbers
•mongo shell’s benchRun function
*
Some Basic Performance Numbers
•mmapv1
WT with --nojournal
*
Some Basic Performance Numbers
•How does the compression work in this example?
•After bench run, WT’s /data/db sums to ~23mb
•With --noprealloc, --nopreallocj, --smallfiles, mmapv1
has ~100mb
•Not a really fair comparison since data is very small
*
Thanks for Listening!
•Slides on:
• Twitter: @code_barbarian
• Slideshare: slideshare.net/vkarpov15

More Related Content

What's hot

Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDB
Ashnikbiz
 
Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
Yann Cluchey
 

What's hot (20)

WiredTiger Overview
WiredTiger OverviewWiredTiger Overview
WiredTiger Overview
 
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTigerLet the Tiger Roar! - MongoDB 3.0 + WiredTiger
Let the Tiger Roar! - MongoDB 3.0 + WiredTiger
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Building Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDBBuilding Hybrid data cluster using PostgreSQL and MongoDB
Building Hybrid data cluster using PostgreSQL and MongoDB
 
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
 
Running MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWSRunning MongoDB 3.0 on AWS
Running MongoDB 3.0 on AWS
 
Microsoft Hekaton
Microsoft HekatonMicrosoft Hekaton
Microsoft Hekaton
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger
MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger
MongoDB Days Silicon Valley: A Technical Introduction to WiredTiger
 
What'sNnew in 3.0 Webinar
What'sNnew in 3.0 WebinarWhat'sNnew in 3.0 Webinar
What'sNnew in 3.0 Webinar
 
Azure storage
Azure storageAzure storage
Azure storage
 
WiredTiger MongoDB Integration
WiredTiger MongoDB Integration WiredTiger MongoDB Integration
WiredTiger MongoDB Integration
 
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDB
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDBPowering Microservices with Docker, Kubernetes, Kafka, and MongoDB
Powering Microservices with Docker, Kubernetes, Kafka, and MongoDB
 
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
 
Concurrency Patterns with MongoDB
Concurrency Patterns with MongoDBConcurrency Patterns with MongoDB
Concurrency Patterns with MongoDB
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage EnginesBeyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0Webinar: Introduction to MongoDB 3.0
Webinar: Introduction to MongoDB 3.0
 
Hosting huge amount of binaries in JCR
Hosting huge amount of binaries in JCRHosting huge amount of binaries in JCR
Hosting huge amount of binaries in JCR
 

Viewers also liked

AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous IntegrationAngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
Valeri Karpov
 
MEAN Stack Workshop at Node Philly, 4/9/14
MEAN Stack Workshop at Node Philly, 4/9/14MEAN Stack Workshop at Node Philly, 4/9/14
MEAN Stack Workshop at Node Philly, 4/9/14
Valeri Karpov
 
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and BeyondJS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
Valeri Karpov
 
Mongo db in 3 minutes BoilerMake
Mongo db in 3 minutes   BoilerMakeMongo db in 3 minutes   BoilerMake
Mongo db in 3 minutes BoilerMake
Valeri Karpov
 
MongoDB: Queries and Aggregation Framework with NBA Game Data
MongoDB: Queries and Aggregation Framework with NBA Game DataMongoDB: Queries and Aggregation Framework with NBA Game Data
MongoDB: Queries and Aggregation Framework with NBA Game Data
Valeri Karpov
 

Viewers also liked (12)

AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous IntegrationAngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
AngularJS Meetup 11/19/13 - AngularJS for MongoDB Continuous Integration
 
MEAN Stack Workshop at Node Philly, 4/9/14
MEAN Stack Workshop at Node Philly, 4/9/14MEAN Stack Workshop at Node Philly, 4/9/14
MEAN Stack Workshop at Node Philly, 4/9/14
 
JS-IL: Getting MEAN in 1 Hour
JS-IL: Getting MEAN in 1 HourJS-IL: Getting MEAN in 1 Hour
JS-IL: Getting MEAN in 1 Hour
 
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and BeyondJS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
JS-IL Keynote: MongoDB 2.6, Mongoose 4.0, and Beyond
 
Mongo db in 3 minutes BoilerMake
Mongo db in 3 minutes   BoilerMakeMongo db in 3 minutes   BoilerMake
Mongo db in 3 minutes BoilerMake
 
MongoDB Israel June Meetup
MongoDB Israel June MeetupMongoDB Israel June Meetup
MongoDB Israel June Meetup
 
Conquering AngularJS Limitations
Conquering AngularJS LimitationsConquering AngularJS Limitations
Conquering AngularJS Limitations
 
Lessons in Open Source from the MongooseJS ODM
Lessons in Open Source from the MongooseJS ODMLessons in Open Source from the MongooseJS ODM
Lessons in Open Source from the MongooseJS ODM
 
MongoDB: Queries and Aggregation Framework with NBA Game Data
MongoDB: Queries and Aggregation Framework with NBA Game DataMongoDB: Queries and Aggregation Framework with NBA Game Data
MongoDB: Queries and Aggregation Framework with NBA Game Data
 
NodeSummit - MEAN Stack
NodeSummit - MEAN StackNodeSummit - MEAN Stack
NodeSummit - MEAN Stack
 
Nimrod: MongoDB Shell in NodeJS (JSConfUY 2015)
Nimrod: MongoDB Shell in NodeJS (JSConfUY 2015)Nimrod: MongoDB Shell in NodeJS (JSConfUY 2015)
Nimrod: MongoDB Shell in NodeJS (JSConfUY 2015)
 
TDD a REST API With Node.js and MongoDB
TDD a REST API With Node.js and MongoDBTDD a REST API With Node.js and MongoDB
TDD a REST API With Node.js and MongoDB
 

Similar to MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

Gotszling mogo db-membase
Gotszling mogo db-membaseGotszling mogo db-membase
Gotszling mogo db-membase
GiltTech
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 

Similar to MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger (20)

Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0Let the Tiger Roar - MongoDB 3.0
Let the Tiger Roar - MongoDB 3.0
 
Let the Tiger Roar!
Let the Tiger Roar!Let the Tiger Roar!
Let the Tiger Roar!
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Benchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible DisastersBenchmarking, Load Testing, and Preventing Terrible Disasters
Benchmarking, Load Testing, and Preventing Terrible Disasters
 
Conceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de AlmacenamientoConceptos Avanzados 1: Motores de Almacenamiento
Conceptos Avanzados 1: Motores de Almacenamiento
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Gotszling mogo db-membase
Gotszling mogo db-membaseGotszling mogo db-membase
Gotszling mogo db-membase
 
MongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTigerMongoDB World 2015 - A Technical Introduction to WiredTiger
MongoDB World 2015 - A Technical Introduction to WiredTiger
 
TPC-H in MongoDB
TPC-H in MongoDBTPC-H in MongoDB
TPC-H in MongoDB
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
MongoDB
MongoDBMongoDB
MongoDB
 
be the captain of your connections deployment
be the captain of your connections deploymentbe the captain of your connections deployment
be the captain of your connections deployment
 
What SharePoint Admins need to know about SQL-Cinncinati
What SharePoint Admins need to know about SQL-CinncinatiWhat SharePoint Admins need to know about SQL-Cinncinati
What SharePoint Admins need to know about SQL-Cinncinati
 
Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)Basic Application Performance Optimization Techniques (Backend)
Basic Application Performance Optimization Techniques (Backend)
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisMongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
 
Developing and Testing a MongoDB and Node.js REST API
Developing and Testing a MongoDB and Node.js REST APIDeveloping and Testing a MongoDB and Node.js REST API
Developing and Testing a MongoDB and Node.js REST API
 
Is It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB PerformanceIs It Fast? : Measuring MongoDB Performance
Is It Fast? : Measuring MongoDB Performance
 
JChem Microservices
JChem MicroservicesJChem Microservices
JChem Microservices
 
Scaling with mongo db (with notes)
Scaling with mongo db (with notes)Scaling with mongo db (with notes)
Scaling with mongo db (with notes)
 
Untangling spring week10
Untangling spring week10Untangling spring week10
Untangling spring week10
 

More from Valeri Karpov

MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
Valeri Karpov
 
MEAN Stack WeNode Barcelona Workshop
MEAN Stack WeNode Barcelona WorkshopMEAN Stack WeNode Barcelona Workshop
MEAN Stack WeNode Barcelona Workshop
Valeri Karpov
 

More from Valeri Karpov (12)

A Practical Introduction to GeoJSON
A Practical Introduction to GeoJSONA Practical Introduction to GeoJSON
A Practical Introduction to GeoJSON
 
A Practical Introduction to Functions-as-a-Service
A Practical Introduction to Functions-as-a-ServiceA Practical Introduction to Functions-as-a-Service
A Practical Introduction to Functions-as-a-Service
 
A Gentle Introduction to Functions-as-a-Service
A Gentle Introduction to Functions-as-a-ServiceA Gentle Introduction to Functions-as-a-Service
A Gentle Introduction to Functions-as-a-Service
 
Introducing Async/Await
Introducing Async/AwaitIntroducing Async/Await
Introducing Async/Await
 
TAO and the Essence of Modern JavaScript
TAO and the Essence of Modern JavaScriptTAO and the Essence of Modern JavaScript
TAO and the Essence of Modern JavaScript
 
Mastering Async/Await in JavaScript
Mastering Async/Await in JavaScriptMastering Async/Await in JavaScript
Mastering Async/Await in JavaScript
 
React, Redux, and Archetype
React, Redux, and ArchetypeReact, Redux, and Archetype
React, Redux, and Archetype
 
MongoDB MEAN Stack Webinar October 7, 2015
MongoDB MEAN Stack Webinar October 7, 2015MongoDB MEAN Stack Webinar October 7, 2015
MongoDB MEAN Stack Webinar October 7, 2015
 
MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
MEAN Stack NYC Meetup 20150717: TDD Your AngularJS + Ionic Directives With jQ...
 
MongoDB API Talk @ HackPrinceton
MongoDB API Talk @ HackPrincetonMongoDB API Talk @ HackPrinceton
MongoDB API Talk @ HackPrinceton
 
MEAN Stack WeNode Barcelona Workshop
MEAN Stack WeNode Barcelona WorkshopMEAN Stack WeNode Barcelona Workshop
MEAN Stack WeNode Barcelona Workshop
 
MEAN Stack - Google Developers Live 10/03/2013
MEAN Stack - Google Developers Live 10/03/2013MEAN Stack - Google Developers Live 10/03/2013
MEAN Stack - Google Developers Live 10/03/2013
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 

MongoDB Miami Meetup 1/26/15: Introduction to WiredTiger

  • 1. Introduction to WiredTiger and the Storage Engine API Valeri Karpov NodeJS Engineer, MongoDB www.thecodebarbarian.com www.slideshare.net/vkarpov15 github.com/vkarpov15 @code_barbarian
  • 2. * A Bit About Me •CI/NodeJS Engineer at MongoDB •Maintainer of mongoose ODM •Recently working on rewriting mongodump, etc.
  • 3. * Talk Overview •What is the Storage Engine API? •What is WT and why you should care •Basic WT internals and gotchas •MMS Automation and WT •Some very basic performance numbers
  • 4. * Introducing Storage Engines •How MongoDB persists data •<= MongoDB 2.6: “mmapv1” storage engine •MongoDB 3.0 has a few options • mmapv1 • in_memory • wiredTiger • devnull (now /dev/null does support sharding!) •Internal hack: Twitter storage engine
  • 5. * Why Storage Engine API? •Different performance characteristics •mmapv1 doesn’t handle certain workloads well •Consistent API on top of storage layer • Can mix storage engines in a replset or sharded cluster!
  • 6. * Storage Engines Visually •mmapv1 persists data to dbpath
  • 7. * Storage Engines Visually •WT persists data differently
  • 8. * What is WiredTiger? •Storage engine company founded by BerkleyDB alums •Recently acquired by MongoDB •Available as a storage engine option in MongoDB 3.0
  • 9. * Why is WiredTiger Awesome? •Document-level locking •Compression on disk •Consistency without journaling •Better performance on certain workloads
  • 10. * Document-level Locking •The often-criticized global write lock was removed in 2.2 •Database-level locking •3.0 with mmapv1 has collection-level locking •3.0 with WT only locks at document layer •Writes no longer block all other writes •Better CPU usage: more cores ~= more writes
  • 11. * Compression •WT uses snappy compression by default •Data is compressed on disk •2 supported compression algorithms: • snappy: default. Good compression, relatively low overhead • zlib: Better compression, but at cost of more overhead
  • 12. * Consistency without Journaling •mmapv1 uses write-ahead log to guarantee consistency as well as durability •WT doesn’t have this problem: no in-place updates •Potentially good for insert-heavy workloads •Rely on replication for durability •More on this in the next section
  • 13. * WiredTiger Internals: Running It •mongod now has a --storageEngine option
  • 14. * WiredTiger and dbpath •Don’t run WT when you have mmapv1 files in your dbpath (and vice versa)
  • 15. * Upgrading to WT •Can’t copy database files •Can’t just restart with same dbpath •Other methods for upgrading still work: • Initial sync from replica set • mongodump/mongorestore •Can still do rolling upgrade of replica set to WT: • Shut down secondary, delete dbpath, bring it back up with -- storageEngine wiredTiger
  • 16. * Other Configuration Options •Compression: --wiredTigerCollectionBlockCompressor •
  • 17. * Other Configuration Options •directoryperdb: doesn’t exist in WT •Databases are a higher level abstraction with WT •Following options also have no WT equivalent • noprealloc • syncdelay • smallfiles • journalCommitInterval
  • 18. * Configuration with YAML Files •MongoDB 2.6 introduced YAML config files •The storage.wiredTiger field lets you tweak WT options
  • 19. * WiredTiger Journaling •Journaling in WT is a little different •Write-ahead log committed to disk at checkpoints •By default checkpoint every 60 seconds or 2GB written •Data files always consistent - no journaling means you lose data since last checkpoint •No journal commit interval: writes are written to journal as they come in
  • 20. * Gotcha: system.indexes, system. namespaces deprecated •Special collections in mmapv1 •Explicit commands: db.getIndexes(), db. getCollectionNames()
  • 21. * Gotcha: No 32-bit Support •WT storage engine will not work on 32-bit platforms at all
  • 22. * Using MMS Automation with WT •MMS automation allows you to manage and deploy MongoDB installations •Demo of upgrading a standalone to WT
  • 23. * Some Basic Performance Numbers •My desired use case: MongoDB for analytics data •Write-heavy workloads aren’t mmapv1’s strong suit •Don’t care about durability as much, but do care about high throughput •Compression is a plus •How does WT w/o journaling do on insert-only? •Simple N=1 experiment
  • 24. * Some Basic Performance Numbers •mongo shell’s benchRun function
  • 25. * Some Basic Performance Numbers •mmapv1 WT with --nojournal
  • 26. * Some Basic Performance Numbers •How does the compression work in this example? •After bench run, WT’s /data/db sums to ~23mb •With --noprealloc, --nopreallocj, --smallfiles, mmapv1 has ~100mb •Not a really fair comparison since data is very small
  • 27. * Thanks for Listening! •Slides on: • Twitter: @code_barbarian • Slideshare: slideshare.net/vkarpov15