SlideShare a Scribd company logo
1 of 50
How Sitecore depends on MongoDB
for scalability and performance, and
what it can teach you
Antonios Giannopoulos
Database Administrator – ObjectRocket
Grant Killian
Sitecore Architect - Rackspace
Percona Live 2017
Agenda
We are going to discuss:
Key terms
- Introduction to Sitecore
- Introduction to MongoDB
Best Practices for MongoDB with Sitecore
Scaling Sitecore
Benchmarks
Who We Are
Antonios Giannopoulos
Database Administrator w/ ObjectRocket
Grant Killian
Sitecore Architect w/ Rackspace
Sitecore MVP
Sitecore Architecture
Minimum necessary to understand this talk
Gartner Magic
Quadrant for
WCM (Web
Content
Management)
-Sept 2016
Sitecore is a framework for building websites...
Sitecore ♥ MongoDB because . . .
● Unstructured document model is a better fit for
Sitecore analytics vs traditional database rows
● ∞ scalability
● Introduces key flexibility to the system
○ HTTP Session state
○ Optional repository for other Sitecore modules
○ 100% replacement for SQL Server (experimental)
■ $$$
MongoDB replica-set
A group of mongod processes that maintain the same dataset
Replica sets provides:
- Redundancy
- High availability
- Scaling
MongoDB replica-set
Consists of at least 3 nodes
- Up to 50 nodes in 3.0 and higher
- 12 on previous versions
A replica-set node may be either:
- Primary
- Secondary
- Arbiter
MongoDB replica-set
Asynchronous replication
- Delay between PRI and SECs
- SECs pull and apply operations
Automatic failover
- If a PRI fails a SEC takes its place
MongoDB replica-set
Best Practices
- Odd number of members
- Use same server specs
- Reliable network connections
- Adjust the oplog accordingly
MongoDB Sharded Clusters
Consists of:
Mongos
- It’s a statement (query) router
- Connection interface for the driver - makes sharding transparent
Config Servers: Holds cluster metadata - location of the data
Shards: Contains a subset of the sharded data
MongoDB Sharded Clusters
MongoDB Sharded Clusters
Best Practices
- Deploy shards as replica-sets
- Reliable network connections
- But most important… pick a shard key
Undo a shard key might require downtime
MongoDB Sharded Clusters
What makes a good shard key:
- High Cardinality
- Not Null values
- Immutable field(s)
- Not Monotonically increased fields
- Even read/write distribution
- Even data distribution
- Read targeting/locality
Most important choose a shard key according to your application requirements
MongoDB Storage Engines
MongoDB version 3.0 and higher supports:
- MMAPv1
- WiredTiger
- RocksDB (Percona Server)
- In Memory (Percona Server)
- Fractal Tree (Percona Server)
Sitecore MongoDB Databases
1. Analytics - customer visit metrics (IP address, browser,pages…)
2. Tracking_contact - contact processing
3. Tracking_history - history worker queue for full rebuilds
4. Tracking_live - task queue for real-time processing
5. Private_session - “classic” http session state
6. Shared_session - meta http session state for contacts
(engagement state for livetime of interactions…)
For example . . .
Graphic courtesy of http://www.techphoria414.com
Scaling Sitecore – Separate Workloads
Move each Sitecore database to a separate instance
Sitecore uses different connection string per Database
connectionString="mongodb://_mongo_server_01_:_port_number_/_session_database
_name_" />
connectionString="mongodb://_mongo_server_02_:_port_number_/_analytics_databas
e_name_" />
Instances can be optimized according to their workload
Scaling Sitecore – Polyglot
Use a different storage engine per database:
- Different instances
- Sharded clusters, different storage engines per shard
Percona In-memory storage engine is a good fit for _sessions
- Based on the in-memory storage engine used in MongoDB Enterprise Edition
- _sessions data are not persistent
Scaling Sitecore - Sharding
What to shard:
- Large collections for capacity
- Busy collections for load distribution
How to pick a shard key:
- Collect a representative statement sample and identify statement patterns
- Pick a shard key that scales the workload/statements
- Meet sharding constraints
Scaling Sitecore - Sharding
From Sitecore documentation: “Sitecore calculates
diskspace sizing projections using 5KB per
interaction and 2.5KB per identified contact and
these two items make up 80% of the diskspace”
Sharding interaction and contact for capacity.
Scaling Sitecore - Sharding
Collection Interaction
Receives: Inserts, Queries and Updates
Read/Write Ratio: 60-40
Updates are using the _id
Queries are using:
"_id, ContactId” : 80%
"ContactId,_t”: 5%
"ContactId,ContactVisitIndex”: 15%
Scaling Sitecore - Sharding
Collection Interaction
Recommended shard key is _id:1 or _id:hashed
- Scale vast majority of statements
- But… few scatter-gather queries (around 20%)
{ContactId:1} is also decent, But:
- Updates on sharded collections MUST use the shard key (or {multi:true}) - _id an exception to that rule
- _id is generated by the application not the driver
- Potential for Jumbo chunks
Scaling Sitecore - Sharding
Collection Interaction
Choose your shard key according to your engine
- MMAP _id:1 or _id:hashed
- WiredTiger _id:1 or _id:hashed or ContactId:1
Sitecore may optimize sharding by including ContactId on the updates
Scaling Sitecore - Sharding
Collection Contacts
Receives: Inserts, Queries and Updates
Read/Write Ratio: 80-20
Updates are using the _id
Queries are using the _id (with additional fields)
Recommended shard key is _id:1 or _id:hashed
Scaling Sitecore - Sharding
Collection Devices
Recommended shard key is _id:1 or _id:hashed
Collection ClassificationsMap
Recommended shard key is _id:1 or _id:hashed
Collection KeyBehaviorCache
Recommended shard key is _id:1 or _id:hashed
Scaling Sitecore - Sharding
Collection GeoIps
Recommended shard key is _id:1 or _id:hashed
Collection OperationStatuses
Recommended shard key is _id:1 or _id:hashed
Collection ReferringSites
Recommended shard key is _id:1 or _id:hashed
Scaling Sitecore - Sharding
{_id:1} vs {_id:hashed}
Client generated _id are monotonically increased thus “hashed”
added for randomness
Sitecore_id is a .NET UUID (Universally Unique Identifier) bundled
on BinData datatype
Example: "_id" : BinData(3,"1eDJ1NXU8EeiD5a6WJtxbA==")
Scaling Sitecore - Sharding
{_id:1} vs {_id:hashed}
You may use the uuidhelpers.js utility to convert _id to UUID
Download from: https://github.com/mongodb/mongo-csharp-
driver/blob/master/uuidhelpers.js
>doc = db.test.findOne()
{ "_id" : BinData(3,"1eDJ1NXU8EeiD5a6WJtxbA==") }
>doc._id.toCSUUID()
CSUUID("d4c9e0d5-d4d5-47f0-a20f-96ba589b716c")
Scaling Sitecore - Sharding
Use {_id:"hashed”} when you have an empty collection
Using numInitialChunks allows to pre-split and distribute empty chunks.
- Avoid chunk splits
- Avoid chunk moves
db.adminCommand( { shardCollection: <collection>, key: {_id:”hashed”} ,
numInitialChunks:<number>} ) , number < 8192 per shard.
Scaling Sitecore - Sharding
Use {_id:"hashed”} when you have an empty collection
Define numInitialChunks
Size= Collection size (in MB)/32
Count= Number of documents/125000
Limit= Number of shards*8192
numInitialChunks = Min(Max(Size, Count), Limit)
Scaling Sitecore - Sharding
Move Primary
Move each sitecore database to a different shard:
(analytics, tracking_live …)
db.runCommand( { movePrimary: <databaseName>, to: <newPrimaryShard> } )
Requires downtime for live databases
Scaling Sitecore – Secondary Reads
You can configure Secondary Reads from the driver (secondary or
secondaryPreferred)
connectionString="mongodb://_mongo_server_01_:_port_number_/_session_da
tabase_name_?readPreference=secondary/>
In 3.4 maxStalenessSeconds was introduced to control stale reads
Specifies, in seconds, how stale a secondary can be before the client stops using
it for read operations
Scaling Sitecore – Secondary Reads
Use ReplicaSet Tags to target reads:
- Direct reads to specific replica set nodes
- Reduces availability
conf = rs.conf();
conf.members[0].tags = {"db": "analytics"}
rs.reconfig(conf)
Set readPreferenceTags on the connection string
connectionString="mongodb://_mongo_server_01_:_port_number_/_session_database_name_?readPref
erenceTags=analytics/>
Order matters when setting multiple tagsOrder matters
Scaling Sitecore – Multi Region
Challenges:
- Direct reads to the closest node
- Direct writes to the closest node
- Single database entity for reporting
- Minimum complexity
Scaling Sitecore – Multi Region
Replica Set:
- Target reads using nearest read concern
- Target reads using region based tags
- Writes must go to the Primary
- Requires at least one secondary per region
Scaling Sitecore – Multi Region
Sharded cluster:
- Target reads using nearest read concern
- Target reads using region based tags
- Requires at least one secondary per region
- Writes must go to the Primaries
- Tags or Zones are based on shard key ranges
- Add location to shard key as prefix – change the source code
Scaling Sitecore – Multi Region
Mongo to Mongo connector:
- Creates a pipeline from a MongoDB cluster to another
MongoDB cluster
- Reads and replicates oplog operations
- Easy deployment
mongo-connector -m <name:port> -t <name:port> -d <database>
Scaling Sitecore – Connector
oplog oplog
db.Insert.foo ({a:1})
db.Insert.foo ({_id:1, a:1})
{ "ts" : Timestamp(), "h" :
NumLong(), "v" : 2, "op" :
"i", "ns”:”foo.foo”, "o" : {
"_id" : 1, a:1}
Scaling Sitecore – Multi Region
Mongo to Mongo Connector
Scaling Sitecore – Multi Region
Mongo to Mongo Connector
Scaling Sitecore – Multi Region
Mongo to Mongo Connector
Benchmarks
Benchmark 1: Single/Replica set MMAP vs Single shard/Replica set
WiredTiger (3.2.8)
Results: WiredTiger is 9.5% faster
Benchmark 2: Sharded cluster MMAP vs Sharded cluster
WiredTiger (Analytics sharded on {_id:1})
Results: WiredTiger is 9.4% faster
So what?
- Evaluate your MongoDB architecture to determine if it
would benefit from scaling
- If scaling is in order, consider this talk as a
reference
- Recognize how MongoDB’s versatility makes it
relevant to a wide variety of applications
Whats next?
- Test MongoRocks (Percona Server) against Sitecore
- Test In-Memory (Percona Server) for sessions or
cache(s)
- Expand sharding recommendations on add-ons
- Evaluate other Sitecore modules for suitability with
MongoDB
- Re-invent our benchmarks
We’re Hiring!
Looking to join a dynamic & innovative team?
Justine is here at Percona Live 2017,
Reach out directly to our Recruiter at justine.marmolejo@rackspace.com
Questions?
Thank you!!!
antonios.giannopoulos@rackspace.co.uk
@iamantonios
🍍
grant.killian@rackspace.com
@sitecoreagent

More Related Content

What's hot

MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrongIgor Donchovski
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB SecurityIgor Donchovski
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDBMongoDB
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBIgor Donchovski
 
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...MongoDB
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAIgor Donchovski
 
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsAndrew Morgan
 
Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 MongoDB
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBMongoDB
 
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...MongoDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring dataJimmy Ray
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2MongoDB
 
MongoDB et Hadoop
MongoDB et HadoopMongoDB et Hadoop
MongoDB et HadoopMongoDB
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMongoDB
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldAjay Gupte
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
 

What's hot (20)

MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrong
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB Security
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDB
 
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...
MongoDB Europe 2016 - Who’s Helping Themselves To Your Data? Demystifying Mon...
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBA
 
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
 
Joins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation EnhancementsJoins and Other MongoDB 3.2 Aggregation Enhancements
Joins and Other MongoDB 3.2 Aggregation Enhancements
 
Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
Webinar: Compliance and Data Protection in the Big Data Age: MongoDB Security...
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
2014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-22014 05-07-fr - add dev series - session 6 - deploying your application-2
2014 05-07-fr - add dev series - session 6 - deploying your application-2
 
MongoDB et Hadoop
MongoDB et HadoopMongoDB et Hadoop
MongoDB et Hadoop
 
Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB WorldNoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
NoSQL Analytics: JSON Data Analysis and Acceleration in MongoDB World
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
 

Similar to How sitecore depends on mongo db for scalability and performance, and what it can teach you

Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBMatthew Kalan
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User GroupMongoDB
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsSrinivas Mutyala
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - finalMongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentationOleksii Usyk
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisMongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisJason Terpko
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB
 
EEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsEEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsExpertos en TI
 
OSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBOSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBBradley Holt
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge ShareingPhilip Zhong
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBMarco Segato
 

Similar to How sitecore depends on mongo db for scalability and performance, and what it can teach you (20)

Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User Group
 
DBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training PresentationsDBVersity MongoDB Online Training Presentations
DBVersity MongoDB Online Training Presentations
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
 
Mongo db and hadoop driving business insights - final
Mongo db and hadoop   driving business insights - finalMongo db and hadoop   driving business insights - final
Mongo db and hadoop driving business insights - final
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
MongoDB 3.4 webinar
MongoDB 3.4 webinarMongoDB 3.4 webinar
MongoDB 3.4 webinar
 
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to RedisMongoDB: Comparing WiredTiger In-Memory Engine to Redis
MongoDB: Comparing WiredTiger In-Memory Engine to Redis
 
MongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: ShardingMongoDB for Time Series Data Part 3: Sharding
MongoDB for Time Series Data Part 3: Sharding
 
EEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsEEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web Applications
 
OSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBOSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
SQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDBSQL vs NoSQL, an experiment with MongoDB
SQL vs NoSQL, an experiment with MongoDB
 

More from Antonios Giannopoulos

Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and ElasticComparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and ElasticAntonios Giannopoulos
 
Using MongoDB with Kafka - Use Cases and Best Practices
Using MongoDB with Kafka -  Use Cases and Best PracticesUsing MongoDB with Kafka -  Use Cases and Best Practices
Using MongoDB with Kafka - Use Cases and Best PracticesAntonios Giannopoulos
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_newAntonios Giannopoulos
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2Antonios Giannopoulos
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBAntonios Giannopoulos
 
Upgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versionsUpgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versionsAntonios Giannopoulos
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018Antonios Giannopoulos
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Antonios Giannopoulos
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017Antonios Giannopoulos
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialAntonios Giannopoulos
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos
 
Introduction to Polyglot Persistence
Introduction to Polyglot Persistence Introduction to Polyglot Persistence
Introduction to Polyglot Persistence Antonios Giannopoulos
 

More from Antonios Giannopoulos (15)

Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and ElasticComparing Geospatial Implementation in MongoDB, Postgres, and Elastic
Comparing Geospatial Implementation in MongoDB, Postgres, and Elastic
 
Using MongoDB with Kafka - Use Cases and Best Practices
Using MongoDB with Kafka -  Use Cases and Best PracticesUsing MongoDB with Kafka -  Use Cases and Best Practices
Using MongoDB with Kafka - Use Cases and Best Practices
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
 
Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
 
Upgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versionsUpgrading to MongoDB 4.0 from older versions
Upgrading to MongoDB 4.0 from older versions
 
How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018How to upgrade to MongoDB 4.0 - Percona Europe 2018
How to upgrade to MongoDB 4.0 - Percona Europe 2018
 
Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018 Elastic 101 tutorial - Percona Europe 2018
Elastic 101 tutorial - Percona Europe 2018
 
Triggers in MongoDB
Triggers in MongoDBTriggers in MongoDB
Triggers in MongoDB
 
Sharded cluster tutorial
Sharded cluster tutorialSharded cluster tutorial
Sharded cluster tutorial
 
MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017MongoDB – Sharded cluster tutorial - Percona Europe 2017
MongoDB – Sharded cluster tutorial - Percona Europe 2017
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
 
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration VariablesAntonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
Antonios Giannopoulos Percona 2016 WiredTiger Configuration Variables
 
Introduction to Polyglot Persistence
Introduction to Polyglot Persistence Introduction to Polyglot Persistence
Introduction to Polyglot Persistence
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 

Recently uploaded

Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerkumenegertelayegrama
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SESaleh Ibne Omar
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 

Recently uploaded (18)

Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 

How sitecore depends on mongo db for scalability and performance, and what it can teach you

  • 1. How Sitecore depends on MongoDB for scalability and performance, and what it can teach you Antonios Giannopoulos Database Administrator – ObjectRocket Grant Killian Sitecore Architect - Rackspace Percona Live 2017
  • 2. Agenda We are going to discuss: Key terms - Introduction to Sitecore - Introduction to MongoDB Best Practices for MongoDB with Sitecore Scaling Sitecore Benchmarks
  • 3. Who We Are Antonios Giannopoulos Database Administrator w/ ObjectRocket Grant Killian Sitecore Architect w/ Rackspace Sitecore MVP
  • 4. Sitecore Architecture Minimum necessary to understand this talk
  • 5. Gartner Magic Quadrant for WCM (Web Content Management) -Sept 2016
  • 6. Sitecore is a framework for building websites...
  • 7.
  • 8.
  • 9. Sitecore ♥ MongoDB because . . . ● Unstructured document model is a better fit for Sitecore analytics vs traditional database rows ● ∞ scalability ● Introduces key flexibility to the system ○ HTTP Session state ○ Optional repository for other Sitecore modules ○ 100% replacement for SQL Server (experimental) ■ $$$
  • 10. MongoDB replica-set A group of mongod processes that maintain the same dataset Replica sets provides: - Redundancy - High availability - Scaling
  • 11. MongoDB replica-set Consists of at least 3 nodes - Up to 50 nodes in 3.0 and higher - 12 on previous versions A replica-set node may be either: - Primary - Secondary - Arbiter
  • 12. MongoDB replica-set Asynchronous replication - Delay between PRI and SECs - SECs pull and apply operations Automatic failover - If a PRI fails a SEC takes its place
  • 13. MongoDB replica-set Best Practices - Odd number of members - Use same server specs - Reliable network connections - Adjust the oplog accordingly
  • 14. MongoDB Sharded Clusters Consists of: Mongos - It’s a statement (query) router - Connection interface for the driver - makes sharding transparent Config Servers: Holds cluster metadata - location of the data Shards: Contains a subset of the sharded data
  • 16. MongoDB Sharded Clusters Best Practices - Deploy shards as replica-sets - Reliable network connections - But most important… pick a shard key Undo a shard key might require downtime
  • 17. MongoDB Sharded Clusters What makes a good shard key: - High Cardinality - Not Null values - Immutable field(s) - Not Monotonically increased fields - Even read/write distribution - Even data distribution - Read targeting/locality Most important choose a shard key according to your application requirements
  • 18. MongoDB Storage Engines MongoDB version 3.0 and higher supports: - MMAPv1 - WiredTiger - RocksDB (Percona Server) - In Memory (Percona Server) - Fractal Tree (Percona Server)
  • 19. Sitecore MongoDB Databases 1. Analytics - customer visit metrics (IP address, browser,pages…) 2. Tracking_contact - contact processing 3. Tracking_history - history worker queue for full rebuilds 4. Tracking_live - task queue for real-time processing 5. Private_session - “classic” http session state 6. Shared_session - meta http session state for contacts (engagement state for livetime of interactions…)
  • 20. For example . . . Graphic courtesy of http://www.techphoria414.com
  • 21. Scaling Sitecore – Separate Workloads Move each Sitecore database to a separate instance Sitecore uses different connection string per Database connectionString="mongodb://_mongo_server_01_:_port_number_/_session_database _name_" /> connectionString="mongodb://_mongo_server_02_:_port_number_/_analytics_databas e_name_" /> Instances can be optimized according to their workload
  • 22. Scaling Sitecore – Polyglot Use a different storage engine per database: - Different instances - Sharded clusters, different storage engines per shard Percona In-memory storage engine is a good fit for _sessions - Based on the in-memory storage engine used in MongoDB Enterprise Edition - _sessions data are not persistent
  • 23. Scaling Sitecore - Sharding What to shard: - Large collections for capacity - Busy collections for load distribution How to pick a shard key: - Collect a representative statement sample and identify statement patterns - Pick a shard key that scales the workload/statements - Meet sharding constraints
  • 24. Scaling Sitecore - Sharding From Sitecore documentation: “Sitecore calculates diskspace sizing projections using 5KB per interaction and 2.5KB per identified contact and these two items make up 80% of the diskspace” Sharding interaction and contact for capacity.
  • 25. Scaling Sitecore - Sharding Collection Interaction Receives: Inserts, Queries and Updates Read/Write Ratio: 60-40 Updates are using the _id Queries are using: "_id, ContactId” : 80% "ContactId,_t”: 5% "ContactId,ContactVisitIndex”: 15%
  • 26. Scaling Sitecore - Sharding Collection Interaction Recommended shard key is _id:1 or _id:hashed - Scale vast majority of statements - But… few scatter-gather queries (around 20%) {ContactId:1} is also decent, But: - Updates on sharded collections MUST use the shard key (or {multi:true}) - _id an exception to that rule - _id is generated by the application not the driver - Potential for Jumbo chunks
  • 27. Scaling Sitecore - Sharding Collection Interaction Choose your shard key according to your engine - MMAP _id:1 or _id:hashed - WiredTiger _id:1 or _id:hashed or ContactId:1 Sitecore may optimize sharding by including ContactId on the updates
  • 28. Scaling Sitecore - Sharding Collection Contacts Receives: Inserts, Queries and Updates Read/Write Ratio: 80-20 Updates are using the _id Queries are using the _id (with additional fields) Recommended shard key is _id:1 or _id:hashed
  • 29. Scaling Sitecore - Sharding Collection Devices Recommended shard key is _id:1 or _id:hashed Collection ClassificationsMap Recommended shard key is _id:1 or _id:hashed Collection KeyBehaviorCache Recommended shard key is _id:1 or _id:hashed
  • 30. Scaling Sitecore - Sharding Collection GeoIps Recommended shard key is _id:1 or _id:hashed Collection OperationStatuses Recommended shard key is _id:1 or _id:hashed Collection ReferringSites Recommended shard key is _id:1 or _id:hashed
  • 31. Scaling Sitecore - Sharding {_id:1} vs {_id:hashed} Client generated _id are monotonically increased thus “hashed” added for randomness Sitecore_id is a .NET UUID (Universally Unique Identifier) bundled on BinData datatype Example: "_id" : BinData(3,"1eDJ1NXU8EeiD5a6WJtxbA==")
  • 32. Scaling Sitecore - Sharding {_id:1} vs {_id:hashed} You may use the uuidhelpers.js utility to convert _id to UUID Download from: https://github.com/mongodb/mongo-csharp- driver/blob/master/uuidhelpers.js >doc = db.test.findOne() { "_id" : BinData(3,"1eDJ1NXU8EeiD5a6WJtxbA==") } >doc._id.toCSUUID() CSUUID("d4c9e0d5-d4d5-47f0-a20f-96ba589b716c")
  • 33. Scaling Sitecore - Sharding Use {_id:"hashed”} when you have an empty collection Using numInitialChunks allows to pre-split and distribute empty chunks. - Avoid chunk splits - Avoid chunk moves db.adminCommand( { shardCollection: <collection>, key: {_id:”hashed”} , numInitialChunks:<number>} ) , number < 8192 per shard.
  • 34. Scaling Sitecore - Sharding Use {_id:"hashed”} when you have an empty collection Define numInitialChunks Size= Collection size (in MB)/32 Count= Number of documents/125000 Limit= Number of shards*8192 numInitialChunks = Min(Max(Size, Count), Limit)
  • 35. Scaling Sitecore - Sharding Move Primary Move each sitecore database to a different shard: (analytics, tracking_live …) db.runCommand( { movePrimary: <databaseName>, to: <newPrimaryShard> } ) Requires downtime for live databases
  • 36. Scaling Sitecore – Secondary Reads You can configure Secondary Reads from the driver (secondary or secondaryPreferred) connectionString="mongodb://_mongo_server_01_:_port_number_/_session_da tabase_name_?readPreference=secondary/> In 3.4 maxStalenessSeconds was introduced to control stale reads Specifies, in seconds, how stale a secondary can be before the client stops using it for read operations
  • 37. Scaling Sitecore – Secondary Reads Use ReplicaSet Tags to target reads: - Direct reads to specific replica set nodes - Reduces availability conf = rs.conf(); conf.members[0].tags = {"db": "analytics"} rs.reconfig(conf) Set readPreferenceTags on the connection string connectionString="mongodb://_mongo_server_01_:_port_number_/_session_database_name_?readPref erenceTags=analytics/> Order matters when setting multiple tagsOrder matters
  • 38. Scaling Sitecore – Multi Region Challenges: - Direct reads to the closest node - Direct writes to the closest node - Single database entity for reporting - Minimum complexity
  • 39. Scaling Sitecore – Multi Region Replica Set: - Target reads using nearest read concern - Target reads using region based tags - Writes must go to the Primary - Requires at least one secondary per region
  • 40. Scaling Sitecore – Multi Region Sharded cluster: - Target reads using nearest read concern - Target reads using region based tags - Requires at least one secondary per region - Writes must go to the Primaries - Tags or Zones are based on shard key ranges - Add location to shard key as prefix – change the source code
  • 41. Scaling Sitecore – Multi Region Mongo to Mongo connector: - Creates a pipeline from a MongoDB cluster to another MongoDB cluster - Reads and replicates oplog operations - Easy deployment mongo-connector -m <name:port> -t <name:port> -d <database>
  • 42. Scaling Sitecore – Connector oplog oplog db.Insert.foo ({a:1}) db.Insert.foo ({_id:1, a:1}) { "ts" : Timestamp(), "h" : NumLong(), "v" : 2, "op" : "i", "ns”:”foo.foo”, "o" : { "_id" : 1, a:1}
  • 43. Scaling Sitecore – Multi Region Mongo to Mongo Connector
  • 44. Scaling Sitecore – Multi Region Mongo to Mongo Connector
  • 45. Scaling Sitecore – Multi Region Mongo to Mongo Connector
  • 46. Benchmarks Benchmark 1: Single/Replica set MMAP vs Single shard/Replica set WiredTiger (3.2.8) Results: WiredTiger is 9.5% faster Benchmark 2: Sharded cluster MMAP vs Sharded cluster WiredTiger (Analytics sharded on {_id:1}) Results: WiredTiger is 9.4% faster
  • 47. So what? - Evaluate your MongoDB architecture to determine if it would benefit from scaling - If scaling is in order, consider this talk as a reference - Recognize how MongoDB’s versatility makes it relevant to a wide variety of applications
  • 48. Whats next? - Test MongoRocks (Percona Server) against Sitecore - Test In-Memory (Percona Server) for sessions or cache(s) - Expand sharding recommendations on add-ons - Evaluate other Sitecore modules for suitability with MongoDB - Re-invent our benchmarks
  • 49. We’re Hiring! Looking to join a dynamic & innovative team? Justine is here at Percona Live 2017, Reach out directly to our Recruiter at justine.marmolejo@rackspace.com