Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

MongoDB and Solidfire: Tuning MongoDB for Next Generation Storage Systems


Published on

Storage architecture can have a direct impact on MongoDB performance. Traditional relational databases were designed around legacy SAN devices and required that the storage systems were dedicated to …

Storage architecture can have a direct impact on MongoDB performance. Traditional relational databases were designed around legacy SAN devices and required that the storage systems were dedicated to the database. If you wanted more performance you purchased a larger array. With NoSQL databases, the model has been flipped upside down. These databases are designed from the ground up to be distributed. More hosts equals more performance. By leveraging solid-state drive technology with concepts like storage virtualization, quality of service and horizontal scaling, next generation storage systems like SolidFire are able to combine the comforts of traditional dedicated storage performance with the simplicity and scalability expected in a MongoDB environment.

The architecture of MongoDB makes it ideal for large scale deployments. By tuning MongoDB to work with a next generation storage system, database administrators can achieve consistent, repeatable IO performance with ultra low latency in a highly scalable, extremely flexible database environment.

Join Chris Merz as he walks through a real-world example to show how to:

Architect MongoDB with SolidFire storage for a large scale production cloud environment
Traverse the technology stack to identify performance bottlenecks
Optimize IO performance and latency
Normalize performance under load
Maintain performance at scale

Published in: Technology, Business

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. MongoDB on SolidFire Tuning MongoDB for Next Generation Storage Systems
  • 2. MongoDB Has Changed The Landscape ● Scale-out Databases Have Changed Storage Expectations ○ ○ ○ ○ ● Expand infrastructure incrementally vs. forklift system upgrades Path to start small and scale extremely large (e.g. Internet sized applications) Avoid vendor lock-in -> not a factor due to open platform options Proven performance in both dedicated and virtual environments Schemaless Design offers Ultimate Flexibility ○ ○ ○ ○ Developers are not forced to code to a relational schema Applications can store and retrieve data in a natural format Functional development is simplified and streamlined MongoDB can help applications scale, regardless of original dbms
  • 3. Traditional Storage Can’t Keep Up ● Limited by their scale-up architectures ● Spinning disks have huge latency and throughput limitations ● HDD performance ceiling is lower than capacity ceiling ● Legacy storage systems require a PhD to configure and maintain ● Severe lack of provisioning flexibility ● Immutable configurations do not allow in-line modification
  • 4. Next Gen Storage Systems Embrace the New Reality ● Scale-out storage architecture (clusters nodes together) ● Horizontal scaling approach allows for incremental, online expansion ● All SSD storage systems provides superior query performance ● Inline compression of data and deduplication maximize usable capacity ● Storage design flexibility enable flexible/adaptive database designs ○ Virtualize storage resources (GBs, IOPS) ○ Provision elastically ○ Adjust resources independently and as needed
  • 5. Real World Example - Large Web 2.0 Enterprise ● Private Cloud environment with a SolidFire storage layer ● MongoDB Sharded Cluster, ~42 nodes ● Backend for a global e-commerce platform ● High volume, extremely strict operational tolerances ● Low latency critical to performance at scale ● IOPS crucial for sharded cluster stability and balancing
  • 6. Architecture of a MongoDB Cluster on SolidFire (simplified) Query Routers Config Servers mongos mongos mongod (C1) mongod (s1) mongod (s1) mongod (s1) mongod (C2) mongod (C3) mongod (s2) mongod (s2) mongod (s2) Cloud Environment mongod (s3) mongod (s3) mongod (s3) Single Shard ● ● ● Cloud Environment backed by SolidFire supporting a MongoDB sharded cluster All mongod, mongos, and config nodes, as well as storage volumes, are virtualized No cookie-cutter limitations, the entire environment has been tailored to suit the application needs
  • 7. Performance Troubleshooting ● Query latency inconsistent and above tolerance requirements: avg 15-17ms ● Affecting all mongod instances in the cluster, not isolated to a single deployment, node, or shard ● What is happening during that 15-17ms? ● What configuration options can be adjusted that affect overall latency? ● Tools: mongostat, iostat, top, bash plus SolidFire API calls ● Worked with customer DBA and Ops teams to gather metrics and implement configuration optimizations on the production system
  • 8. Latency Analysis and Contributor Isolation ● Identify any potential i/o queue depth bottlenecks ● Determine main contributors to end-to-end latency on the chain ● View the chain as an overall ‘latency loop’ and break down sections ● MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB ● iostat showed i/o queue depths of 100+, while iSCSI limit was 32 ● Frames stalled at the OS layer while waiting to move down the chain ● Implemented mpio to combine 4 iSCSI sessions, increasing queue capacity to 128
  • 9. Latency Reduction After Implementing mpio
  • 10. Optimizing the Stack to Reduce Remaining Latency ● Verified Linux best practices were being followed: ○ ○ ○ ● Verified SolidFire best practices were being followed: ○ ○ ○ ○ ● Adjust /etc/limits.conf XFS agcount equal to number of cpu cores Readahead value set very low (4k) I/O scheduler set to[noop] Volume mount options: nobarrier, noatime, nodiratime MPIO in use for high performance applications IOPS values configured high enough for MongoDB needs If the OS and the Storage System are optimal, and the Network latency is a static, what’s left?
  • 11. Zeroing in on the Latency Source ● The latency loop from MongoDB, end-to-end, averaging 3.9ms MongoDB > OS > NIC > Network > Storage > Network > NIC > OS > MongoDB ● The inner subset loop of the OS back to the OS averaging ~1.1ms OS > NIC > Network > Storage > Network > NIC > OS ● The storage subset of the OS loop was averaging 0.7ms ● The application layer, MongoDB, was the main contributor at ~2.8ms ● Latency spikes at ~60s intervals
  • 12. Fine-tuning to Reduce Latency ● I/O queue depth showing periodic elevations to 100+ ● MongoDB flushes pages to disk, by default, every 60s ● Latency spikes lasting ~5-6 seconds ● Tuning syncdelay to 10s lowered the flushing queue depth, reducing latency ● Final end-to-end MongoDB response times consistently 1.2ms ● 4-5k IOPS per mongod, no measurable increase in extraneous i/o
  • 13. DBA and Ops Teams VERY HAPPY
  • 14. Avoid Gotchas ● Be sure to use the optimal i/o scheduler for your storage subsystem ● Ensure that the correct mount options are used to avoid i/o bottlenecks ● For high-volume databases, align queue depth maximums (multipath) ● Provision adequate IOPS and burst capacity per volume for throughput ● Traverse the entire stack to mitigate latency, don’t over-focus on one area ● Consider syncdelay tuning, but evaluate this on a per-deployment basis ● Depending on storage characteristics, journal splitting may be necessary ● DO NOT RUN OUT OF IOPS. THIS KILLS THE DB CLUSTER.
  • 15. Take Aways New Approach = Amazing Results ● MongoDB has changed the database / storage relationship ● Great results can happen with database / storage architecture alignment The Path There ● Horizontal scaling in storage is critical for optimal throughput and latency ● Understand your options for controlling performance before designing ● Have a plan for scaling both capacity and performance
  • 16. 1620 Pearl Street, Boulder, Colorado 80302 Phone: 720.523.3278 Email: