Migrating from Relational Databases to MongoDB


Published on

The relational database has been the foundation of enterprise data management for over thirty years. But the way we build and run applications today, coupled with growth in data sources and user loads, are pushing relational databases beyond their limits. To address these challenges, companies like Salesforce.com, MTV and Cisco have migrated successfully from relational databases to MongoDB.

This presentation provides a step-by-step guide for project teams that want to know how to migrate from an RDBMS to MongoDB. It also explains the considerations for teams that come from relational database backgrounds and want to build new applications on MongoDB.

Download the whitepaper to learn more best practices:

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Welcome to the Relational DB to MongoDB Migration webinarMy name Mat Keep, taking through the considerations and best practices for db migrationTo set the context – we will see challenges in the way we build and run applications today, that a number of users are starting to look beyond relational databases to NoSQL as they evolve their applications – some are beyond looking and have actually done it So in this session, discuss considerations need to make as you explore potential migration from RDBMS to MongoDBAlso have Bryan Reinero on line – he is here to answer your questions – place in the webinar tool
  • Start by talking about developing a roadmap for migration – then get into each stage of the migrationNot going to go massively deep on each topic – but rather look at key considerations you need to make as you evaluate a migration
  • Before that, look at some of the drivers for migrationLook across the enterprise – see 4 x core strategic initiativesOrganizations concerned with enabling new apps, and enhancing existing apps – specifically to drive new revenue streams, enhancing operational processes, delivering better customer experiences:These updated applications involve handling rapid growth in data from new channels: from mobile and social apps, from sensor networks and so onAt same time, deliver application enhancements with faster time to market and lower ongoing TCO
  • Many of these requirements causing RDBMS hit their design limitsMuch of the data being generated from new web, social or mobile apps is highly complex, very often semi-structured or unstructured, difficult to map into 2D structures of rows and columns. See what that data looks like laterData volumes and diversity of data increasingly, Development methodologies are changing, from traditional waterfall approaches where requirements are defined at the very start of the project, into agile approaches supporting iterative development and shorter release cyclesEmergence of new architectures for scaling out in the cloud – on-prem or public infrastructureAll of these requirements place massive strain on existing relational infrastructure
  • Because of these challenges, Migration is well trodden path – see range of examples from web, media, telecoms, mobile, range of apps: ecommerce, social networking, real-time analytics, billing:Working with many of these, given insight into considerations discussing here
  • So, how do we get started
  • High level roadmap – divide into 5 stages, keys steps in each stageStarts with formation of project team, ensure we have all key stakeholders on board. successful projects bring together LOB, devs, DBAs Operational staff. Then get to the most important area – schema design, capture app requirements, model our data, define indexesApp integration – drivers and API we are using – models for consistency and durability, Also at this stage, start load testingPhysical migration of data, multiple options for doing thatOperational considerations, strategies in place for all things listedTo support you on your way – have free online mongodb training for devs and dbas – already had over 100k registrantsConsulting and support packages also available
  • 1st and most critical phase of the migration is schema design – spend a little bit of time on thisAlso point you to a very recent webinar, available on-demand which explores this topic In much more depth
  • Before getting into schema design, lets start by standardising terminology between relational and mongodb worlds
  • lets cover comceptually the differences between the relational model and the documentrequires a change in perspective for data architects - from the relational data model that flattens data into rigid 2-dimensional tabular structures of rows and columns - to a document data model allows embedding of sub-documents and arrays. In this example of a car ownership database with 2 tables – 1 for person and another for their cars, the Relational model uses the Pers_ID field to JOIN the “Person” table with the “Car” table so that the application canreport owners of each car.Modeling the same data in MongoDB enables us to create a schema where we embed each of the cars as an array of sub-documents directly within the Person document.
  • So what are the advantages of this approach
  • A significant advantage of NoSQL databases is that the schema is dynamic. Collections can be created without first defining the structure of documents. Documents in a collection need not have an identical set of fields. The structure of documents can be changed adding new fields or deleting existing ones. Enables schema to be evolved much more rapidly. Do have a lot of flexibility during the migration phase to quickly iterate over schema design, and that flexibility carries on to production
  • Lets continue the comparison between relational and document model - consider the example of a blogging platformGot 5 tables - Category, article, user, comments and tags - the application relies on the RDBMS to join five separate tables in order to build the blog entry. –
  • In the case of MongoDB, all of the blog data is aggregated within a single document, linked with a single reference to a user document containing authors ofboth the blog and commentsFrom a performance and scalability perspective, the aggregated document can be accessed in a single call to the database, rather than having to JOIN multiple tables to respond to a query
  • How do we go about defining the data model?The golden rule of schema design is that the data access patterns of the application should govern the design process, look specifically: The read / write ratio of database operations;The types of queries and updates performed by the database;The growth rate of documents.As a first step,capture the operations the application needs to perform on it’s data,using example of a product catalogFor illustration we show the process of creating the product record, retrieving the record and adding a review – show the steps the relational database takes for each of those tasks and look at how that same functionality is handled by MongoDB – this helps inform the schema design for our new databaseAnother important reference point is to look at logs of current database to identify most common queries, identify what data is stored togetherThese steps will help to identify the ideal document schema for the application’s data, based on the queries and operations to be performed against it.
  • Impt design decision is deciding when to embed related data in a document or instead create a reference between separate documents. This is an application-specific consideration. There aresome general considerations that can be made to guide the decision during schema design.Embedding data via sub-documents or arrays, makes senseFor 1:1 or 1:Many (where “many” viewed with the parent)Consider the concept of ownership and containment. Using the product data example earlier, product pricing - should be embedded within the product document as it is owned by and contained within the product. If the product is deleted, the pricing becomes redundant. Makes sense to embedNeed to consider document growth – MongoDB current max document size is 16MBReferencing gives greater degree of data normalization, and can give more flexibility than embedding; however to resolve the reference, the application will issue follow-up queries, requiring additional round-trips to the server.References are generally implemented by saving the _id field of one document in the related document as a reference. The application then runs a second query to return the referenced data. Referencing is generally used when:Embedding would not provide sufficient read performance advantages to outweigh the implications of data duplication – got example;To represent more complex many-to-many relationships;To model large hierarchical data sets.
  • Good example of referencing iswhere we we have 1: Many relationship, in this case 1 publisher to many books, as we show on this chartEmbedding books inside publisher create rapidly growing docs, because new books continually being added Embedding publisher sub-document inside of each book causes lots of data duplication,so instead we create a reference using publisher ID – in this case on o’reilly
  • During schema design, think about how we query our data – some NoSQL databases that are little more than key/value stores, so you maybe able to ingest data quickly, but you can’t do anything other than primary key lookups – huge backward step coming from the relational worldMongoDB on the other hand has a rich query model enabled by extensive indexing. Indexes can be defined for any key or array within the document, as secondary indexesMongoDB indexing will be familiar to DBAs- B-Tree Indexes, Secondary IndexesAs with a relational DB, indexes are the single biggest tunable performance factor- Define indexes by identifying common queries- Use MongoDB explain to ensure index coverage- Use MongoDB profiler log all slow queriesListed index types on the slide, include text search and geospatialArray indexes allow you to index each element of an embedded array, ie in a document describing a product, each of the categories that the product can be classified under can be included in an array and indexed, so get a major performance boost when users are searching by those classificationsThis sort of flexibility gives MongoDB ability to run complex queries quickly
  • Move on to App IntegrationDiscuss drivers, discuss application query requirements, and how Mongo can be configured to meet data integrity & consistency requirements of the app
  • MongoDB has idiomatic drivers for the most popular languages with over a dozen developed and supported by MongoDB and 30+ community-supported drivers. MongoDB API is implemented as methods within the API of a specific programming language, as opposed to a completely separate language like SQL. If we couple this withMongoDB’s document model and their affinity data structures used in object-oriented programming, makes integration with applications very simple.The MongoDB API is becoming a standardIBM recently announced its collaboration in developing standards for mobile app development with the API in their own DB2 database.
  • For developers familiar with SQL it is useful to understand how core SQL statements map to the MongoDB API. Got an example on screen.The documentation includes a mapping chart to help in that transition
  • impt to understand you’re not losing query functionality when migrating to MongoDB.Beyond powerful API and secondary indexes, MongoDB also offers extensive capabilities for analysis of data, both within the database and through integration with other tooling and frameworks –MongoDB provides the Aggregation Framework which delivers similar functionality to the SQL GROUP_BY statementGives you Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce: Max, Min, Averages, SumSupports aggregations on single servers & across shards
  • The SQL to Aggregation Mapping Chart in our docs shows a number of examples of how queries in SQL are handled in MongoDB’s aggregation framework- very simple to get started with this functionality:
  • Move on to even more advanced analyticsMongodbprovides native support for MapReduce operations over both sharded and unsharded collections. MapReduce can give developers finer grained control over grouping keys and a variety of output options.MongoDB Connector for Hadoop – read slide
  • In terms of reporting, A number of Business Intelligence (BI) vendors have developed connectors to integrate MongoDB as a data source with their suites, alongside traditional relational dbs. This integration provides reporting, visualisations, dash-boarding of MongoDB data
  • RDBMS offers stong data integrity and consistency controls – look at how you maintain those things in MongoDB – key part of application integrationStart with data integrityAll write operations are atomic at a document level within MongoDB - including the updating of embedded arrays and sub-documents. By embedding related fields within a single document, users get the same integrity guarantees as a traditional RDBMS without perf overhead of having to synchronize ACID operations and maintain referential integrity across separate tables. If you need multi-doc updates, can get tx like semantics using findanymodify or in some cases, 2PCfindandmodify command allows a document to be updated atomically and returned in the same round trip. This allows users to build job queues and state machines which can be used to implement basic transaction semantics
  • MongoDB by default is strongly consistent - all reads directed to primary servers. Also by default, any reads from secondary servers within a MongoDB replica set will be eventually consistent - much like legacy master / slave replication in a relational database. Administrators can configure the secondary replicas to handle read traffic if needed,ie for long running reports or geographic distribution of replicas across TZs.
  • MongoDB ensures durability through write concerns - used to configure the level of guarantee the server provides in response to a write operation. By default, MongoDB drivers wait for the mongo server to acknowledge the write from the primary before commiting to the application – this is configurable on a per operation basis, ie you might wait for the write to be ack by primary + 1 replica, a majority of replicas or all replicas. Perf impact in doing thatAlternatively, can relaxed write concern, the application can send a write operation to MongoDB and then continue without waiting for a response from the database, useful for logging apps
  • In addition, MongoDB uses WAL to guarantee write durability and crash resistance
  • Start with how to migrate data from the relational database to MongoDBThere are multiple options.-Tried to show different options on this 1 chart1 of most common is users writing scripts that extract data in a JSON format, matching the desired schema with embedding and arrays. Then use mongoimport to read directly into the databaseETL tools commonly used.A number of migrations have involved running the existing RDBMS in parallel with the new MongoDB database, incrementally transferring production data:- As records are retrieved from the RDBMS, the application writes them back out to MongoDB in the required document schema. - Consistency checkers, for example using MD5 checksums, can be used to validate the migrated data. All newly created or updated data is written to MongoDB onlyShutterfly used this incremental approach to migrate the metadata of 6 billion images and 20TB of data from Oracle to MongoDB.
  • The considerations discussed so far fall into the domain of the data architects, developers and DBAs. Need to ensure database performs reliably at scale, and can be proactively monitored and managedThe final set of considerations in migration planning should focus on operational issues.The MongoDB Operations Best Practices guide is the definitive reference to learn more on this key area:Will touch on a couple of areas
  • MongoDB has a range of mgmt and monitoring such as mongostat, mongotop, database profiler can be used to profile key operational metrics– one I want to highlight is MMSCloud-based suite of services for managing MongoDB deploymentsMonitoring, with charts, dashboards and alerts on 100+ metricsBackup and restore, with point-in-time recovery, support for sharded clustersMMS On-Prem included with MongoDB Enterprise (backup coming soon)
  • Look at HA with replica sets - Touched on these briefly earlierMongoDB maintains multiple copies of data in what we call replica sets.. Failover between replicas is fully automated, eliminating the need for administrators to intervene ManuallyMulti-data center supportAs well as handling failures, also be used to run maintenance operations using rolling restarts to avoid downtime
  • MongoDB provides horizontal scale-out for databases using a technique called sharding, which is transparent to applications. Sharding distributes data across multiple physical partitions,allows us to scale beyond the hardware limitations of a single server, without adding complexity to the applicationData is automatically balanced across the shards to ensure even distributionNeed to ensure you pick the right shard key
  • That was a very quick view of operations – download Ops Guide to learn moreSummary
  • Also look at how we enables our migration teams to be successful – training is a key way to do this, via MongoDB UniSpecific courses are available for both developers and for DBAs, with options in how the courses are consumedFree, web-based classes, delivered over a 7 week period, supported by lectures, homework and forums to engage with instructors and other students – over 100k enrollments;2-day public training events held at MongoDB facilities;Private training customised to an organization’s specific requirements, delivered at their site.
  • SLAs – as low as 30 minutes
  • MongoDB-enabled applications include:Dealer Analytics. This mobile and wired app built on MongoDB provides dealers with real-time insight into how users are interacting with inventory on Edmunds.com.Ad Sales Management. A MongoDB-based web app provides a consolidated view on inventory of Edmunds ad placements and helps ensure clients are accurately billed for ads served.Ratings / Reviews. MongoDB’s document database was a good choice for replacing Edmunds’ CMS as a means of data storage for this domain as the domain itself fits very naturally into a document model.User Registration. MongoDB replaced a proprietary system to help Edmunds.com manage user data.
  • Core features of MongoDB are well aligned to organisations looking to evolve their applications to support new workloads– exploring some of these as part of the migration process
  • Migrating from Relational Databases to MongoDB

    1. 1. RDBMS to MongoDB Migration Considerations and Best Practices Mat Keep MongoDB Product Marketing mat.keep@mongodb.com @matkeep
    2. 2. Agenda • Migration Roadmap • Schema Design • Application Integration • Data Migration • Operational Considerations • Resources to Get Started 2
    3. 3. Strategic Priorities Enabling New & Enhancing Existing Apps Faster Time to Market 3 Better Customer Experience Lower TCO
    4. 4. Hitting RDBMS Limits Data Types Agile Development • Unstructured data • Iterative • Semi-structured data • Short development cycles • Polymorphic data • New workloads Volume of Data New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per second 4 • Cloud computing
    5. 5. Migration: Proven Benefits Organization Migrated From Application edmunds.com Oracle Billing, online advertising, user data Cisco Multiple RDBMS Craigslist MySQL Content management Salesforce Marketing Cloud RDBMS Social marketing, analytics Foursquare PostgreSQL MTV Networks Multiple RDBMS Orange Digital MySQL Analytics, social networking Social, mobile networking platforms Centralized content management Content Management http://www.mongodb.com/customers 5
    6. 6. Migration Steps 6
    7. 7. Migration Roadmap • Backed by Free, Online MongoDB Training • 100k+ registrations to date • Consulting and Support also available 7
    8. 8. Schema Design On-Demand Webinar: http://www.mongodb.com/presentations/webinar-relationaldatabases-mongodb-what-you-need-know-0 From Relational to MongoDB – What you Need to Know 8
    9. 9. Definitions RDBMS Database Database Table Collection Row Document Index Index JOIN 9 MongoDB Embedded Document or Reference
    10. 10. Data Models: Relational to Document MongoDB Document Relational { first_name: „Paul‟, surname: „Miller‟ city: „London‟, location: [45.123,47.232], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } ] } 10
    11. 11. Document Model Benefits • Rich data model, natural data representation – Embed related data in sub-documents & arrays – Support indexes and rich queries against any element • Data aggregated to a single structure (preJOINed) – Programming becomes simple – Performance can be delivered at scale • Dynamic schema – Data models can evolve easily – Adapt to changes quickly: agile methodology 11
    12. 12. The Power of Dynamic Schema RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } 12
    13. 13. RDBMS: Blogging Platform JOIN 5 tables 13
    14. 14. MongoDB: Denormalized to 2 BSON Documents Higher Performance: Data Locality 14
    15. 15. Defining the Data Model Application RDBMS Action Create Product Record INSERT to (n) tables insert() to 1 document (product description, price, with sub-documents, manufacturer, etc.) arrays Display Product Record SELECT and JOIN (n) product tables find() aggregated document Add Product Review INSERT to “review” table, foreign key to product record insert() to “review” collection, reference to product document More actions….. …… MongoDB Action …… • Analyze data access patterns of the application – Identify data that is accessed together, model within a document • Identify most common queries queries from logs 15
    16. 16. Modeling Relationships: Embedding and Referencing • Embedding – For 1:1 or 1:Many (where “many” viewed with the parent) – Ownership and containment – Document limit of 16MB, consider document growth • Referencing – – – – – 16 _id field is referenced in the related document Application runs 2nd query to retrieve the data Data duplication vs performance gain Object referenced by many different sources Models complex Many : Many & hierarchical structures
    17. 17. Referencing Publisher ID in Book  publisher = {  _id: "oreilly",  name: "O‟Reilly Media",  founded: "1980",  location: "CA"  }  book = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirolf" ],  published_date: ISODate("2010-09-24"),  pages: 216,  language: "English",  publisher_id: "oreilly"  } 17
    18. 18. Indexing in MongoDB • MongoDB indexing will be familiar to DBAs – B-Tree Indexes, Secondary Indexes • Single biggest tunable performance factor – Define indexes by identifying common queries – Use MongoDB explain to ensure index coverage – MongoDB profiler logs all slow queries • Compound • Geospatial • Unique • Hash • Array • Sparse • TTL • Text Search 18
    19. 19. Application Integration 19
    20. 20. MongoDB Drivers and API Drivers Drivers for most popular programming languages and frameworks Java Ruby JavaScript Implemented as methods within API of the language, not a separate language like SQL Perl Python Haskell IBM MongoDB API selected as standard for mobile app development 20
    21. 21. Mapping the MongoDB API to SQL Mapping Chart: http://docs.mongodb.org/manual/reference/sql-comparison/ 21
    22. 22. Application Integration MongoDB Aggregation Framework • Ad-hoc reporting, grouping and aggregations, without the complexity of MapReduce – Max, Min, Averages, Sum • Similar functionality to SQL GROUP_BY • Processes a stream of documents – Original input is a collection – Final output is a result document • Series of operators – Filter or transform data – Input/output chain • Supports single servers & shards 22
    23. 23. SQL to Aggregation Mapping Mapping Chart: http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/ 23
    24. 24. Application Integration Advanced Analytics • Native MapReduce in MongoDB – Enables more complex analysis than Aggregation Framework • MongoDB Connector for Hadoop – Integrates real time data from MongoDB with Hadoop – Reads and writes directly from MongoDB, avoiding copying TBs of data across the network – Support for SQL-like queries from Apache Hive – Support for MapReduce, Pig, Hadoop Streaming, Flume 24
    25. 25. BI Integration Dashboards Reports Reporting Visualizations 25
    26. 26. Document Level Atomicity { first_name: „Paul‟, surname: „Miller‟ city: „London‟, location: [45.123,47.232], cars: [ { model: „Bently‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } } 26 • “All or Nothing” updates • Extends to embedded documents and arrays • Consistent view to application • Transaction-like semantics for multi-doc updates with findandmodify() or 2PC
    27. 27. Maintaining Strong Consistency • By default, all reads and writes sent to Primary • Reads to secondary replicas will be eventually consistent • Scale by sharding • Read Preferences control how reads are routed 27
    28. 28. Data Durability – Write Concerns • Configurable per operation • Default is ACK by primary 28
    29. 29. Data Durability – Journaling • Guarantees write durability & crash resistance – All operations written to journal before being applied to the database (WAL) – Configure writes to wait until committed to journal before ACK to application – Replay journal after a server crash • Operations committed in groups, at configurable intervals – 2ms – 300ms 29
    30. 30. Migration and Operations 30
    31. 31. Data Migration 31
    32. 32. Operations • Monitoring, Management and Backup • High Availability • Scalability • Hardware selection – Commodity Servers: Prioritize RAM, Fast CPUs & SSD • Security – Access Control, Authentication, Encryption Download the Whitepaper MongoDB Operations Best Practices 32
    33. 33. MongoDB Management Service Cloud-based suite of services for managing MongoDB deployments • Monitoring, with charts, dashboards and alerts on 100+ metrics • Backup and restore, with pointin-time recovery, support for sharded clusters • MMS On-Prem included with MongoDB Enterprise (backup coming soon) 33
    34. 34. High Availability: Replica Sets • Automated replication and failover • Multi-data center support • Improved operational simplicity (e.g., HW swaps) • Maintenance & Disaster Recovery 34
    35. 35. Scalability: Auto-Sharding • Three types of sharding: hash-based, range-based, tagaware. Application transparent • Increase or decrease capacity as you go • Automatic balancing 35
    36. 36. Summary and Getting Started 36
    37. 37. Summary • Benefits of migration are well understood • Many successful projects • Largest differences in data model and query language – MongoDB is much more suited to the way applications are built and run today • Many principles of RDBMS apply to MongoDB Download the Whitepaper http://www.mongodb.com/dl/migrate-rdbms-nosql 37
    38. 38. For More Information Resource MongoDB Downloads mongodb.com/download Free Online Training education.mongodb.com Webinars and Events mongodb.com/events White Papers mongodb.com/white-papers Case Studies mongodb.com/customers Presentations mongodb.com/presentations Documentation docs.mongodb.org Additional Info 38 Location info@mongodb.com
    39. 39. BACKUP 40
    40. 40. Enable Success: MongoDB University Public • 2-3 day courses • Dev, Admin and Essentials courses • Worldwide Private Online • Customized to your needs • Free, runs over 7 weeks • On-Site • Lectures, homework, final exam • 100k+ enrollments • Private online for Enterprise users 41
    41. 41. Enable Success: MongoDB Support & Consulting Community Resource Commercial Support • Google Groups & StackOverflow Forums • Access to MongoDB engineers • MUGs, Office Hours • Up to 24 x 7, 30 minute response • IRC Channels • Docs 42 • Unlimited incidents & hot fixes Consulting • Lightning consults • Healthchecks • Custom consults • Dedicated TAM
    42. 42. Case Study Serves variety of content and user services on multiple platforms to 7M web and mobile users Problem Why MongoDB • MySQL reached scale ceiling – could not cope with performance and scalability demands • Unrivaled performance • Metadata management too challenging with relational model • Intuitive mapping • Hard to integrate external data sources 43 • Simple scalability and high availability • Eliminated 6B+ rows of attributes – instead creates single document per user / piece of content Results • Supports 115,000+ queries per second • Saved £2M+ over 3 yrs. • “Lead time for new implementations is cut massively” • MongoDB is default choice for all new projects
    43. 43. Case Study Runs social marketing suite with real-time analytics on MongoDB Problem • RDBMS could not meet speed and scale requirements of measuring massive online activity • Inability to provide realtime analytics and aggregations • Unpredictable peak loads 44 Why MongoDB • Ease of use, developer ramp-up Results • Decreased app development from months to weeks • Solution maturity – depth of functionality, failover • 30M social events per day stored in MongoDB • High-performance with • 6x increase in customers write-heavy system supported over one year • Queuing and logging for easy search at app layer
    44. 44. Case Study Uses MongoDB to safeguard over 6 billion images served to millions of customers Problem Why MongoDB • 6B images, 20TB of data • JSON-based data model • Brittle code base on top of Oracle database – hard to scale, add features • Agile, high performance, scalable • High SW and HW costs 45 • Alignment with Shutterfly‟s servicesbased architecture Results • 80% cost reduction • 900% performance improvement • Faster time-to-market • Dev. cycles in weeks vs. tens of months
    45. 45. Case Study Uses MongoDB to power enterprise social networking platform Problem • Complex SQL queries, highly normalized schema not aligned with new data types • Poor performance • Lack of horizontal scalability 46 Why MongoDB Results • Dynamic schemas using JSON • Flexibility to roll out new social features quickly • Ability to handle complex data while maintaining high performance • Sped up reads from 30 seconds to tens of milliseconds • Social network analytics with lightweight MapReduce • Dramatically increased write performance
    46. 46. Case Study Stores billions of posts in myriad formats with MongoDB Problem Why MongoDB Results • 1.5M posts per day, different structures • Flexible documentbased model • Inflexible MySQL, lengthy delays for making changes • Horizontal scalability built in • Data piling up in production database • Easy to use • Automated failover provides high availability • Interface in familiar language • Schema changes are quick and easy • Poor performance 47 • Initial deployment held over 5B documents and 10TB of data
    47. 47. Case Study Uses MongoDB as go-to database for all new projects Problem Why MongoDB • RDBMS had poor performance and could not scale • Ease of use and integration with systems • Too much operational overhead • Needed more developer control 48 • Small operational footprint • Document model supports continuous development • Flexible licensing model Results • Time from release to production reduced to <30 minutes • Easy to add new features • Developers can focus on apps instead of ops
    48. 48. Case Study Stores user and location-based data in MongoDB for social networking mobile app Problem • Relational architecture could not scale • Check-in data growth hit single-node capacity ceiling • Significant work to build custom sharding layer 49 Why MongoDB Results • Auto-sharding to scale high-traffic and fastgrowing application • Focus engineering on building mobile app vs. back-end • Geo-indexing for easy querying of locationbased data • Scale efficiently with limited resources • Simple data model • Increased developer productivity
    49. 49. Case Study MongoDB enables Gilt to roll out new revenuegenerating features faster and cheaper Problem • Monolithic Postgres architecture expensive to scale Why MongoDB • Dynamic schema makes it easy to build new features • Alignment with SOA • Limited ability to add new features for different business silos • Cost-effective, horizontal scaling • Spiky server loads • Easy to use and maintain Results • Developers can launch new services faster, e.g., customized upsell emails • Stable, sub-ms performance on commodity hardware • Reduced complexity yields lower overhead 50
    50. 50. Case Study Built custom ecommerce platform on MongoDB in 8 Months Problem • Dated e-commerce site with limited capabilities • Usability issues • SQL database did not scale Why MongoDB • Multi-data center replication and sharding for DR and scalability • Dynamic schema • Fast performance (reads and writes) Results • Developers, users are empowered • Fast time to market • Database can meet evolving business needs • Superior user experience 51
    51. 51. MongoDB Features • JSON Document Model with Dynamic Schemas • Full, Flexible Index Support and Rich Queries • Auto-Sharding for Horizontal Scalability • Built-In Replication for High Availability • Text Search, Geospatial queries • Advanced Security • Aggregation Framework and MapReduce 52 • Large Media Storage with GridFS