Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Common MongoDB Use Cases


Published on

Published in: Technology

Common MongoDB Use Cases

  1. 1. Common MongoDB Use-Cases Kevin Hanson Solutions Architect, 10gen @hungarianhc ~
  2. 2. Intro to NoSQL and MongoDBFolllow-up: (completed) How to Get Started with your MongoDB Pilot Project (August 7th)
  3. 3. Emerging NoSQL Space RDBMS RDBMS RDBMS Data Data NoSQL Warehouse WarehouseThe beginning Last 10 years Today
  4. 4. Qualities of NoSQL WorkloadsFlexible data models High Throughput Large Data Sizes• Lists, Nested Objects • Lots of reads • Aggregate data size• Sparse schemas • Lots of writes • Number of objects• Semi-structured data• Agile DevelopmentLow Latency Cloud Computing Commodity• Both reads and writes • Run anywhere Hardware• Millisecond latency • No assumptions about • Ethernet hardware • Local disks • No / Few Knobs
  5. 5. MongoDB was designed for thisFlexible data models High Throughput Large Data Sizes• Lists, Nested Objects • Lots of reads • Aggregate data size • schemas• SparseJSON based • writes • Lots of Replica Sets to • Number of objects shards • 1000’s of• Semi-structuredmodel object data scale reads in a single DB • Dynamic• Agile Development • Sharding to • Partitioning of schemas scale writes dataLow Latency Cloud Computing Commodity• Both reads and writes • Run anywhere Hardware • In-memory• Millisecond latency • No • Scale-out to assumptions about • Ethernet • Designed for cache overcome hardware • Local disks • No / Few Knobs “typical” OS and • Scale-out hardware local file system working set limitations
  6. 6. Example customersContent Management Operational Intelligence Product Data Management User Data Management High Volume Data Feeds
  8. 8. High Volume Data Feeds Machine • More machines, more sensors, more Generated data Data • Variably structuredStock Market • High frequency trading DataSocial Media • Multiple sources of data Firehose • Each changes their format constantly
  9. 9. High Volume Data Feed Flexible document model can adapt to changes in sensor format Asynchronous writes Data DataSources Data Sources Data Write to memory with Sources periodic disk flush Sources Scale writes over multiple shards
  10. 10. Operational Intelligence • Large volume of state about usersAd Targeting • Very strict latency requirements Customer • Expose report data to millions of customers Facing • Report on large volumes of data • Reports that update in real timeDashboardsSocial Media • Need to join the conversation _now_ Monitoring
  11. 11. Operational Intelligence Parallelize queries Low latency reads across replicas and shards API In database aggregationDashboards Flexible schema adapts to changing input dataCan use same clusterto collect, store, and report on data
  12. 12. Behavioral Profiles Rich profiles collecting multiple complex actions1 See Ad Scale out to support { cookie_id: “1234512413243”, high throughput of advertiser:{ apple: { activities tracked actions: [2 See Ad { impression: ‘ad1’, time: 123 }, { impression: ‘ad2’, time: 232 }, { click: ‘ad2’, time: 235 }, { add_to_cart: ‘laptop’, sku: ‘asdf23f’, time: 254 }, Click { purchase: ‘laptop’, time: 354 }3 ] } } } Dynamic schemas make it easy to track Indexing and4 Convert vendor specific querying to support attributes matching, frequency capping
  13. 13. Product DataE-Commerce • Diverse product portfolio Product • Complex querying and filtering Catalog • Scale for short bursts of high volume trafficFlash Sales • Scalable, but consistent view of inventory
  14. 14. Product Data Indexing and rich query API for easy searching and sorting db.products. find({ “”: “David Eggers” }). sort({ “title” : -1 }); Flexible data model for similar, but different objects{ sku: “00a9f3a”, { sku: “00e8da9b”, type: “Book”, type: “MP3”, details: { details: { author: “David Eggers”, artist: “John Coltrane”, title: “You shall know our velocity”, title: “A love supreme”, isbn: “0-9703355-5-5” length: 123 } }} }
  15. 15. Content Management • Comments and user generated News Site content • Personalization of content, layoutMulti-Device • Generate layout on the fly for each rendering device that connects • No need to cache static pages • Store large objects Sharing • Simple modeling of metadata
  16. 16. Content Management Geo spatial indexing Flexible data model for location basedGridFS for large for similar, but searches object storage different objects { camera: “Nikon d4”, location: [ -122.418333, 37.775 ] } { camera: “Canon 5d mkII”, people: [ “Jim”, “Carol” ], taken_on: ISODate("2012-03-07T18:32:35.002Z") } { origin: “”, license: “Creative Commons CC0”, size: { dimensions: [ 124, 52 ], units: “pixels” Horizontal scalability } for large data sets }
  17. 17. User Data Management • User state and sessionVideo Games management • Scale out to large graphsSocial Graphs • Easy to search and process Identity • Authentication, AuthorizationManagement and Accounting
  18. 18. User Game State Flexible documents Easy to store entire supports new game player state in a features without single document. schema migration Sharding enables whole data set to beJSON data model in memory, ensuring maps well to low latencyHTML5/JS & Flash based clients
  19. 19. Social Graphs Native support forArrays makes it easyto store connections inside user profile Sharding partitions user profiles across Documents enable Social Graph available servers disk locality of all profile data for a user
  21. 21. Good fits for MongoDBApplication Characteristic Why MongoDB might be a good fitLarge number of objects to Sharding lets you split objects across multiplestore serversHigh write or read throughput Sharding + Replication lets you scale read and write traffic across multiple serversLow Latency Access Memory Mapped storage engine caches documents in RAM, enabling in-memory performance. Data locality of documents can significantly improve latency over join based approachesVariable data in objects Dynamic schema and JSON data model enable flexible data storage without sparse tables or complex joinsCloud based deployment Sharding and replication let you work around hardware limitations in clouds.
  22. 22. Thanks!