MongoDB: What, why, when

813 views

Published on

Massimo Brignoli @ MEAN Conference - 9 giugno 2014

Published in: Internet, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
813
On SlideShare
0
From Embeds
0
Number of Embeds
116
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

MongoDB: What, why, when

  1. 1. MongoDB: What,why,when. SolutionsArchitect,MongoDB Inc. Massimo Brignoli #mongodb
  2. 2. Who Am I? • Solutions Architect/Evangelist in MongoDB Inc. • 24 years of experience in databases and software development • Former MySQL employee • Previous life: web,web,web
  3. 3. Innovation
  4. 4. Understanding Big Data – It’s Not Very“Big” from Big Data Executive Summary – 50+ top executives from Government and F500 firms 64% - Ingest diverse, new data in real-time 15% - More than 100TB of data 20% - Less than 100TB (average of all? <20TB)
  5. 5. “I have not failed. I've just found 10,000 ways that won't work.” ― Thomas A. Edison
  6. 6. Back in 1970…Cars Were Great!
  7. 7. Lots of Great Innovations Since 1970
  8. 8. Would you use these technologies for your business today?
  9. 9. Including the Relational Database
  10. 10. For which computers the relational model has been designed for?
  11. 11. So Were Computers!
  12. 12. And Storage!
  13. 13. RDBMS Makes Development Hard Relational Database Object Relational Mapping Application Code XML Config DB Schema
  14. 14. And Even Harder To Iterate New Table New Table New Column Name Pet Phone Email New Column 3 months later…
  15. 15. RDBMS From Complexity to Simplicity MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
  16. 16. MongoDB The leading NoSQL database Document Database Open- Source General Purpose
  17. 17. 7,000,000+ MongoDB Downloads 150,000+ Online Education Registrants 25,000+ MongoDB User Group Members 25,000+ MongoDB DaysAttendees 20,000+ MongoDB Management Service (MMS) Users Global Community
  18. 18. To provide the best database for how we build and run apps today MongoDB Vision Build – New and complex data – Flexible – New languages – Faster development Run – Big Data scalability – Real-time – Commodity hardware – Cloud
  19. 19. Enterprise Big Data Stack EDWHadoop Management&Monitoring Security&Auditing RDBMS CRM, ERP, Collaboration, Mobile, BI OS & Virtualization, Compute, Storage, Network RDBMS Applications Infrastructure Data Management Online Data Offline Data
  20. 20. Agile MongoDB Overview Scalable
  21. 21. Operational Database Landscape
  22. 22. Key → Value • One-dimensional storage • Single value is a blob • Query on key only • No schema • Value cannot be updated,only replaced Key Blob
  23. 23. Relational/Wide Column • Two-dimensional storage (tuples) • Each field contains a single value • Query on anyfield • Very structured schema (table) • In-place updates • Normalization process requires many tables, joins, indexes,and poor data locality Primary Key
  24. 24. Document • N-dimensional storage • Each field can contain 0,1, many,or embedded values • Query on anyfield & level • Flexible schema • Inline updates * • Embedding related data has optimal data locality, requires fewer indexes,has better performance _id
  25. 25. Document Data Model Relational MongoDB { first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } ] }
  26. 26. Document Model Benefits • Agility and flexibility – Data models can evolve easily – Companies can adapt to changes quickly • Intuitive,natural data representation – Developers are more productive – Manytypes of applications are a good fit • Reduces the need for joins,disk seeks – Programming is more simple – Performance can be delivered at scale
  27. 27. Developers are more productive
  28. 28. Developers are more productive
  29. 29. Automatic Sharding • Three types of sharding: hash-based, range-based, tag- aware! • Increase or decrease capacity as you go! • Automatic balancing
  30. 30. Query Routing • Multiple query optimization models! • Each sharding option appropriate for different apps!
  31. 31. HighAvailability–Ensure application availabilityduring many types of failures ! Disaster Recovery–Address the RTO and RPO goals for business continuity ! Maintenance –Perform upgrades and other maintenance operations with no application downtime Availability Considerations
  32. 32. Replica Sets • Replica Set – two or more copies! • “Self-healing” shard! • Addresses many concerns:! - High Availability! - Disaster Recovery! - Maintenance
  33. 33. Strong Consistency
  34. 34. Delayed Consistency
  35. 35. Write Concern • Network acknowledgement • Wait for error • Wait for journal sync • Wait for replication
  36. 36. Unacknowledged
  37. 37. MongoDB Acknowledged (wait for error)
  38. 38. Wait for Journal Sync
  39. 39. Wait for Replication
  40. 40. Tagging • Control where data is written to,and read from • Each member can have one or more tags – tags: {dc:"ny"} – tags: {dc:"ny",
 subnet:"192.168",
 rack:"row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code
  41. 41. {! _id : "mySet",! members : [! {_id : 0, host : "A", tags : {"dc": "ny"}},! {_id : 1, host : "B", tags : {"dc": "ny"}},! {_id : 2, host : "C", tags : {"dc": "sf"}},! {_id : 3, host : "D", tags : {"dc": "sf"}},! {_id : 4, host : "E", tags : {"dc": "cloud"}}],! settings : {! getLastErrorModes : {! allDCs : {"dc" : 3},! someDCs : {"dc" : 2}} }! }! > db.blogs.insert({...})! > db.runCommand({getLastError : 1, w : "someDCs"}) Tagging Example
  42. 42. Wait for Replication (Tagging)
  43. 43. Read Preference Modes • 5 modes – primary(only)-Default – primaryPreferred – secondary – secondaryPreferred – Nearest ! When more than one node is possible,closest node is used for reads (all modes but primary)
  44. 44. Single Data Center • Automated failover ! • Tolerates server failures! • Tolerates rack failures! • Number of replicas defines failure tolerance Primary –A Primary – B Primary – C Secondary –A Secondary –ASecondary – B Secondary – BSecondary – CSecondary – C
  45. 45. Active/Standby Data Center • Tolerates server and rack failure! • Standby data center Data Center - West Primary –A Primary – B Primary – C Secondary –ASecondary – B Secondary – C Data Center - East Secondary –A Secondary – B Secondary – C
  46. 46. Active/Active Data Center • Tolerates server, rack, data center failures, network partitions Data Center - West Primary –A Primary – B Primary – C Secondary –A Secondary – BSecondary – C Data Center - East Secondary –A Secondary – B Secondary – C Secondary – B Secondary – C Secondary –A Data Center - Central Arbiter –A Arbiter – B Arbiter – C
  47. 47. Global Data Distribution Real-time Real-time Real-time Real-time Real-time Real-time Real-time Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary
  48. 48. Read Global/Write Local Primary:NYC Secondary:NYC Primary:LON Primary:SYD Secondary:LON Secondary:NYC Secondary:SYD Secondary:LON Secondary:SYD
  49. 49. Common Use Cases
  50. 50. High Volume Data Feeds ••More machine forms, sensors & data ••Variably structured Machine Generated Data ••High frequency trading ••Daily closing price Securities Data ••Multiple data sources ••Each changes their format consistently ••Student Scores, ISP logs Social Media / General Public
  51. 51. Operational Intelligence ••Large volume of users ••Very strict latency requirements ••Sentiment Analysis Ad Targeting ••Expose data to millions of customers ••Reports on large volumes of data ••Reports that update in real time Real time dashboards ••Join the conversation ••Catered Games ••Customized Surveys Social Media Monitoring
  52. 52. Metadata ••Diverse product portfolio ••Complex querying and filtering ••Multi-faceted product attributes Product Catalogue ••Data mining ••Call records ••Insurance Claims Data analysis ••Retina Scans ••Fingerprints Biometric
  53. 53. Content Management ••Comments and user generated content ••Personalization of content and layout News Site ••Generate layout on the fly ••No need to cache static pages Multi-device rendering ••Store large objects ••Simpler modeling of metadata Sharing
  54. 54. Questions?
  55. 55. Thanks! @massimobrignoli Massimo Brignoli #MongoDB SolutionsArchitect,MongoDB Inc. massimo@mongodb.com

×