Your SlideShare is downloading. ×
0
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When

747

Published on

tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When …

tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
Massimo Brignoli, MongoDB Inc
The presentation will illustrate what MongoDB is, the advantages of the document based approach and some of the use cases where MongoDB is a perfect fit.

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
747
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Dotted line is the natural boundary of what is possible today. Eg, ORCL lives far out on the right and does things nosql vendors will ever do. These things come at the expense of some degree of scale and performance.NoSQL born out of wanting greater scalability and performance, but we think they overreacted by giving up some things. Eg, caching layers give up many things, key value stores are super fast, but give up rich data model and rich query model.MongoDB tries to give up some features of a relational database (joins, complex transactions) to enable greater scalability and performance. You get most of the functionality – 80% - with much better scalability and performance. Start with rdbms, ask what could we do to scale – take out complex transactions and joins. How? Change the data model. >> segue to data model section.May need to revise the graphic – either remove the line or all points should be on the line.To enable horizontal scalability, reduce coordination between nodes (joins and transactions). Traditionally in rdbms you would denormalize the data or tell the system more about how data relates to one another. Another way, a more intuitive way, is to use a document data model. More intuitive b/c closer to the way we develop applications today with object oriented languages, like java,.net, ruby, node.js, etc. Document data model is good segue to next section >> Data Model
  • Here we have greatly reduced the relational data model for this application to two tables. In reality no database has two tables. It is much more common to have hundreds or thousands of tables. And as a developer where do you begin when you have a complex data model?? If you’re building an app you’re really thinking about just a hand full of common things, like products, and these can be represented in a document much more easily that a complex relational model where the data is broken up in a way that doesn’t really reflect the way you think about the data or write an application.
  • Segue – Rich queries, text search, geospatial, aggregation, mapreduce are types of things you can build based on the richness of the query model. More on that in just a moment.
  • Replicas in DC East configured as secondary-only.
  • Transcript

    • 1. @mongodb What, When and Why of MongoDB Massimo Brignoli Solution Architect, MongoDB Inc.
    • 2. Agenda About MongoDB Inc. Data and Query Model Scalability Availability Deployment Architectures Schema Design Challenges Use Cases
    • 3. About MongoDB
    • 4. MongoDB Inc. Overview 300+ employees Offices in New York, Palo Alto, Washington DC, London, Dublin, Barcelona and Sydney 600+ customers Over $231 million in funding
    • 5. Global Community 6,000,000+ MongoDB Downloads 100,000+ Online Education Registrants 20,000+ MongoDB User Group Members 20,000+ MongoDB Days Attendees 15,000+ MongoDB Management Service (MMS) Users
    • 6. MongoDB Inc. Products and Services Subscriptions MongoDB Enterprise, On-Prem Monitoring, Professional Support and Commercial License Consulting Expert Resources for All Phases of MongoDB Implementations Training Online and In-Person for Developers and Administrators MongoDB Monitoring Service Cloud-Based Service for Monitoring, Alerts, Backup and Restore
    • 7. Data & Query Model
    • 8. Operational Database Landscape
    • 9. Document Data Model Relational MongoDB { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: [45.123,47.232], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
    • 10. Document Model Benefits • Agility and flexibility – Data models can evolve easily – Companies can adapt to changes quickly • Intuitive, natural data representation – Developers are more productive – Many types of applications are a good fit • Reduces the need for joins, disk seeks – Programming is more simple – Performance can be delivered at scale
    • 11. Developers are more productive
    • 12. Developers are more productive
    • 13. Developers are more productive
    • 14. MongoDB is full featured Rich Queries • Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 MongoDB { Geospatial • Find all of the car owners within 5km of Trafalgar Sq. Text Search • Find all the cars described as having leather seats Aggregation • Calculate the average value of Paul’s car collection Map Reduce • What is the ownership pattern of colors by geography over time? (is purple trending up in China?) first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: [45.123,47.232], cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
    • 15. Shell and Drivers Drivers Drivers for most popular programming languages and frameworks Java Ruby JavaScript Python Shell Command-line shell for interacting directly with database Perl Haskell > db.collection.insert({company:“10gen”, product:“MongoDB”}) > > db.collection.findOne() { “_id” : ObjectId(“5106c1c2fc629bfe52792e86”), “company” : “10gen” “product” : “MongoDB” }
    • 16. Scalability
    • 17. Automatic Sharding • Three types of sharding: hash-based, range-based, tagaware • Increase or decrease capacity as you go • Automatic balancing
    • 18. Query Routing • Multiple query optimization models • Each sharding option appropriate for different apps
    • 19. Availability
    • 20. Availability Considerations High Availability – Ensure application availability during many types of failures Disaster Recovery – Address the RTO and RPO goals for business continuity Maintenance – Perform upgrades and other maintenance operations with no application downtime
    • 21. Replica Sets • Replica Set – two or more copies • “Self-healing” shard • Addresses many concerns: - High Availability - Disaster Recovery - Maintenance
    • 22. Replica Set Benefits Business Needs Replica Set Benefits High Availability Automated failover Disaster Recovery Hot backups offsite Maintenance Rolling upgrades Low Latency Locate data near users Workload Isolation Read from non-primary replicas Data Privacy Restrict data to physical location Data Consistency Tunable Consistency
    • 23. Deployment Architectures
    • 24. Single Data Center Primary – A Primary – B Primary – C Secondary – B Secondary – A Secondary – A • Automated failover • Tolerates server failures • Tolerates rack failures Secondary – C Secondary – C Secondary – B • Number of replicas defines failure tolerance
    • 25. Active/Standby Data Center Primary – A Primary – B Primary – C Secondary – B Secondary – C Secondary – A Secondary – A Data Center - West • Tolerates server and rack failure • Standby data center Secondary – B Secondary – C Data Center - East
    • 26. Active/Active Data Center Primary – A Primary – B Primary – C Secondary – A Secondary – B Secondary – C Secondary – C Secondary – A Secondary – B Secondary – B Secondary – C Secondary – A Arbiter – A Data Center - West Arbiter – B Arbiter – C Data Center - Central Data Center - East • Tolerates server, rack, data center failures, network partitions
    • 27. Global Data Distribution Real-time Real-time Secondary Secondary Secondary Real-time Real-time Primary Secondary Real-time Secondary Real-time Secondary Real-time Secondary
    • 28. Read Global/Write Local Primary:LON Secondary:NYC Primary:NYC Secondary:SYD Secondary:LON Secondary:SYD Primary:SYD Secondary:LON Secondary:NYC
    • 29. Schema Design Challenges
    • 30. First a story: Once upon a time there was a medical records company…
    • 31. Schema Design Challenge • Flexibility – Easily adapt to new requirements • Agility – Rapid application development • Scalability – Support large data and query volumes
    • 32. Schema Design: MongoDB vs. Relational
    • 33. MongoDB versus Relational MongoDB Relational Collections Tables Documents Rows Data Use Data Storage What questions do I have? What answers do I have?
    • 34. Attribute MongoDB Relational Storage N-dimensional Two-dimensional Field Values 0, 1, many, or embed Single value Query Any field or level Any field Schema Flexible Very structured Updates In line In place
    • 35. With relational, this is hard Long development times Inflexible Doesn’t scale
    • 36. Document model is much easier { "patient_id": "1177099", "first_name": "John", "last_name": "Doe", "middle_initial": "A", "dob": "2000-01-25", "gender": "Male", "blood_type": "B+", "address": "123 Elm St., Chicago, IL 59923", "height": "66", "weight": "110", "allergies": ["Nuts", "Penicillin", "Pet Dander"], "current_medications": [{"name": "Zoloft", "dosage": "2mg", "frequency": "daily", "route": "orally"}], "complaint" : [{"entered": "2000-11-03", "onset": "2000-11-03", "prob_desc": "", "icd" : 250.00, "status" : "Active"}, {"entered": "2000-02-04", "onset": "2000-02-04", "prob_desc": "in spite of regular exercise, ...", "icd" : 401.9, "status" : "Active"}], "diagnosis" : [{"visit" : "2005-07-22" , "narrative" : "Fractured femur", "icd" : "9999", "priority" : "Primary"}, {"visit" : "2005-07-22" , "narrative" : "Type II Diabetes", "icd" : "250.00", "priority" : "Secondary"}] Shorter development times Flexible } Scalable
    • 37. Let’s model something together How about a business card?
    • 38. Business Card
    • 39. Twitters • • • • name location web bio Groups • name N N 1 Thumbnail s • mime_type 1 • data Portraits • mime_type 1 • data Addresses 1 Contacts N 1 • • • • • type street city state zip_code Phones • name 1 • company • title 1 • type N • number 1 1 Emails N • type • address Address Book Entity-Relationship
    • 40. Referencing Contact • • • • name company title phone Address • • • • street city state zip_code Use two collections with a reference Similar to relational
    • 41. Embedding Contact • • • name company address • Street street • City city • State • Zip zip_code • title • phone Document Schema
    • 42. Referencing Contacts Addresses { { “_id”: 2, “name”: “Steven Jobs”, “title”: “VP, New Product Development”, “company”: “Apple Computer”, “phone”: “408-996-1010”, “address_id”: 1 } } “_id”: 1, “street”: “10260 Bandley Dr”, “city”: “Cupertino”, “state”: “CA”, “zip_code”: ”95014”, “country”: “USA”
    • 43. Embedding Contacts { “_id”: 2, “name”: “Steven Jobs”, “title”: “VP, New Product Development”, “company”: “Apple Computer”, “address”: {“street”: “10260 Bandley Dr”, “city”: “Cupertino”, “state”: “CA”, “zip_code”: ”95014”, “country”: “USA”}, “phone”: “408-996-1010” }
    • 44. How are they different? Why? Contact Contact • • • • name company title phone Address • • • • street city state zip_code • name • company • adress address • Street street • City city • State state • Zip zip_code • title • phone
    • 45. Schema Flexibility { { “name”: “Larry Page, “url”: “http://google.com”, “title”: “CEO”, “company”: “Google!”, “address”: { “street”: 555 Bryant, #106”, “city”: “Palo Alto”, “state”: “CA”, “zip_code”: “94301” }, “phone”: “650-330-0100” “fax”: ”650-330-1499” “name”: “Steven Jobs”, “title”: “VP, New Product Development”, “company”: “Apple Computer”, “address”: { “street”: 10260 Bandley Dr”, “city”: “Cupertino”, “state”: “CA”, “zip_code”: “95014” }, “phone”: “408-996-1010” } }
    • 46. { One-to-many embedding vs. referencing “name”: “Larry Page”, “url”: “http://google.com/”, “title”: “CEO”, “company”: “Google!”, “email”: “larry@google.com”, “address”: [{ “street”: “555 Bryant, #106”, “city”: “Palo Alto”, “state”: “CA”, “zip_code”: “94301” }] “phones”: [{“type”: “Office”, “number”: “650-618-1499”}, {“type”: “fax”, “number”: “650-330-0100”}] { “name”: “Larry Page”, “url”: “http://google.com/”, “title”: “CEO”, “company”: “Google!”, “email”: “larry@google.com”, “address”: [“addr99”], “phones”: [“ph23”, “ph49”]} { { } { “_id”: “addr99”, “street”: “555 Bryant, #106”, “city”: “Palo Alto”, “state”: “CA”, “zip_code”: “94301”} “_id”: “ph23”, “type”: “Office”, “number”: “650-618-1499”}, “_id”: “ph49”, “type”: “fax”, “number”: “650-330-0100”}
    • 47. Many to Many Traditional Relational Association Join table Groups name X GroupContacts group_id contact_id Use arrays instead Contacts name company title phone
    • 48. Twitters • • • • name location web bio Groups • name N N 1 Thumbnail s • mime_type 1 • data Portraits • mime_type 1 • data Addresses 1 Contacts N 1 • • • • • type street city state zip_code Phones • name 1 • company • title 1 • type N • number 1 1 Emails N • type • address Address Book Entity-Relationship
    • 49. Groups Contacts • name • name • company • title N N 1 1 Portraits • mime_type • data twitter • • • • addresses N 1 name location web bio thumbnail 1 • mime_type • data • • • • • type street city state zip_code phones N • type • number emails N • type • address Document model - holistic and efficient representation
    • 50. Contact document example { } “name” : “Gary J. Murakami, Ph.D.”, “company” : “MongoDB, Inc”, “title” : “Lead Engineer and Ruby Evangelist”, “twitter” : { “name” : “GaryMurakami”, “location” : “New Providence, NJ”, “web” : “http://www.nobell.org” }, “portrait_id” : 1, “addresses” : [ { “type” : “work”, “street” : ”229 W 43rd St.”, “city” : “New York”, “zip_code” : “10036” } ], “phones” : [ { “type” : “work”, “number” : “1-866-237-8815 x8015” } ], “emails” : [ { “type” : “work”, “address” : “gary.murakami@mongodb.com” }, { “type” : “home”, “address” : “gjm@nobell.org” } ]
    • 51. Health Care Use Cases
    • 52. 360-Degree Patient View • Healthcare provider networks have massive amounts of patient data – – – – Both structured and unstructured Basic patient informations Lab results MRI images • Centralization of data needed – Aggregation of all the data in one repository • Analytics
    • 53. Population Management for At-Risk Demographics • Certain populations are known to be prone to certain diseases. • Analyzing data insurers help people take preventative measures – reminding them to get regularly scheduled colonoscopies • Help insurers to reduce costs and to expand margins,
    • 54. Lab Data Management and Analytics • Strain on traditional technological systems: – Rise of number of tests conducted – Rise of variety of data collected – Lack of flexibility • With MongoDB‟s flexible data model, providers of lab testing, genomics and clinical pathology can: – Ingest, store and analyze a variety of data types – Coming from numerous sources all in a single data store • enables these companies to generate new insights and revenue streams
    • 55. Other use cases for MongoDB in healthcare include: • Fraud Detection • Remote Monitoring and Body Area Networks • Mobile Apps for Doctors and Nurses • Pandemic Detection with Real-Time Geospatial Analytics • Electronic Healthcare Records (EHR) • Advanced Auditing Systems for Compliance • Hospital Equipment Management and Optimization
    • 56. #MongoDB Thank You Massimo Brignoli massimo@mongodb.com @massimobrignoli Solutions Architect, MongoDB

    ×