mongoDB at VisibizWhy and how we’re using mongoDB in our application                  Mike Brocious                    Tec...
Why We’re Here•   Discuss why we chose mongoDB at Visibiz•   Show how we’re using it•   Highlight what our experience has ...
About Us•   Startup•   Founded April, 2010•   8 employees•   Located just outside of Philadelphia, PA•   Social CRM      •...
Physical                    ApacheComponents                     Tomcat                        -                Grails app...
Why MongoDB?•   Application requirements for extensibility      •   customer extensible objects      •   customer definable...
Extensible Objects•   We provide core objects      •   person, company, prospect, relationship, etc.•   Customer can add t...
CUSTOMER #1                                CUSTOMER #2•   Person                                          •   Person    • ...
Customer Definable Objects•   Give customers ability to define their own objects      •   attribute name, type and value    ...
Scalability•   Will eventually have large amount of data    •   social net works, blogs, articles, etc.    •   email•   Sc...
What We Liked About MongoDB
What We Liked About MongoDB•   Dynamic schemas (“schema-free”)      •   fit well with our extensibility requirements
What We Liked About MongoDB•   Dynamic schemas (“schema-free”)      •   fit well with our extensibility requirements•   Doc...
Schema Flexibility Comparison•   Relational vs. document-based•   Support extensible ‘person’ object
Relational Database Example                                                                               Date of        P...
Document Datastore Example{    "_id" : ObjectId("4d87a4d32739a23b3c834b67"),    "name" : "Mike Brocious",    "address" : "...
Document Datastore Example{     "_id" : ObjectId("4d87a4d32739a23b3c834b67"),     "name" : "Mike Brocious",     "address" ...
More MongoDB Goodness
More MongoDB Goodness•   Scalability    •   replication - super easy to setup    •   sharding
More MongoDB Goodness•   Scalability    •   replication - super easy to setup    •   sharding•   Uses JSON    •   “the new...
JSON•   JSON * (MongoDB + Groovy + Grails + JavaScript) == LOVE•   MongoDB == BSON/JSON•   Groovy == Map•   Grails == JSON...
JSON-eze•   Every layer of the architecture understands common format                                                 Pres...
What Else We Liked About MongoDB•   Active product development•   Community support    •   plugins, drivers, viewers•   10...
Be Aware Of...•   No transactions    •   mitigation: schema design + atomic document updates•   No really complex queries ...
•   OK, so that’s WHY•   Let’s get into the HOW
Application    Layers                      Browser                               Controller                      Applicati...
Things•   Our main collection: “things”•   Domain objects (person, company, note, event, etc.)•   Customer-defined objects•...
‘Person’ Thing{   "_id" : ObjectId("4d10f60f39fe153be3316478"),    "thingType" : "person",    "name" : {         "firstNam...
‘Company’ Thing{    "_id" : ObjectId("4d10ffe939fe153b193b6478"),    "thingType" : "company",    "name" : "We Be Coders, I...
‘Company’ Thing    {        "_id" : ObjectId("4d10ffe939fe153b193b6478"),        "thingType" : "company",        "name" : ...
Relationships - Act One•   Connections bet ween things    •   employment history    •   co-workers, friends
Relationships - Act One•   Connections bet ween things    •   employment history    •   co-workers, friends•   First cut: ...
Relationships - Act OneA             rel             B
Relationships - Act One           A                rel              B•   Single atomic insert (good! no transaction concer...
Relationships - Act OneFind all things I’m related tome = ObjectId(“4d10f60e39fe153b20316478”)// Get all the relationships...
Relationships - Act Two•   Moved relationships to nested document inside thing    •   natural approach for document-based ...
Relationships - Act Two•   Queries easy and fast - awesome!    // Get all the things I’m related to    relatedThings = db....
Relationships - Act Two•   Queries easy and fast - awesome!    // Get all the things I’m related to    relatedThings = db....
Relationships - Act Three (final?)•   Best of both worlds
Relationships - Act Three (final?)•   Best of both worlds•   Separate “rels” collection                                    ...
Relationships - Act Three (final?)•       Best of both worlds•       Separate “rels” collection                            ...
Relationships - Act Three (final?)•       Best of both worlds                                 insert / delete•       Separa...
Relationships - Act Three (final?)•       Best of both worlds                                 insert / delete•       Separa...
REST•   RESTful interface on top of our    services•   Expose services for internal and    external development (API)
REST    •       RESTful interface on top of our            services    •       Expose services for internal and           ...
person = [                          Person Schema    name: "person",    type: "object",    extends: "contact",   propertie...
Schemas•   Separate “schemas” MongoDB collection•   Every “thing” passes through validation before being stored•   Uses:  ...
Other MongoDB Collections•   workflow status log•   query log•   staging area for har vested data•   workgroups•   duplicat...
Summary•   Why mongoDB works for us    •   Schema-free    •   Document-oriented    •   JSON    •   Scalable    •   Active ...
Questions?
mongoDB at Visibiz
mongoDB at Visibiz
mongoDB at Visibiz
mongoDB at Visibiz
mongoDB at Visibiz
Upcoming SlideShare
Loading in …5
×

mongoDB at Visibiz

3,206 views

Published on

Presentation describing why and how mongoDB is used at Visibiz, Inc.

Published in: Technology
0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,206
On SlideShare
0
From Embeds
0
Number of Embeds
1,122
Actions
Shares
0
Downloads
75
Comments
0
Likes
8
Embeds 0
No embeds

No notes for slide
  • thanks for joining us today, i am mike brocious, tech lead here at visibiz, and \ni’m gonna take you through a brief discussion of why we chose mongodb and how it fits into our application\n\nTurn off all message notifications\n- tweet deck\n- gmail\n- skype\n\n
  • why we chose mongo as primary datastore\n\nNot a sales pitch - Can’t tell you whether it’s right for you\n\nHow we’ve fit it into our architecture\n\n
  • startup\nbeen around about a year\n\nour application is in the social CRM market space - take traditional CRM features, throw in social networks, email and some intelligence...we help you understand your network of contacts, and prospects to help you during the sales cycle\n\n\n
  • a little more context about the application we’re developing it in the Grails framework from SpringSource\nso grails application hosted inside of tomcat, front-ended by apache with mongo, recently added elastic search as search engine\nall hosted in the amazon ec2 cloud\n
  • a lot factors went into the decision, but there are two worth noting here\n-we want our customers to be able to extend the core domain objects\n we also want them to be able to define their own domain objects and plug them into our application\n-the other point is scalability....important to us that our datastore be prepared to scale up and scale out as necessary\n\n
  • so lets talk a little more about extensibility\nwe provide core objects with a set of attributes\nto help customers to fit our application into their own workflow, we want customers to be able to extend those objects with their own attributes\nso what this means is that a person object ........\n
  • C#1 is interested in tracking the employment history of people that are in the network\nC#2 need to know are they male or female and what they do in their spare time, what are their interests, what are their hobbies\n\nthose attributes add meaning for those customers based on their workflow, their marketspace, their targets\nobviously a trivial example, but it’s meant to give you a feeling for what we want to want to be able to offer in our application\n
  • no real business logic, just the ability to store, associate and retrieve their own objects\n\n
  • Second concern that i mentioned earlier\nWanted to make sure that the datastore that we chose was designed with scalability in mind\nI mentioned that our application is dealing with social networks. as most of you are probably a member of one or more networks, be it twitter, facebook, linkedin, there’s a ton of data out there related to social networks. but even beyond the social networks, we’re going to be getting into email, blogs, articles, and whatever else is on the web that can provide information about your network....so we’re going to be housing a lot of data...\nwe want scaling to be easy, shouldn’t be rocket science....want to focus on developing the application, and not administering/supporting db clusters\n
  • So...given that criteria a backdrop, we looked at our options: relational, couple of NoSQL / schema-free datastores\nnot exhaustive evaluation, but enough to feel comfortable....obviously we chose mongo, or we wouldn’t be having this conversation\n\nmakes it that much easier to support extensible objects...lots of other challenges, but at least the database is not one of them\nas a developer, you deal with your domain data on a daily basis....it’s so nice to see the data presented in a way that easy for humans to understand\n
  • So...given that criteria a backdrop, we looked at our options: relational, couple of NoSQL / schema-free datastores\nnot exhaustive evaluation, but enough to feel comfortable....obviously we chose mongo, or we wouldn’t be having this conversation\n\nmakes it that much easier to support extensible objects...lots of other challenges, but at least the database is not one of them\nas a developer, you deal with your domain data on a daily basis....it’s so nice to see the data presented in a way that easy for humans to understand\n
  • \n
  • Had a design roughed out in relational model\n- team member had implemented something similar in previous life\n\n\n
  • \n
  • of course the name ‘mongo’ comes from humongous so that’s a pretty good indication that scalability is built-in\n\nfor us, JSON is new XML, it’s everywhere\n\nso that’s one thing that’s good about JSON, the other thing is....\n\n(drill into this a little)\n\n
  • of course the name ‘mongo’ comes from humongous so that’s a pretty good indication that scalability is built-in\n\nfor us, JSON is new XML, it’s everywhere\n\nso that’s one thing that’s good about JSON, the other thing is....\n\n(drill into this a little)\n\n
  • Take a look at the major components of our application\n\nHad already chosen Grails as app framework\n\nwhat this really means....\n
  • +Can move really fast, fewer moving parts to develop/maintain\n+Good to get application going, plan on refactoring \n\n
  • 10gen JIRA list is active\n\nClear from mongo forums, etc. that mongo is very active product.\n- not quite as sure with some of the other projects\n\n- not alone\n\n
  • of course, nothing is perfect, life is a bunch of trade-offs\n\nyou don’t have transactions or really complex queries....JOINs, nested SELECTs\n\nlack of transactions was the thing I was most nervous about. have to go into with your eyes wide open, have to understand how to minimize the risk, and reduce the impact.\n
  • \n
  • \n
  • the name 'things' represents that this collection will hold different object types\nhaving all Things in one collection allows us to query across all Things\ngood use of Mongo's 'schema free' document store\n show example thing (JSON)\n\n
  • \n
  • \n
  • \n
  • first cut: id of thing on left, thing on right and label\nwe were still thinking in relational terms\n\nwill be reading relationships much more often than creating new ones, so slow queries is definitely a bad thing\n\n
  • \n
  • \n
  • \n
  • \n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • the last major thing i want to talk about came about because we’re supporting a RESTful interface and mongo doesn’t enforce datatypes.\n\nmongo is not going to stop you from storing any kind of data in any field - it’s ‘schema-free’.\nenter JSON schema - provides a way to describe an object using JSON syntax\n\n
  • analogous to DTD or XML schema\n
  • \n
  • one last point....the ease with which you can create collections, and the fact that you can store different looking objects in the same collection opens up possibilities\n\n\n
  • \n
  • \n
  • mongoDB at Visibiz

    1. 1. mongoDB at VisibizWhy and how we’re using mongoDB in our application Mike Brocious Tech Lead Visibiz, Inc.
    2. 2. Why We’re Here• Discuss why we chose mongoDB at Visibiz• Show how we’re using it• Highlight what our experience has been so far• Made mistakes Learned along the way• Not a sales pitch
    3. 3. About Us• Startup• Founded April, 2010• 8 employees• Located just outside of Philadelphia, PA• Social CRM • ‘know your net work...sell better’• Currently in private beta
    4. 4. Physical ApacheComponents Tomcat - Grails application Elastic mongoDB Search Unix Amazon EC2
    5. 5. Why MongoDB?• Application requirements for extensibility • customer extensible objects • customer definable objects• Scalability
    6. 6. Extensible Objects• We provide core objects • person, company, prospect, relationship, etc.• Customer can add their own attributes • including relationships with other objects• A ‘person’ object for customer #1 may not look like a person object for customer #2
    7. 7. CUSTOMER #1 CUSTOMER #2• Person • Person • name • name • address <= Core Attributes => • address • date of birth • date of birth • employment history • gender • list: <= Customer Defined => • hobbies • company Attributes • list: • begin date • name • end date • job title
    8. 8. Customer Definable Objects• Give customers ability to define their own objects • attribute name, type and value • relationships with other objects
    9. 9. Scalability• Will eventually have large amount of data • social net works, blogs, articles, etc. • email• Scaling should be (relatively) easy
    10. 10. What We Liked About MongoDB
    11. 11. What We Liked About MongoDB• Dynamic schemas (“schema-free”) • fit well with our extensibility requirements
    12. 12. What We Liked About MongoDB• Dynamic schemas (“schema-free”) • fit well with our extensibility requirements• Document datastore • easy to understand, visualize objects • cmd shell, log files, debuggers, viewers
    13. 13. Schema Flexibility Comparison• Relational vs. document-based• Support extensible ‘person’ object
    14. 14. Relational Database Example Date of Person Id Name Address Birth 1 Mike Brocious Malvern, PA 6/10/1984 2 Doug Smith Philadelphia, PA 2/24/1980Person Id User Defined 1 User Defined 2 User Defined 3 User Defined 4 ... 1 Good Burger 3/20/1998 3/31/2010 Flipper 1 Visibiz 4/1/2010 <null> Tech Lead 2 10gen 7/9/2006 3/18/2009 Engineer Column Name TypeUser Defined 1 Company StringUser Defined 2 Begin Date DateUser Defined 3 End Date DateUser Defined 4 Job Title String
    15. 15. Document Datastore Example{ "_id" : ObjectId("4d87a4d32739a23b3c834b67"), "name" : "Mike Brocious", "address" : "Malvern, PA", "dateOfBirth" : "Sun Jun 10 1984 00:00:00 GMT-0400 (EDT)", "employmentHistory" : [ { "company" : "Good Burger", "beginDate" : "Sun Mar 20 1998 00:00:00 GMT-0400 (EDT)", "endDate" : "Thu Mar 31 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Flipper" }, { "company" : "Visibiz", "beginDate" : "Fri Apr 01 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Engineer" } ]}
    16. 16. Document Datastore Example{ "_id" : ObjectId("4d87a4d32739a23b3c834b67"), "name" : "Mike Brocious", "address" : "Malvern, PA", "dateOfBirth" : "Sun Jun 10 1984 00:00:00 GMT-0400 (EDT)", "employmentHistory" : [ { "company" : "Good Burger", "beginDate" : "Sun Mar 20 1998 00:00:00 GMT-0400 (EDT)", "endDate" : "Thu Mar 31 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Flipper" }, { "company" : "Visibiz", "beginDate" : "Fri Apr 01 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Engineer" } ]}{ "_id": ObjectId("4d87a9732739a23b3c834b6d"), "name": "Julie Harper", "address": "Ocean City, NJ", "dateOfBirth": "Thu Sep 22 1994 00:00:00 GMT-0400 (EDT)", "gender": "F", "hobbies": [ "painting", "skateboarding", "reading", "cooking" ]}
    17. 17. More MongoDB Goodness
    18. 18. More MongoDB Goodness• Scalability • replication - super easy to setup • sharding
    19. 19. More MongoDB Goodness• Scalability • replication - super easy to setup • sharding• Uses JSON • “the new XML” • easy to build, parse, read
    20. 20. JSON• JSON * (MongoDB + Groovy + Grails + JavaScript) == LOVE• MongoDB == BSON/JSON• Groovy == Map• Grails == JSON converter/builder• JavaScript = Well...it’s the JS in JSON
    21. 21. JSON-eze• Every layer of the architecture understands common format Presentation• Not forced into transforming data View • DB result sets <--> DO/DTO J S Controller • DO/DTO <--> view O Services • view <--> presentation N Data Access• Enables rapid development DB
    22. 22. What Else We Liked About MongoDB• Active product development• Community support • plugins, drivers, viewers• 10gen support • developers, CEO very active on forum, JIRA • MongoDB conferences, webcasts• Deep list of sites already using MongoDB
    23. 23. Be Aware Of...• No transactions • mitigation: schema design + atomic document updates• No really complex queries (JOINs, nested SELECTs) • mitigation: schema design• Felt we could minimize the impact
    24. 24. • OK, so that’s WHY• Let’s get into the HOW
    25. 25. Application Layers Browser Controller Application Svcs (bus. logic)Convenience wrapper Grails ORM plugin around Java driver mongoSvc GORM from SpringSource mongoDB Java driver MongoDB
    26. 26. Things• Our main collection: “things”• Domain objects (person, company, note, event, etc.)• Customer-defined objects• Takes advantage of ‘s chema-free’ nature of mongoDB • able to easily query across all thing type
    27. 27. ‘Person’ Thing{ "_id" : ObjectId("4d10f60f39fe153be3316478"), "thingType" : "person", "name" : { "firstName" : "Gail", "lastName" : "Staudt" }, "owner" : ObjectId("4d10f47e39fe153b552e6478"), "createdDate" : "Tue Dec 21 2010 13:46:39 GMT-0500 (EST)", "tags" : [ { "tag" : "java", "score" : 2 }, { "tag" : "clojure", "score" : 2 } ], "addresses" : [ { "addr1" : "40 Lloyd Ave", "city" : "Malvern", "state" : "PA" } ], "emailAddresses" : [ { "type" : "work", "value" : "staudt@foo.com" } ]}
    28. 28. ‘Company’ Thing{ "_id" : ObjectId("4d10ffe939fe153b193b6478"), "thingType" : "company", "name" : "We Be Coders, Inc." "owner" : ObjectId("4d10f60e39fe153b20316478"), "createdDate" : "Tue Dec 21 2010 14:28:41 GMT-0500 (EST)", "primaryBusiness" : "Software development consulting" "tags" : [ { "tag" : "software", "score" : 2 }, { "tag" : "development", "score" : 7 }, { "tag" : "clojure", "score" : 3 }, { "tag" : "java", "score" : 5 } ]}
    29. 29. ‘Company’ Thing { "_id" : ObjectId("4d10ffe939fe153b193b6478"), "thingType" : "company", "name" : "We Be Coders, Inc." "owner" : ObjectId("4d10f60e39fe153b20316478"), "createdDate" : "Tue Dec 21 2010 14:28:41 GMT-0500 (EST)", "primaryBusiness" : "Software development consulting" "tags" : [ { "tag" : "software", "score" : 2 }, { "tag" : "development", "score" : 7 }, { "tag" : "clojure", "score" : 3 }, { "tag" : "java", "score" : 5 } ] }• Find all things I own tagged with ‘java’> db.things.find({owner: ObjectId(“4d10f60e39fe153b20316478”), tags.tag: “java”})
    30. 30. Relationships - Act One• Connections bet ween things • employment history • co-workers, friends
    31. 31. Relationships - Act One• Connections bet ween things • employment history • co-workers, friends• First cut: separate documents in the things collection • essentially a mapping/join table
    32. 32. Relationships - Act OneA rel B
    33. 33. Relationships - Act One A rel B• Single atomic insert (good! no transaction concerns)• Made retrieval inefficient (no JOINs, nested SELECTs)
    34. 34. Relationships - Act OneFind all things I’m related tome = ObjectId(“4d10f60e39fe153b20316478”)// Get all the relationships I’m involved inrelated = db.things.find({$or: [left.id: me, right.id:me]})// Build up a list of ids I’m related toi = 0relatedId = new Array()for(relationship in related) if (relationship.left.id == me) { relatedIds[i++] = relationship.right.id } else { relatedIds[i++] = relationship.left.id })// Now get the thingsrelatedThings = db.things.find({_id: { $in : relatedIds } })
    35. 35. Relationships - Act Two• Moved relationships to nested document inside thing • natural approach for document-based datastores A B
    36. 36. Relationships - Act Two• Queries easy and fast - awesome! // Get all the things I’m related to relatedThings = db.things.find({$or: [rels.left.id: me, rels.right.id:me]})
    37. 37. Relationships - Act Two• Queries easy and fast - awesome! // Get all the things I’m related to relatedThings = db.things.find({$or: [rels.left.id: me, rels.right.id:me]})• Two updates required to insert new relationship • relationship stored in both things • bad! - transaction concerns
    38. 38. Relationships - Act Three (final?)• Best of both worlds
    39. 39. Relationships - Act Three (final?)• Best of both worlds• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!)
    40. 40. Relationships - Act Three (final?)• Best of both worlds• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
    41. 41. Relationships - Act Three (final?)• Best of both worlds insert / delete• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
    42. 42. Relationships - Act Three (final?)• Best of both worlds insert / delete• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
    43. 43. REST• RESTful interface on top of our services• Expose services for internal and external development (API)
    44. 44. REST • RESTful interface on top of our services • Expose services for internal and external development (API)• Need a way to validate data to prevent “garbage in”• JSON schema • http://json-schema.org/
    45. 45. person = [ Person Schema name: "person", type: "object", extends: "contact", properties: [ name: [type: "object", title: "Name", properties: [ firstName: [type: "string", title: "First Name", optional:true], middleName: [type: "string", title: "Middle Name", optional: true], lastName: [type: "string", title: "Last Name", optional: true], suffix: [type: "string", title: "Suffix", optional: true]]]]]
    46. 46. Schemas• Separate “schemas” MongoDB collection• Every “thing” passes through validation before being stored• Uses: • validate incoming data • track customer-specific schema extensions • generate UI to display things
    47. 47. Other MongoDB Collections• workflow status log• query log• staging area for har vested data• workgroups• duplicates scan log
    48. 48. Summary• Why mongoDB works for us • Schema-free • Document-oriented • JSON • Scalable • Active product
    49. 49. Questions?

    ×