mongoDB at Visibiz
Upcoming SlideShare
Loading in...5
×
 

mongoDB at Visibiz

on

  • 2,541 views

Presentation describing why and how mongoDB is used at Visibiz, Inc.

Presentation describing why and how mongoDB is used at Visibiz, Inc.

Statistics

Views

Total Views
2,541
Views on SlideShare
1,728
Embed Views
813

Actions

Likes
7
Downloads
59
Comments
0

16 Embeds 813

http://www.10gen.com 645
http://www.mongodb.com 96
http://pelnodara.wordpress.com 33
http://www.visibiz.com 12
http://visibiz.com 11
http://faslil.mongodb.org 4
http://drupal1.10gen.cc 3
http://apy.mongodb.org 1
http://archive.10gen.com 1
http://w.mongodb.org 1
http://localhost:8080 1
https://freesslproxy.com 1
http://bristle.com 1
https://www.10gen.com 1
http://dowloads.mongodb.org 1
https://live.mongodb.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • thanks for joining us today, i am mike brocious, tech lead here at visibiz, and \ni’m gonna take you through a brief discussion of why we chose mongodb and how it fits into our application\n\nTurn off all message notifications\n- tweet deck\n- gmail\n- skype\n\n
  • why we chose mongo as primary datastore\n\nNot a sales pitch - Can’t tell you whether it’s right for you\n\nHow we’ve fit it into our architecture\n\n
  • startup\nbeen around about a year\n\nour application is in the social CRM market space - take traditional CRM features, throw in social networks, email and some intelligence...we help you understand your network of contacts, and prospects to help you during the sales cycle\n\n\n
  • a little more context about the application we’re developing it in the Grails framework from SpringSource\nso grails application hosted inside of tomcat, front-ended by apache with mongo, recently added elastic search as search engine\nall hosted in the amazon ec2 cloud\n
  • a lot factors went into the decision, but there are two worth noting here\n-we want our customers to be able to extend the core domain objects\n we also want them to be able to define their own domain objects and plug them into our application\n-the other point is scalability....important to us that our datastore be prepared to scale up and scale out as necessary\n\n
  • so lets talk a little more about extensibility\nwe provide core objects with a set of attributes\nto help customers to fit our application into their own workflow, we want customers to be able to extend those objects with their own attributes\nso what this means is that a person object ........\n
  • C#1 is interested in tracking the employment history of people that are in the network\nC#2 need to know are they male or female and what they do in their spare time, what are their interests, what are their hobbies\n\nthose attributes add meaning for those customers based on their workflow, their marketspace, their targets\nobviously a trivial example, but it’s meant to give you a feeling for what we want to want to be able to offer in our application\n
  • no real business logic, just the ability to store, associate and retrieve their own objects\n\n
  • Second concern that i mentioned earlier\nWanted to make sure that the datastore that we chose was designed with scalability in mind\nI mentioned that our application is dealing with social networks. as most of you are probably a member of one or more networks, be it twitter, facebook, linkedin, there’s a ton of data out there related to social networks. but even beyond the social networks, we’re going to be getting into email, blogs, articles, and whatever else is on the web that can provide information about your network....so we’re going to be housing a lot of data...\nwe want scaling to be easy, shouldn’t be rocket science....want to focus on developing the application, and not administering/supporting db clusters\n
  • So...given that criteria a backdrop, we looked at our options: relational, couple of NoSQL / schema-free datastores\nnot exhaustive evaluation, but enough to feel comfortable....obviously we chose mongo, or we wouldn’t be having this conversation\n\nmakes it that much easier to support extensible objects...lots of other challenges, but at least the database is not one of them\nas a developer, you deal with your domain data on a daily basis....it’s so nice to see the data presented in a way that easy for humans to understand\n
  • So...given that criteria a backdrop, we looked at our options: relational, couple of NoSQL / schema-free datastores\nnot exhaustive evaluation, but enough to feel comfortable....obviously we chose mongo, or we wouldn’t be having this conversation\n\nmakes it that much easier to support extensible objects...lots of other challenges, but at least the database is not one of them\nas a developer, you deal with your domain data on a daily basis....it’s so nice to see the data presented in a way that easy for humans to understand\n
  • \n
  • Had a design roughed out in relational model\n- team member had implemented something similar in previous life\n\n\n
  • \n
  • of course the name ‘mongo’ comes from humongous so that’s a pretty good indication that scalability is built-in\n\nfor us, JSON is new XML, it’s everywhere\n\nso that’s one thing that’s good about JSON, the other thing is....\n\n(drill into this a little)\n\n
  • of course the name ‘mongo’ comes from humongous so that’s a pretty good indication that scalability is built-in\n\nfor us, JSON is new XML, it’s everywhere\n\nso that’s one thing that’s good about JSON, the other thing is....\n\n(drill into this a little)\n\n
  • Take a look at the major components of our application\n\nHad already chosen Grails as app framework\n\nwhat this really means....\n
  • +Can move really fast, fewer moving parts to develop/maintain\n+Good to get application going, plan on refactoring \n\n
  • 10gen JIRA list is active\n\nClear from mongo forums, etc. that mongo is very active product.\n- not quite as sure with some of the other projects\n\n- not alone\n\n
  • of course, nothing is perfect, life is a bunch of trade-offs\n\nyou don’t have transactions or really complex queries....JOINs, nested SELECTs\n\nlack of transactions was the thing I was most nervous about. have to go into with your eyes wide open, have to understand how to minimize the risk, and reduce the impact.\n
  • \n
  • \n
  • the name 'things' represents that this collection will hold different object types\nhaving all Things in one collection allows us to query across all Things\ngood use of Mongo's 'schema free' document store\n show example thing (JSON)\n\n
  • \n
  • \n
  • \n
  • first cut: id of thing on left, thing on right and label\nwe were still thinking in relational terms\n\nwill be reading relationships much more often than creating new ones, so slow queries is definitely a bad thing\n\n
  • \n
  • \n
  • \n
  • \n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • hybrid solution - gives us best of both worlds\n- insertion integrity\n- query performance\ndownside:\nnow inserting/updating a rel is a two-step process: update “rels” collection, then push the changes to the related things\nif that second update fails, we can always rebuild those relationships from the rels collection.\n\n
  • the last major thing i want to talk about came about because we’re supporting a RESTful interface and mongo doesn’t enforce datatypes.\n\nmongo is not going to stop you from storing any kind of data in any field - it’s ‘schema-free’.\nenter JSON schema - provides a way to describe an object using JSON syntax\n\n
  • analogous to DTD or XML schema\n
  • \n
  • one last point....the ease with which you can create collections, and the fact that you can store different looking objects in the same collection opens up possibilities\n\n\n
  • \n
  • \n

mongoDB at Visibiz mongoDB at Visibiz Presentation Transcript

  • mongoDB at VisibizWhy and how we’re using mongoDB in our application Mike Brocious Tech Lead Visibiz, Inc.
  • Why We’re Here• Discuss why we chose mongoDB at Visibiz• Show how we’re using it• Highlight what our experience has been so far• Made mistakes Learned along the way• Not a sales pitch
  • About Us• Startup• Founded April, 2010• 8 employees• Located just outside of Philadelphia, PA• Social CRM • ‘know your net work...sell better’• Currently in private beta
  • Physical ApacheComponents Tomcat - Grails application Elastic mongoDB Search Unix Amazon EC2
  • Why MongoDB?• Application requirements for extensibility • customer extensible objects • customer definable objects• Scalability
  • Extensible Objects• We provide core objects • person, company, prospect, relationship, etc.• Customer can add their own attributes • including relationships with other objects• A ‘person’ object for customer #1 may not look like a person object for customer #2
  • CUSTOMER #1 CUSTOMER #2• Person • Person • name • name • address <= Core Attributes => • address • date of birth • date of birth • employment history • gender • list: <= Customer Defined => • hobbies • company Attributes • list: • begin date • name • end date • job title
  • Customer Definable Objects• Give customers ability to define their own objects • attribute name, type and value • relationships with other objects
  • Scalability• Will eventually have large amount of data • social net works, blogs, articles, etc. • email• Scaling should be (relatively) easy
  • What We Liked About MongoDB
  • What We Liked About MongoDB• Dynamic schemas (“schema-free”) • fit well with our extensibility requirements
  • What We Liked About MongoDB• Dynamic schemas (“schema-free”) • fit well with our extensibility requirements• Document datastore • easy to understand, visualize objects • cmd shell, log files, debuggers, viewers
  • Schema Flexibility Comparison• Relational vs. document-based• Support extensible ‘person’ object
  • Relational Database Example Date of Person Id Name Address Birth 1 Mike Brocious Malvern, PA 6/10/1984 2 Doug Smith Philadelphia, PA 2/24/1980Person Id User Defined 1 User Defined 2 User Defined 3 User Defined 4 ... 1 Good Burger 3/20/1998 3/31/2010 Flipper 1 Visibiz 4/1/2010 <null> Tech Lead 2 10gen 7/9/2006 3/18/2009 Engineer Column Name TypeUser Defined 1 Company StringUser Defined 2 Begin Date DateUser Defined 3 End Date DateUser Defined 4 Job Title String
  • Document Datastore Example{ "_id" : ObjectId("4d87a4d32739a23b3c834b67"), "name" : "Mike Brocious", "address" : "Malvern, PA", "dateOfBirth" : "Sun Jun 10 1984 00:00:00 GMT-0400 (EDT)", "employmentHistory" : [ { "company" : "Good Burger", "beginDate" : "Sun Mar 20 1998 00:00:00 GMT-0400 (EDT)", "endDate" : "Thu Mar 31 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Flipper" }, { "company" : "Visibiz", "beginDate" : "Fri Apr 01 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Engineer" } ]}
  • Document Datastore Example{ "_id" : ObjectId("4d87a4d32739a23b3c834b67"), "name" : "Mike Brocious", "address" : "Malvern, PA", "dateOfBirth" : "Sun Jun 10 1984 00:00:00 GMT-0400 (EDT)", "employmentHistory" : [ { "company" : "Good Burger", "beginDate" : "Sun Mar 20 1998 00:00:00 GMT-0400 (EDT)", "endDate" : "Thu Mar 31 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Flipper" }, { "company" : "Visibiz", "beginDate" : "Fri Apr 01 2010 00:00:00 GMT-0400 (EDT)", "jobTitle" : "Engineer" } ]}{ "_id": ObjectId("4d87a9732739a23b3c834b6d"), "name": "Julie Harper", "address": "Ocean City, NJ", "dateOfBirth": "Thu Sep 22 1994 00:00:00 GMT-0400 (EDT)", "gender": "F", "hobbies": [ "painting", "skateboarding", "reading", "cooking" ]}
  • More MongoDB Goodness
  • More MongoDB Goodness• Scalability • replication - super easy to setup • sharding
  • More MongoDB Goodness• Scalability • replication - super easy to setup • sharding• Uses JSON • “the new XML” • easy to build, parse, read
  • JSON• JSON * (MongoDB + Groovy + Grails + JavaScript) == LOVE• MongoDB == BSON/JSON• Groovy == Map• Grails == JSON converter/builder• JavaScript = Well...it’s the JS in JSON
  • JSON-eze• Every layer of the architecture understands common format Presentation• Not forced into transforming data View • DB result sets <--> DO/DTO J S Controller • DO/DTO <--> view O Services • view <--> presentation N Data Access• Enables rapid development DB
  • What Else We Liked About MongoDB• Active product development• Community support • plugins, drivers, viewers• 10gen support • developers, CEO very active on forum, JIRA • MongoDB conferences, webcasts• Deep list of sites already using MongoDB
  • Be Aware Of...• No transactions • mitigation: schema design + atomic document updates• No really complex queries (JOINs, nested SELECTs) • mitigation: schema design• Felt we could minimize the impact
  • • OK, so that’s WHY• Let’s get into the HOW
  • Application Layers Browser Controller Application Svcs (bus. logic)Convenience wrapper Grails ORM plugin around Java driver mongoSvc GORM from SpringSource mongoDB Java driver MongoDB
  • Things• Our main collection: “things”• Domain objects (person, company, note, event, etc.)• Customer-defined objects• Takes advantage of ‘s chema-free’ nature of mongoDB • able to easily query across all thing type
  • ‘Person’ Thing{ "_id" : ObjectId("4d10f60f39fe153be3316478"), "thingType" : "person", "name" : { "firstName" : "Gail", "lastName" : "Staudt" }, "owner" : ObjectId("4d10f47e39fe153b552e6478"), "createdDate" : "Tue Dec 21 2010 13:46:39 GMT-0500 (EST)", "tags" : [ { "tag" : "java", "score" : 2 }, { "tag" : "clojure", "score" : 2 } ], "addresses" : [ { "addr1" : "40 Lloyd Ave", "city" : "Malvern", "state" : "PA" } ], "emailAddresses" : [ { "type" : "work", "value" : "staudt@foo.com" } ]}
  • ‘Company’ Thing{ "_id" : ObjectId("4d10ffe939fe153b193b6478"), "thingType" : "company", "name" : "We Be Coders, Inc." "owner" : ObjectId("4d10f60e39fe153b20316478"), "createdDate" : "Tue Dec 21 2010 14:28:41 GMT-0500 (EST)", "primaryBusiness" : "Software development consulting" "tags" : [ { "tag" : "software", "score" : 2 }, { "tag" : "development", "score" : 7 }, { "tag" : "clojure", "score" : 3 }, { "tag" : "java", "score" : 5 } ]}
  • ‘Company’ Thing { "_id" : ObjectId("4d10ffe939fe153b193b6478"), "thingType" : "company", "name" : "We Be Coders, Inc." "owner" : ObjectId("4d10f60e39fe153b20316478"), "createdDate" : "Tue Dec 21 2010 14:28:41 GMT-0500 (EST)", "primaryBusiness" : "Software development consulting" "tags" : [ { "tag" : "software", "score" : 2 }, { "tag" : "development", "score" : 7 }, { "tag" : "clojure", "score" : 3 }, { "tag" : "java", "score" : 5 } ] }• Find all things I own tagged with ‘java’> db.things.find({owner: ObjectId(“4d10f60e39fe153b20316478”), tags.tag: “java”})
  • Relationships - Act One• Connections bet ween things • employment history • co-workers, friends
  • Relationships - Act One• Connections bet ween things • employment history • co-workers, friends• First cut: separate documents in the things collection • essentially a mapping/join table
  • Relationships - Act OneA rel B
  • Relationships - Act One A rel B• Single atomic insert (good! no transaction concerns)• Made retrieval inefficient (no JOINs, nested SELECTs)
  • Relationships - Act OneFind all things I’m related tome = ObjectId(“4d10f60e39fe153b20316478”)// Get all the relationships I’m involved inrelated = db.things.find({$or: [left.id: me, right.id:me]})// Build up a list of ids I’m related toi = 0relatedId = new Array()for(relationship in related) if (relationship.left.id == me) { relatedIds[i++] = relationship.right.id } else { relatedIds[i++] = relationship.left.id })// Now get the thingsrelatedThings = db.things.find({_id: { $in : relatedIds } })
  • Relationships - Act Two• Moved relationships to nested document inside thing • natural approach for document-based datastores A B
  • Relationships - Act Two• Queries easy and fast - awesome! // Get all the things I’m related to relatedThings = db.things.find({$or: [rels.left.id: me, rels.right.id:me]})
  • Relationships - Act Two• Queries easy and fast - awesome! // Get all the things I’m related to relatedThings = db.things.find({$or: [rels.left.id: me, rels.right.id:me]})• Two updates required to insert new relationship • relationship stored in both things • bad! - transaction concerns
  • Relationships - Act Three (final?)• Best of both worlds
  • Relationships - Act Three (final?)• Best of both worlds• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!)
  • Relationships - Act Three (final?)• Best of both worlds• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
  • Relationships - Act Three (final?)• Best of both worlds insert / delete• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
  • Relationships - Act Three (final?)• Best of both worlds insert / delete• Separate “rels” collection A rel B • master source of relationship details • atomic insert (good!) • unique index (good!) • Nested “rels” documents in things A B • easy, fast queries (good!)
  • REST• RESTful interface on top of our services• Expose services for internal and external development (API)
  • REST • RESTful interface on top of our services • Expose services for internal and external development (API)• Need a way to validate data to prevent “garbage in”• JSON schema • http://json-schema.org/
  • person = [ Person Schema name: "person", type: "object", extends: "contact", properties: [ name: [type: "object", title: "Name", properties: [ firstName: [type: "string", title: "First Name", optional:true], middleName: [type: "string", title: "Middle Name", optional: true], lastName: [type: "string", title: "Last Name", optional: true], suffix: [type: "string", title: "Suffix", optional: true]]]]]
  • Schemas• Separate “schemas” MongoDB collection• Every “thing” passes through validation before being stored• Uses: • validate incoming data • track customer-specific schema extensions • generate UI to display things
  • Other MongoDB Collections• workflow status log• query log• staging area for har vested data• workgroups• duplicates scan log
  • Summary• Why mongoDB works for us • Schema-free • Document-oriented • JSON • Scalable • Active product
  • Questions?