MongoDB
      at
Community Engine
About me

Lead platform engineer
mathieu.kempe@communityengine.com
@mathieukempe
Agenda

Brain dump on our experience with MongoDB
• Why NoSQL
• Why we chose MongoDB
• Moving away from an hybrid solution
• MongoDB and Amazon
• SOLR and MongoDB
• Ease of development
• Zero downtime database deployment
About Community Engine
• Social network based on locality and
  small business
• Launching in April
• Built on:
    • ASP.NET MVC 3
    • Amazon Web services
    • MongoDB
    • SOLR
    • Mahout
    • …
NoSQL?

• How to store the big amounts of data required
  in social networking applications
• Data complexity, NoSQL handle hierarchical
  and graph data structures better
• Change management is always difficult with
  RDBMS
• Scaling
Why we chose MongoDB?

• Reviewed different products RavenDB, CouchDB...
• Selected MongoDB because we had the best
  experience
   – Very easy to install and get started
   – Great developer experience
   – Replication very easy to setup
   – Good documentation
   – Much of the convenience of SQL, Dynamic
     Queries, Indexing
Why Hybrid?

• Team had a lot of experience with SQL Server
  and Entity Framework
• Reporting
• Transaction
No more SQL Server
• Simplify our infrastructure
• Easier to Backup
• Better performance, not slowed down by SQL
  Server, too many queries joined in the
  application
• Development speed
• Lower cost
Transaction?

Transactions we could go around that using the atomic
  document updates and a good schema design

MongoDB supports atomic operations on single
 document.

When transactions across documents are needed
Two phase commits
Hosting with Amazon Web Service

• Elasticity and scalability
• Configure MongoDB using Amazon EC2 instance
  bundled into an AMI.
• 64 bits EC2 instance
• Raid10 + EBS volumes
• Multi-datacenter 3-node replica set in different
  availability zone
• Use secondaries for zero downtime backup
• We are not yet using sharded replica sets
What about durability?


Use journaling
Use replica sets
Critical writes

Verify that replication is working at write time

mongodb://host1,host2,host3/?safe=true;w=2;

•safe=true : Use safemode
•w=2: wmode, connect to a replica set waiting for
replication to succeed on the majority of nodes
Why we kept SOLR?

•   Right tool for the right job
•   Proven technology
•   SOLR best solution for Full Text Indexing
•   Faceted search, Spelling suggestions…
•   Team already skilled with SOLR
•   SOLR scales well
MongoDB/SOLR How we do it
MongoDB/SOLR How we do it
MongoDB/SOLR How we do it
MongoDB/SOLR How we do it
MongoDB/SOLR How we do it
Ease of development
Hierarchical data in SQL Server
Single table
Mongo Database Schema

Using Type discriminator

{
     "_id" : ObjectId("4f504e7acd3e1c190ce04198"),
     "_t" : "PhotoSpark",
     "Photo" : "MyMotorcycle.png "
    "DateCreated" : ISODate("2012-02-24T09:23:12.246Z")
}
{
     "_id" : ObjectId("4f504e7ccd3e1c190ce04199"),
     "_t" : "PostSpark",
     "Body" : "Hello World“
    " DateCreated" : ISODate("2012-02-28T10:44:12.858Z")
}
Views
                      


                                         




                                         




                
Deployment of database with zero downtime

• We release every week
• We aim at zero downtime
• Our domain model change often
Deployment of database with zero downtime

Make sure that our code can handle both
 "versions" of the data structure
When saving we updates to the new structure
Deployment of database with zero downtime

• Use a migration script
Deployment of database with zero downtime

  Expansion Script   Deploy new version   Compression Script




                                                               t
Questions?
Thank you!

mathieu.kempe@communityengine.com
@mathieukempe

MongoDB at community engine

  • 1.
    MongoDB at Community Engine
  • 2.
    About me Lead platformengineer mathieu.kempe@communityengine.com @mathieukempe
  • 3.
    Agenda Brain dump onour experience with MongoDB • Why NoSQL • Why we chose MongoDB • Moving away from an hybrid solution • MongoDB and Amazon • SOLR and MongoDB • Ease of development • Zero downtime database deployment
  • 4.
    About Community Engine •Social network based on locality and small business • Launching in April • Built on: • ASP.NET MVC 3 • Amazon Web services • MongoDB • SOLR • Mahout • …
  • 5.
    NoSQL? • How tostore the big amounts of data required in social networking applications • Data complexity, NoSQL handle hierarchical and graph data structures better • Change management is always difficult with RDBMS • Scaling
  • 6.
    Why we choseMongoDB? • Reviewed different products RavenDB, CouchDB... • Selected MongoDB because we had the best experience – Very easy to install and get started – Great developer experience – Replication very easy to setup – Good documentation – Much of the convenience of SQL, Dynamic Queries, Indexing
  • 8.
    Why Hybrid? • Teamhad a lot of experience with SQL Server and Entity Framework • Reporting • Transaction
  • 10.
    No more SQLServer • Simplify our infrastructure • Easier to Backup • Better performance, not slowed down by SQL Server, too many queries joined in the application • Development speed • Lower cost
  • 11.
    Transaction? Transactions we couldgo around that using the atomic document updates and a good schema design MongoDB supports atomic operations on single document. When transactions across documents are needed Two phase commits
  • 12.
    Hosting with AmazonWeb Service • Elasticity and scalability • Configure MongoDB using Amazon EC2 instance bundled into an AMI. • 64 bits EC2 instance • Raid10 + EBS volumes • Multi-datacenter 3-node replica set in different availability zone • Use secondaries for zero downtime backup • We are not yet using sharded replica sets
  • 14.
    What about durability? Usejournaling Use replica sets
  • 15.
    Critical writes Verify thatreplication is working at write time mongodb://host1,host2,host3/?safe=true;w=2; •safe=true : Use safemode •w=2: wmode, connect to a replica set waiting for replication to succeed on the majority of nodes
  • 21.
    Why we keptSOLR? • Right tool for the right job • Proven technology • SOLR best solution for Full Text Indexing • Faceted search, Spelling suggestions… • Team already skilled with SOLR • SOLR scales well
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
    Mongo Database Schema UsingType discriminator { "_id" : ObjectId("4f504e7acd3e1c190ce04198"), "_t" : "PhotoSpark", "Photo" : "MyMotorcycle.png " "DateCreated" : ISODate("2012-02-24T09:23:12.246Z") } { "_id" : ObjectId("4f504e7ccd3e1c190ce04199"), "_t" : "PostSpark", "Body" : "Hello World“ " DateCreated" : ISODate("2012-02-28T10:44:12.858Z") }
  • 34.
    Views                                                                    
  • 35.
    Deployment of databasewith zero downtime • We release every week • We aim at zero downtime • Our domain model change often
  • 36.
    Deployment of databasewith zero downtime Make sure that our code can handle both "versions" of the data structure When saving we updates to the new structure
  • 37.
    Deployment of databasewith zero downtime • Use a migration script
  • 38.
    Deployment of databasewith zero downtime Expansion Script Deploy new version Compression Script t
  • 39.
  • 40.

Editor's Notes

  • #4 Brain dump on why we started to use MongoDB Why we moved away from a solution using MongoDB and SQL Server
  • #10 Moved to this infra
  • #19 If the primary crash we have the data in at least one of the secondary Second node has priority greater than or equal to other eligible nodes in the set
  • #25 Update is different as we do sometime Update in place Update in place are a query and an update in one command