Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
 

Like this? Share it with your network

Share

Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr

on

  • 741 views

 

Statistics

Views

Total Views
741
Views on SlideShare
647
Embed Views
94

Actions

Likes
2
Downloads
5
Comments
0

6 Embeds 94

http://www.10gen.com 57
http://www.mongodb.com 21
http://www.linkedin.com 6
https://www.linkedin.com 5
http://drupal1.10gen.cc 4
https://www.mongodb.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Maintained by different teams in different physical locations – NY, Texas, China
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • Business Entity vs. a technical entity
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • We let the Mongo Driver serialize the C# object
  • Images in a separate collectionList of Ids in main documentMongoView ScreenshotsMongoHacker Extension
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • Very small amount of traffic, IOPS not yet a conernWe haven't seen sporadic io latency in AWS reported by other companies
  • Mongo holding data for all logging and event informationLabel Queries per Second
  • Here they are on the same graph
  • Click through fast

Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr Presentation Transcript

  • 1. Click to edit Master title styleThe Knot Search PlatformMongoLA 2013
  • 2. About Us • Weddings, Newlyweds, Babies, Style • NYSE-traded under "XOXO" • Founded in 1996 as AOL Channel • 11 million uniques / month • Articles / Blogs CMS • Photo Galleries • Membership / Favorites • Community Forums • Planning Tools • Local Directory • Gift Registry • Ecommerce If you havent heard of us… watch out, your girlfriend probably has!
  • 3. About Me Jason Sirota Director of Software Architecture XO Group Inc. (The Knot) jsirota@xogrp.com http://jasonsirota.com/ @jasonsirota
  • 4. Current Arch: Sharded By Business Line Consumer My Content National Tools Comm. Memb. eComm Registry Local Account UX UX UX / API UX UX / API UX UX / API UX / API UX Content National Tools Comm. Memb. eComm Registry Local My Acct. Business Business Business Business Business Business Business Business Business Logic Logic Logic Logic Logic Logic Logic Logic Logic Comm- Member- Local Sitecore National Tools Ecom Registry CES/ATS unity ship Profiles (SQL) (SQL) (SQL) (SQL) (Oracle) (SQL) (SQL) (SQL) (SQL) Databases UW UGC ODB GR360 (SQL) Photos (SQL) (MySQL) (SQL) Pluck Responsys Enterprise CRM GP FatTail GP Man Business Intelligence
  • 5. Migrating to OSS and Cloud Consumer Solutions Web APIs AWS Elastic Beanstalk Federated via Apigee Services Caching Document Data Relationships Free-Text Search Data Analysis Key/Value Data Couchbase MongoDB Neo4j Solr Hadoop DynamoDB Relational Data SQL Server Enterprise Service Bus MSMQ (On-Premise) SQS (Cloud)
  • 6. Why MongoDB?Document DB Decided First: Schemaless Design• RavenDB (C#) • MongoDB – Worked well with C# LINQ – Mature Document Data Store – Cross-collection Joins (but slow..) – Enterprise-level Support – Very new to NoSQL Landscape – High user-base – Limited to C# and REST interface – LINQ and JSON-based querying – Locking issues resolved – GeoSearching• Couchbase 2.0 – AWS I/O issues N/A – Already familiar with Ops from Caching – Masterless horizontal scaling – Still in Beta during choice – Map/Reduce-based queries only
  • 7. Started this Migration with our Search Application
  • 8. Many Data Stores Gowns Products Photos Local Directory Articles/Blogs ECommerce User Photos "Gown"
  • 9. Message-oriented Realtime Publishing
  • 10. I hope you guys can read JSON
  • 11. Message Format Message: { EntityId: "6765aec7-370d-4f1d-82d2-97647ccea94e", SearchType: "Product", Title: "Sloan by Sottero and Midgley", Url:"http://www.theknot.com/wedding-dress/sottero-midgley/sloansottero", //Images appear in the Image Search Images:[{ Id: "04ed3a07-fcb5-41da-aa74-11214dcc8e27", Url: "http://xoedge.com/objects/0031/0107148/main_image.jpg", }], //Used for Solr Indexing Categories:["Gowns","Fashion"], Tags: ["modern","romantic"], Facets:[ "Color":["White","Ivory"] ], Attributes: { "FeaturedVendor": true }, }
  • 12. Message-oriented Realtime Publishing
  • 13. Persister "EntityMappings": [{ Configure… "SearchType": "LocalProfile", "PersistenceType": "XO.Vendors.Core.Domain.Profile, XO.Vendors.Core", "MongoDatabase": "search", "MongoCollection": "profiles" } namespace XO.Vendors.Core.Domain { public class Profile : Entity, IReviewAggreate { public Address Address { get; set; } …Define… public string Headline { get; set; } public List<Guid> ImageIds { get; set; } public string ImageId { get; set; } var server = MongoServer.Create(ConfigurationManager.ConnectionStrings["MongoDB"].ToString()); var db = server.GetDatabase(config.MongoDatabase); var collection = db.GetCollection(config.MongoCollection); collection.Save(entity); …Save.
  • 14. Document Structure
  • 15. Message-oriented Realtime Publishing
  • 16. Search API
  • 17. MongoDB Challenges• UUID Endianness – Write C# GUID to Mongo – Retrieve UUID from Python, reverses Endianness, different value out def upendUUID(orig): return uuid.UUID(bytes=orig.bytes_le)• C# Driver logged phantom errors at first: – "Could not Find MongoDB" – No other indicators of outage
  • 18. Demo?
  • 19. MongoDB Instances • 1 Replica Set • 5 MongoDB Instances • 3 Availability Zones • 20 EBS Volumes (R10) • 250 IOPS per Volume • EBS Snapshot Backups • S3 Data Dump
  • 20. Tested Traffic: Queries per Minute
  • 21. Actual Traffic: Queries per Minute
  • 22. How to lie with statistics… what we tested… …what we got
  • 23. Jason Sirota Director of Software Architecture XO Group Inc. (The Knot) jsirota@xogrp.comThe Knot http://jasonsirota.com/ @jasonsirota