Click to edit Master title styleThe Knot Search PlatformMongoLA 2013
About Us           •   Weddings, Newlyweds, Babies, Style           •   NYSE-traded under "XOXO"           •   Founded in ...
About Me           Jason Sirota Director of Software Architecture    XO Group Inc. (The Knot)                             ...
Current Arch: Sharded By Business Line  Consumer                                                                          ...
Migrating to OSS and Cloud   Consumer   Solutions   Web APIs   AWS Elastic Beanstalk   Federated via Apigee Services   Cac...
Why MongoDB?Document DB Decided First: Schemaless Design•   RavenDB (C#)                                 •   MongoDB    – ...
Started this Migration with our      Search Application
Many Data Stores Gowns   Products   Photos   Local Directory   Articles/Blogs   ECommerce   User Photos                   ...
Message-oriented Realtime Publishing
I hope you guys can read JSON
Message Format Message: {      EntityId: "6765aec7-370d-4f1d-82d2-97647ccea94e",      SearchType: "Product",      Title: "...
Message-oriented Realtime Publishing
Persister  "EntityMappings": [{                                                            Configure…       "SearchType": ...
Document Structure
Message-oriented Realtime Publishing
Search API
MongoDB Challenges• UUID Endianness  – Write C# GUID to Mongo  – Retrieve UUID from Python, reverses    Endianness, differ...
Demo?
MongoDB Instances                    • 1 Replica Set                    • 5 MongoDB Instances                    • 3 Avail...
Tested Traffic: Queries per Minute
Actual Traffic: Queries per Minute
How to lie with statistics…           what we tested…                             …what we got
Jason Sirota Director of Software Architecture    XO Group Inc. (The Knot)                                       jsirota@x...
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Upcoming SlideShare
Loading in...5
×

Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr

572

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
572
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
8
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Maintained by different teams in different physical locations – NY, Texas, China
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • Business Entity vs. a technical entity
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • We let the Mongo Driver serialize the C# object
  • Images in a separate collectionList of Ids in main documentMongoView ScreenshotsMongoHacker Extension
  • Idempotency and a Parallel Persistor – GUIDs generated off the source system, natural key or hash keyTime-based – last update for that guid wins
  • Very small amount of traffic, IOPS not yet a conernWe haven't seen sporadic io latency in AWS reported by other companies
  • Mongo holding data for all logging and event informationLabel Queries per Second
  • Here they are on the same graph
  • Click through fast
  • Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr

    1. 1. Click to edit Master title styleThe Knot Search PlatformMongoLA 2013
    2. 2. About Us • Weddings, Newlyweds, Babies, Style • NYSE-traded under "XOXO" • Founded in 1996 as AOL Channel • 11 million uniques / month • Articles / Blogs CMS • Photo Galleries • Membership / Favorites • Community Forums • Planning Tools • Local Directory • Gift Registry • Ecommerce If you havent heard of us… watch out, your girlfriend probably has!
    3. 3. About Me Jason Sirota Director of Software Architecture XO Group Inc. (The Knot) jsirota@xogrp.com http://jasonsirota.com/ @jasonsirota
    4. 4. Current Arch: Sharded By Business Line Consumer My Content National Tools Comm. Memb. eComm Registry Local Account UX UX UX / API UX UX / API UX UX / API UX / API UX Content National Tools Comm. Memb. eComm Registry Local My Acct. Business Business Business Business Business Business Business Business Business Logic Logic Logic Logic Logic Logic Logic Logic Logic Comm- Member- Local Sitecore National Tools Ecom Registry CES/ATS unity ship Profiles (SQL) (SQL) (SQL) (SQL) (Oracle) (SQL) (SQL) (SQL) (SQL) Databases UW UGC ODB GR360 (SQL) Photos (SQL) (MySQL) (SQL) Pluck Responsys Enterprise CRM GP FatTail GP Man Business Intelligence
    5. 5. Migrating to OSS and Cloud Consumer Solutions Web APIs AWS Elastic Beanstalk Federated via Apigee Services Caching Document Data Relationships Free-Text Search Data Analysis Key/Value Data Couchbase MongoDB Neo4j Solr Hadoop DynamoDB Relational Data SQL Server Enterprise Service Bus MSMQ (On-Premise) SQS (Cloud)
    6. 6. Why MongoDB?Document DB Decided First: Schemaless Design• RavenDB (C#) • MongoDB – Worked well with C# LINQ – Mature Document Data Store – Cross-collection Joins (but slow..) – Enterprise-level Support – Very new to NoSQL Landscape – High user-base – Limited to C# and REST interface – LINQ and JSON-based querying – Locking issues resolved – GeoSearching• Couchbase 2.0 – AWS I/O issues N/A – Already familiar with Ops from Caching – Masterless horizontal scaling – Still in Beta during choice – Map/Reduce-based queries only
    7. 7. Started this Migration with our Search Application
    8. 8. Many Data Stores Gowns Products Photos Local Directory Articles/Blogs ECommerce User Photos "Gown"
    9. 9. Message-oriented Realtime Publishing
    10. 10. I hope you guys can read JSON
    11. 11. Message Format Message: { EntityId: "6765aec7-370d-4f1d-82d2-97647ccea94e", SearchType: "Product", Title: "Sloan by Sottero and Midgley", Url:"http://www.theknot.com/wedding-dress/sottero-midgley/sloansottero", //Images appear in the Image Search Images:[{ Id: "04ed3a07-fcb5-41da-aa74-11214dcc8e27", Url: "http://xoedge.com/objects/0031/0107148/main_image.jpg", }], //Used for Solr Indexing Categories:["Gowns","Fashion"], Tags: ["modern","romantic"], Facets:[ "Color":["White","Ivory"] ], Attributes: { "FeaturedVendor": true }, }
    12. 12. Message-oriented Realtime Publishing
    13. 13. Persister "EntityMappings": [{ Configure… "SearchType": "LocalProfile", "PersistenceType": "XO.Vendors.Core.Domain.Profile, XO.Vendors.Core", "MongoDatabase": "search", "MongoCollection": "profiles" } namespace XO.Vendors.Core.Domain { public class Profile : Entity, IReviewAggreate { public Address Address { get; set; } …Define… public string Headline { get; set; } public List<Guid> ImageIds { get; set; } public string ImageId { get; set; } var server = MongoServer.Create(ConfigurationManager.ConnectionStrings["MongoDB"].ToString()); var db = server.GetDatabase(config.MongoDatabase); var collection = db.GetCollection(config.MongoCollection); collection.Save(entity); …Save.
    14. 14. Document Structure
    15. 15. Message-oriented Realtime Publishing
    16. 16. Search API
    17. 17. MongoDB Challenges• UUID Endianness – Write C# GUID to Mongo – Retrieve UUID from Python, reverses Endianness, different value out def upendUUID(orig): return uuid.UUID(bytes=orig.bytes_le)• C# Driver logged phantom errors at first: – "Could not Find MongoDB" – No other indicators of outage
    18. 18. Demo?
    19. 19. MongoDB Instances • 1 Replica Set • 5 MongoDB Instances • 3 Availability Zones • 20 EBS Volumes (R10) • 250 IOPS per Volume • EBS Snapshot Backups • S3 Data Dump
    20. 20. Tested Traffic: Queries per Minute
    21. 21. Actual Traffic: Queries per Minute
    22. 22. How to lie with statistics… what we tested… …what we got
    23. 23. Jason Sirota Director of Software Architecture XO Group Inc. (The Knot) jsirota@xogrp.comThe Knot http://jasonsirota.com/ @jasonsirota
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×