0
Retail Reference Architecture
with MongoDB
Antoine Girbal
Principal Solutions Engineer, MongoDB Inc.
@antoinegirbal
Introduction
4
• it is way too broad to tackle with one solution
• data maps so well to the document model
• needs for agility, perform...
5
• Holds complex JSON structures
• Dynamic Schema for Agility
• complex querying and in-place updating
• Secondary, compo...
6
MongoDB Strategic Advantages
Horizontally Scalable
-Sharding
Agile
Flexible
High Performance &
Strong Consistency
Applic...
7
build your data to fit your application
Relational MongoDB
{ customer_id : 1,
name : "Mark Smith",
city : "San Francisco...
8
Notions
RDBMS MongoDB
Database Database
Table Collection
Row Document
Column Field
Retail Components Overview
10
Information
Management
Merchandising
Content
Inventory
Customer
Channel
Sales &
Fulfillment
Insight
Social
Architecture...
11
Commerce Functional Components
Information
Layer
Look & Feel
Navigation
Customization
Personalization
Branding
Promotio...
Merchandising
13
Merchandising
Merchandising
MongoDB
Variant
Hierarchy
Pricing
Promotions
Ratings & Reviews
Calendar
Semantic Search
Ite...
14
• Single view of a product, one central catalog service
• Read volume high and sustained, 100k reads / s
• Write volume...
15
Merchandising - requirements
Requirement Example Challenge MongoDB
Single-view of product Blended description and
hiera...
16
Merchandising - Product Page
Product
images
General
Informatio
n
List of
Variants
External
Informatio
n
Localized
Descr...
17
> db.item.findOne()
{ _id: "301671", // main item id
department: "Shoes",
category: "Shoes/Women/Pumps",
brand: "Guess"...
18
• Get item by id
db.definition.findOne( { _id: "301671" } )
• Get item from Product Ids
db.definition.findOne( { _id: {...
19
> db.variant.findOne()
{
_id: "730223104376", // the sku
itemId: "301671", // references item id
thumbnail: "http://cdn...
20
• Get variant from SKU
db.variation.find( { _id: "730223104376" } )
• Get all variants for a product, sorted by SKU
db....
22
Per store Pricing could result in billions of documents,
unless you build it in a modular way
Price: {
_id: "sku7302231...
23
• Get all prices for a given item
db.prices.find( { _id: /^p301671_/ )
• Get all prices for a given sku (price could be...
26
Merchandising – Browse and Search products
Browse by
category
Special
Lists
Filter by
attributes
Lists hundreds
of item...
27
The previous page presents many challenges:
• Response within milliseconds for hundreds of items
• Faceted search on ma...
28
Merchandising – Browse and Search products
Hundreds
of sizes
One Item
Dozens of
colors
A single item may have thousands...
29
Merchandising – Browse and Search products
Images of the matching
variants are displayed
Hierarchy
Sort
parameter
Facet...
30
Merchandising – Traditional Architecture
Relational DB
System of Records
Full Text Search
Engine
Indexing
#1 obtain
sea...
31
The traditional architecture issues:
• 3 different systems to maintain: RDBMS, Search
engine, Caching layer
• search re...
32
MongoDB Data Store
Merchandising - Architecture
SummariesItems Pricing
PromotionsVariants
Ratings &
Reviews
#1 Obtain
r...
33
The summary relies on the following parameters:
• department e.g. "Shoes"
• An indexed attribute
– Category path, e.g. ...
34
> db.summaries.findOne()
{ "_id": "p39",
"title": "Evening Platform Pumps 39",
"department": "Shoes", "category": "Shoe...
35
• Get summary from item id
db.variation.find({ _id: "p301671" })
• Get summary's specific variation from SKU
db.variati...
36
Merchandising – Summary Model
• The following indices are used:
– department + attr + category + _id
– department + var...
37
Facet samples:
{ "_id" : "Accessory Type=Hosiery" , "count" : 14}
{ "_id" : "Ladder Material=Steel" , "count" : 2}
{ "_...
38
Merchandising – Query stats
Department Category Price Primary
attribute
Time
Average
(ms)
90th (ms) 95th (ms)
1 0 0 0 2...
Upcoming SlideShare
Loading in...5
×

Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Product Catalog

801

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
801
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
51
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Retail Reference Architecture Part 1: Flexible, Searchable, Low-Latency Product Catalog"

  1. 1. Retail Reference Architecture with MongoDB Antoine Girbal Principal Solutions Engineer, MongoDB Inc. @antoinegirbal
  2. 2. Introduction
  3. 3. 4 • it is way too broad to tackle with one solution • data maps so well to the document model • needs for agility, performance and scaling • Many (e)retailers are already using MongoDB • Let's define the best ways and places for it! Retail solution
  4. 4. 5 • Holds complex JSON structures • Dynamic Schema for Agility • complex querying and in-place updating • Secondary, compound and geo indexing • full consistency, durability, atomic operations • Near linear scaling via sharding • Overall, MongoDB is a unique fit! MongoDB is a great fit
  5. 5. 6 MongoDB Strategic Advantages Horizontally Scalable -Sharding Agile Flexible High Performance & Strong Consistency Application Highly Available -Replica Sets { customer: “roger”, date: new Date(), comment: “Spirited Away”, tags: [“Tezuka”, “Manga”]}
  6. 6. 7 build your data to fit your application Relational MongoDB { customer_id : 1, name : "Mark Smith", city : "San Francisco", orders: [ { order_number : 13, store_id : 10, date: “2014-01-03”, products: [ {SKU: 24578234, Qty: 3, Unit_price: 350}, {SKU: 98762345, Qty: 1, Unit_Price: 110} ] }, { <...> } ] } CustomerID First Name Last Name City 0 John Doe New York 1 Mark Smith San Francisco 2 Jay Black Newark 3 Meagan White London 4 Edward Danields Boston Order Number Store ID Product Customer ID 10 100 Tablet 0 11 101 Smartphone 0 12 101 Dishwasher 0 13 200 Sofa 1 14 200 Coffee table 1 15 201 Suit 2
  7. 7. 8 Notions RDBMS MongoDB Database Database Table Collection Row Document Column Field
  8. 8. Retail Components Overview
  9. 9. 10 Information Management Merchandising Content Inventory Customer Channel Sales & Fulfillment Insight Social Architecture Overview Customer Channels Amazon Ebay … Stores POS Kiosk … Mobile Smartphone Tablet Website Contact Center API Data and Service Integration Social Facebook Twitter … Data Warehouse Analytics Supply Chain Management System Suppliers 3rd Party In Network Web Servers Application Servers
  10. 10. 11 Commerce Functional Components Information Layer Look & Feel Navigation Customization Personalization Branding Promotions Chat Ads Customer's Perspective Research Browse Search Select Shopping Cart Purchase Checkout Receive Track Use Feedback Maintain Dialog Assist Market / Offer Guide Offer Semantic Search Recommend Rule-based Decisions Pricing Coupons Sell / Fullfill Orders Payments Fraud Detection Fulfillment Business Rules Insight Session Capture Activity Monitoring Customer Enterprise Information Management Merchandising Content Inventory Customer Channel Sales & Fulfillment Insight Social
  11. 11. Merchandising
  12. 12. 13 Merchandising Merchandising MongoDB Variant Hierarchy Pricing Promotions Ratings & Reviews Calendar Semantic Search Item Localization
  13. 13. 14 • Single view of a product, one central catalog service • Read volume high and sustained, 100k reads / s • Write volume spikes up during catalog update • Advanced indexing and querying • Geographical distribution and low latency • No need for a cache layer, CDN for assets Merchandising - principles
  14. 14. 15 Merchandising - requirements Requirement Example Challenge MongoDB Single-view of product Blended description and hierarchy of product to ensure availability on all channels Flexible document-oriented storage High sustained read volume with low latency Constant querying from online users and sales associates, requiring immediate response Fast indexed querying, replication allows local copy of catalog, sharding for scaling Spiky and real-time write volume Bulk update of full catalog without impacting production, real-time touch update Fast in-place updating, real- time indexing, , sharding for scaling Advanced querying Find product based on color, size, description Ad-hoc querying on any field, advanced secondary and compound indexing
  15. 15. 16 Merchandising - Product Page Product images General Informatio n List of Variants External Informatio n Localized Description
  16. 16. 17 > db.item.findOne() { _id: "301671", // main item id department: "Shoes", category: "Shoes/Women/Pumps", brand: "Guess", thumbnail: "http://cdn…/pump.jpg", image: "http://cdn…/pump1.jpg", // larger version of thumbnail title: "Evening Platform Pumps", description: "Those evening platform pumps put the perfect finishing touches on your most glamourous night-on-the-town outfit", shortDescription: "Evening Platform Pumps", style: "Designer", type: "Platform", rating: 4.5, // user rating lastUpdated: Date("2014/04/01"), // last update time … } Merchandising - Item Model
  17. 17. 18 • Get item by id db.definition.findOne( { _id: "301671" } ) • Get item from Product Ids db.definition.findOne( { _id: { $in: ["301671", "301672" ] } } ) • Get items by department db.definition.find({ department: "Shoes" }) • Get items by category prefix db.definition.find( { category: /^Shoes/Women/ } ) • Indices productId, department, category, lastUpdated Merchandising - Item Definition
  18. 18. 19 > db.variant.findOne() { _id: "730223104376", // the sku itemId: "301671", // references item id thumbnail: "http://cdn…/pump-red.jpg", // variant specific image: "http://cdn…/pump-red.jpg", size: 6.0, color: "Red", width: "B", heelHeight: 5.0, lastUpdated: Date("2014/04/01"), // last update time … } Merchandising – Variant Model
  19. 19. 20 • Get variant from SKU db.variation.find( { _id: "730223104376" } ) • Get all variants for a product, sorted by SKU db.variation.find( { productId: "301671" } ).sort( { _id: 1 } ) • Indices productId, lastUpdated Merchandising – Variant Model
  20. 20. 22 Per store Pricing could result in billions of documents, unless you build it in a modular way Price: { _id: "sku730223104376_store123", currency: "USD", price: 89.95, lastUpdated: Date("2014/04/01"), // last update time … } _id: concatenation of item and store. Item: can be an item id or sku Store: can be a store group or store id. Indices: lastUpdated Merchandising – per store Pricing
  21. 21. 23 • Get all prices for a given item db.prices.find( { _id: /^p301671_/ ) • Get all prices for a given sku (price could be at item level) db.prices.find( { _id: { $in: [ /^sku730223104376_/, /^p301671_/ ]) • Get minimum and maximum prices for a sku db.prices.aggregate( { match }, { $group: { _id: 1, min: { $min: price }, max: { $max : price} } }) • Get price for a sku and store id (returns up to 4 prices) db.prices.find( { _id: { $in: [ "sku730223104376_store1234", "sku730223104376_sgroup0", "p301671_store1234", "p301671_sgroup0"] , { price: 1 }) Merchandising – per store Pricing
  22. 22. 26 Merchandising – Browse and Search products Browse by category Special Lists Filter by attributes Lists hundreds of item summaries Ideally a single query is issued to the database to obtain all items and metadata to display
  23. 23. 27 The previous page presents many challenges: • Response within milliseconds for hundreds of items • Faceted search on many attributes: category, brand, … • Attributes at the variant level: color, size, etc, and the variation's image should be shown • thousands of variants for an item, need to de-duplicate • Efficient sorting on several attributes: price, popularity • Pagination feature which requires deterministic ordering Merchandising – Browse and Search products
  24. 24. 28 Merchandising – Browse and Search products Hundreds of sizes One Item Dozens of colors A single item may have thousands of variants
  25. 25. 29 Merchandising – Browse and Search products Images of the matching variants are displayed Hierarchy Sort parameter Faceted Search
  26. 26. 30 Merchandising – Traditional Architecture Relational DB System of Records Full Text Search Engine Indexing #1 obtain search results IDs ApplicationCache #2 obtain objects by ID Pre-joined into objects
  27. 27. 31 The traditional architecture issues: • 3 different systems to maintain: RDBMS, Search engine, Caching layer • search returns a list of IDs to be looked up in the cache, increases latency of response • RDBMS schema is complex and static • The search index is expensive to update • Setup does not allow efficient pagination Merchandising – Traditional Architecture
  28. 28. 32 MongoDB Data Store Merchandising - Architecture SummariesItems Pricing PromotionsVariants Ratings & Reviews #1 Obtain results
  29. 29. 33 The summary relies on the following parameters: • department e.g. "Shoes" • An indexed attribute – Category path, e.g. "Shoes/Women/Pumps" – Price range – List of Item Attributes, e.g. Brand = Guess – List of Variant Attributes, e.g. Color = red • A non-indexed attribute – List of Item Secondary Attributes, e.g. Style = Designer – List of Variant Secondary Attributes, e.g. heel height = 4.0 • Sorting, e.g. Price Low to High Merchandising – Summary Model
  30. 30. 34 > db.summaries.findOne() { "_id": "p39", "title": "Evening Platform Pumps 39", "department": "Shoes", "category": "Shoes/Women/Pumps", "thumbnail": "http://cdn…/pump-small-39.jpg", "image": "http://cdn…/pump-39.jpg", "price": 145.99, "rating": 0.95, "attrs": [ { "brand" : "Guess"}, … ], "sattrs": [ { "style" : "Designer"} , { "type" : "Platform"}, …], "vars": [ { "sku": "sku2441", "thumbnail": "http://cdn…/pump-small-39.jpg.Blue", "image": "http://cdn…/pump-39.jpg.Blue", "attrs": [ { "size": 6.0 }, { "color": "Blue" }, …], "sattrs": [ { "width" : "B"} , { "heelHeight" : 5.0 }, …], }, … Many more skus … ] } Merchandising – Summary Model
  31. 31. 35 • Get summary from item id db.variation.find({ _id: "p301671" }) • Get summary's specific variation from SKU db.variation.find( { "vars.sku": "730223104376" }, { "vars.$": 1 } ) • Get summary by department, sorted by rating db.variation.find( { department: "Shoes" } ).sort( { rating: 1 } ) • Get summary with mix of parameters db.variation.find( { department : "Shoes" , "vars.attrs" : { "color" : "Gray"} , "category" : ^/Shoes/Women/ , "price" : { "$gte" : 65.99 , "$lte" : 180.99 } } ) Merchandising - Summary Model
  32. 32. 36 Merchandising – Summary Model • The following indices are used: – department + attr + category + _id – department + vars.attrs + category + _id – department + category + _id – department + price + _id – department + rating + _id • _id used for pagination • Can take advantage of index intersection • With several attributes specified (e.g. color=red and size=6), which one is looked up?
  33. 33. 37 Facet samples: { "_id" : "Accessory Type=Hosiery" , "count" : 14} { "_id" : "Ladder Material=Steel" , "count" : 2} { "_id" : "Gold Karat=14k" , "count" : 10138} { "_id" : "Stone Color=Clear" , "count" : 1648} { "_id" : "Metal=White gold" , "count" : 10852} Single operations to insert / update: db.facet.update( { _id: "Accessory Type=Hosiery" }, { $inc: 1 }, true, false) The facet with lowest count is the most restrictive… It should come first in the query! Merchandising – Facet
  34. 34. 38 Merchandising – Query stats Department Category Price Primary attribute Time Average (ms) 90th (ms) 95th (ms) 1 0 0 0 2 3 3 1 1 0 0 1 2 2 1 0 1 0 1 2 3 1 1 1 0 1 2 2 1 0 0 1 0 1 2 1 1 0 1 0 1 1 1 0 1 1 1 2 2 1 1 1 1 0 1 1 1 0 0 2 1 3 3 1 1 0 2 0 2 2 1 0 1 2 10 20 35 1 1 1 2 0 1 1
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×