Augmenting RDBMS
  with MongoDB
    for e-commerce
My name is
Steve Francia

    @spf13
• 15+ years building e-commerce
• Long time open source contributor
• Entrepreneur
• Hacker, father, husband, skate punk
My name is
Justin Hileman

    @bobthecow
• 10+ years making the Internet
  awesomer

• Open Source contributor
• Vespa rider, swing dancer,
  coder, standardista
We work for OpenSky

    http://shopopensky.com
OpenSky is
  a new way to shop

OpenSky connects you with innovators,
trendsetters and tastemakers.You choose
the ones you like and each week they invite
you to their private online sales.
OpenSky Loves
             Open Source
•   PHP 5.3
•   Apache2
•   Symfony2
•   Doctrine2
•   jQuery
•   Mule
•   HornetQ
•   MongoDB
•   nginx
•   varnish
We contribute to many
open source projects and
   pioneer innovative
  solutions using them
OpenSky was the first
e-commerce site built
    on MongoDB
... also the first e-commerce site built on Symfony2
Why NoSQL for
  e-commerce?


Using the right solution for each situation
Data dilemma of
 e-commerce
     Pick One
Data dilemma of
      e-commerce
                 Pick One


• Stick to one vertical (Sane schema)
Data dilemma of
      e-commerce
                 Pick One


• Stick to one vertical (Sane schema)
• Flexibility (Insane schema)
Sane schema
Sane schema

• Works ... for a while
Sane schema

• Works ... for a while
• Fine for a few types of products
Sane schema

• Works ... for a while
• Fine for a few types of products
• Not possible when more product types
  introduced
Let’s Use an Example
Let’s Use an Example
   How about we start with books
Book Product Schema
Product {

id:
sku:                                    General Product
product dimensions:
shipping weight:                        attributes
MSRP:
price:
description:
...
author:           Orson Scott Card
title:            Enders Game
binding:          Hardcover
publication date: July 15, 1994         Book Specific
publisher name:   Tor Science Fiction   attributes
number of pages: 352
ISBN:             0812550706
language:         English
...
Seems simple enough
Seems simple enough

What happens when we add another vertical...
            say music albums
Album Product Schema
Product {

id:
sku:                               General Product
product dimensions:                attributes stay the
shipping weight:
MSRP:
                                   same
price:
description:
...
artist:         MxPx
title:          Panic              Album Specific
release date:   June 7, 2005       attributes are
label:          Side One Dummy
track listing: [ The Darkest ...   different
language:       English
format:         CD
...
Okay, it’s getting hairy but
is still manageable, right?
Okay, it’s getting hairy but
is still manageable, right?

    Now the business want to sell jeans
Jeans Product Schema
Product {

id:                           General Product
sku:
product dimensions:
                              attributes stay the
shipping weight:              same
MSRP:
price:
description:
...
brand:         Lucky
gender:        Mens
                              Jeans specific
make:          Vintage        attributes are totally
style:         Straight Cut   different ... and not
length:        34
width:         34
                              consistent across
color:         Hipster        brands & make
material:      Cotten Blend
...
Now we’re screwed
We need a flexible
schema in RDBMS
We need a flexible
schema in RDBMS


    We got this ... right?
Many approaches
dealing with unknown
unknowns in RDBMS
Many approaches
dealing with unknown
unknowns in RDBMS


      None work well
EAV
             as popularized by Magento
“For purposes of flexibility, the Magneto database heavily utilizes
an Entity-Attribute-Value (EAV) data model.

As is often the case, the cost of flexibility is complexity -
Magento is no exception.

The process of manipulating data in Magento is often more
“involved” than that typically experienced using traditional
relational tables.”
                         - Varien
EAV

•   Crazy SQL queries

•   Hundreds of joins in a query...
    or

•   Hundreds of queries joined in
    the application

•   No database enforced integrity
Did I say crazy SQL
(this is a single query)
Did I say crazy SQL
(this is a single query)




You may have trouble reading this in the back
Selecting a single product
Single Table Inheritance
            (insanely wide tables)

•   No data integrity enforcement

•   Only can use FK for common
    elements

•   Very wasteful (but disk is cheap!)

•   Can’t effectively index
Generic Columns
•   No data integrity enforcement

•   No data type enforcement

•   Only can use FK for common
    elements

•   Wasteful (but disk is cheap!)

•   Can’t index
Serialized in Blob
•   Not searchable

•   No integrity

•   All the disadvantages of a document
    store, but none of the advantages

•   Never should be used

•   One exception is Oracle XML
    which operates similar to a
    document store
Concrete Table Inheritance
    (a table for each product attribute set)

•   Allows for data integrity

•   Querying across attribute
    sets quite hard to do (lots of
    joins, OR statements and full
    table scanning)

•   New table needs to be
    created for each new
    attribute set
Class table inheritance
                  (single product table,
             each attribute set in own table)
•   Likely best solution within the
    constraint of SQL

•   Supports data type enforcement

•   No data integrity enforcement

•   Easy querying across categories (for
    browse pages) since common data
    in single table

•   Every set needs a new table

•   Requires a ton of forsight, as
    changes are very complicated
MongoDB to the
   Rescue
MongoDB to the
        Rescue
• Flexible (and sane) Schema
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
• Easily accessible
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
• Easily accessible
• Fast
Flexible schema
{                                 {
    sku: "00e8da9c",                  sku: "00e8da9d",
    type: "Audio Album",              type: "Film",
    title: "Hoss",                    title: "The Matrix",
    description: "by Lagwagon",       description: "Set in the 22nd century, Th
    asin: "B0000007QG",               asin: "B000P0J0AQ",

    shipping: {                       shipping: {
       weight: 6,                        weight: 6,
       dimensions: {                     dimensions: {
          width: 10,                        width: 10,
          height: 10,                       height: 10,
          depth: 1                          depth: 1
       },                                },
    },                                },

    pricing: {                        pricing: {
       list: 1000,                       list: 1200,
       retail: 800,                      retail: 1100,
       savings: 200,                     savings: 100,
       pct_savings: 20                   pct_savings: 8.5
    },                                },

    details: {                        details: {
      title: "Hoss",                    title: "The Matrix",
pct_savings: 20                      pct_savings: 8.5
},                                   },

details: {                           details: {
  title: "Hoss",                        title: "The Matrix",
  artist: "Lagwagon",                   director: [ "Andy Wachowski", "Larry Wa
  genre: [ "Punk", "Hardcore", "Indie Rock" ], [ "Andy Wachowski", "Larry Wach
                                        writer:
  label: "Fat Wreck Chords",            actor: [ "Keanu Reeves" , "Lawrence Fis
  number_of_discs: 1,                   genre: [ "Science Fiction", "Action" ],
  issue_date: "November 21, 1995",      number_of_discs: 1,
  format: "CD",                         issue_date: "May 15 2007",
  alternate_formats: [ 'Vinyl', 'MP3' ],original_release_date: "1999",
  tracks: [                             disc_format: "DVD",
     "Kids Don't Like To Share",        rating: "R",
     "Violins",                         alternate_formats: [ 'VHS', 'Bluray' ],
     "Name Dropping",                   run_time: "136",
     "Bombs Away",                      studio: "Warner Bros",
     "Move The Car",                    language: "English",
     "Sleep",                           format: [ "AC-3", "Closed-captioned", "
     "Sick",                            aspect_ratio: "1.66:1"
     "Rifle",                        },
     "Weak",                       }
     "Black Eye",
     "Bro Dependent",
     "Razor Burn",
     "Shaving Your Head",
     "Ride The Snake",
  ],
Queries
db.products.find( { 'name': "The Matrix" } );
db.products.find( { 'name': "The Matrix" } );


 {
     "_id": ObjectId("4d8ad78b46b731a22943d3d3"),
     "sku": "00e8da9d",
     "type": "Film",
     "name": "The Matrix",
     "description": "Set in the 22nd century, The Matrix...",
     "asin": "B000P0J0AQ",
     "shipping": {
         "weight": 6,
         "dimensions": {
             "width": 10,
             "height": 10,
             "depth": 1
         }
     },
     "pricing": {
db.products.find( { 'details.actor': "Groucho Marx" } );
db.products.find( { 'details.actor': "Groucho Marx" } );


 },
 "pricing": {
     "list": 1000,
     "retail": 800,
     "savings": 200,
     "pct_savings": 20
 },
 "details": {
     "title": "A Night at the Opera",
     "director": "Sam Wood",
     "actor": ["Groucho Marx", "Chico Marx", "Harpo Marx"],
     "genre": "Comedy",
     "number_of_discs": 1,
     "issue_date": "May 4 2004",
     "original_release_date": "1935",
     "disc_format": "DVD",
db.products.find( {
     'details.genre': "Jazz", 'details.format': "CD"
} );
db.products.find( {
     'details.genre': "Jazz", 'details.format': "CD"
} );


     "list": 1200,
     "retail": 1100,
     "savings": 100,
     "pct_savings": 8
 },
 "details": {
     "title": "A Love Supreme [Original Recording Reissued]",
     "artist": "John Coltrane",
     "genre": ["Jazz", "General"],
     "format": "CD",
     "label": "Impulse Records",
     "number_of_discs": 1,
     "issue_date": "December 9, 1964",
     "alternate_formats": ["Vinyl", "MP3"],
     "tracks": [
     "A Love Supreme Part I: Acknowledgement",
db.products.find( { 'details.actor':
     { $all: ['James Stewart', 'Donna Reed'] }
} );
db.products.find( { 'details.actor':
     { $all: ['James Stewart', 'Donna Reed'] }
} );


 },
 "details": {
     "title": "It's a Wonderful Life",
     "director": "Frank Capra",
     "actor": ["James Stewart", "Donna Reed", "Lionel Barrymore"],
     "writer": [
     "Frank Capra",
     "Albert Hackett",
     "Frances Goodrich",
     "Jo Swerling",
     "Michael Wilson"
     ],
     "genre": "Drama",
     "number_of_discs": 1,
     "issue_date": "Oct 31 2006",
     "original_release_date": "1947",
Wanna Play?

•   grab products.js from
    http://github.com/spf13/mongoProducts
•   mongo --shell products.js

•   > use mongoProducts
Embedded documents
 are great for orders
• Ordered items need to be fixed at the time
  of purchase
• Embed them right in the order
db.order.find( { 'items.sku': '00e8da9f' } );
db.order.find( {
    'items.details.actor': 'James Stewart'
} ).count();
Why not NoSQL?


Using the right solution for each situation
Data (like people) are
really sensitive when it
   comes to money
Stricter data
requirements for $$
Stricter data
   requirements for $$

• For financial systems any data inconsistency
  is unacceptable
Stricter data
   requirements for $$

• For financial systems any data inconsistency
  is unacceptable
• Perhaps you’ve heard of ACID?
What about ACID?
What about ACID?


Q: Is MongoDB ACID?
What about ACID?


Q: Is MongoDB ACID?
A: Kinda
Atomicity
Atomicity

• MongoDB does atomic writes
Atomicity

• MongoDB does atomic writes
  ... for single document changesets
Atomicity

• MongoDB does atomic writes
    ... for single document changesets


•   $set, $unset, $inc, $push,
    $pushAll, $pull, $pullAll, $bit
Consistency
Consistency

• MongoDB can enforce unique keys
Consistency

• MongoDB can enforce unique keys
  ... but only on keys shared by every
  document in the collection
Consistency

• MongoDB can enforce unique keys
  ... but only on keys shared by every
  document in the collection
• MongoDB can't enforce referential integrity
Isolation
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...

     •    Despite the $atomic keyword, this is not an atomic update,
          since atomicity implies “all or nothing”
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...

     •    Despite the $atomic keyword, this is not an atomic update,
          since atomicity implies “all or nothing”

     •    An isolated update can only act on a single collection. Multi-
          collection updates are not transactional, thus not isolatable.
Durability
Durability


• Mongo has this one covered
What does
MongoDB Support?
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Isolated (not atomic) single collection updates.
    •   Mongo does not support locking
    •   There are ways to work around this
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Isolated (not atomic) single collection updates.
    •   Mongo does not support locking
    •   There are ways to work around this
•   It’s durable
There are ways to
guarantee ACID properties
 in inconsistent databases
There are ways to
guarantee ACID properties
 in inconsistent databases
 (or, as we call them, consistency impaired databases)
Optimistic concurrency
Optimistic concurrency
• Read the current state of a product
Optimistic concurrency
• Read the current state of a product
• Make your changes with the assertion that
  your product has the same state as it did
  when you last read it
Optimistic concurrency
    in MongoDB
Optimistic concurrency
    in MongoDB
We’ll use an update-if-current strategy.
Optimistic concurrency
    in MongoDB
We’ll use an update-if-current strategy.
This example is straight from the documentation:
Optimistic concurrency
    in MongoDB
    We’ll use an update-if-current strategy.
    This example is straight from the documentation:

>   t = db.inventory
>   p = t.findOne({sku:'abc'})
>   t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
>   db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}
// it worked
Optimistic concurrency
    in MongoDB
    We’ll use an update-if-current strategy.
    This example is straight from the documentation:

>   t = db.inventory
>   p = t.findOne({sku:'abc'})
>   t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
>   db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}
// it worked



    ... If that didn't work, try again until it does.
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
  your product has the same state as it did
  when you last read it.
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
    your product has the same state as it did
    when you last read it.
•   It's possible to use OCC to bootstrap
    pessimistic concurrency and fake row level
    locking
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
    your product has the same state as it did
    when you last read it.
•   It's possible to use OCC to bootstrap
    pessimistic concurrency and fake row level
    locking
    ... ask me about this some time
Optimistic concurrency
 control assumes an
environment with low
   data contention
OCC works great for
companies like Amazon

• Amazon has a long-tail catalog
• A long tail catalog lends itself well to
  optimistic concurrency, because it has low
  data contention
OCC fails miserably for
OCC fails miserably for
• eBay
OCC fails miserably for
• eBay
• Gilt
OCC fails miserably for
• eBay
• Gilt
• Groupon
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
• Living Social
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
• Living Social
• InsertFlashSaleSiteOfTheMinute
Flash sales and auctions
are defined by high data
       contention
Flash sales and auctions
are defined by high data
       contention

• The model doesn't work otherwise
Flash sales and auctions
are defined by high data
       contention

• The model doesn't work otherwise
• They can't afford to be optimistic
Can we use pessimistic
 concurrency with a
 distributed NoSQL
       database?
Yep.
Blending
NoSQL & RDBMS


Using the right solution for each situation
Our goal is to put as much
  in Mongo as possible

• What makes more sense in RDBMS?
 • Inventory
 • Orders
Inventory requires


• Row level locking (or table level locking)
Orders require

• Row level locking (or table level locking)
• Atomic writes (inventory decremented)
• Transactions (3rd party processing)
Inventory & checkout
     transactions
Commerce is ACID
   In Real Life
1. I go to Barneys and see a pair of shoes I just have to
   buy.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
5. If my card was declined, the shoes are “rolled back”
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
5. If my card was declined, the shoes are “rolled back”
   ... out onto the shelves and sold to the next customer
   who wants them.
We follow the same
model for e-commerce
1. Select a product.
1. Select a product.

2. Lock the row or table and confirm inventory.
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment

4. Commit the transaction.
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment

4. Commit the transaction.

5. Roll back if anything went wrong.
Doctrine (ORM/ODM)
    to the rescue
Doctrine (ORM/ODM)
    to the rescue
   It would be possible without them,
      but we're not that masochistic
Data we store in SQL

• Order
• Order/Shipment
• Order/Transaction
• Inventory
Data we store in
  MongoDB
Data we store in
             MongoDB
•   User               •   Event

•   Product            •   TaxRate

•   Product/Sellable   •   ... and then I got tired of
                           typing them in
•   Address
                       •   Just imagine this list has
•   Cart                   40 more classes

•   CreditCard         •   ...
We have the
most boring SQL
  schema ever
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Wait. How does
 inventory live in SQL?
Isn’t that a property in one of your Mongo collections?
I thought you’d
   never ask!
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Inventory is transient
•   Product::$inventory is effectively a
    transient property
• Note how I said “effectively”? ... we cheat
    and persist our transient property to
    MongoDB as well
• We can do this because we never really
    trust the value stored in Mongo
Accuracy is only important
 when there’s contention
Accuracy is only important
 when there’s contention
• For display, sorting and alerts, we can use
  the value stashed in MongoDB
  • It’s faster
  • It’s accurate enough
Accuracy is only important
 when there’s contention
• For display, sorting and alerts, we can use
  the value stashed in MongoDB
  • It’s faster
  • It’s accurate enough
• For financial transactions, we want the
  security and comfort of our RDBMS.
We keep inventory in
 sync with listeners
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
• Every time an order is placed, inventory is
  verified and decremented
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
• Every time an order is placed, inventory is
  verified and decremented
• Whenever the SQL inventory changes, it is
  saved to MongoDB as well
Be careful what you lock
Be careful what you lock
1. Acquire inventory row lock and begin transaction
2. Check current product inventory
3. Decrement product inventory
4. Write the Order to SQL
5. Update affected MongoDB documents
6. Commit the transaction
7. Release product inventory lock
Making MongoDB
and RDBMS relations
      play nice
Products are
documents stored
  in MongoDB
/** @mongodb:Document(collection="products") */
class Product
{
    /** @mongodb:Id */
    private $id;

    /** @mongodb:String */
    private $title;

    public function getId()
    {
        return $this->id;
    }

    public function getTitle()
    {
        return $this->title;
    }

    public function setTitle($title)
    {
        $this->title = $title;
    }
}
Orders are entities
stored in an RDBMS
/**
 * @orm:Entity
 * @orm:Table(name="orders")
 * @orm:HasLifecycleCallbacks
 */
class Order
{
    /**
     * @orm:Id @orm:Column(type="integer")
     * @orm:GeneratedValue(strategy="AUTO")
     */
    private $id;

    /**
     * @orm:Column(type="string")
     */
    private $productId;

    /**
     * @var DocumentsProduct
     */
    private $product;

    // ...
}
So how does an
     RDBMS have a
reference to something
 outside the database?
Setting the Product
class Order {

    // ...

    public function setProduct(Product $product)
    {
        $this->productId = $product->getId();
        $this->product = $product;
    }
}
•   $productId is mapped and persisted

•   $product which stores the Product
    instance is not a persistent entity property
Retrieving our
product later
OrderPostLoadListener
use DoctrineORMEventLifecycleEventArgs;

class OrderPostLoadListener
{
    public function postLoad(LifecycleEventArgs $eventArgs)
    {
        // get the order entity
        $order = $eventArgs->getEntity();

        // get odm reference to order.product_id
        $productId = $order->getProductId();
        $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId);

        // set the product on the order
        $em = $eventArgs->getEntityManager();
        $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder')
            ->reflClass->getProperty('product');
        $productReflProp->setAccessible(true);
        $productReflProp->setValue($order, $product);
    }
}
All Together Now
// Create a new product and order
$product = new Product();
$product->setTitle('Test Product');
$dm->persist($product);
$dm->flush();

$order = new Order();
$order->setProduct($product);
$em->persist($order);
$em->flush();

// Find the order later
$order = $em->find('Order', $order->getId());

// Instance of an uninitialized product proxy
$product = $order->getProduct();

// Initializes proxy and queries the monogodb database
echo "Order Title: " . $product->getTitle();
print_r($order);
Read more about
       this technique
Jon Wage, one of OpenSky’s engineers, first
wrote about this technique on his personal
blog: http://jwage.com

You can read the full article here:
http://jwage.com/2010/08/25/blending-the-
doctrine-orm-and-mongodb-odm/
Questions?
   http://spf13.com
      @spf13

   http://justinhileman.com
      @bobthecow

   http://shopopensky.com


PS: We’re hiring!! Contact us at
    jobs@shopopensky.com
Augmenting RDBMS with MongoDB for ecommerce

Augmenting RDBMS with MongoDB for ecommerce

  • 1.
    Augmenting RDBMS with MongoDB for e-commerce
  • 2.
    My name is SteveFrancia @spf13
  • 3.
    • 15+ yearsbuilding e-commerce • Long time open source contributor • Entrepreneur • Hacker, father, husband, skate punk
  • 4.
    My name is JustinHileman @bobthecow
  • 5.
    • 10+ yearsmaking the Internet awesomer • Open Source contributor • Vespa rider, swing dancer, coder, standardista
  • 6.
    We work forOpenSky http://shopopensky.com
  • 7.
    OpenSky is a new way to shop OpenSky connects you with innovators, trendsetters and tastemakers.You choose the ones you like and each week they invite you to their private online sales.
  • 8.
    OpenSky Loves Open Source • PHP 5.3 • Apache2 • Symfony2 • Doctrine2 • jQuery • Mule • HornetQ • MongoDB • nginx • varnish
  • 9.
    We contribute tomany open source projects and pioneer innovative solutions using them
  • 10.
    OpenSky was thefirst e-commerce site built on MongoDB ... also the first e-commerce site built on Symfony2
  • 11.
    Why NoSQL for e-commerce? Using the right solution for each situation
  • 12.
    Data dilemma of e-commerce Pick One
  • 13.
    Data dilemma of e-commerce Pick One • Stick to one vertical (Sane schema)
  • 14.
    Data dilemma of e-commerce Pick One • Stick to one vertical (Sane schema) • Flexibility (Insane schema)
  • 15.
  • 16.
    Sane schema • Works... for a while
  • 17.
    Sane schema • Works... for a while • Fine for a few types of products
  • 18.
    Sane schema • Works... for a while • Fine for a few types of products • Not possible when more product types introduced
  • 19.
  • 20.
    Let’s Use anExample How about we start with books
  • 21.
    Book Product Schema Product{ id: sku: General Product product dimensions: shipping weight: attributes MSRP: price: description: ... author: Orson Scott Card title: Enders Game binding: Hardcover publication date: July 15, 1994 Book Specific publisher name: Tor Science Fiction attributes number of pages: 352 ISBN: 0812550706 language: English ...
  • 22.
  • 23.
    Seems simple enough Whathappens when we add another vertical... say music albums
  • 24.
    Album Product Schema Product{ id: sku: General Product product dimensions: attributes stay the shipping weight: MSRP: same price: description: ... artist: MxPx title: Panic Album Specific release date: June 7, 2005 attributes are label: Side One Dummy track listing: [ The Darkest ... different language: English format: CD ...
  • 25.
    Okay, it’s gettinghairy but is still manageable, right?
  • 26.
    Okay, it’s gettinghairy but is still manageable, right? Now the business want to sell jeans
  • 27.
    Jeans Product Schema Product{ id: General Product sku: product dimensions: attributes stay the shipping weight: same MSRP: price: description: ... brand: Lucky gender: Mens Jeans specific make: Vintage attributes are totally style: Straight Cut different ... and not length: 34 width: 34 consistent across color: Hipster brands & make material: Cotten Blend ...
  • 28.
  • 29.
    We need aflexible schema in RDBMS
  • 30.
    We need aflexible schema in RDBMS We got this ... right?
  • 31.
    Many approaches dealing withunknown unknowns in RDBMS
  • 32.
    Many approaches dealing withunknown unknowns in RDBMS None work well
  • 33.
    EAV as popularized by Magento “For purposes of flexibility, the Magneto database heavily utilizes an Entity-Attribute-Value (EAV) data model. As is often the case, the cost of flexibility is complexity - Magento is no exception. The process of manipulating data in Magento is often more “involved” than that typically experienced using traditional relational tables.” - Varien
  • 34.
    EAV • Crazy SQL queries • Hundreds of joins in a query... or • Hundreds of queries joined in the application • No database enforced integrity
  • 35.
    Did I saycrazy SQL (this is a single query)
  • 36.
    Did I saycrazy SQL (this is a single query) You may have trouble reading this in the back
  • 37.
  • 38.
    Single Table Inheritance (insanely wide tables) • No data integrity enforcement • Only can use FK for common elements • Very wasteful (but disk is cheap!) • Can’t effectively index
  • 39.
    Generic Columns • No data integrity enforcement • No data type enforcement • Only can use FK for common elements • Wasteful (but disk is cheap!) • Can’t index
  • 40.
    Serialized in Blob • Not searchable • No integrity • All the disadvantages of a document store, but none of the advantages • Never should be used • One exception is Oracle XML which operates similar to a document store
  • 41.
    Concrete Table Inheritance (a table for each product attribute set) • Allows for data integrity • Querying across attribute sets quite hard to do (lots of joins, OR statements and full table scanning) • New table needs to be created for each new attribute set
  • 42.
    Class table inheritance (single product table, each attribute set in own table) • Likely best solution within the constraint of SQL • Supports data type enforcement • No data integrity enforcement • Easy querying across categories (for browse pages) since common data in single table • Every set needs a new table • Requires a ton of forsight, as changes are very complicated
  • 43.
  • 44.
    MongoDB to the Rescue • Flexible (and sane) Schema
  • 45.
    MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable
  • 46.
    MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable • Easily accessible
  • 47.
    MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable • Easily accessible • Fast
  • 48.
  • 49.
    { { sku: "00e8da9c", sku: "00e8da9d", type: "Audio Album", type: "Film", title: "Hoss", title: "The Matrix", description: "by Lagwagon", description: "Set in the 22nd century, Th asin: "B0000007QG", asin: "B000P0J0AQ", shipping: { shipping: { weight: 6, weight: 6, dimensions: { dimensions: { width: 10, width: 10, height: 10, height: 10, depth: 1 depth: 1 }, }, }, }, pricing: { pricing: { list: 1000, list: 1200, retail: 800, retail: 1100, savings: 200, savings: 100, pct_savings: 20 pct_savings: 8.5 }, }, details: { details: { title: "Hoss", title: "The Matrix",
  • 50.
    pct_savings: 20 pct_savings: 8.5 }, }, details: { details: { title: "Hoss", title: "The Matrix", artist: "Lagwagon", director: [ "Andy Wachowski", "Larry Wa genre: [ "Punk", "Hardcore", "Indie Rock" ], [ "Andy Wachowski", "Larry Wach writer: label: "Fat Wreck Chords", actor: [ "Keanu Reeves" , "Lawrence Fis number_of_discs: 1, genre: [ "Science Fiction", "Action" ], issue_date: "November 21, 1995", number_of_discs: 1, format: "CD", issue_date: "May 15 2007", alternate_formats: [ 'Vinyl', 'MP3' ],original_release_date: "1999", tracks: [ disc_format: "DVD", "Kids Don't Like To Share", rating: "R", "Violins", alternate_formats: [ 'VHS', 'Bluray' ], "Name Dropping", run_time: "136", "Bombs Away", studio: "Warner Bros", "Move The Car", language: "English", "Sleep", format: [ "AC-3", "Closed-captioned", " "Sick", aspect_ratio: "1.66:1" "Rifle", }, "Weak", } "Black Eye", "Bro Dependent", "Razor Burn", "Shaving Your Head", "Ride The Snake", ],
  • 51.
  • 52.
    db.products.find( { 'name':"The Matrix" } );
  • 53.
    db.products.find( { 'name':"The Matrix" } ); { "_id": ObjectId("4d8ad78b46b731a22943d3d3"), "sku": "00e8da9d", "type": "Film", "name": "The Matrix", "description": "Set in the 22nd century, The Matrix...", "asin": "B000P0J0AQ", "shipping": { "weight": 6, "dimensions": { "width": 10, "height": 10, "depth": 1 } }, "pricing": {
  • 54.
  • 55.
    db.products.find( { 'details.actor':"Groucho Marx" } ); }, "pricing": { "list": 1000, "retail": 800, "savings": 200, "pct_savings": 20 }, "details": { "title": "A Night at the Opera", "director": "Sam Wood", "actor": ["Groucho Marx", "Chico Marx", "Harpo Marx"], "genre": "Comedy", "number_of_discs": 1, "issue_date": "May 4 2004", "original_release_date": "1935", "disc_format": "DVD",
  • 56.
    db.products.find( { 'details.genre': "Jazz", 'details.format': "CD" } );
  • 57.
    db.products.find( { 'details.genre': "Jazz", 'details.format': "CD" } ); "list": 1200, "retail": 1100, "savings": 100, "pct_savings": 8 }, "details": { "title": "A Love Supreme [Original Recording Reissued]", "artist": "John Coltrane", "genre": ["Jazz", "General"], "format": "CD", "label": "Impulse Records", "number_of_discs": 1, "issue_date": "December 9, 1964", "alternate_formats": ["Vinyl", "MP3"], "tracks": [ "A Love Supreme Part I: Acknowledgement",
  • 58.
    db.products.find( { 'details.actor': { $all: ['James Stewart', 'Donna Reed'] } } );
  • 59.
    db.products.find( { 'details.actor': { $all: ['James Stewart', 'Donna Reed'] } } ); }, "details": { "title": "It's a Wonderful Life", "director": "Frank Capra", "actor": ["James Stewart", "Donna Reed", "Lionel Barrymore"], "writer": [ "Frank Capra", "Albert Hackett", "Frances Goodrich", "Jo Swerling", "Michael Wilson" ], "genre": "Drama", "number_of_discs": 1, "issue_date": "Oct 31 2006", "original_release_date": "1947",
  • 60.
    Wanna Play? • grab products.js from http://github.com/spf13/mongoProducts • mongo --shell products.js • > use mongoProducts
  • 61.
    Embedded documents aregreat for orders • Ordered items need to be fixed at the time of purchase • Embed them right in the order db.order.find( { 'items.sku': '00e8da9f' } ); db.order.find( { 'items.details.actor': 'James Stewart' } ).count();
  • 62.
    Why not NoSQL? Usingthe right solution for each situation
  • 63.
    Data (like people)are really sensitive when it comes to money
  • 64.
  • 65.
    Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable
  • 66.
    Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Perhaps you’ve heard of ACID?
  • 67.
  • 68.
    What about ACID? Q:Is MongoDB ACID?
  • 69.
    What about ACID? Q:Is MongoDB ACID? A: Kinda
  • 70.
  • 71.
  • 72.
    Atomicity • MongoDB doesatomic writes ... for single document changesets
  • 73.
    Atomicity • MongoDB doesatomic writes ... for single document changesets • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit
  • 74.
  • 75.
    Consistency • MongoDB canenforce unique keys
  • 76.
    Consistency • MongoDB canenforce unique keys ... but only on keys shared by every document in the collection
  • 77.
    Consistency • MongoDB canenforce unique keys ... but only on keys shared by every document in the collection • MongoDB can't enforce referential integrity
  • 78.
  • 79.
    Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );
  • 80.
    Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true );
  • 81.
    Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats...
  • 82.
    Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats... • Despite the $atomic keyword, this is not an atomic update, since atomicity implies “all or nothing”
  • 83.
    Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats... • Despite the $atomic keyword, this is not an atomic update, since atomicity implies “all or nothing” • An isolated update can only act on a single collection. Multi- collection updates are not transactional, thus not isolatable.
  • 84.
  • 85.
    Durability • Mongo hasthis one covered
  • 86.
  • 88.
    Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write
  • 89.
    Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection
  • 90.
    Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection • Isolated (not atomic) single collection updates. • Mongo does not support locking • There are ways to work around this
  • 91.
    Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection • Isolated (not atomic) single collection updates. • Mongo does not support locking • There are ways to work around this • It’s durable
  • 92.
    There are waysto guarantee ACID properties in inconsistent databases
  • 93.
    There are waysto guarantee ACID properties in inconsistent databases (or, as we call them, consistency impaired databases)
  • 94.
  • 95.
    Optimistic concurrency • Readthe current state of a product
  • 96.
    Optimistic concurrency • Readthe current state of a product • Make your changes with the assertion that your product has the same state as it did when you last read it
  • 97.
  • 98.
    Optimistic concurrency in MongoDB We’ll use an update-if-current strategy.
  • 99.
    Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation:
  • 100.
    Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation: > t = db.inventory > p = t.findOne({sku:'abc'}) > t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}}); > db.$cmd.findOne({getlasterror:1}); {"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked
  • 101.
    Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation: > t = db.inventory > p = t.findOne({sku:'abc'}) > t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}}); > db.$cmd.findOne({getlasterror:1}); {"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked ... If that didn't work, try again until it does.
  • 102.
    Optimistic concurrency • Readthe current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it.
  • 103.
    Optimistic concurrency • Readthe current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it. • It's possible to use OCC to bootstrap pessimistic concurrency and fake row level locking
  • 104.
    Optimistic concurrency • Readthe current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it. • It's possible to use OCC to bootstrap pessimistic concurrency and fake row level locking ... ask me about this some time
  • 105.
    Optimistic concurrency controlassumes an environment with low data contention
  • 106.
    OCC works greatfor companies like Amazon • Amazon has a long-tail catalog • A long tail catalog lends itself well to optimistic concurrency, because it has low data contention
  • 107.
  • 108.
    OCC fails miserablyfor • eBay
  • 109.
    OCC fails miserablyfor • eBay • Gilt
  • 110.
    OCC fails miserablyfor • eBay • Gilt • Groupon
  • 111.
    OCC fails miserablyfor • eBay • Gilt • Groupon • OpenSky
  • 112.
    OCC fails miserablyfor • eBay • Gilt • Groupon • OpenSky • Living Social
  • 113.
    OCC fails miserablyfor • eBay • Gilt • Groupon • OpenSky • Living Social • InsertFlashSaleSiteOfTheMinute
  • 114.
    Flash sales andauctions are defined by high data contention
  • 115.
    Flash sales andauctions are defined by high data contention • The model doesn't work otherwise
  • 116.
    Flash sales andauctions are defined by high data contention • The model doesn't work otherwise • They can't afford to be optimistic
  • 117.
    Can we usepessimistic concurrency with a distributed NoSQL database?
  • 118.
  • 119.
    Blending NoSQL & RDBMS Usingthe right solution for each situation
  • 120.
    Our goal isto put as much in Mongo as possible • What makes more sense in RDBMS? • Inventory • Orders
  • 121.
    Inventory requires • Rowlevel locking (or table level locking)
  • 122.
    Orders require • Rowlevel locking (or table level locking) • Atomic writes (inventory decremented) • Transactions (3rd party processing)
  • 123.
  • 124.
    Commerce is ACID In Real Life
  • 126.
    1. I goto Barneys and see a pair of shoes I just have to buy.
  • 127.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf).
  • 128.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them:
  • 129.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented.
  • 130.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx.
  • 131.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store.
  • 132.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store. 5. If my card was declined, the shoes are “rolled back”
  • 133.
    1. I goto Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store. 5. If my card was declined, the shoes are “rolled back” ... out onto the shelves and sold to the next customer who wants them.
  • 134.
    We follow thesame model for e-commerce
  • 136.
    1. Select aproduct.
  • 137.
    1. Select aproduct. 2. Lock the row or table and confirm inventory.
  • 138.
    1. Select aproduct. 2. Lock the row or table and confirm inventory. 3. Purchase the product:
  • 139.
    1. Select aproduct. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory
  • 140.
    1. Select aproduct. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment
  • 141.
    1. Select aproduct. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment 4. Commit the transaction.
  • 142.
    1. Select aproduct. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment 4. Commit the transaction. 5. Roll back if anything went wrong.
  • 143.
    Doctrine (ORM/ODM) to the rescue
  • 144.
    Doctrine (ORM/ODM) to the rescue It would be possible without them, but we're not that masochistic
  • 145.
    Data we storein SQL • Order • Order/Shipment • Order/Transaction • Inventory
  • 146.
    Data we storein MongoDB
  • 147.
    Data we storein MongoDB • User • Event • Product • TaxRate • Product/Sellable • ... and then I got tired of typing them in • Address • Just imagine this list has • Cart 40 more classes • CreditCard • ...
  • 148.
    We have the mostboring SQL schema ever
  • 149.
    CREATE TABLE `product_inventory`( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 150.
    Wait. How does inventory live in SQL? Isn’t that a property in one of your Mongo collections?
  • 151.
  • 152.
    CREATE TABLE `product_inventory`( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 153.
    Inventory is transient • Product::$inventory is effectively a transient property • Note how I said “effectively”? ... we cheat and persist our transient property to MongoDB as well • We can do this because we never really trust the value stored in Mongo
  • 154.
    Accuracy is onlyimportant when there’s contention
  • 155.
    Accuracy is onlyimportant when there’s contention • For display, sorting and alerts, we can use the value stashed in MongoDB • It’s faster • It’s accurate enough
  • 156.
    Accuracy is onlyimportant when there’s contention • For display, sorting and alerts, we can use the value stashed in MongoDB • It’s faster • It’s accurate enough • For financial transactions, we want the security and comfort of our RDBMS.
  • 157.
    We keep inventoryin sync with listeners
  • 158.
    We keep inventoryin sync with listeners • Every time a new product is created, its inventory is inserted in SQL
  • 159.
    We keep inventoryin sync with listeners • Every time a new product is created, its inventory is inserted in SQL • Every time an order is placed, inventory is verified and decremented
  • 160.
    We keep inventoryin sync with listeners • Every time a new product is created, its inventory is inserted in SQL • Every time an order is placed, inventory is verified and decremented • Whenever the SQL inventory changes, it is saved to MongoDB as well
  • 161.
  • 162.
    Be careful whatyou lock 1. Acquire inventory row lock and begin transaction 2. Check current product inventory 3. Decrement product inventory 4. Write the Order to SQL 5. Update affected MongoDB documents 6. Commit the transaction 7. Release product inventory lock
  • 163.
    Making MongoDB and RDBMSrelations play nice
  • 164.
  • 165.
    /** @mongodb:Document(collection="products") */ classProduct { /** @mongodb:Id */ private $id; /** @mongodb:String */ private $title; public function getId() { return $this->id; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; } }
  • 166.
  • 167.
    /** * @orm:Entity * @orm:Table(name="orders") * @orm:HasLifecycleCallbacks */ class Order { /** * @orm:Id @orm:Column(type="integer") * @orm:GeneratedValue(strategy="AUTO") */ private $id; /** * @orm:Column(type="string") */ private $productId; /** * @var DocumentsProduct */ private $product; // ... }
  • 168.
    So how doesan RDBMS have a reference to something outside the database?
  • 169.
    Setting the Product classOrder { // ... public function setProduct(Product $product) { $this->productId = $product->getId(); $this->product = $product; } }
  • 170.
    $productId is mapped and persisted • $product which stores the Product instance is not a persistent entity property
  • 171.
  • 172.
    OrderPostLoadListener use DoctrineORMEventLifecycleEventArgs; class OrderPostLoadListener { public function postLoad(LifecycleEventArgs $eventArgs) { // get the order entity $order = $eventArgs->getEntity(); // get odm reference to order.product_id $productId = $order->getProductId(); $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId); // set the product on the order $em = $eventArgs->getEntityManager(); $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder') ->reflClass->getProperty('product'); $productReflProp->setAccessible(true); $productReflProp->setValue($order, $product); } }
  • 173.
    All Together Now //Create a new product and order $product = new Product(); $product->setTitle('Test Product'); $dm->persist($product); $dm->flush(); $order = new Order(); $order->setProduct($product); $em->persist($order); $em->flush(); // Find the order later $order = $em->find('Order', $order->getId()); // Instance of an uninitialized product proxy $product = $order->getProduct(); // Initializes proxy and queries the monogodb database echo "Order Title: " . $product->getTitle(); print_r($order);
  • 174.
    Read more about this technique Jon Wage, one of OpenSky’s engineers, first wrote about this technique on his personal blog: http://jwage.com You can read the full article here: http://jwage.com/2010/08/25/blending-the- doctrine-orm-and-mongodb-odm/
  • 175.
    Questions? http://spf13.com @spf13 http://justinhileman.com @bobthecow http://shopopensky.com PS: We’re hiring!! Contact us at jobs@shopopensky.com

Editor's Notes

  • #2 \n
  • #3 \n
  • #4 \n
  • #5 \n
  • #6 \n
  • #7 \n
  • #8 \n
  • #9 \n
  • #10 \n
  • #11 \n
  • #12 \n
  • #13 \n
  • #14 \n
  • #15 \n
  • #16 \n
  • #17 \n
  • #18 \n
  • #19 \n
  • #20 \n
  • #21 \n
  • #22 \n
  • #23 \n
  • #24 \n
  • #25 \n
  • #26 \n
  • #27 \n
  • #28 \n
  • #29 \n
  • #30 actually, just the first 1/3 of it. \n
  • #31 \n
  • #32 Ironically this is how magento solves the performance problems associated with EAV, by caching the data into insanely wide tables.\n
  • #33 \n
  • #34 \n
  • #35 \n
  • #36 Can’t create a FK as each set references a different table. “Key” really made of attribute table name id and attribute table name\n
  • #37 \n
  • #38 \n
  • #39 \n
  • #40 \n
  • #41 \n
  • #42 \n
  • #43 \n
  • #44 \n
  • #45 \n
  • #46 \n
  • #47 \n
  • #48 \n
  • #49 \n
  • #50 \n
  • #51 \n
  • #52 \n
  • #53 \n
  • #54 \n
  • #55 \n
  • #56 \n
  • #57 Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. \n\nkyle idea... but we are fairly atomic with authorize.net\n\n
  • #58 Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. \n\nkyle idea... but we are fairly atomic with authorize.net\n\n
  • #59 atomicity, consistency, isolation, durability.\n\n
  • #60 atomicity, consistency, isolation, durability.\n\n
  • #61 Mongo has a grip of atomic operations: set, unset, etc.\n
  • #62 Mongo has a grip of atomic operations: set, unset, etc.\n
  • #63 Mongo has a grip of atomic operations: set, unset, etc.\n
  • #64 \n
  • #65 \n
  • #66 \n
  • #67 update( { where }, { values }, upsert?, multiple? )\n\n\n
  • #68 update( { where }, { values }, upsert?, multiple? )\n\n\n
  • #69 update( { where }, { values }, upsert?, multiple? )\n\n\n
  • #70 update( { where }, { values }, upsert?, multiple? )\n\n\n
  • #71 update( { where }, { values }, upsert?, multiple? )\n\n\n
  • #72 \n
  • #73 \n
  • #74 \n
  • #75 \n
  • #76 \n
  • #77 \n
  • #78 \n
  • #79 lemme show you an example\n
  • #80 lemme show you an example\n
  • #81 \n
  • #82 \n
  • #83 \n
  • #84 \n
  • #85 \n
  • #86 \n
  • #87 Imagine what would happen if everyone tried to access the same record at the same time. Just think of all those spinning while loops :)\n
  • #88 \n
  • #89 \n
  • #90 \n
  • #91 \n
  • #92 \n
  • #93 \n
  • #94 \n
  • #95 \n
  • #96 \n
  • #97 \n
  • #98 And I’ll show you how OpenSky does it.\n
  • #99 \n
  • #100 Since we really like MongoDB, we want to keep as much of our data in Mongo as possible.\n
  • #101 \n
  • #102 \n
  • #103 \n
  • #104 Mind if I tell you a story?\n
  • #105 \n
  • #106 \n
  • #107 \n
  • #108 \n
  • #109 \n
  • #110 \n
  • #111 \n
  • #112 \n
  • #113 \n
  • #114 But this sounds like it could get COMPLICATED...\n
  • #115 But this sounds like it could get COMPLICATED...\n
  • #116 But this sounds like it could get COMPLICATED...\n
  • #117 But this sounds like it could get COMPLICATED...\n
  • #118 But this sounds like it could get COMPLICATED...\n
  • #119 But this sounds like it could get COMPLICATED...\n
  • #120 But this sounds like it could get COMPLICATED...\n
  • #121 \n
  • #122 \n
  • #123 \n
  • #124 \n
  • #125 \n
  • #126 \n
  • #127 \n
  • #128 \n
  • #129 \n
  • #130 \n
  • #131 \n
  • #132 \n
  • #133 \n
  • #134 Given that split, we just happen to have the most boring SQL schema ever\n
  • #135 This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
  • #136 \n
  • #137 \n
  • #138 Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
  • #139 \n
  • #140 \n
  • #141 \n
  • #142 And here’s why we like Doctrine so much.\n
  • #143 And here’s why we like Doctrine so much.\n
  • #144 And here’s why we like Doctrine so much.\n
  • #145 This will look a bit like when I bought those shoes.\n
  • #146 This will look a bit like when I bought those shoes.\n
  • #147 This will look a bit like when I bought those shoes.\n
  • #148 This will look a bit like when I bought those shoes.\n
  • #149 This will look a bit like when I bought those shoes.\n
  • #150 This will look a bit like when I bought those shoes.\n
  • #151 This will look a bit like when I bought those shoes.\n
  • #152 \n
  • #153 \n
  • #154 The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
  • #155 \n
  • #156 \n
  • #157 Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
  • #158 This is key: we set both the product id and a reference to the product itself.\n
  • #159 When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
  • #160 \n
  • #161 This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
  • #162 Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n
  • #163 \n
  • #164 \n
  • #165 \n