Hybrid MongoDB  Applications  with Relational Databases
Today’s Agenda•Who I am•Why MongoDB w/intro•Why Hybrid•Hybrid Case Studies•How OpenSky implemented Hybrid MySQL / MongoDB
My name isSteve Francia     @spf13
•15+ years building the  internet (13 years using SQL)•Father, husband,  skateboarder•Chief Solutions Architect @  10gen r...
• Company behind MongoDB • AGPL license, own copyrights, engineering team • support, consulting, commercial license revenu...
Before 10gen Iworked    for     http://opensky.com
OpenSky was the firste-commerce site built    on MongoDB
Why MongoDB
Why MongoDB                My Top 10 Reasons 10. Great developer experience  9. Speaks your language  8. Scale horizontall...
Why MongoDB                My Top 10 Reasons 10. Great developer experience  9. Speaks your language  8. Scale horizontall...
MongoDB is          Application      Document                           Oriented                           { author: “stev...
Under the hood• Written in C++• Runs on nearly anything• Data serialized to BSON• Extensive use of memory-mapped files  i.e...
Database Landscape                            MemCacheScalability & Performance                                           ...
This has led    some to say“MongoDB has the bestfeatures of key/valuesstores, document databasesand relational databases i...
Why Hybrid?
Reasons to build a  hybrid application•Friction in existing application caused  by RDBMS•Transitioning an existing applica...
Reasons Not to builda hybrid application•Aggregation (at least not very soon)•Lack of clear understanding of needs•Backups...
HybridApplications...  but I don’t   want to complicate    things
Most  RDMBSapplicationsare already  hybrid
Typical RDMBS  Application        Memcache App         RDBMS
Typical HybridRDMBS Application          MongoDB   App           RDBMS
Most of the same     rules apply•Application partitions data between  two (or more) systems.•Model layer tracks what conte...
Hybrid is easier thanRDMBS + MemCache• Always know where to find a piece of data.• Data never needs expiring.• Data not dup...
Typical RDBMSretrieval operation       exists & up to date?        if yes... then done     Memcache       if no, query DB ...
Typical HybridRetrieval Operation         find        return   MongoDB  App   OR        query        return   RDBMS
Typical RDMBSwrite operation       insert or update row         confirm written         RDBMS      assemble into object(s)A...
Typical RDMBSwrite operation         insert or update row            confirm written              RDBMS        assemble int...
Typical RDMBS   write operation               insert or update row                  confirm written              RDBMS     ...
Typical HybridWrite Operation      save document           return        MongoDBApp        OR      insert or update       ...
Typical HybridWrite Operation      save document           return        MongoDBApp        OR      insert or update       ...
Hybrid Use Cases
ArchivingWhy Hybrid:• Existing application built on MySQL• Lots of friction with RDBMS based archive storage• Needed more ...
ReportingWhy Hybrid:• Most of the functionality written in MongoDB• Reporting team doesn’t want to learn MongoDBSolution:•...
E-commerceWhy Hybrid:• Multi-vertical product catalogue impossible to model in RDBMS• Needed transaction support RDBMS pro...
How        implemented a      hybrid MongoDB /      MySQL solution       http://opensky.com
Doctrine (ORM/ODM)   makes it easy
Data to store in SQL•Order•Order/Shipment•Order/Transaction•Inventory
Data to store in  MongoDB
Data to store in        MongoDB• User               • Event• Product            • TaxRate• Product/Sellable   • ... and th...
The most boring SQL    schema ever
CREATE TABLE `product_inventory` (   `product_id` char(32) NOT NULL,   `inventory` int(11) NOT NULL DEFAULT 0,   PRIMARY K...
Did you noticeInventory is in SQLBut it’s also property in your Mongo collections?
CREATE TABLE `product_inventory` (   `product_id` char(32) NOT NULL,   `inventory` int(11) NOT NULL DEFAULT 0,   PRIMARY K...
Inventory is transient
Inventory is         transient• Product::$inventory is effectively a  transient property•Note how I said “effectively”? .....
Accuracy is only important when there’s contention
Accuracy is only important when there’s contention•For display, sorting and alerts, we can  use the value stashed in Mongo...
Accuracy is only important when there’s contention•For display, sorting and alerts, we can  use the value stashed in Mongo...
Inventory kept insync with listeners
Inventory kept in sync with listeners•Every time a new product is created,  its inventory is inserted in SQL
Inventory kept in sync with listeners•Every time a new product is created,  its inventory is inserted in SQL•Every time an...
Inventory kept in sync with listeners•Every time a new product is created,  its inventory is inserted in SQL•Every time an...
Be careful what you        lock
Be careful what you         lock1. Acquire inventory row lock and begin   transaction2. Check current product inventory3. ...
Making MongoDBand RDBMS relations     play nice
Products aredocuments stored  in MongoDB
/** @mongodb:Document(collection="products") */class Product{    /** @mongodb:Id */    private $id;    /** @mongodb:String...
Orders are entitiesstored in an RDBMS
/**  * @orm:Entity  * @orm:Table(name="orders")  * @orm:HasLifecycleCallbacks  */class Order{     /**      * @orm:Id @orm:...
So how does an    RDBMS have areference to something outside the database?
Setting the Productclass Order {    // ...    public function setProduct(Product $product)    {        $this->productId = ...
• $productId is mapped and persisted• $product which stores the Product  instance is not a persistent entity  property
Retrieving ourproduct later
OrderPostLoadListeneruse DoctrineORMEventLifecycleEventArgs;class OrderPostLoadListener{    public function postLoad(Lifec...
All Together Now// Create a new product and order$product = new Product();$product->setTitle(Test Product);$dm->persist($p...
Read more about      this techniqueJon Wage, one of OpenSky’s engineers, firstwrote about this technique on his personalblo...
http://spf13.com                                http://github.com/spf13                                @spf13           Qu...
Hybrid MongoDB and RDBMS Applications
Upcoming SlideShare
Loading in...5
×

Hybrid MongoDB and RDBMS Applications

27,743

Published on

Published in: Technology
2 Comments
68 Likes
Statistics
Notes
No Downloads
Views
Total Views
27,743
On Slideshare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
662
Comments
2
Likes
68
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Given that split, we just happen to have the most boring SQL schema ever\n
  • This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
  • \n
  • Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • And here’s why we like Doctrine so much.\n
  • And here’s why we like Doctrine so much.\n
  • And here’s why we like Doctrine so much.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • \n
  • \n
  • The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
  • \n
  • \n
  • Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
  • This is key: we set both the product id and a reference to the product itself.\n
  • When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
  • \n
  • This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
  • Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n
  • \n
  • \n
  • \n
  • Hybrid MongoDB and RDBMS Applications

    1. 1. Hybrid MongoDB Applications with Relational Databases
    2. 2. Today’s Agenda•Who I am•Why MongoDB w/intro•Why Hybrid•Hybrid Case Studies•How OpenSky implemented Hybrid MySQL / MongoDB
    3. 3. My name isSteve Francia @spf13
    4. 4. •15+ years building the internet (13 years using SQL)•Father, husband, skateboarder•Chief Solutions Architect @ 10gen responsible for drivers, integrations, web & docs
    5. 5. • Company behind MongoDB • AGPL license, own copyrights, engineering team • support, consulting, commercial license revenue• Management • Google/DoubleClick, Oracle, Apple, NetApp • Funding: Sequoia, Union Square, Flybridge • Offices in NYC, Palo Alto, London & Dublin • 90+ employees
    6. 6. Before 10gen Iworked for http://opensky.com
    7. 7. OpenSky was the firste-commerce site built on MongoDB
    8. 8. Why MongoDB
    9. 9. Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
    10. 10. Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
    11. 11. MongoDB is Application Document Oriented { author: “steve”, High date: new Date(), text: “About MongoDB...”,Performance tags: [“tech”, “database”]} Fully Consistent Horizontally Scalable
    12. 12. Under the hood• Written in C++• Runs on nearly anything• Data serialized to BSON• Extensive use of memory-mapped files i.e. read-through write-through memory caching.
    13. 13. Database Landscape MemCacheScalability & Performance MongoDB RDBMS Depth of Functionality
    14. 14. This has led some to say“MongoDB has the bestfeatures of key/valuesstores, document databasesand relational databases inone. John Nunemaker
    15. 15. Why Hybrid?
    16. 16. Reasons to build a hybrid application•Friction in existing application caused by RDBMS•Transitioning an existing application to MongoDB•Using the right tool for the right job•Need some features not present in MongoDB
    17. 17. Reasons Not to builda hybrid application•Aggregation (at least not very soon)•Lack of clear understanding of needs•Backups•MongoDB as cache in front of SQL•Loads more...
    18. 18. HybridApplications... but I don’t want to complicate things
    19. 19. Most RDMBSapplicationsare already hybrid
    20. 20. Typical RDMBS Application Memcache App RDBMS
    21. 21. Typical HybridRDMBS Application MongoDB App RDBMS
    22. 22. Most of the same rules apply•Application partitions data between two (or more) systems.•Model layer tracks what content resides where.
    23. 23. Hybrid is easier thanRDMBS + MemCache• Always know where to find a piece of data.• Data never needs expiring.• Data not duplicated (for the most part) across systems.• Always handle a record same way.• Developer freedom to choose the right tool for the right reasons.
    24. 24. Typical RDBMSretrieval operation exists & up to date? if yes... then done Memcache if no, query DB for it Retrieve record(s) RDBMS App Replace in cache Memcache Repeat
    25. 25. Typical HybridRetrieval Operation find return MongoDB App OR query return RDBMS
    26. 26. Typical RDMBSwrite operation insert or update row confirm written RDBMS assemble into object(s)App write object Memcache
    27. 27. Typical RDMBSwrite operation insert or update row confirm written RDBMS assemble into object(s)App write object write object write object Memcache write object This goes on for a while doesn’t it?
    28. 28. Typical RDMBS write operation insert or update row confirm written RDBMS assemble into object(s) App write object write object write object Memcache write object This goes on for a while doesn’t it? one row can be in many objects so there’sa lot of complication in updating everything
    29. 29. Typical HybridWrite Operation save document return MongoDBApp OR insert or update return RDBMS
    30. 30. Typical HybridWrite Operation save document return MongoDBApp OR insert or update return RDBMS
    31. 31. Hybrid Use Cases
    32. 32. ArchivingWhy Hybrid:• Existing application built on MySQL• Lots of friction with RDBMS based archive storage• Needed more scalable archive storage backendSolution:• Keep MySQL for active data (100mil), MongoDB for archive (2+ bil)Results:• No more alter table statements taking over 2 months to run• Sharding fixed vertical scale problem• Very happily looking at other places to use MongoDB
    33. 33. ReportingWhy Hybrid:• Most of the functionality written in MongoDB• Reporting team doesn’t want to learn MongoDBSolution:• Use MongoDB for active database, replicate to MySQL for reportingResults:• Developers happy• Business Analysts happy
    34. 34. E-commerceWhy Hybrid:• Multi-vertical product catalogue impossible to model in RDBMS• Needed transaction support RDBMS providesSolution:• MySQL for orders, MongoDB for everything elseResults:• Massive simplification of code base• Rapidly build, halving time to market (and cost)• Eliminated need for external caching system• 50x+ improvement over MySQL alone
    35. 35. How implemented a hybrid MongoDB / MySQL solution http://opensky.com
    36. 36. Doctrine (ORM/ODM) makes it easy
    37. 37. Data to store in SQL•Order•Order/Shipment•Order/Transaction•Inventory
    38. 38. Data to store in MongoDB
    39. 39. Data to store in MongoDB• User • Event• Product • TaxRate• Product/Sellable • ... and then I got tired of typing them in• Address • Just imagine this list• Cart has 40 more classes• CreditCard • ...
    40. 40. The most boring SQL schema ever
    41. 41. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`product_id`));CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`sellable_id`));CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
    42. 42. Did you noticeInventory is in SQLBut it’s also property in your Mongo collections?
    43. 43. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`product_id`));CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`sellable_id`));CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
    44. 44. Inventory is transient
    45. 45. Inventory is transient• Product::$inventory is effectively a transient property•Note how I said “effectively”? ... we cheat and persist our transient property to MongoDB as well•We can do this because we never really trust the value stored in Mongo
    46. 46. Accuracy is only important when there’s contention
    47. 47. Accuracy is only important when there’s contention•For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough
    48. 48. Accuracy is only important when there’s contention•For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough•For financial transactions, we want the multi table transactions from our RDBMS.
    49. 49. Inventory kept insync with listeners
    50. 50. Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL
    51. 51. Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL•Every time an order is placed, inventory is verified and decremented
    52. 52. Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL•Every time an order is placed, inventory is verified and decremented•Whenever the SQL inventory changes, it is saved to MongoDB as well
    53. 53. Be careful what you lock
    54. 54. Be careful what you lock1. Acquire inventory row lock and begin transaction2. Check current product inventory3. Decrement product inventory4. Write the Order to SQL5. Update affected MongoDB documents6. Commit the transaction7. Release product inventory lock
    55. 55. Making MongoDBand RDBMS relations play nice
    56. 56. Products aredocuments stored in MongoDB
    57. 57. /** @mongodb:Document(collection="products") */class Product{ /** @mongodb:Id */ private $id; /** @mongodb:String */ private $title; public function getId() { return $this->id; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; }}
    58. 58. Orders are entitiesstored in an RDBMS
    59. 59. /** * @orm:Entity * @orm:Table(name="orders") * @orm:HasLifecycleCallbacks */class Order{ /** * @orm:Id @orm:Column(type="integer") * @orm:GeneratedValue(strategy="AUTO") */ private $id; /** * @orm:Column(type="string") */ private $productId; /** * @var DocumentsProduct */ private $product; // ...}
    60. 60. So how does an RDBMS have areference to something outside the database?
    61. 61. Setting the Productclass Order { // ... public function setProduct(Product $product) { $this->productId = $product->getId(); $this->product = $product; }}
    62. 62. • $productId is mapped and persisted• $product which stores the Product instance is not a persistent entity property
    63. 63. Retrieving ourproduct later
    64. 64. OrderPostLoadListeneruse DoctrineORMEventLifecycleEventArgs;class OrderPostLoadListener{ public function postLoad(LifecycleEventArgs $eventArgs) { // get the order entity $order = $eventArgs->getEntity(); // get odm reference to order.product_id $productId = $order->getProductId(); $product = $this->dm->getReference(MyBundle:DocumentProduct, $productId); // set the product on the order $em = $eventArgs->getEntityManager(); $productReflProp = $em->getClassMetadata(MyBundle:EntityOrder) ->reflClass->getProperty(product); $productReflProp->setAccessible(true); $productReflProp->setValue($order, $product); }}
    65. 65. All Together Now// Create a new product and order$product = new Product();$product->setTitle(Test Product);$dm->persist($product);$dm->flush();$order = new Order();$order->setProduct($product);$em->persist($order);$em->flush();// Find the order later$order = $em->find(Order, $order->getId());// Instance of an uninitialized product proxy$product = $order->getProduct();// Initializes proxy and queries the monogodb databaseecho "Order Title: " . $product->getTitle();print_r($order);
    66. 66. Read more about this techniqueJon Wage, one of OpenSky’s engineers, firstwrote about this technique on his personalblog: http://jwage.comYou can read the full article here:http://jwage.com/2010/08/25/blending-the-doctrine-orm-and-mongodb-odm/
    67. 67. http://spf13.com http://github.com/spf13 @spf13 Questions? download at mongodb.orgPS: We’re hiring!! Contact us at jobs@10gen.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×