Hybrid MongoDB and RDBMS Applications
Upcoming SlideShare
Loading in...5
×
 

Hybrid MongoDB and RDBMS Applications

on

  • 23,371 views

 

Statistics

Views

Total Views
23,371
Views on SlideShare
17,071
Embed Views
6,300

Actions

Likes
60
Downloads
517
Comments
1

24 Embeds 6,300

http://spf13.com 3838
http://www.abstra.cc 1493
http://www.scoop.it 574
http://asyncionews.com 251
http://feeds.feedburner.com 40
http://a0.twimg.com 19
http://webcache.googleusercontent.com 15
https://twitter.com 14
http://blog.fasoulas.com 12
http://www.sfexception.com 7
http://www.techgig.com 7
http://translate.googleusercontent.com 4
http://www.newsblur.com 4
http://localhost 4
http://posterous.com 3
http://tests.webogroup.com 3
http://www.linkedin.com 3
http://www.pinterest.com 2
http://twitter.com 2
http://www.techgig.timesjobs.com 1
http://pinterest.com 1
https://si0.twimg.com 1
http://115.112.206.131 1
http://www.m.techgig.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Given that split, we just happen to have the most boring SQL schema ever\n
  • This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
  • \n
  • Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • And here’s why we like Doctrine so much.\n
  • And here’s why we like Doctrine so much.\n
  • And here’s why we like Doctrine so much.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • This will look a bit like when I bought those shoes.\n
  • \n
  • \n
  • The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
  • \n
  • \n
  • Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
  • This is key: we set both the product id and a reference to the product itself.\n
  • When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
  • \n
  • This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
  • Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n
  • \n
  • \n
  • \n

Hybrid MongoDB and RDBMS Applications Hybrid MongoDB and RDBMS Applications Presentation Transcript

  • Hybrid MongoDB Applications with Relational Databases
  • Today’s Agenda•Who I am•Why MongoDB w/intro•Why Hybrid•Hybrid Case Studies•How OpenSky implemented Hybrid MySQL / MongoDB
  • My name isSteve Francia @spf13
  • •15+ years building the internet (13 years using SQL)•Father, husband, skateboarder•Chief Solutions Architect @ 10gen responsible for drivers, integrations, web & docs
  • • Company behind MongoDB • AGPL license, own copyrights, engineering team • support, consulting, commercial license revenue• Management • Google/DoubleClick, Oracle, Apple, NetApp • Funding: Sequoia, Union Square, Flybridge • Offices in NYC, Palo Alto, London & Dublin • 90+ employees
  • Before 10gen Iworked for http://opensky.com
  • OpenSky was the firste-commerce site built on MongoDB
  • Why MongoDB
  • Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
  • Why MongoDB My Top 10 Reasons 10. Great developer experience 9. Speaks your language 8. Scale horizontally 7. Fully consistent data w/atomic operations1.It’ssource scale web 6. Memory Caching integrated 5. Open 4. Flexible, rich & structured data format not just K:V 3. Ludicrously fast (without going plaid) 2. Simplify infrastructure & application
  • MongoDB is Application Document Oriented { author: “steve”, High date: new Date(), text: “About MongoDB...”,Performance tags: [“tech”, “database”]} Fully Consistent Horizontally Scalable
  • Under the hood• Written in C++• Runs on nearly anything• Data serialized to BSON• Extensive use of memory-mapped files i.e. read-through write-through memory caching.
  • Database Landscape MemCacheScalability & Performance MongoDB RDBMS Depth of Functionality
  • This has led some to say“MongoDB has the bestfeatures of key/valuesstores, document databasesand relational databases inone. John Nunemaker
  • Why Hybrid?
  • Reasons to build a hybrid application•Friction in existing application caused by RDBMS•Transitioning an existing application to MongoDB•Using the right tool for the right job•Need some features not present in MongoDB
  • Reasons Not to builda hybrid application•Aggregation (at least not very soon)•Lack of clear understanding of needs•Backups•MongoDB as cache in front of SQL•Loads more...
  • HybridApplications... but I don’t want to complicate things
  • Most RDMBSapplicationsare already hybrid
  • Typical RDMBS Application Memcache App RDBMS
  • Typical HybridRDMBS Application MongoDB App RDBMS
  • Most of the same rules apply•Application partitions data between two (or more) systems.•Model layer tracks what content resides where.
  • Hybrid is easier thanRDMBS + MemCache• Always know where to find a piece of data.• Data never needs expiring.• Data not duplicated (for the most part) across systems.• Always handle a record same way.• Developer freedom to choose the right tool for the right reasons.
  • Typical RDBMSretrieval operation exists & up to date? if yes... then done Memcache if no, query DB for it Retrieve record(s) RDBMS App Replace in cache Memcache Repeat
  • Typical HybridRetrieval Operation find return MongoDB App OR query return RDBMS
  • Typical RDMBSwrite operation insert or update row confirm written RDBMS assemble into object(s)App write object Memcache
  • Typical RDMBSwrite operation insert or update row confirm written RDBMS assemble into object(s)App write object write object write object Memcache write object This goes on for a while doesn’t it?
  • Typical RDMBS write operation insert or update row confirm written RDBMS assemble into object(s) App write object write object write object Memcache write object This goes on for a while doesn’t it? one row can be in many objects so there’sa lot of complication in updating everything
  • Typical HybridWrite Operation save document return MongoDBApp OR insert or update return RDBMS
  • Typical HybridWrite Operation save document return MongoDBApp OR insert or update return RDBMS
  • Hybrid Use Cases
  • ArchivingWhy Hybrid:• Existing application built on MySQL• Lots of friction with RDBMS based archive storage• Needed more scalable archive storage backendSolution:• Keep MySQL for active data (100mil), MongoDB for archive (2+ bil)Results:• No more alter table statements taking over 2 months to run• Sharding fixed vertical scale problem• Very happily looking at other places to use MongoDB
  • ReportingWhy Hybrid:• Most of the functionality written in MongoDB• Reporting team doesn’t want to learn MongoDBSolution:• Use MongoDB for active database, replicate to MySQL for reportingResults:• Developers happy• Business Analysts happy
  • E-commerceWhy Hybrid:• Multi-vertical product catalogue impossible to model in RDBMS• Needed transaction support RDBMS providesSolution:• MySQL for orders, MongoDB for everything elseResults:• Massive simplification of code base• Rapidly build, halving time to market (and cost)• Eliminated need for external caching system• 50x+ improvement over MySQL alone
  • How implemented a hybrid MongoDB / MySQL solution http://opensky.com
  • Doctrine (ORM/ODM) makes it easy
  • Data to store in SQL•Order•Order/Shipment•Order/Transaction•Inventory
  • Data to store in MongoDB
  • Data to store in MongoDB• User • Event• Product • TaxRate• Product/Sellable • ... and then I got tired of typing them in• Address • Just imagine this list• Cart has 40 more classes• CreditCard • ...
  • The most boring SQL schema ever
  • CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`product_id`));CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`sellable_id`));CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • Did you noticeInventory is in SQLBut it’s also property in your Mongo collections?
  • CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`product_id`));CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT 0, PRIMARY KEY (`sellable_id`));CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • Inventory is transient
  • Inventory is transient• Product::$inventory is effectively a transient property•Note how I said “effectively”? ... we cheat and persist our transient property to MongoDB as well•We can do this because we never really trust the value stored in Mongo
  • Accuracy is only important when there’s contention
  • Accuracy is only important when there’s contention•For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough
  • Accuracy is only important when there’s contention•For display, sorting and alerts, we can use the value stashed in MongoDB •It’s faster •It’s accurate enough•For financial transactions, we want the multi table transactions from our RDBMS.
  • Inventory kept insync with listeners
  • Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL
  • Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL•Every time an order is placed, inventory is verified and decremented
  • Inventory kept in sync with listeners•Every time a new product is created, its inventory is inserted in SQL•Every time an order is placed, inventory is verified and decremented•Whenever the SQL inventory changes, it is saved to MongoDB as well
  • Be careful what you lock
  • Be careful what you lock1. Acquire inventory row lock and begin transaction2. Check current product inventory3. Decrement product inventory4. Write the Order to SQL5. Update affected MongoDB documents6. Commit the transaction7. Release product inventory lock
  • Making MongoDBand RDBMS relations play nice
  • Products aredocuments stored in MongoDB
  • /** @mongodb:Document(collection="products") */class Product{ /** @mongodb:Id */ private $id; /** @mongodb:String */ private $title; public function getId() { return $this->id; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; }}
  • Orders are entitiesstored in an RDBMS
  • /** * @orm:Entity * @orm:Table(name="orders") * @orm:HasLifecycleCallbacks */class Order{ /** * @orm:Id @orm:Column(type="integer") * @orm:GeneratedValue(strategy="AUTO") */ private $id; /** * @orm:Column(type="string") */ private $productId; /** * @var DocumentsProduct */ private $product; // ...}
  • So how does an RDBMS have areference to something outside the database?
  • Setting the Productclass Order { // ... public function setProduct(Product $product) { $this->productId = $product->getId(); $this->product = $product; }}
  • • $productId is mapped and persisted• $product which stores the Product instance is not a persistent entity property
  • Retrieving ourproduct later
  • OrderPostLoadListeneruse DoctrineORMEventLifecycleEventArgs;class OrderPostLoadListener{ public function postLoad(LifecycleEventArgs $eventArgs) { // get the order entity $order = $eventArgs->getEntity(); // get odm reference to order.product_id $productId = $order->getProductId(); $product = $this->dm->getReference(MyBundle:DocumentProduct, $productId); // set the product on the order $em = $eventArgs->getEntityManager(); $productReflProp = $em->getClassMetadata(MyBundle:EntityOrder) ->reflClass->getProperty(product); $productReflProp->setAccessible(true); $productReflProp->setValue($order, $product); }}
  • All Together Now// Create a new product and order$product = new Product();$product->setTitle(Test Product);$dm->persist($product);$dm->flush();$order = new Order();$order->setProduct($product);$em->persist($order);$em->flush();// Find the order later$order = $em->find(Order, $order->getId());// Instance of an uninitialized product proxy$product = $order->getProduct();// Initializes proxy and queries the monogodb databaseecho "Order Title: " . $product->getTitle();print_r($order);
  • Read more about this techniqueJon Wage, one of OpenSky’s engineers, firstwrote about this technique on his personalblog: http://jwage.comYou can read the full article here:http://jwage.com/2010/08/25/blending-the-doctrine-orm-and-mongodb-odm/
  • http://spf13.com http://github.com/spf13 @spf13 Questions? download at mongodb.orgPS: We’re hiring!! Contact us at jobs@10gen.com