4. •15+ years building the
internet (13 years using SQL)
•Father, husband,
skateboarder
•Chief Solutions Architect @
10gen responsible for
drivers, integrations, web &
docs
5. • Company behind MongoDB
• AGPL license, own copyrights, engineering team
• support, consulting, commercial license revenue
• Management
• Google/DoubleClick, Oracle, Apple, NetApp
• Funding: Sequoia, Union Square, Flybridge
• Offices in NYC, Palo Alto, London & Dublin
• 90+ employees
9. Why MongoDB
My Top 10 Reasons
10. Great developer experience
9. Speaks your language
8. Scale horizontally
7. Fully consistent data w/atomic operations
1.It’ssource scale
web
6. Memory Caching integrated
5. Open
4. Flexible, rich & structured data format not just K:V
3. Ludicrously fast (without going plaid)
2. Simplify infrastructure & application
10. Why MongoDB
My Top 10 Reasons
10. Great developer experience
9. Speaks your language
8. Scale horizontally
7. Fully consistent data w/atomic operations
1.It’ssource scale
web
6. Memory Caching integrated
5. Open
4. Flexible, rich & structured data format not just K:V
3. Ludicrously fast (without going plaid)
2. Simplify infrastructure & application
11. MongoDB is
Application Document
Oriented
{ author: “steve”,
High date: new Date(),
text: “About MongoDB...”,
Performance tags: [“tech”, “database”]}
Fully
Consistent
Horizontally Scalable
12. Under the hood
• Written in C++
• Runs on nearly anything
• Data serialized to BSON
• Extensive use of memory-mapped files
i.e. read-through write-through memory
caching.
16. Reasons to build a
hybrid application
•Friction in existing application caused
by RDBMS
•Transitioning an existing application to
MongoDB
•Using the right tool for the right job
•Need some features not present in
MongoDB
17. Reasons Not to build
a hybrid application
•Aggregation (at least not very soon)
•Lack of clear understanding of needs
•Backups
•MongoDB as cache in front of SQL
•Loads more...
22. Most of the same
rules apply
•Application partitions data between
two (or more) systems.
•Model layer tracks what content
resides where.
23. Hybrid is easier than
RDMBS + MemCache
• Always know where to find a piece of data.
• Data never needs expiring.
• Data not duplicated (for the most part)
across systems.
• Always handle a record same way.
• Developer freedom to choose the right tool
for the right reasons.
24. Typical RDBMS
retrieval operation
exists & up to date?
if yes... then done Memcache
if no, query DB for it
Retrieve record(s) RDBMS
App
Replace in cache
Memcache
Repeat
26. Typical RDMBS
write operation
insert or update row
confirm written RDBMS
assemble into object(s)
App write object
Memcache
27. Typical RDMBS
write operation
insert or update row
confirm written RDBMS
assemble into object(s)
App write object
write object
write object Memcache
write object
This goes on for a while doesn’t it?
28. Typical RDMBS
write operation
insert or update row
confirm written RDBMS
assemble into object(s)
App write object
write object
write object Memcache
write object
This goes on for a while doesn’t it?
one row can be in many objects so there’s
a lot of complication in updating everything
32. Archiving
Why Hybrid:
• Existing application built on MySQL
• Lots of friction with RDBMS based archive storage
• Needed more scalable archive storage backend
Solution:
• Keep MySQL for active data (100mil), MongoDB for archive (2+
bil)
Results:
• No more alter table statements taking over 2 months to run
• Sharding fixed vertical scale problem
• Very happily looking at other places to use MongoDB
33. Reporting
Why Hybrid:
• Most of the functionality written in MongoDB
• Reporting team doesn’t want to learn MongoDB
Solution:
• Use MongoDB for active database, replicate to MySQL for
reporting
Results:
• Developers happy
• Business Analysts happy
34. E-commerce
Why Hybrid:
• Multi-vertical product catalogue impossible to model in RDBMS
• Needed transaction support RDBMS provides
Solution:
• MySQL for orders, MongoDB for everything else
Results:
• Massive simplification of code base
• Rapidly build, halving time to market (and cost)
• Eliminated need for external caching system
• 50x+ improvement over MySQL alone
35. How
implemented a
hybrid MongoDB /
MySQL solution
http://opensky.com
39. Data to store in
MongoDB
• User • Event
• Product • TaxRate
• Product/Sellable • ... and then I got
tired of typing them in
• Address
• Just imagine this list
• Cart has 40 more classes
• CreditCard • ...
45. Inventory is
transient
• Product::$inventory is effectively a
transient property
•Note how I said “effectively”? ... we
cheat and persist our transient
property to MongoDB as well
•We can do this because we never really
trust the value stored in Mongo
47. Accuracy is only important
when there’s contention
•For display, sorting and alerts, we can
use the value stashed in MongoDB
•It’s faster
•It’s accurate enough
48. Accuracy is only important
when there’s contention
•For display, sorting and alerts, we can
use the value stashed in MongoDB
•It’s faster
•It’s accurate enough
•For financial transactions, we want the
multi table transactions from our
RDBMS.
50. Inventory kept in
sync with listeners
•Every time a new product is created,
its inventory is inserted in SQL
51. Inventory kept in
sync with listeners
•Every time a new product is created,
its inventory is inserted in SQL
•Every time an order is placed,
inventory is verified and decremented
52. Inventory kept in
sync with listeners
•Every time a new product is created,
its inventory is inserted in SQL
•Every time an order is placed,
inventory is verified and decremented
•Whenever the SQL inventory changes,
it is saved to MongoDB as well
54. Be careful what you
lock
1. Acquire inventory row lock and begin
transaction
2. Check current product inventory
3. Decrement product inventory
4. Write the Order to SQL
5. Update affected MongoDB documents
6. Commit the transaction
7. Release product inventory lock
60. So how does an
RDBMS have a
reference to something
outside the database?
61. Setting the Product
class Order {
// ...
public function setProduct(Product $product)
{
$this->productId = $product->getId();
$this->product = $product;
}
}
62. • $productId is mapped and persisted
• $product which stores the Product
instance is not a persistent entity
property
64. OrderPostLoadListener
use DoctrineORMEventLifecycleEventArgs;
class OrderPostLoadListener
{
public function postLoad(LifecycleEventArgs $eventArgs)
{
// get the order entity
$order = $eventArgs->getEntity();
// get odm reference to order.product_id
$productId = $order->getProductId();
$product = $this->dm->getReference('MyBundle:DocumentProduct', $productId);
// set the product on the order
$em = $eventArgs->getEntityManager();
$productReflProp = $em->getClassMetadata('MyBundle:EntityOrder')
->reflClass->getProperty('product');
$productReflProp->setAccessible(true);
$productReflProp->setValue($order, $product);
}
}
65. All Together Now
// Create a new product and order
$product = new Product();
$product->setTitle('Test Product');
$dm->persist($product);
$dm->flush();
$order = new Order();
$order->setProduct($product);
$em->persist($order);
$em->flush();
// Find the order later
$order = $em->find('Order', $order->getId());
// Instance of an uninitialized product proxy
$product = $order->getProduct();
// Initializes proxy and queries the monogodb database
echo "Order Title: " . $product->getTitle();
print_r($order);
66. Read more about
this technique
Jon Wage, one of OpenSky’s engineers, first
wrote about this technique on his personal
blog: http://jwage.com
You can read the full article here:
http://jwage.com/2010/08/25/blending-the-
doctrine-orm-and-mongodb-odm/
67. http://spf13.com
http://github.com/spf13
@spf13
Questions?
download at mongodb.org
PS: We’re hiring!! Contact us at jobs@10gen.com
Editor's Notes
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Given that split, we just happen to have the most boring SQL schema ever\n
This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
\n
Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
\n
\n
\n
\n
\n
And here’s why we like Doctrine so much.\n
And here’s why we like Doctrine so much.\n
And here’s why we like Doctrine so much.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
This will look a bit like when I bought those shoes.\n
\n
\n
The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
\n
\n
Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
This is key: we set both the product id and a reference to the product itself.\n
When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
\n
This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n