MongoDB at eBay

Yuri Finkelstein
eBay Platform Architect
yfinkelstein@ebay.com

May 2012

DB Scalability @ eBay
 eBay is one of the first and largest BASE
environments based on Oracle DB App1 App2
• Basic Availability
• Soft-state Business Business
• Eventual consistency Logic Logic
 Every database we use is shared and partitioned
Hint (shard key) Hint (shard key)
• N logical hosts names are defined for each use case ahead
of time
DAL DAL
• These logical hosts are mapped to physical based on static
mapping tables which are controlled by DBAs Framework Framework
• A common ORM framework called DAL provides powerful
and consistent patterns for data scalability
Applications

 If the client provides a hint along with every DB F1(Hint) F2(Hint)
query:
• DAL maps the hint to a logical host using one of N mapping Logical DB
schemes (ex: modulus, lookup table, range, etc)
… hosts …
• Logical host is then mapped to a physical using L-to-Ph map (shards)
• The query is sent to just one shard

 If the client does not have a hint, the query is sent to Config
all shards and the results are joined on the client Physical
with the help of DAL framework … Master DB
 Side-effects: hosts
• Hint is not part of the query; client has to manage it Physical
• Logical to Physical mapping scheme becomes extra piece of … standby
client configuration
• Shard rebalancing is “DBA magic” DB hosts

Key desired improvements

 All eBay site-facing applications use the scheme outlined above
 It’s proven to scale to tens of thousands of developers, petabytes of data, hundreds
of millions of SQL queries per day
 But there is always room for improvements and new ideas
• ORM is not the fastest way to develop; how do we achieve faster development cycles and reduce
schema mapping frictions?
• How do we add new attributes to tables faster and without DBA’s involvement? Schema free approach
sounds interesting.
• Can we make the hint transparent, ex: auto-extract it from queries?
• Can we rebalance the data seamlessly and automatically?
• Can we add shards faster in order to scale out on demand and transparently to applications?
• How do we deploy new DBs to the cloud on demand?

 And what about performance? Can we use RAM more aggressively and
seamlessly to speed up queries?

Enters MongoDB

 We are playing with MongoDB since 2010.
Why? Business
Logic Document
 Its scalability scheme is very similar to how
we shard RDBMS Morphia/Mongo
• Single master for writes, eventually consistent slaves for
Driver
reads Dynamic
• Horizontal partitioning of data sets is a norm at eBay Config
• MongoS is performing familiar scatter-gather and client-
side merge-sorts MongoS
F(Shard Key)
 We don’t use distributed transactions since
day 1; transactional updates of multiple tables
…
that we do use can be simulated by atomic

<- Replicas ->
updates of a single Mongo document
 MongoDB offers a number of features that …
help address our goals mentioned earlier:
• Developers love document model and schema-free
persistence
…
• Hints are embedded into the queries
• MongoDB has automatic shard rebalancing
• Shards can be added on demand without application
restart and data will be auto-rebalanced ---------- Shards -------
• We can easily bring it up in the cloud since cloud
machines have storage

Case study #1: eBay Search Suggestions

 Search suggestion list is a MongoDB document
indexed by word prefix as well as by some
metadata: product category, search domain,
etc.
 Must have < 60-70msec round trip end to end
 MongoDB query < 1.4msec
 Data set fits in RAM; 100-s M documents
 Data is bulk loaded once a day from Hadoop,
but can be tweaked on demand during sale
promotions, etc
 Single replica set, no shards in this case
 MongoDB benefits:
• Multiple indexes allow flexible lookups
• In-memory data placement ensures lookup speed
• Large data set is durable and replicated

Case Study #2: Cloud Manager “State Hub”

Query  State Hub powers eBay Cloud
Provision
Resources
Resources  Every resource provisioned by the cloud is
and Topology
represented by a single Mongo document
 Documents contain highly structured
metadata reflecting roles and grouping of
the resources
 Lookup by both primary and secondary
State Hub indexes

Mongo  Several GB data sets, easily fit in RAM
Update
 Documents are not uniform
resource
state  All resources have “State” field which is
updated periodically to reflect health state
of the underlying resource
 Mixed workload: lots of in-place writes, but
also lots of read queries

Case Study #3: eBay Merchandizing Info Cache

 Merchandizing backend powers eBay product/item
classification and categorization
 Each MongoDB document represents a cluster of similar
products
 Numerous relationships between clusters are modeled as
R1 document attributes
Cluster1 Cluster2
 Relationship hierarchy traversal is achieved by issuing a
R3 number of queries on “edge” attributes
R2
 Each instance of such a hierarchy is called a model; there
Cluster3 are lots of models
 Again, data set fits in RAM, single replica set
 Replica set members are located in 3 different data
centers (3+2+2) with all members in a single data center
having higher weight to avoid moving master away
 MongoDB benefits:
• Schema-free design and declarative indexes are perfect for this use
case where new attributes and new queries are constantly being
added
• Async replication across multiple data centers
• MongoDB Java Driver ensures automatic detection of proximity
of clients to replica set members; reads with slaveOK=true are
served from local data center nodes which insures low
response latency

Case Study #4: Zoom – Media Metadata Store

 This is a new mega project which is a work in progress
 MongoDB is being evaluated as a storage backend for all media-related
metadata on the site (example: picture IDs with lots attributes)
 Requirements:
• Tens of TBs data set, Millions of documents: data set must be partitioned; this is our
first use case where MongoDB sharding is used
• System of record for picture info; data can not be lost!
• Replication/DR across 2 data centers; local DC reads are required
• Queries are from site-facing flows; <10msec response time SLA
• Mixed workload: both inserts and reads are happening concurrently all the time

 Can MongoDB do it ??

Zoom: Data Model

 2 main collections: Item and Image
• Item references multiple Images

 Item represents eBay Item:
• _id in Item is external ID of the item in eBay site DB
• These IDs are already sharded in balanced across N
logical DB hosts using ID ranges
• We use MongoDB pre-split points for initial
mapping our N site DB shards to M MongoDB shards
• This ensures good balance between the shards;

 Image represents a picture attached to an
Item
• _id in Image is md5 of the image content
• This ensures good distribution across any number of
shards
• Md5 is also used to find duplicate images

 Our choice of document IDs in both
collections ensures good balance across
Mongo shards
 We never query both collections in a single
service request to ensure data consistency
and to have only one index lookup

Zoom: Service Topology and Configuration

 MongoS is deployed on app servers
• Ensures network IO on MongoS won’t become a bottleneck
• This is a very familiar pattern in eBay as was explained in the
>
--- DC1(Primary)---

beginning of this presentation

 M shards; each replica set has 6 members
M M M M • 3 + 3 in 2 data centers
• Master can be only in one DC during automatic failover; manual
failover may activate another DC
--- Replicas ---> • One slave in the secondary DC is invisible for reads and is
dedicated to periodic backups/snapshots (more on this later)

 For reads, client first sets SlaveOK=true and if
required document is not found flips to
SlaveOK=false to read from Master
-- DC2(Secondary)-->

 Home-grown MongoDB configuration and monitoring
agent is running on every node
• Fetches MongoD configuration from a central configuration store
and saves it to local config file
• Manages lifecycle of MongoD
B B B B • Monitors state and metrics

---- Shards -----

Zoom: Data Backup and Restore strategy

 Goals:
• Take periodic backups of the entire data set
Application • Be able to recover from backup
• Do not loose any writes that have happened after last snapshot
• Briefly service unavailability during recovery is better than data
Dual-write loss …
to capped
M collection C  Dual writes on the client
• Regular write to main cluster
…

• Second write to another Mongo cluster: single replica set,
capped collection, the data written is similar to REDO log record
Recovery
B Agent  Hidden slave in each shard has volume mounted on a
remote storage appliance capable of instant file
system snapshot; captures both DB files and journal
files
 If DB recovery is activated:
• All MongoD on primary cluster are shutdown
• NFS slave is remounted to snapshot volume
Instant • MongoD on this machine is started as a master
Shapshot • MongoD on other replica set members are started cold
• Full sync-up from master
Capable • Master is switched to a regular member
device • Writes that occurred since time when the backup was taken
are replayed from the REDO log capped collection in the
secondary cluster
•

Key Learning

 MongoDB can be a very powerful tool but use it wisely
 Deletes can be slow; automatic balancer is dangerous; use it only when you
must (example: be careful when adding new shards)
 Use explain for every query; disable full scans to discover inefficiencies
early
 Query profiler is great
 Retry every failed query at least once; long tail in response times is possible
when data set > RAM size

MongoDB at eBay

In this document

More Related Content

What's hot

Viewers also liked

Similar to MongoDB at eBay

More from MongoDB

Recently uploaded

MongoDB at eBay