Successfully reported this slideshow.
You’ve unlocked unlimited downloads on SlideShare!
Aesop Change Data Propagation :
Bridging SQL and NoSQL Systems
Regunath B, Principal Architect, Flipkart
What data store?
• In interviews: I need to scale and therefore will use a
• Avoids the overheads of RDBMS!?
• XX product brochure:
• Y million ops/sec (Lies, Damn Lies, and Benchmarks)
• In-memory, Flash optimised
• In architecture reviews:
• Durability of data, disk-to-memory ratios
• How many nodes in a single cluster?
• CAP tradeoffs: Consistency vs. Availability
Caching/Serving Layer challenges
There are only two hard things in Computer Science: cache
invalidation and naming things.
-- Phil Karlton
• Cache TTLs, High Request concurrency, Lazy caching
• Thundering herds
• Availability of primary data store
• Cache size, distribution, no. of replicas
• Feasibility of write-through
• Serving layer is Eventually Consistent, at best
• Replicas converge over time
• Scale reads through multiple replicas
• Higher overall data availability
• Reads return live data before convergence
• Need to implement Strong Eventual Consistency
when timeline-consistent view of data is needed
• Achieving Eventual Consistency is not easy
• Trivially requires Atleast-Once delivery guarantee of
updates to all replicas
• A keen observer of changes that can also relay change events reliably
to interested parties. Provides useful infrastructure for building
Eventually Consistent data sources and systems.
• Open Source : https://github.com/Flipkart/aesop
• Support : firstname.lastname@example.org
• Production Deployments at Flipkart :
• Payments : Multi-tiered datastore spanning MySQL, HBase
• ETL : Move changes on User accounts to data analysis platform/
• Data Serving : Capture Wishlist data updates on MySQL and index
in Elastic Search
• WIP : Accounting, Pricing, Order management etc.
• Producer : Uses Log Mining (Old wine in new bottle?)
• "Durability is typically implemented via logging and
recovery.” Architecture of a Database System
• "The contents of the DB are a cache of the latest records in
the log. The truth is the log. The database is a cache of a
subset of the log.” - Jay Kreps (creator of Kafka)
• WAL (write ahead log) ensures:
• Each modification is flushed to disk
• Log records are in order
• Databus Relay : Ring-Buffer holding Avro
serialised change events
• Memory mapped
• Similar to a Broker in a pub-sub system
• Enhanced in Aesop for configurability, metrics
collection and admin console
• Databus Consumer(s) : Sinks for change events
• Enhanced in Aesop for bootstrapping,
configurability, data transformation
Performance (Lies, Damn Lies, and Benchmarks)
• MySQL —> HBase
• Relay : 1 XL VM (8 core, 32GB)
• Consumers : 4 XL, 200 partitions
• Throughput : 30K Inserts per sec.
• Data size : 800 GB
• Time : 60 hrs
• Busy Relay - 95% CPU (serving data to 200 partitions)
• High producer throughput - Log read operates at disk transfer
• High consumer throughput - Append-only writes of HBase
• Better scale possible with larger machine for Relay
• Partitioning Relay might be tricky - to preserve WAL edits ordering
• Enhance, Implement:
• HBase, MongoDB, etc.
• Data Layers
• Redis, Aerospike, etc.
• Document Operational best practices
• e.g. MySQL mastership transfer
• Infra component for building tiered data stores
• Sharded, Secondary indices, Low Latency, HW
optimized (high Disk-Memory ratios)