SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
4.
Prior Data Ingestion
● Daily snapshotting of tables in RDS
● Dedicated Replicas to isolate
snapshot queries
● Bottlenecked by Replica I/O
● Read & Write amplifications
5.
Faster Ingestion Pipeline
Unlock Data Lake for business critical
applications
6.
Change Data Capture
● Table State as sequence of state changes
● Capture stream of changes from DB.
● Replay and Merge changes to data lake
● Efficient & Fast -> Capture and Apply only deltas
7.
High Level Architecture
Master
RDS
Replica
RDS
Table Topic
DeltaStreamer
DeltaStreamer
Bootstrap
DATA LAKE
(s3://xxx/…
Update schema
and partition
Write incremental data
and checkpoint offsets
9.
Debezium - Zooming In
Master Relational
Database (RDS)
WriteAheadLogs (WALs)
1. Enable logical-replication.
All updates to the Postgres RDS
database are logged into binary files
called WriteAheadLogs (WALs)
AVRO Schema Registry
3. Debezium updates and
validates avro schemas
for all tables using Kafka
Schema Registry
Table_1 Topic
Table_2 Topic
Table_n Topic
4. Debezium writes
avro serialized updates
into separate Kafka
topics for each table
2. Debezium
consumes WALs
using Postgres
Logical Replication
plugins
10.
Why did we choose Debezium over
alternatives?
Debezium AWS Database Migration Service (DMS)
Operational Overhead High Low
Cost Free, with engineering time cost Relatively expensive, with negligible engineering
time cost
Speed High Not enough
Customizations Yes No
Community Support Debezium has a very active and helpful Gitter
community.
Limited to AWS support.
12.
1. Postgres Master Dependency
ONLY the Postgres Master publishes WriteAheadLogs (WALs).
Disk Space:
- If a consumer dies, Postgres will keep accumulating WALs to ensure Data Consistency
- Can eat up all the disk space
- Need proper monitoring and alerting
CPU:
- Each logical replication consumer uses a small amount of CPU
- Postgres10+ uses pgoutput (built-in) : Lightweight
Postgres9 uses wal2Json (3rd party) : Heavier
- Need upgrades to Postgres10+
Debezium: Lessons Learned
Postgres Master:
- Publishes WALs
- Record LogSequenceNumber
(LSN) for each consumer
Consumer-1
Consumer-n
Consumer-2
LSN-2
LSN-1
LSN-n
Consumer-ID LSN
Consumer-1 A3CF/BC
Consumer-n A3CF/41
13.
Debezium: Lessons Learned
2. Initial Table Snapshot
(Bootstrapping)
Need for bootstrapping:
- Each table to replicate requires initial snapshot, on top of which ongoing
logical updates are applied
Problem with Debezium:
- Slow. Debezium snapshot mode reads, encodes (AVRO), and writes all the
rows to Kafka
- Large tables put too much pressure on Kafka Infrastructure and Postgres
master
Solution using Deltastreamer:
- Custom Deltastreamer bootstrapping framework using partitioned and
distributed spark reads
- Can use read-replicas instead of the master
Master
RDS
Replica
RDS
Table
Topic
DeltaStreamer
Bootstrap
DeltaStreamer
14.
Debezium: Lessons Learned
AVRO JSON JSON + Schema
Throughput
(Benchmarked using db.r5.24xlarge
Postgres RDS instance)
Up to 40K mps Up to 11K mps.
JSON records are larger than AVRO.
Up to 3K mps.
Schema adds considerable size to
JSON records.
Data Types - Supports considerably high
number of primitive and complex
data types out of the box.
- Great for type safety.
Values must be one of these 6 data types:
- String
- Number
- JSON object
- Array
- Boolean
- Null
Same as JSON
Schema Registry Required by clients to deserialize
the data.
Optional Optional
3. AVRO vs JSON
15.
Debezium: Lessons Learned
Table-1
Table-4
Table-2
Table-5
Table-3
Table-n
Consumer-1
Consumer-n
Consumer-2
Table-1
Table-2
Table-3
Table-4
Table-5
Table-n
4. Multiple logical replication streams for
horizontal scaling
Databases with multiple large tables can generate enough logs to overwhelm a single
Debezium connector.
Solution is to split the tables across multiple Debezium connectors.
Total throughput = max_throughput_per_connector * num_connectors
16.
Debezium: Lessons Learned
5. Schema evolution and value of
Freezing Schemas
Failed assumption: Schema changes are infrequent and always backwards
compatible.
- Examples:
1. Adding non-nullable columns (Most Common 99/100)
2. Deleting columns
3. Changing data types
How to handle the non backwards compatible changes?
- Re-bootstrap the table
- Can happen anytime during the day #always_on_call
Alternatives? Freeze the schema
- Debezium allows to specify the list of columns to be consumed per table.
- Pros:
- #not_always_on_call
- Monitoring scripts to identify and batch the changes for management window
- Cons:
- Schema is temporarily out of sync
Table-X
Backwards Incompatible
Schema Change
Consumer-2
- Specific columns
Consumer-1
- All columns
17.
Deep Dive
- CDC Ingestion
- Bootstrap Ingestion
18.
Data Lake Ingestion - CDC Path
Schema Registry
Hudi Table
Hudi Metadata
1. Get Kafka checkpoint
2. Spark Kafka batch
read and union from
most recently
committed kafka offsets
3. Deserialize using Avro
schema from schema
registry
4. Apply Copy-On-Write
updates and update
Kafka checkpoint
Shard 1 Table 1
Shard 2 Table 1
Shard N Table 1
Shard 1 Table 2
Shard 2 Table 2
Shard N Table 2
19.
Data Lake Ingestion - Bootstrap Path
Hudi Table
AWS RDS
Replica
Shard 1 Table
Topic
Shard 2 Table
Topic
Shard n Table
Topic
Hudi Table
3. Wait for Replica to
catch up latest
checkpoint
2. Store offsets in Hudi
metadata
1. Get latest topic
offsets
4. Spark JDBC Read
5. Bulk Insert Hudi
Table
21.
Common Challenges
- How to ensure multiple Hudi jobs are not running on the same
table?
22.
Common Challenges
- How to partition Postgres queries for bootstrapping skewed
tables?
23.
Common Challenges
- How to monitor job statuses and end-to-end-latencies?
- Commit durations
- Number of inserts, updates, and deletes
- Files added and deleted
- Partitions updated
24.
Future Work
- 1000s of pipelines to be managed and replications slots requires
an orchestration framework for managing this
- Support for Hudi’s continuous mode to achieve lower latencies
- Leverage Hudi’s incremental processing to build efficient
downstream pipelines
- Deploy and Improve Hudi’s Flink integration to Flink based
pipelines
25.
Robinhood
is Growing
Robinhood is hiring Engineers across its teams to
keep up with our growth!
We have expanded
Engineering teams recently
in Seattle, NYC, and
continuing to Grow in Menlo
Park, CA (HQ)
Interested? Reach out to us
directly or Apply online at
careers.robinhood.com