Data Product at Airbnb
LIYIN TANG & JINGWEI LU
Data Infrastructure at Airbnb
Event
Logs
MySQL
Dumps
Gold Cluster
HDFS
Hive
Kafk
Sqoo
Silver Cluster Spark Cluster
Spark
ReAi
Airflow Scheduling
S3
Presto Cluster
AirPal
SuperSet
Tableau
Batch Infrastructure
Yarn HDFS
Hive
Yarn
Liyin Tang and Jingwei Lu
3
Streaming at Airbnb
Liyin Tang and Jingwei Lu
4
Cluster
Spark Streaming
Airflow Scheduling
HBase
Sources
Kafka
S3
HDFS
…
Sinks
Datadog
Kafka
Dynamo
DB
Elastic
Search
…
Lambda Architecture
Batch
AirStream
Hive
Spark SQL
Lambda Architecture
Liyin Tang and Jingwei Lu
6
Streaming
Kafka
Spark Streaming
HBase
Liyin Tang and Jingwei Lu
Our Foundations
• Combine stream with batch
• Shared global state store
7
Liyin Tang and Jingwei Lu
Unified API through AirStream
• Declarative job configuration
• Computation operator or sink can be shared by stream and
batch job.
• Stream source vs static source
• Single driver execute stream/batch mode job
8
AirStream
Liyin Tang and Jingwei Lu
9
HBase Tables
Spark StreamingSpark StreamingSpark StreamingSpark Streaming
Spark BatchSpark BatchSpark BatchSpark Batch
Shared Global State Store
Shared Global State Store
Liyin Tang and Jingwei Lu
10
DataFrame
HBase
Region 1
Region 2
Region N
Re-partition
<Region 1, [RowKey,
Value]>
<Region 2, [RowKey,
Value]>
<Region N, [RowKey,
Value]>
… …
Puts
HFile
BulkLoad
Shared Global State Store
Liyin Tang and Jingwei Lu
11
HBase Tables
Spark Streaming/Batch Jobs
Multi-Gets Prefix Scan Time Range Scan
Why HBase ?
• Rich API for point-lookups and sequential scan
(TimeRange, TTL, Prefix Scan …)
• Merged view based on version
• Unified API for streaming writes and bulk uploads
• Unified API for reading from live table and snapshot table
13
Why HBase
Streaming Computation
Merged Storage
Liyin Tang and Jingwei Lu
15
Row Key
R1 V100 V100
R1 V99 V99
R1 V01 V01
… … … …
Time
Streaming Writes
Streaming Writes
Streaming Writes
Merged Storage
Liyin Tang and Jingwei Lu
16
Row Key
R1 V100 V100
R1 V99 V99
R1 V01 V01
Time
Streaming Writes
Streaming Writes
Streaming Writes
R1 V100 100
Batch Bulk Upload
Distinct Count
Liyin Tang and Jingwei Lu
17
Row Key
Prefix _ R1 V102 V102
Prefix _ R2 V101 V101
Prefix _ R4 V01 V01
Prefix _ R3 V100 100
Prefix Scan with
TimeRange
Prefix Scan with
TimeRange
Time
Moving Average
Liyin Tang and Jingwei Lu
18
Row Key
R1 V102 102
R1 V101 101
R1 V01 V01
R1 V100 100
Count Difference/
Time Elapsed
Count Difference/
Time Elapsed
Time
… … …
… … …
Window 1
Window 2
Long Window Computation
Liyin Tang and Jingwei Lu
19
Spark HBase Connector
Liyin Tang and Jingwei Lu
20
Zeppelin
Spark
HBase
Connector
HBase
Presto HBase Connector
Schema
Mapping
Split -> RS
Mapping
Presto - HBase Connector
Liyin Tang and Jingwei Lu
21
Presto HBase
Hive - HBase Connector
Liyin Tang and Jingwei Lu
22
Hive HBase
Hive HBase Connector
Table
InputFormat
Snapshot
InputFormat
Use Cases
Mysql DB Snapshot Using
Binlog Replay
• Large amount of data: Multiple large mysql DBs
• Realtime-ness: minutes delay/ hours delay
• Transaction : Need to keep transaction cross different tables
• Schema change: Table schema evolves
Database Snapshot
25
Move Elephant
26
20+ hr 4+ hr
Airstream Job
5 mins
15
1 hr
Binlog Replay on Spark
• Streaming and Batch shares Logic:
Binlog file reader, DDL processor,
transaction processor, DML
processor.
• Idempotent: Log can be replayed
multiple times.
• Schema changes: Full schema
change history.
27
Log Parser
xvid
DML
DDL
HBASE
Architecture
Binlog
Realtime Ingestion &
Interactive Query
Realtime Ingestion and Interactive Query
Liyin Tang and Jingwei Lu
29
HBase
AirStream
Spark
Streaming
Kafka
Query
Engine
Data
Portal
Spark SQL
Hive SQL
Presto SQL
Interactive Query in Superset
30
Summary
Unify Batch and
Streaming Computation
32
33
We are hiring

HBaseCon2017 Data Product at AirBnB

  • 1.
    Data Product atAirbnb LIYIN TANG & JINGWEI LU
  • 2.
  • 3.
    Event Logs MySQL Dumps Gold Cluster HDFS Hive Kafk Sqoo Silver ClusterSpark Cluster Spark ReAi Airflow Scheduling S3 Presto Cluster AirPal SuperSet Tableau Batch Infrastructure Yarn HDFS Hive Yarn Liyin Tang and Jingwei Lu 3
  • 4.
    Streaming at Airbnb LiyinTang and Jingwei Lu 4 Cluster Spark Streaming Airflow Scheduling HBase Sources Kafka S3 HDFS … Sinks Datadog Kafka Dynamo DB Elastic Search …
  • 5.
  • 6.
    Batch AirStream Hive Spark SQL Lambda Architecture LiyinTang and Jingwei Lu 6 Streaming Kafka Spark Streaming HBase
  • 7.
    Liyin Tang andJingwei Lu Our Foundations • Combine stream with batch • Shared global state store 7
  • 8.
    Liyin Tang andJingwei Lu Unified API through AirStream • Declarative job configuration • Computation operator or sink can be shared by stream and batch job. • Stream source vs static source • Single driver execute stream/batch mode job 8
  • 9.
    AirStream Liyin Tang andJingwei Lu 9 HBase Tables Spark StreamingSpark StreamingSpark StreamingSpark Streaming Spark BatchSpark BatchSpark BatchSpark Batch Shared Global State Store
  • 10.
    Shared Global StateStore Liyin Tang and Jingwei Lu 10 DataFrame HBase Region 1 Region 2 Region N Re-partition <Region 1, [RowKey, Value]> <Region 2, [RowKey, Value]> <Region N, [RowKey, Value]> … … Puts HFile BulkLoad
  • 11.
    Shared Global StateStore Liyin Tang and Jingwei Lu 11 HBase Tables Spark Streaming/Batch Jobs Multi-Gets Prefix Scan Time Range Scan
  • 12.
  • 13.
    • Rich APIfor point-lookups and sequential scan (TimeRange, TTL, Prefix Scan …) • Merged view based on version • Unified API for streaming writes and bulk uploads • Unified API for reading from live table and snapshot table 13 Why HBase
  • 14.
  • 15.
    Merged Storage Liyin Tangand Jingwei Lu 15 Row Key R1 V100 V100 R1 V99 V99 R1 V01 V01 … … … … Time Streaming Writes Streaming Writes Streaming Writes
  • 16.
    Merged Storage Liyin Tangand Jingwei Lu 16 Row Key R1 V100 V100 R1 V99 V99 R1 V01 V01 Time Streaming Writes Streaming Writes Streaming Writes R1 V100 100 Batch Bulk Upload
  • 17.
    Distinct Count Liyin Tangand Jingwei Lu 17 Row Key Prefix _ R1 V102 V102 Prefix _ R2 V101 V101 Prefix _ R4 V01 V01 Prefix _ R3 V100 100 Prefix Scan with TimeRange Prefix Scan with TimeRange Time
  • 18.
    Moving Average Liyin Tangand Jingwei Lu 18 Row Key R1 V102 102 R1 V101 101 R1 V01 V01 R1 V100 100 Count Difference/ Time Elapsed Count Difference/ Time Elapsed Time … … … … … … Window 1 Window 2
  • 19.
    Long Window Computation LiyinTang and Jingwei Lu 19
  • 20.
    Spark HBase Connector LiyinTang and Jingwei Lu 20 Zeppelin Spark HBase Connector HBase
  • 21.
    Presto HBase Connector Schema Mapping Split-> RS Mapping Presto - HBase Connector Liyin Tang and Jingwei Lu 21 Presto HBase
  • 22.
    Hive - HBaseConnector Liyin Tang and Jingwei Lu 22 Hive HBase Hive HBase Connector Table InputFormat Snapshot InputFormat
  • 23.
  • 24.
    Mysql DB SnapshotUsing Binlog Replay
  • 25.
    • Large amountof data: Multiple large mysql DBs • Realtime-ness: minutes delay/ hours delay • Transaction : Need to keep transaction cross different tables • Schema change: Table schema evolves Database Snapshot 25 Move Elephant
  • 26.
    26 20+ hr 4+hr Airstream Job 5 mins 15 1 hr Binlog Replay on Spark
  • 27.
    • Streaming andBatch shares Logic: Binlog file reader, DDL processor, transaction processor, DML processor. • Idempotent: Log can be replayed multiple times. • Schema changes: Full schema change history. 27 Log Parser xvid DML DDL HBASE Architecture Binlog
  • 28.
  • 29.
    Realtime Ingestion andInteractive Query Liyin Tang and Jingwei Lu 29 HBase AirStream Spark Streaming Kafka Query Engine Data Portal Spark SQL Hive SQL Presto SQL
  • 30.
  • 31.
  • 32.
  • 33.