Apache HBase at Airbnb

Aug. 2, 2016
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
Apache HBase at Airbnb
1 of 35

More Related Content

What's hot

Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumTathastu.ai
Amazon AWS basics needed to run a Cassandra Cluster in AWSAmazon AWS basics needed to run a Cassandra Cluster in AWS
Amazon AWS basics needed to run a Cassandra Cluster in AWSJean-Paul Azar
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsCloudera, Inc.
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주
[네이버오픈소스세미나] Maglev Hashing Scheduler in IPVS, Linux Kernel - 송인주NAVER Engineering
The State of HBase ReplicationThe State of HBase Replication
The State of HBase ReplicationHBaseCon

What's hot(20)

Similar to Apache HBase at Airbnb

Airstream: Spark Streaming At AirbnbAirstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbJen Aman
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...
Building Data Product Based on Apache Spark at Airbnb with Jingwei Lu and Liy...Databricks
HBaseCon2017 Data Product at AirBnBHBaseCon2017 Data Product at AirBnB
HBaseCon2017 Data Product at AirBnBHBaseCon
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Simon Elliston Ball
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a ServiceSteven Wu
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1Adam Muise

Similar to Apache HBase at Airbnb (20)

More from HBaseCon

hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on KubernetesHBaseCon
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon

Recently uploaded

Climate Impact of Software TestingClimate Impact of Software Testing
Climate Impact of Software TestingKari Kakkonen
Document WhatsApp MessagingDocument WhatsApp Messaging
Document WhatsApp MessagingGeminate Consultancy Services
Semantic Search_ NLP_ ML.pdfSemantic Search_ NLP_ ML.pdf
Semantic Search_ NLP_ ML.pdfPlamenaDzharadat
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!QAware GmbH
OutSystems Security Specialization - Study Help DeckOutSystems Security Specialization - Study Help Deck
OutSystems Security Specialization - Study Help DeckFábio Godinho
The art of AI ArtThe art of AI Art
The art of AI ArtDennis Vroegop

Apache HBase at Airbnb

Editor's Notes

  1. *Disaster recovery *High Slow SLA job isolation
  2. Slide why Stateful process vs stateless
  3. Use diagram to show operators
  4. Realtime ingestion provides fast feedback loop. Advanced monitoring infrastructure Tracking changes instead of full snapshot for RDS dump
  5. What is the goal of realtime ingestion: *fast feedback loop for experiment to reduce testing cycle *provide realtime view of production database for many offline workload(for example, machine learning)
  6. Table mapping provide a unified view to access realtime ingested data.
  7. For snapshot using scan it takes 10-30 minutes per table. This does not scale. Take 10 minutes to do the link and restore. All tables can be accessed afterward.
  8. Backup based db export restore takes 9 - 12 hours and it is subject to AWS network situation. Long latency and fragile. We just need to track changes and apply to snapshot. Provide near realtime snapshot of db. Unify across mysql and dynamodb