Submit Search
Upload
Serving queries at low latency using HBase
•
0 likes
•
214 views
B
Biju Nair
Follow
HBase application design for low latency
Read less
Read more
Software
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 50
Download now
Download to read offline
Recommended
Multi-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-final
Biju Nair
HBase Internals And Operations
HBase Internals And Operations
Biju Nair
Cursor Implementation in Apache Phoenix
Cursor Implementation in Apache Phoenix
Biju Nair
2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase
Rohit Jain
Non-Relational Revolution - Joseph Idziorek
Non-Relational Revolution - Joseph Idziorek
Amazon Web Services
ElastiCache and Redis - Samir Karande
ElastiCache and Redis - Samir Karande
Amazon Web Services
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
DataWorks Summit
1 - The Case for Trafodion
1 - The Case for Trafodion
Rohit Jain
Recommended
Multi-Tenant HBase Cluster - HBaseCon2018-final
Multi-Tenant HBase Cluster - HBaseCon2018-final
Biju Nair
HBase Internals And Operations
HBase Internals And Operations
Biju Nair
Cursor Implementation in Apache Phoenix
Cursor Implementation in Apache Phoenix
Biju Nair
2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase
Rohit Jain
Non-Relational Revolution - Joseph Idziorek
Non-Relational Revolution - Joseph Idziorek
Amazon Web Services
ElastiCache and Redis - Samir Karande
ElastiCache and Redis - Samir Karande
Amazon Web Services
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...
DataWorks Summit
1 - The Case for Trafodion
1 - The Case for Trafodion
Rohit Jain
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
Krishna-Kumar
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San Francisco
Amazon Web Services
Hadoop development series(1)
Hadoop development series(1)
Amar kumar
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
normanbarker
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SF
Amazon Web Services
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
DataWorks Summit
Pivotal HAWQ 소개
Pivotal HAWQ 소개
Seungdon Choi
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
DataWorks Summit
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkes
gfawkesnew2
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
DataWorks Summit
High-Performance Input Pipelines for Scalable Deep Learning
High-Performance Input Pipelines for Scalable Deep Learning
DataWorks Summit
Hive - Cost Based Optimizer
Hive - Cost Based Optimizer
John pullokkaran
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
VMware Tanzu
CBO-2
CBO-2
John pullokkaran
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
DataWorks Summit
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
What's new in apache hive
What's new in apache hive
DataWorks Summit
Big data course
Big data course
kiruthikab6
Big data at United Airlines
Big data at United Airlines
DataWorks Summit
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
confluent
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxData
More Related Content
What's hot
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
Krishna-Kumar
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San Francisco
Amazon Web Services
Hadoop development series(1)
Hadoop development series(1)
Amar kumar
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
normanbarker
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SF
Amazon Web Services
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
DataWorks Summit
Pivotal HAWQ 소개
Pivotal HAWQ 소개
Seungdon Choi
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
DataWorks Summit
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkes
gfawkesnew2
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
DataWorks Summit
High-Performance Input Pipelines for Scalable Deep Learning
High-Performance Input Pipelines for Scalable Deep Learning
DataWorks Summit
Hive - Cost Based Optimizer
Hive - Cost Based Optimizer
John pullokkaran
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
VMware Tanzu
CBO-2
CBO-2
John pullokkaran
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
DataWorks Summit
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
What's new in apache hive
What's new in apache hive
DataWorks Summit
Big data course
Big data course
kiruthikab6
Big data at United Airlines
Big data at United Airlines
DataWorks Summit
What's hot
(20)
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San Francisco
Hadoop development series(1)
Hadoop development series(1)
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SF
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Avoiding Log Data Overload in a CI/CD System While Streaming 190 Billion Even...
Pivotal HAWQ 소개
Pivotal HAWQ 소개
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkes
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub...
High-Performance Input Pipelines for Scalable Deep Learning
High-Performance Input Pipelines for Scalable Deep Learning
Hive - Cost Based Optimizer
Hive - Cost Based Optimizer
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
CBO-2
CBO-2
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
Lessons Learned Migrating from IBM BigInsights to Hortonworks Data Platform
Protecting your Critical Hadoop Clusters Against Disasters
Protecting your Critical Hadoop Clusters Against Disasters
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
What's new in apache hive
What's new in apache hive
Big data course
Big data course
Big data at United Airlines
Big data at United Airlines
Similar to Serving queries at low latency using HBase
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
confluent
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxData
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
confluent
Moneyball: Using Advanced Account Insights for Effective ABM Activation
Moneyball: Using Advanced Account Insights for Effective ABM Activation
Engagio
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
confluent
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Lucidworks
How Financial Services can Save On File Storage
How Financial Services can Save On File Storage
Charly Mostert
From Startup to Recognized IoT Leader
From Startup to Recognized IoT Leader
Amazon Web Services
Augmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big Data
Tyler Wishnoff
Augmented OLAP for Big Data
Augmented OLAP for Big Data
Luke Han
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
Amazon Web Services
From Data To Insights
From Data To Insights
Orit Alul
Building csm while going from on premise to saa s
Building csm while going from on premise to saa s
Jessica Osborn
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
Amazon Web Services
Building SRE from Scratch at Coinbase during Hypergrowth (DEV315-S) - AWS re:...
Building SRE from Scratch at Coinbase during Hypergrowth (DEV315-S) - AWS re:...
Amazon Web Services
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
Romit Mehta
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
Deepak Chandramouli
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Hannah Flynn
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Aggregage
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Amazon Web Services
Similar to Serving queries at low latency using HBase
(20)
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
InfluxDB Enterprise Architectural Patterns | Craig Hobbs | InfluxData
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Moneyball: Using Advanced Account Insights for Effective ABM Activation
Moneyball: Using Advanced Account Insights for Effective ABM Activation
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
Replaying KStreams Apps Using State Snapshots (Nishchay Sinha & Yan Wang, Blo...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
Learning to Rank: From Theory to Production - Malvina Josephidou & Diego Cecc...
How Financial Services can Save On File Storage
How Financial Services can Save On File Storage
From Startup to Recognized IoT Leader
From Startup to Recognized IoT Leader
Augmented OLAP Analytics for Big Data
Augmented OLAP Analytics for Big Data
Augmented OLAP for Big Data
Augmented OLAP for Big Data
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
M&E Leadership Session: The State of the Industry, What's New from AWS for M&...
From Data To Insights
From Data To Insights
Building csm while going from on premise to saa s
Building csm while going from on premise to saa s
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
Building SRE from Scratch at Coinbase during Hypergrowth (DEV315-S) - AWS re:...
Building SRE from Scratch at Coinbase during Hypergrowth (DEV315-S) - AWS re:...
Gimel at Dataworks Summit San Jose 2018
Gimel at Dataworks Summit San Jose 2018
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Modern Product Data Workflows: How King Crushes New Product Development using...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
Hollywood's Cloud-Based Content Lakes: Modernized Media Archives (MAE203) - A...
More from Biju Nair
Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
Biju Nair
Apache Kafka Reference
Apache Kafka Reference
Biju Nair
Hadoop security
Hadoop security
Biju Nair
Chef patterns
Chef patterns
Biju Nair
HBase Application Performance Improvement
HBase Application Performance Improvement
Biju Nair
HDFS User Reference
HDFS User Reference
Biju Nair
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
Biju Nair
Netezza workload management
Netezza workload management
Biju Nair
Row or Columnar Database
Row or Columnar Database
Biju Nair
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
Biju Nair
Netezza fundamentals for developers
Netezza fundamentals for developers
Biju Nair
Concurrency
Concurrency
Biju Nair
Project Risk Management
Project Risk Management
Biju Nair
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
Biju Nair
More from Biju Nair
(14)
Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
Apache Kafka Reference
Apache Kafka Reference
Hadoop security
Hadoop security
Chef patterns
Chef patterns
HBase Application Performance Improvement
HBase Application Performance Improvement
HDFS User Reference
HDFS User Reference
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
Netezza workload management
Netezza workload management
Row or Columnar Database
Row or Columnar Database
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
Netezza fundamentals for developers
Netezza fundamentals for developers
Concurrency
Concurrency
Project Risk Management
Project Risk Management
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
Recently uploaded
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
jennyeacort
Asset Management Software - Infographic
Asset Management Software - Infographic
Hr365.us smith
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Ahmed Mohamed
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
Alina Yurenko
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
Łukasz Chruściel
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
Hr365.us smith
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
soniya singh
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
Neo4j
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Christina Lin
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
OPEN KNOWLEDGE GmbH
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
BradBedford3
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
Wave PLM
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
Hanief Utama
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
umasea
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
AnoyGreter
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
Tier1 app
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
Power Karaoke
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
kzayra69
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
Ortus Solutions, Corp
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
Christina Lin
Recently uploaded
(20)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Asset Management Software - Infographic
Asset Management Software - Infographic
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
EY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
MYjobs Presentation Django-based project
MYjobs Presentation Django-based project
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
Serving queries at low latency using HBase
1.
© 2018 Bloomberg
Finance L.P. All rights reserved. Serving Billions of Queries in Millisecond Latency HBaseConAsia 2018 August 17, 2018 Biju Nair bnair10@bloomberg.net
2.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Agenda • Need for low latency • HBase principles • Modeling • Implementation • Monitoring and Tuning
3.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg by the numbers • Founded in 1981 • 325,000 subscribers in 170 countries • Over 19,000 employees in 192 locations • More News reporters than The New York Times + Washington Post + Chicago Tribune
4.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg Tech • More than 5,000 software engineers (and growing) • 100+ engineers and data scientists devoted to machine learning • One of the largest private networks in the world • 100B+ tick messages per day, with a peak of more than 10 million messages/second • 2M news stories ingested / published each day (that's >500 news stories ingested/second) • News content from 125K+ sources • More than a billion messages (emails and IB chats) processed each day
5.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Bloomberg in a nutshell
6.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Storage and Retrieval • Files • VSAM • Network • Hierarchical • Relational • MPP
7.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Using RDBMS • Use Case • Entities and Relations • Logical data model • Physical data model • Implementation and tuning
8.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Principles • Ordered Key Value Store • Distributed
9.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-f
10.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g
11.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Distributed Order Key Value Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value ordered … Key-299 Value … Key-298 Value Key-297 Value ordered … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value …
12.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Abstraction • Table row view • Versioning • ACIDity
13.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Table Row View Key Value Column Id ValueRow Id Timestamp
14.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Table Row View Key11|col1|1234567 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-C Key11|col4|1234567 Value-D Key11 Value-A Value-B Value-C Col1 Col2 Col3 Value-D Col4
15.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Versioning Key11|col1|1234567 Value-A1 Key11|col1|1234566 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-CC Key11|col3|1234563 Value-C Key11|col4|1234567 Value-DD Key11|col4|1234560 Value-D1 Key11|col4|1234557 Value-D Descendingorder
16.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. ACIDity • Atomic at row level • Consistent to a point in time before the request • Isolation through MVCC (reads) and row locks (mutations) • Durability is guaranteed for all successful mutations
17.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Modeling • Fitness for key value store — Can’t build relations — No secondary indexes — De-normalization • Understand queries to design key — Data Skew — Query Skew
18.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Skew Key-e Value Key-e Value Key-e Value Key-e Value Key-a Value Key-a Value Key-a Value Key-b Value Key-b Value Key-b Value Key-a Value Key-h Value Key-h Value Key-h Value Key-d Value Key-d Value Key-d Value Key-f Value Key-f Value Key-x Value Key-x Value Key-z Value Key-z Value Key-y Value Key-y Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value
19.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Data Skew Key-e Value Key-e Value Key-e Value Key-e Value Key-a Value Key-a Value Key-a Value Key-b Value Key-b Value Key-b Value Key-a Value Key-h Value Key-h Value Key-h Value Key-d Value Key-d Value Key-d Value Key-f Value Key-f Value Key-x Value Key-x Value Key-z Value Key-z Value Key-y Value Key-y Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Key-e Value Hotspot
20.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value …
21.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Queries Queries Queries
22.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Query Skew Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Queries Queries Queries Hotspot
23.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Data Write Memory File System WAL t1 t1 t1 Store files Write 1 2 3 Block Block Block Data Idx Blm
24.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. HBase Data Read Memstore File System WAL Read 1 Block Cache 2 Block t1 t1 t1 Store files
25.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Cache • Pack more blocks into cache — Block size — Column Family • Large cache
26.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size and Cache Use BlockCache unuseduse 64 K Blocks unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse
27.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size and Cache Use BlockCache unuseduse 64 K Blocks unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse unuseduse BlockCache use 16 K Blocks use use use use use use use use use use use use use use use use use use use use use
28.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size vs. Read Latency BAvg 16.731 16.728 16.761 16.763 16.418 16.371 16.37 16.431 16.152 16.14 16.169 16.158 16.308 16.29 16.325 16.307 16.34 16.381 16.391 16.352 BMedian 14 14 14 14 13 13 13 13 15 15 15 15 13 13 13 13 13 13 13 13 B95% 41 41 41 41 41 41 41 41 43 43 43 43 40 40 40 40 41 41 41 41 B99% 55 55 55 55 54 54 54 54 55 55 55 55 54 54 54 54 54 54 55 54 B99.9% 71 71 71 71 70 70 70 70 67 67 67 67 71 70 70 71 71 71 71 70 BMax 545 1062 559 567 1075 1027 561 567 564 541 558 1062 1062 561 1075 1072 1067 563 1035 1032 Avg 3.002 5.362 5.361 5.357 6.419 6.369 6.405 6.383 6.188 6.196 6.182 6.174 6.246 6.264 6.268 6.253 5.194 5.207 5.219 3.031 Median 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 95% 10 15 15 15 18 18 18 18 18 18 17 17 18 18 18 18 15 15 15 10 99% 15 26 26 26 30 30 30 30 28 28 28 28 29 29 29 29 25 24 25 15 99.90% 26 41 41 41 45 45 45 45 43 43 43 43 44 44 44 44 41 41 41 26 Max 2261 127 185 102 90 106 92 102 93 106 119 114 89 140 132 82 81 150 93 1910 Get Performance (ms) – 64 K Block Get Performance (ms) – 16 K Block Note: Smaller block size increases index block size
29.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Block Size vs. Index Size Idx Sz K Bloom K 266346 2368 247895 2240 225561 2096 253633 2368 224862 2016 225685 2096 16 K Blocks Idx Sz K Bloom K 472058 2432 574239 2944 331899 1792 471362 2304 517272 2560 469543 2432 8 K Blocks
30.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family Key11|col1|1234567 Value-A1 Key11|col1|1234566 Value-A Key11|col2|1234567 Value-B Key11|col3|1234567 Value-CC Key11|col3|1234563 Value-C Key11|col4|1234567 Value-DD Key11|col4|1234560 Value-D1 Key11|col4|1234557 Value-D BlockCache use use use use use use use use use use use use use use use use use use use use use use
31.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family File System t1:cf2 t1:cf2 Store files t1:cf1 t1:cf1 cf1:col1 Timestamp cf2:col1 TimestampRow Id ValueValueRow Id
32.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1
33.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Column Family BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf1 BlockCache cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf2 cf1 cf2 cf1 cf1 cf1 cf1 cf1 cf1 cf1 cf2 cf1 cf1 cf2 cf1
34.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x
35.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x File System Store files K-x Compaction
36.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Compaction • Part of regular HBase operations • Minor Compaction • Major Compaction • Utilizes server and HBase resources • Major compaction can be scheduled
37.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Short Circuit Read HBase/DFS Client HDFS File System Data TCP HBase/DFS Client HDFS File System Data TCP Open File Pass FD
38.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection HBase Memstore File System WAL t1 t1 t1 Store filesRead 1 Block Cache 2 Block
39.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection HBase Memstore File System WAL t1 t1 t1 Store filesRead 1 Block Cache 3 Block Off-heap Cache Block 2
40.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Large Cache Avg 2.693 2.814 2.836 2.842 2.812 Median 1 1 1 1 1 95% 8 8 8 8 8 99% 14 14 14 14 15 99.90% 20 20 20 20 20 99.99% 32 31 32 32 33 100.00% 313 319 315 376 341 Max latency 1049 1046 1048 1044 1235 61 GB of Cache Avg 3.872 3.995 3.936 4.007 4.052 Median 1 1 1 1 1 95% 14 14 14 15 15 99% 20 20 20 20 20 99.90% 27 27 27 28 28 99.99% 36 36 36 37 37 100.00% 208 310 332 207 232 Max latency 1360 1906 1736 1359 1363 93 GB of Cache
41.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Garbage Collection • Fine tune Garbage Collector • For example, some CMS GC options to look at — ExplicitGCInvokesConcurrent — CMSInitiatingOccupancyFraction — UseCMSInitiatingOccupancyOnly — ParallelGCThreads — UseParNewGC • Log GC info which will help with tuning — PrintGCDetails — Loggc — PrintTenuringDistribution — …
42.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1
43.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1
44.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1
45.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication • Requires changes to cluster configuration — hbase.region.replica.replication.enabled — hbase.regionserver.storefile.refresh.period (not the complete list) • Need to specify region replication in table definition — create 't1', 'f1', {REGION_REPLICATION => 2} • Client needs to specify when to read secondary — get1.setConsistency(Consistency.TIMELINE); — hbase.client.primaryCallTimeout.get — hbase.client.primaryCallTimeout.multiget
46.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Region Replication Time (ms) readers totalQuery totalStale %age 3,000 512 1,520,207 0 0.00% 3,000 512 1,520,207 0 0.00% 3,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 1,000 512 1,520,207 0 0.00% 100 512 1,520,207 5,101 0.34% 100 512 1,520,207 1,476 0.10% 100 512 1,520,207 74 0.00% 50 512 1,520,207 6,173 0.41% 50 512 1,520,207 4,785 0.31% 50 512 1,520,207 5,263 0.35% 10 512 1,520,207 22,518 1.48% 10 512 1,520,207 16,818 1.11% 10 512 1,520,207 19,050 1.25% PrimaryCall Timeout Vs Stale Calls
47.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Application • Table pre-split • Connection reuse • No read cache/scan cache/batch requests • Bulk load instead of Put/Batch Mutate • Column Identifiers • Code on server — Co-processor — Filters
48.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Monitoring • Cache hit ratio • Data locality • GC pause • Compactions • Call queue • Read latencies • Server metrics
49.
© 2018 Bloomberg
Finance L.P. All rights reserved.© 2018 Bloomberg Finance L.P. All rights reserved. Summary • Familiarize system principles • Use DB development lifecycle • Ascertain natural fitness for HBase
50.
© 2018 Bloomberg
Finance L.P. All rights reserved. Thank You! Reference: http://hbase.apache.org Connect with Bloomberg’s Hadoop Team: hadoop@bloomberg.net
Download now