More Related Content Similar to Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか (20) More from Toshihiro Suzuki (10) Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか2. ( )
• Apache HBase Committer
• Cloudera
• Sr. Software Engineer, Breakfix
•
•
• ( HBase/Phoenix)
• HBase
• Twitter: @brfrn169
6. • HBase 0.98
• HBase 1.4.9
•
• HBase 1.5.0
• HBase 1
• HBase 2 HBase 2.1.x
HBase 2.2.0
• HBase 2
7. • CDH
• CDH 5.8+: HBase 1.2.0 (+ bugfixes and backports)
• CDH 6.0: HBase 2.0.1 (+ bugfixes and backports)
• CDH 6.1: HBase 2.1.1 (+ bugfixes and backports)
• HDP
• HDP 2.x: HBase 1.1.2 (+ bugfixes and backports)
• HDP 3.x: HBase 2.0.2 (+ bugfixes and backports)
9. HBase
• HBase 2.x
•
• Procedure version 2
• Assignment Manager version 2
•
• Backup/Restore
•
• Compacting Memstore
•
• Serial Replication
11. Procedure version 2
• ) CreateTableProcedure
PRE_OPERATION WRITE_FS_LAYOUT ADD_TO_META
ASSIGN_REGIONSUPDATE_DESC_CACHEPOST_OPERATION
Start
End
12. Procedure version 2
• ) CreateTableProcedure
PRE_OPERATION WRITE_FS_LAYOUT ADD_TO_META
ASSIGN_REGIONSUPDATE_DESC_CACHEPOST_OPERATION
Start
End
13. Procedure version 2
• ) CreateTableProcedure
PRE_OPERATION WRITE_FS_LAYOUT ADD_TO_META
ASSIGN_REGIONSUPDATE_DESC_CACHEPOST_OPERATION
Start
End
14. Procedure version 2
• ) CreateTableProcedure
PRE_OPERATION WRITE_FS_LAYOUT ADD_TO_META
ASSIGN_REGIONSUPDATE_DESC_CACHEPOST_OPERATION
Start
End
Procedure
ASSIGN_REGIONS
Region
Procedure
15. Assignment Manager version 2
• Region
• Region
• HBCK
• Region Assignment
Manager version 2
• Procedure version 2
• Region Zookeeper
•
•
• Region
• Region
• Master
16. Backup/Restore
•
•
• hbase backup create <type> <backup_path> [options]
• hbase restore <backup_path> <backup_id> [options]
• HDFS S3, ADLS, WASB
•
• hbase snapshot
• Write Ahead Log (WAL)
33. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
34. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
Tail the WALs
35. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
Tail the WALs
36. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1
Tail the WALs
37. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1
38. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1
39. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
2
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1
40. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
2
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2
41. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2
42. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2
43. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
3
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2
44. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
3
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2 3
45. Serial Replication
• HBase Replication
RegionServer
WAL1
WAL2
1
Queue
2
3
4
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push asynchronously
1 2 3
47. Serial Replication
• HBase Replication
RegionServer 1Queue
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
48. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
49. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2 ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
50. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
51. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
Move the Region
52. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
RegionServer 2Queue
ReplicationSource
4
Move the Region
53. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
RegionServer 2Queue
ReplicationSource
4
Push
54. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
RegionServer 2Queue
ReplicationSource
4
Push
55. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
RegionServer 2Queue
ReplicationSource
4
Push
1
56. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1
RegionServer 2Queue
ReplicationSource
4
Push
1
57. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1
RegionServer 2Queue
ReplicationSource
4
Push
58. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1
RegionServer 2Queue
ReplicationSource
4
Push
59. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1
RegionServer 2Queue
ReplicationSource
4
Push
2
60. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
2
61. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
62. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
63. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
4
64. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
4
4
65. Serial Replication
• HBase Replication
RegionServer 1
1
Queue
2
3
ReplicationSource
Cluster 1
RegionServer
ReplicationSink
Cluster 2
HTable
RegionServer
Push
1 2
RegionServer 2Queue
ReplicationSource
4
Push
4
Inconsistent
State!
106. • Procedure version 2 / Assignment Manager version 2
•
• Backup/Restore
• Compacting Memstore
• Serial Replication Replication
108. HBase
• Evolving HBase in the Cloud
• HBase
• HBase on Persistent Memory
• HBase Persistent Memory
• Synchronous Replication
•
•
•
109. HBase
• Evolving HBase in the Cloud
• HBase
• HBase on Persistent Memory
• HBase Persistent Memory
• Synchronous Replication
•
•
•
110. Evolving HBase in the Cloud
• HBASE-20951 Ratis LogService backed WALs
• IaaS (Amazon EC2, Google Compute
Engine, Microsoft Azure Compute) HBase
• IaaS HBase
112. Evolving HBase in the Cloud
• Amazon EBS (Elastic Block Store) Google Persistent Storage
( )
• Amazon EBS (Elastic Block Store) Google Persistent Storage
• Amazon S3 Google Cloud Storage
•
• Amazon EBS Google Persistent Storage
113. Evolving HBase in the Cloud
• HBase HFile WAL HDFS
• HFile
• WAL short-lived, sub-second durability requirements
HDFS
HFile
HFile
HFile
WAL
RegionServerPuts Memstore
Flush
114. Evolving HBase in the Cloud
• HFile (S3 with S3Guard )
• WAL
• WAL
• sub-second durability requirements
• WAL
• traversable queue (FIFO)
• constant-time append complexity
• linear-time traversal
• sub-linear seek to an arbitrary offset
115. Evolving HBase in the Cloud
• Apache Ratis
• Apache Software Foundation
• RAFT Java
• Apache Hadoop Ozone
• Ratis Kafka DistributedLog
• HBase WAL
•
• Ratis
• WAL Ratis
116. Evolving HBase in the Cloud
• Ratis WAL Ratis LogService Ratis
• WAL HBase
• 2
1. Ratis LogService (RATIS-271)
2. HBase WAL (HBASE-20952)
• HDFS HDFS WAL 1
• Ratis LogService Kafka DistributedLog
117. Evolving HBase in the Cloud
•
RegionServer1
ReginoServer2
New WAL API
Ratis LogService
Amazon S3/Google Cloud Storage
ReginoServer3
Flush
Memstore
WAL
Storage
WAL
Storage
WAL
Storage
Puts
HFile
HFile
HFile
RAFT