The document discusses Apache HBase replication, which asynchronously copies data between HBase clusters. It uses a push-based architecture shipping write-ahead log (WAL) entries similarly to MySQL replication. Replication provides eventual consistency and preserves the atomicity of individual updates. Administrators can configure replication by setting parameters and managing peer clusters and queues stored in Zookeeper. Replicated edits flow from the replication source on a region server to the remote replication sink where they are applied.
8@Twitter
Replication State
Persistently storedin Zookeeper
Status
Master kill switch
Peers
List of remote target clusters
Queues
List of remaining HLogs to replicate and current
position in each log
11@Twitter
End-point for shippingWAL entries
One instance for each queue
Runs as a separate thread on region server
Uses AdminProtocol RPC to synchronously
ship entries
Filters edits based on replication scope
Replication Source
12.
12@Twitter
Replication Sink
End-point forreceiving shipped WAL entries
One instance per region server
Synchronously receives entries and applies
them using HTable
Batches rows in the same table