Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HBaseConAsia2018 Track1-7: HDFS optimizations for HBase at Xiaomi

80 views

Published on

Gang Zie, Yingchao Zhou, and Chen Zhang of Xiaomi

Published in: Internet
  • Be the first to comment

  • Be the first to like this

HBaseConAsia2018 Track1-7: HDFS optimizations for HBase at Xiaomi

  1. 1. HDFS optimization for Hbase At XiaoMi
  2. 2. 01 02 03
  3. 3. 01
  4. 4. l  Multiple replicas l  Namenode & Datanode r=replication, factor k=fault datanodes, N=total datanodes NkailabilityreaddSLAAv /1 −= ( ) ( ) ( )rN,CkNi,rCki,CailabilitywriteSLAAv r =i /1 1 −−×−= ∑ ),,( readSLAwriteSLAailabilitynamenodeAvMintyavailabili =
  5. 5. Master Worker Cluster Configs Namenode DataNode DataNodeWorker Falcon Metrics HDFSMonitorCluster
  6. 6. 02
  7. 7. Architecture DFSClient DfsClientShm Slot Slot slot slotDfsClientShm Datanode RegisteredShm RegisteredShm Slot Slot Block Block Shared Mem Domain Socket •  Allocate Shm •  Request slot & FDs •  Release Slot •  Release Shm
  8. 8. ²  •  •  •  Datanode Full GC caused by RegisteredShm •  3000+ QPS alloction VS 1000+ QPS release
  9. 9. ² 
  10. 10. ²  YCSB get 20% QPS incensement
  11. 11. ²  Preallocate the Shm: •  1000 files •  60 threads •  seek read
  12. 12. ²  ²  ²     
  13. 13. ²  Listen drop on SSD cluster causes 3s delay 15:56:14.506610 IP x.x.x.x.62393 > y.y.y.y.29402: Flags [S], seq 167786998, win 14600, options [mss 1460,sackOK,TS val 1590620938 ecr 0,nop,wscale 7], length 0<<<--------timeout on first try 15:56:17.506172 IP x.x.x.x.62393 > y.y.y.y.29402: Flags [S], seq 167786998, win 14600, options [mss 1460,sackOK,TS val 1590623938 ecr 0,nop,wscale 7], length 0<<<--------retry 15:56:17.506211 IP y.y.y.y.29402 > x.x.x.x.62393: Flags [S.], seq 4109047318, ack 167786999, win 14480, options [mss 1460,sackOK,TS val 1589839920 ecr 1590623938,nop,wscale 7], length 0 ²  HDFS-9669 ²  After change backlog 128, 3s delays reduced to ~1/10 on Hbase SSD cluster Somaxconn=128 Default Datanode backlog=50
  14. 14. ²  Peer cache bucket adjustment
  15. 15. ²  Connection/Socket timeout of the DFSClient & Datanode dfs.client.socket-timeout dfs.datanode.socket.write.timeout l  Reduce the timeout to 15s l  Avoid pipeline timeout, upgrade the DFSClient first
  16. 16. 03
  17. 17. ²  60 seconds timeout if datanode dies ²  Dead nodes not shared ²  “Dead” node not actually dead DeadNodeDetector Global Dead Nodes DFSInputStream Local Dead Nodes Namenode DataNode DataNode DFSInputStream Local Dead Nodes DFSClient HDFS Suspicious Nodes Live Nodes
  18. 18. ²  Node state machine Init Live Suspicious Dead Open RPC Failure Removed Close RPC Failure RPC Success Read Failure Read Success
  19. 19. Subtitle Text Maintain the data on local host as much as possible and reduce the over head of the local read Make sure the response to Hbase is returned as soon as possible even a failed one Minor GC from both Hbase and HDFS affects the latency. Try to easy the one from HDFS on client side.
  20. 20. Thanks

×