図でわかる

HDFS Erasure Coding
Kai Sasaki
Treasure Data Inc.
Who am I
佐々木 海(Kai Sasaki)
Software Engineer at Treasure Data Inc.

http://www.treasuredata.com
Hadoop, Spark, DL4J
Agenda
• Erasure Coding
• Under the Namespace
• Writing Side
• Reading Side
Erasure Coding
Replication
Block
HDFS
Replication
Block
Block
Block
Block
HDFS
Replication
Block
Block
Block
Block
HDFS
Replication
Block
Block
Block
Block
HDFS
Capacity
Overhead 

x3
Replication
Block
Block
Block
Block
HDFS
Redundancy
2
Capacity
Overhead 

x3
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block
RS-6-3
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block6 out of 9
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block6 out of 9
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block6 out of 9
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block6 out of 9
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
BlockRedundancy
3
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block
Capacity
Overhead 

x1.5
Redundancy
3
Erasure Coding
Block
Block
HDFS
Block
Block
Block
Block
Block
Block
Block
Block
BlockGroup
Under the Namespace
INode and BlockInfo
BlockInfo
INode
INode and BlockInfo
BlockInfo
INode
BlockInfo BlockInfo…
INode and BlockInfo
BlockInfo
INode
BlockInfo BlockInfo…
…
Block
Block
Block
INode and BlockInfo
BlockInfo
INode
BlockInfo BlockInfo…
…
Block
Block
Block
BlockGroup
BlockInfo
…
Block
Block
Block
BlockGroup
long BlockId
0 64
BlockInfo
…
Block
Block
Block
BlockGroup
long BlockId
0 64
BlockInfo
…
Block
Block
Block
BlockGroup
index GroupId
4bit 60bit
long BlockId
0 64
BlockInfo
…
Block
Block
Block
BlockGroup
index GroupId
4bit 60bit
index 0
index 2
index 1
Saving memory
Writing Side
Data
0
Data64KB
0
Data
BlockGroup
64KB
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data Block
Data
BlockGroup
0
Data Block Parity Block
Data
BlockGroup
0
Data Block Parity Block
Stripe
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
Data
BlockGroup
0
0
Data
BlockGroup
0
0
Data
BlockGroup
0
0
Saving diskspace usage
Reading Side
BlockGroup
0
0
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
BlockGroup
0
0
200kb 500kb
Saving reading time
まとめ
• Namespace -> Saving memory

BlockInfoStriped, BlockIdManager
• Writing Side -> Saving diskspace usage

INodeFile
• Reading Side -> Saving reading time

DFSStripedInputStream
ありがとうございました

図でわかるHDFS Erasure Coding