文档名称:Hadoop 基线选定
实现 (facebook,cloudera
推荐)
注:
HADOOP-1700 为 2008-07-24 提交,经过 2 年 bug fix。
HDFS-265 为 2010-05-21 提交设计文档,并合并入 0.21.0。
由于 0.21.0 该版本为非稳定 release,并未经过测试使用,引入了大量不兼容改变,且
其并不能比其他版本更好的满足 hbase 的需求,所以我们暂不选择 hadoop 0.21.0。
我们选择 cdh3b2 进行功能测试,主要测试 append/sync 功能的实现状况。具体测试用例
请见附录。由测试用例得出结论,基于 HADOOP-1700 的 append/sync 实现可以满足需求。
下面我们对比 cdh3b2 和 0.20-append, 对比这两个版本。
在社区支持方面,0.20-append 主要由 facebook 在推动,而 CDH3b2 则是有 cloudera 在
推动,比较社区支持而言,不相伯仲。
我们比较下 cdh3b2 和 0.20-append 在功能上的差别。这两个版本在主要功能和采用的
patch 上大同小异。下表中对比了两个版本特有的 patch。
Cdh3 0.20-append
稳定 [HDFS-1056] - Multi-node RPC deadlocks during HDFS-1258 Clearing namespace quota on
性 block recovery "/" corrupts fs image.
[HDFS-1122] - client block verification may result HDFS-955 New implementation of
in blocks in DataBlockScanner prematurely saveNamespace() to avoid loss of edits
[HDFS-1197] - Blocks are considered "complete"
prematurely after commitBlockSynchronization or
DN restart
[HDFS-1186] - 0.20: DNs should interrupt writers at
start of recovery
[HDFS-1218] - 20 append: Blocks recovered on
startup should be treated with lower priority during
block synchronization
[HDFS-1260] - 0.20: Block lost when multiple DNs
trying to recover it to different genstamps
[HDFS-127] - DFSClient block read failures cause
open DFSInputStream to become unusable
[HDFS-686] - NullPointerException is thrown while
merging edit log and image
[HDFS-915] - Hung DN stalls write pipeline for far
longer than its timeout
[HADOOP-6269] - Missing synchronization for
defaultResources in Configuration.addResource
[HADOOP-6460] - Namenode runs of out of
memory due to memory leak in ipc Server
[HADOOP-6667] - RPC.waitForProxy should retry
through NoRouteToHostException
[HADOOP-6722] - NetUtils.connect should check
that it hasn't connected a socket to itself
4.
文档名称:Hadoop 基线选定
[HADOOP-6723] - unchecked exceptions thrown in
IPC Connection orphan clients
[HADOOP-6724] - IPC doesn't properly handle
IOEs thrown by socket factory
[HADOOP-6762] - exception while doing RPC I/O
closes channel
[HADOOP-2366] - Space in the value for
dfs.data.dir can cause great problems
[HADOOP-4885] - Try to restore failed replicas of
Name Node storage (at checkpoint time)
功能 [HDFS-142] - In 0.20, move blocks being written
into a blocksBeingWritten directory
[HDFS-611] - Heartbeats times from Datanodes
increase when there are plenty of blocks to delete
[HDFS-895] - Allow hflush/sync to occur in parallel
with new writes to the file
[HDFS-877] - Client-driven block verification not
functioning
[HDFS-894] - DatanodeID.ipcPort is not updated
when existing node re-registers
性能 [HADOOP-4655] - FileSystem.CACHE should be HDFS-1041
ref-counted DFSClient.getFileChecksum(..) should
retry if connection to
HDFS-927 DFSInputStream retries too
many times for new block locations
Misc [HDFS-1161] - Make DN minimum valid volumes HADOOP-6637 Benchmark for
configurable establishing RPC session. (shv)
[HDFS-1209] - Add conf HADOOP-6760 WebServer shouldn't
dfs.client.block.recovery.retries to configure number increase port number in case of negative
of block recovery attempts
[HDFS-455] - Make NN and DN handle in a
intuitive way comma-separated configuration strings
[HDFS-528] - Add ability for safemode to wait for a
minimum number of live datanodes
[HADOOP-1849] - IPC server max queue size
should be configurable
[HADOOP-4675] - Current Ganglia metrics
implementation is incompatible with Ganglia 3.1
[HADOOP-4829] - Allow FileSystem shutdown
hook to be disabled
[HADOOP-5257] - Export namenode/datanode
functionality through a pluggable RPC layer