this slides introduce the concept of distributed filesystem ,the CAP theory,consistent hashing algorithm。
then go deep into internal of google filesystem.
at last,slides introduces taobao filesystem,mogile filesystem and moose filesystem.
54. CAP理论(Brewer的猜想)
2000 Prof.Eric Brewer,PoDC Conference Keynote
2002 Seth Gilbert and Nancy Lynch ACM SIGACT News 33(2)
“ Of three properties of shared-data systems -
data Consistency,system Availability and
tolerance to network Partitions - only two can
be achieved at any given moment in time.”
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf
13年9⽉月16⽇日 星期⼀一
59. • ⼀一致性(Consistency)
• 任何⼀一个读操作总能读取到之前完
成的写操作结果
• 可⽤用性(Availability)
• 每⼀一个操作总是能在确定的时间内
返回
• 分区可容忍性(Tolerance of network
Partition)
• 在出现⺴⽹网络分区的情况下,仍然能
够满⾜足⼀一致性和可⽤用性
CAP
pick
two
C
A P
13年9⽉月16⽇日 星期⼀一
61. 分区容忍性-可⽤用性
“The network will be allowed to lose arbitrarily many messages sent
from one node to another” [...]
“For a distributed system to be continuously availability,every request
received by non-failing node in the system must result in a response”
- Gillbert and Lynch,SIGACT 2002
http://codahale.com/you-cant-sacrifice-partition-tolerance/
http://pl.atyp.us/wordpress/?p=2521
13年9⽉月16⽇日 星期⼀一
62. 分区容忍性-可⽤用性
“The network will be allowed to lose arbitrarily many messages sent
from one node to another” [...]
“For a distributed system to be continuously availability,every request
received by non-failing node in the system must result in a response”
- Gillbert and Lynch,SIGACT 2002
http://codahale.com/you-cant-sacrifice-partition-tolerance/
http://pl.atyp.us/wordpress/?p=2521
13年9⽉月16⽇日 星期⼀一
63. 分区容忍性-可⽤用性
“The network will be allowed to lose arbitrarily many messages sent
from one node to another” [...]
“For a distributed system to be continuously availability,every request
received by non-failing node in the system must result in a response”
- Gillbert and Lynch,SIGACT 2002
http://codahale.com/you-cant-sacrifice-partition-tolerance/
http://pl.atyp.us/wordpress/?p=2521
13年9⽉月16⽇日 星期⼀一
64. 分区容忍性-可⽤用性
“The network will be allowed to lose arbitrarily many messages sent
from one node to another” [...]
“For a distributed system to be continuously availability,every request
received by non-failing node in the system must result in a response”
- Gillbert and Lynch,SIGACT 2002
http://codahale.com/you-cant-sacrifice-partition-tolerance/
http://pl.atyp.us/wordpress/?p=2521
HIGH LATENCY
≈
NETWORK PARTITION
http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
13年9⽉月16⽇日 星期⼀一
175. GFS 松散⼀一致性模型
Write Record Append
Serial success defined defined
interspersed with
inconsistentConcurrent successes consistent but undefined
defined
interspersed with
inconsistent
Failure inconsistentinconsistent
13年9⽉月16⽇日 星期⼀一
176. GFS 松散⼀一致性模型
• “Consistent” = 所有的副本有相同的值
Write Record Append
Serial success defined defined
interspersed with
inconsistentConcurrent successes consistent but undefined
defined
interspersed with
inconsistent
Failure inconsistentinconsistent
13年9⽉月16⽇日 星期⼀一
177. GFS 松散⼀一致性模型
• “Consistent” = 所有的副本有相同的值
• “Defined” = “Consistent” +可以看到完整变更的数据
Write Record Append
Serial success defined defined
interspersed with
inconsistentConcurrent successes consistent but undefined
defined
interspersed with
inconsistent
Failure inconsistentinconsistent
13年9⽉月16⽇日 星期⼀一