DISTRIBUTED OPERATING
SYSTEM
TOPIC : FILE REPLICATION
Prepared By:
Dhaval Chodavadiya
Index
Introduction
Difference between Replication and Caching
Advantages of File Replication
Replication Transparency
Multi Copy Update Protocols
Conclusion
Reference
3/23/2019 10:35 AM Dhaval Chodavadiya 2
Introduction
 File Replication:
 High availability is desirable feature of a good distributed
file system and file replication is the primary mechanism for
improving file availability.
 Replication is the key strategy for improving reliability, fault
tolerance and availability.
 Therefore, duplicating files on multiple machines improves
availability and performance.
3/23/2019 10:35 AM Dhaval Chodavadiya 3
Introduction (Cont…)
Replicated File:
A replicated file is a file that has multiple copies, with each
copies located on a separate file server.
Each copy of the set of the copies that comprises a
replicated file is referred to as Replica of the replicated file
3/23/2019 10:35 AM Dhaval Chodavadiya 4
Replication and Caching
3/23/2019 10:35 AM Dhaval Chodavadiya 5
Replication is often confused with caching, probably because
they both deal with multiple copies of data. The two concepts
has the following basic differences:
1. A replica is associated with server, whereas a cached copy is
associated with a client.
2. The existence of cached copy is primarily dependent on the
locality in file access patterns, whereas the existence of a
replica normally depends on availability and performance
requirements.
3. Satynarayanana [1992] distinguishes a replicated copy from a
cached copy by calling the first-class replicas and second-class
replicas respectively .
Advantages of file replication
3/23/2019 10:35 AM Dhaval Chodavadiya 6
1. Increase availability: one of the most important advantages of
replication is that it masks and tolerates failures in the network
gracefully. In particular, the system remains operational and
available to the users despite failures.
2. Improve response time
3. Increase reliability
4. Improve availability
5. Reduced network traffic
Replication Transparency
3/23/2019 10:35 AM Dhaval Chodavadiya 7
Naming Of Replicas:
It’s the responsibility of the system to name the various copies of
a resource and to map a user-supplied name of the resource to an
appropriate replica of the resource.
Replication Control:
Replication control determines how many copies of the resource
should be created, where each copy should be placed, and when
should a copy be created /deleted. All this things should be made
entirely automatically by the system in a user transparent manner.
Draw back: The main problem of file replication is consistency. That is when
one copy of replica changes, how does the other copies reflect that change.
Multi Copy Update Protocols
 Maintaining consistency among copies when a replicated file
is updated is the major issue of file system that supports
replication of files. Some commonly used approaches to
handle this issue are described below:
1. Read -Only-Replication
2. Read -Any-Write- All Protocol
3. Available –Copies Protocol
4. Primary-Copy Protocol
5. Quorum-Based Protocol
3/23/2019 10:35 AM Dhaval Chodavadiya 8
Multi Copy Update Protocols(Contd…)
1. Read- Only- Replication:
 This approach allows the replication of only immutable files,
since immutable files are used only in the read-only mode,
because mutable files cannot be replicated.
 This approach is too restrictive in the sense that it allows the
replication of only immutable files.
3/23/2019 10:35 AM Dhaval Chodavadiya 9
Multi Copy Update Protocols(Contd…)
2. Read-Any-Write-All Protocol:
 This approach allows the replication of mutable files. In this
method, a read operation on a replicated file is performed by
reading any copy of the file and write operation by writing to
all copies of the file.
 Some of the lock has to be used to carryout a write operation.
That is, before updating any copy, all copies are locked, then
the they are updated, and finally locks are released to
complete write operation. The protocol is used for
implementing UNIX like Semantics .
 The main problem with this approach is that a write operation
can’t be performed if any of the servers having a copy of the
replicated file is down at a time of write operation.
3/23/2019 10:35 AM Dhaval Chodavadiya 10
Multi Copy Update Protocols(Contd…)
3. Available-Copies Protocol:
 This Approach allows the write operation to be carried out
even when some of the servers having a copy of the
replicated file are down.
 In this method, the read operation is performed by reading
any available copy, but a write operation is performed by
writing to all available copies.
 When the server recovers after a failure, it brings itself up to
date by copying from the other servers before accepting any
user request.
3/23/2019 10:35 AM Dhaval Chodavadiya 11
Multi Copy Update Protocols(Contd…)
4. Primary-Copy Protocol:
 Another simple method to solve the multi-copy update
problem is the primary–copy protocol.
 In this protocol for each replicated file one copy is as the
primary copy and all others are secondary copies.
 Read operation can be performed using any copy primary or
secondary.
 Each server having a copy updates its copy either by receiving
notification of changes from the server having the primary
copy or by requesting the updated copy from it.
3/23/2019 10:35 AM Dhaval Chodavadiya 12
Multi Copy Update Protocols(Contd…)
 Draw backs :
 The read-any-write-all and available –copies protocols cannot
handle the network partition problem in which the copies of
a replicated file are partitioned into two or more active
groups.
 Moreover, the primary-copy – protocol is too restrictive in
the sense that a write operation cannot be perform if the
server having the primary copy is down.
5. Quorum –Based Protocol:
 This protocol is capable of handling the network partition
problem and can increase the availability of write
operations at the expense of read operation.
3/23/2019 10:35 AM Dhaval Chodavadiya 13
Conclusion
 A replicated file is a file that has multiple copies, with each
copy located on a separate file server.
 Each copy of the set of copies that comprises a replicated file
is referred to as a replica.
 Maintaining consistency among copies when a replicated file
is updated is the major issue in file system that support
replication of files.
 Some of the commonly approaches to handle this issue are
read-only replication, read –any write to all protocol, available
copies protocol, primary copy protocol and quorum-based
protocol.
3/23/2019 10:35 AM Dhaval Chodavadiya 14
File replication

File replication

  • 1.
    DISTRIBUTED OPERATING SYSTEM TOPIC :FILE REPLICATION Prepared By: Dhaval Chodavadiya
  • 2.
    Index Introduction Difference between Replicationand Caching Advantages of File Replication Replication Transparency Multi Copy Update Protocols Conclusion Reference 3/23/2019 10:35 AM Dhaval Chodavadiya 2
  • 3.
    Introduction  File Replication: High availability is desirable feature of a good distributed file system and file replication is the primary mechanism for improving file availability.  Replication is the key strategy for improving reliability, fault tolerance and availability.  Therefore, duplicating files on multiple machines improves availability and performance. 3/23/2019 10:35 AM Dhaval Chodavadiya 3
  • 4.
    Introduction (Cont…) Replicated File: Areplicated file is a file that has multiple copies, with each copies located on a separate file server. Each copy of the set of the copies that comprises a replicated file is referred to as Replica of the replicated file 3/23/2019 10:35 AM Dhaval Chodavadiya 4
  • 5.
    Replication and Caching 3/23/201910:35 AM Dhaval Chodavadiya 5 Replication is often confused with caching, probably because they both deal with multiple copies of data. The two concepts has the following basic differences: 1. A replica is associated with server, whereas a cached copy is associated with a client. 2. The existence of cached copy is primarily dependent on the locality in file access patterns, whereas the existence of a replica normally depends on availability and performance requirements. 3. Satynarayanana [1992] distinguishes a replicated copy from a cached copy by calling the first-class replicas and second-class replicas respectively .
  • 6.
    Advantages of filereplication 3/23/2019 10:35 AM Dhaval Chodavadiya 6 1. Increase availability: one of the most important advantages of replication is that it masks and tolerates failures in the network gracefully. In particular, the system remains operational and available to the users despite failures. 2. Improve response time 3. Increase reliability 4. Improve availability 5. Reduced network traffic
  • 7.
    Replication Transparency 3/23/2019 10:35AM Dhaval Chodavadiya 7 Naming Of Replicas: It’s the responsibility of the system to name the various copies of a resource and to map a user-supplied name of the resource to an appropriate replica of the resource. Replication Control: Replication control determines how many copies of the resource should be created, where each copy should be placed, and when should a copy be created /deleted. All this things should be made entirely automatically by the system in a user transparent manner. Draw back: The main problem of file replication is consistency. That is when one copy of replica changes, how does the other copies reflect that change.
  • 8.
    Multi Copy UpdateProtocols  Maintaining consistency among copies when a replicated file is updated is the major issue of file system that supports replication of files. Some commonly used approaches to handle this issue are described below: 1. Read -Only-Replication 2. Read -Any-Write- All Protocol 3. Available –Copies Protocol 4. Primary-Copy Protocol 5. Quorum-Based Protocol 3/23/2019 10:35 AM Dhaval Chodavadiya 8
  • 9.
    Multi Copy UpdateProtocols(Contd…) 1. Read- Only- Replication:  This approach allows the replication of only immutable files, since immutable files are used only in the read-only mode, because mutable files cannot be replicated.  This approach is too restrictive in the sense that it allows the replication of only immutable files. 3/23/2019 10:35 AM Dhaval Chodavadiya 9
  • 10.
    Multi Copy UpdateProtocols(Contd…) 2. Read-Any-Write-All Protocol:  This approach allows the replication of mutable files. In this method, a read operation on a replicated file is performed by reading any copy of the file and write operation by writing to all copies of the file.  Some of the lock has to be used to carryout a write operation. That is, before updating any copy, all copies are locked, then the they are updated, and finally locks are released to complete write operation. The protocol is used for implementing UNIX like Semantics .  The main problem with this approach is that a write operation can’t be performed if any of the servers having a copy of the replicated file is down at a time of write operation. 3/23/2019 10:35 AM Dhaval Chodavadiya 10
  • 11.
    Multi Copy UpdateProtocols(Contd…) 3. Available-Copies Protocol:  This Approach allows the write operation to be carried out even when some of the servers having a copy of the replicated file are down.  In this method, the read operation is performed by reading any available copy, but a write operation is performed by writing to all available copies.  When the server recovers after a failure, it brings itself up to date by copying from the other servers before accepting any user request. 3/23/2019 10:35 AM Dhaval Chodavadiya 11
  • 12.
    Multi Copy UpdateProtocols(Contd…) 4. Primary-Copy Protocol:  Another simple method to solve the multi-copy update problem is the primary–copy protocol.  In this protocol for each replicated file one copy is as the primary copy and all others are secondary copies.  Read operation can be performed using any copy primary or secondary.  Each server having a copy updates its copy either by receiving notification of changes from the server having the primary copy or by requesting the updated copy from it. 3/23/2019 10:35 AM Dhaval Chodavadiya 12
  • 13.
    Multi Copy UpdateProtocols(Contd…)  Draw backs :  The read-any-write-all and available –copies protocols cannot handle the network partition problem in which the copies of a replicated file are partitioned into two or more active groups.  Moreover, the primary-copy – protocol is too restrictive in the sense that a write operation cannot be perform if the server having the primary copy is down. 5. Quorum –Based Protocol:  This protocol is capable of handling the network partition problem and can increase the availability of write operations at the expense of read operation. 3/23/2019 10:35 AM Dhaval Chodavadiya 13
  • 14.
    Conclusion  A replicatedfile is a file that has multiple copies, with each copy located on a separate file server.  Each copy of the set of copies that comprises a replicated file is referred to as a replica.  Maintaining consistency among copies when a replicated file is updated is the major issue in file system that support replication of files.  Some of the commonly approaches to handle this issue are read-only replication, read –any write to all protocol, available copies protocol, primary copy protocol and quorum-based protocol. 3/23/2019 10:35 AM Dhaval Chodavadiya 14