This document discusses key aspects of distributed file systems including file caching schemes, file replication, and fault tolerance. It describes different cache locations, modification propagation techniques, and methods for replica creation. File caching schemes aim to reduce network traffic by retaining recently accessed files in memory. File replication provides increased reliability and availability through independent backups. Distributed file systems must also address being stateful or stateless to maintain information about file access and operations.
Overview of distributed file systems including file caching, replication, and fault tolerance.
Introduction to file caching schemes focusing on key decisions such as location and modification propagation.
Explains the purpose of file caching: reducing network traffic by retaining accessed files. Discusses key decisions in caching, including data granularity and cache size.
Identifies three possible cache locations: server's main memory, client's disk, and client's main memory for cached data storage.
Describes two write strategies: write-through for immediate updates and delayed-write for batched updates.
Highlights the advantages of file replication: increased reliability, performance improvements, and higher availability through distributed workloads.
Discusses replication transparency as a key issue where the process isn't transparent to users.
Explains three replication methods: explicit, lazy, and group file replication for managing file copies.
Distinguishes between stateful and stateless services, detailing their mechanisms for maintaining file operation information.
Introduction
To Retainrecently accessed files in the main
memory.
Repeated accesses to the same information can be
handled locally.
Reduce network traffic
4.
Key decisions
Inimplementation of file caching scheme, one has to
make several key decisions
Like granularity of cached data
Cache size
Replacement policy
5.
Cache location
Itrefers to the place at where the cached data is
stored.
Let’s assume original location of a file is on its
server’s disk.
Three possible cache location:
1) Server’s main memory
2) Client’s disk
3) Client’s main memory
Modification Propagation
Write-through:
whennew user modifies the cache entry it is
immediately written to the server.
Delayed-write:
To decrease continuous network traffic write all updates
to the server periodically or batch them together.
Single write operation is efficient then multiple
8.
File replication
Providefor increase reliability by having independent
backups of each file.
Enable file access to continue even if one file server
is down.
9.
Advantages of replication
Improver performance
By distributing workload among multiple servers
Increase reliability
Copy can be taken from another server if one
crashes
Availability
By multiple copy of each file on separated servers
10.
Key issue inreplication
Replication Transparency
Not transparent to the user.
Entire process is carry behind programmer’s back.
11.
Replica creation methods
Three methods
a) Explicit file replication
b) Lazy file replication
c) File replication using group
12.
Explicit file replication
Entire process controlled by
programmer
Process makes file on one
server & Make multiple
copies on servers
Directory server maintain list
of all replicas
When file is requested, any
one of these copies can be
opened.
13.
Lazy file replication
Only one copy created
on server & server
makes replication for
other servers.
System can track all
replicas & retrieve one
copy as required.
14.
File replication usinga group
System call is sent to all
servers
Replicas are created
when original is made
All copies are made at
same time.
15.
Stateful
– Servermaintains information about a file opened by a client (e.g.,
file pointer, mode)
– Mechanism: on open, the server provides a “handle” to the client
to use on subsequent operations.
Stateless
– Server maintains no information about client access to files
– Mechanism: each client operation must provide context
information for that operation
Fault Tolerance
16.
Stateful service
Information about file operations are kept in the server during
all the file session.
A communication channel is established between the
client and the server when a the client explicitly solicits the file
opening.
A number(identifier) is used to define the communication
channel then this identifier will be used to perform file
operations.
To attend its clients, the server copies data from the storage
devices to memory and let them there till the file closing.
17.
Stateless service
The stateless service does not establish a communication
channel. there is no necessity for explicit file opening and
closing
Before executing a file operation the server will automatically
open and close the file.
Each request sent to the server must define the desired file
likewise, if a read or write
Operation is requested, it must contain the position in the file
referring to the respective operation.