2. Computing System is a collection of processes
operating on data objects.
Persistent data objects should be named and saved
on nonvolatile storage device.
Named data objects are files.
A file system is a major component in an OS.
A Distributed File System(DFS) is an
implementation of file system.
3. Important concepts in distributed System
design
• DFSs employ many aspects of the notion of
transparency.
• The directory service in DFS is a key
component in all distributed systems.
• The performance and availability require the
use of caching and replication.
• Access control and protection for DFSs open
many problems in distributed system security.
4. Characteristics of DFS:
dispersion and multiplicity of users and files.
Transparent DFS should exhibit the following properties:
Dispersed Clients
Dispersed Files
Multiplicity of users
Multiplicity of files
5. DFS DESIGN AND IMPLEMENTATION
Basic concepts of files and file systems:
Files consists of three logical components.
File
attributes
Data unitsFile name
File accesses are generally in one of the three modes:
Sequential access, Direct access , indexed sequential
access.
6. A file system consists of four major components:
I. Directory
II. Authorization
III. File service
IV. System services
Directory service Name resolution, add &
deletion of files
Authorization service Capability and/or access
control list
File service Transaction Concurrency & replication
management
basic read./write files and get/set
attributes
System services Device, cache and block
mgmt
7. The organization of data files can be either flat or
hierarchical.
Files are named and accessed using a hierarchical pathname .
root
chow johnson
report
book Paper
Directories are files that contain names and
addresses of other files and subdirectories.
8. File access must be regulated to ensure security.
Directory , Authorization and file services are user
interfaces to a file system.
System services are file systems interface to hardware & are
transparent to users.
Major Fns of System services includes:
• mapping of logical to physical block addresses
• Interfacing to services at the device level for file
space allocation/deallocation
• Actual read/write file operations.
9. Services and Servers:
Servers are processes that implement services.
A service may be implemented by a server/ number of servers.
A server may also provide multiple services.
Interaction among services in DFS:
Directory
service
Authorization
service
File service
System
service
Servers
clients
10. File mounting and Server Registration
Constructs a large file system from various file servers and
storage devices
Mounting point is usually the leaf of directory tree that
contain only an empty subdirectory
Once files are mounted they’re accessed using the
concatenated logical path names.
File system mounting can be done in three diff instances:
o Explicit mounting
o Boot mounting
o Auto mounting
11. Stateful and Stateless File Servers
A connection requires the establishment and termination
of communication session. There’s state information
associated with each session.
Ex:
Opened files and their clients
File descriptors and File handles
Current file position pointers
Mounting information
Lock status
Session keys
Cache/Buffer
12. A file server is stateful if it maintains internally some of the
state information and stateless if it maintains none at all.
Implementation of stateless server must address the following
issues:
Idempotency Requirement
File locking mechanism
Session key management
Cache consistency
13. File Access and Semantics of sharing
File sharing- multiple clients access same file at same time.
The may result from either
overlapping/interleaving
Coherency Control- Managing access to the replicas, to
provide a coherent view of the shared file
Concurrency control- Concurrency is achieved by time
multiplexing of the files and the issues here are how to
prevent one execution sequence from interfering with others
when they’re interleaved & how to avoid inconsistent results.
14. In space domain read and write accesses to a remote file can be implemented in
one of the following ways:
1.Remote access
2.Cache access
3.Download/upload access
Coherency of replicated data may be interpreted in many diff ways
1. All replicas are identical in all times
2. Replicas are perceived as identical only at some points in time
3. Users always read the “most recent”data in the replicas.
4. Write operations are always performed immediately and their results
are propagated in a best -effort fashion
In time domain interleaved read and write results in concurrent file accesses.
1. Simple RW
2. Transaction
3. Session
15. Semantics of sharing
Solutions to coherency and concurrency control problem
depends on semantics of sharing.
Three popular semantic models:
• Unix semantics
• Transaction semantics
• Session semantics
16. Version control:
All problems associated with file sharing and replication
disappear if the file is read only. To achieve write sharing
clients must know the names of newly created files.
A simple solution is to use same file name but with a
version number for each revision of the file.
The burden of enforcing file sharing semantics is
separated from the file service in to a higher level service
called version control.
File with highest version number considered to be current
version.