DISTRIBUTED FILE SYSTEMS Presented By HariKrishnan S7CSE
Computing System is a collection of processes operating on data objects. Persistent data objects should be named and saved on nonvolatile storage device. Named data objects are files. A file system is a major component in an OS. A Distributed File System(DFS) is an implementation of file system.
Important concepts in distributed System design
DFSs employ many aspects of the notion of transparency.
The directory service in DFS is a key component in all distributed systems.
The performance and availability require the use of caching and replication.
Access control and protection for DFSs open many problems in distributed system security.
Characteristics of DFS: dispersion and multiplicity of users and files. Transparent DFS should exhibit the following properties:
Multiplicity of users
Multiplicity of files
DFS DESIGN AND IMPLEMENTATION Basic concepts of files and file systems: Files consists of three logical components. File attributes Data units File name File accesses are generally in one of the three modes: Sequential access, Direct access , indexed sequential access.
A file system consists of four major components: Directory Authorization File service System services
The organization of data files can be either flat or hierarchical. Files are named and accessed using a hierarchical pathname . root chow johnson report book Paper Directories are files that contain names and addresses of other files and subdirectories.
File access must be regulated to ensure security. Directory , Authorization and file services are user interfaces to a file system. System services are file systems interface to hardware & are transparent to users. Major Fns of System services includes:
mapping of logical to physical block addresses
Interfacing to services at the device level for file space allocation/deallocation
Actual read/write file operations.
Services and Servers: Servers are processes that implement services. A service may be implemented by a server/ number of servers. A server may also provide multiple services. Interaction among services in DFS: Directory service Authorization service Servers clients File service System service
File mounting and Server Registration Constructs a large file system from various file servers and storage devices Mounting point is usually the leaf of directory tree that contain only an empty subdirectory Once files are mounted they’re accessed using the concatenated logical path names. File system mounting can be done in three diff instances:
Stateful and Stateless File Servers A connection requires the establishment and termination of communication session. There’s state information associated with each session. Ex:
Opened files and their clients
File descriptors and File handles
Current file position pointers
A file server is stateful if it maintains internally some of the state information and stateless if it maintains none at all. Implementation of stateless server must address the following issues:
File locking mechanism
Session key management
File Access and Semantics of sharing File sharing- multiple clients access same file at same time. The may result from either overlapping/interleaving Coherency Control- Managing access to the replicas, to provide a coherent view of the shared file Concurrency control- Concurrency is achieved by time multiplexing of the files and the issues here are how to prevent one execution sequence from interfering with others when they’re interleaved & how to avoid inconsistent results.
In space domain read and write accesses to a remote file can be implemented in one of the following ways: 1.Remote access 2.Cache access 3.Download/upload access Coherency of replicated data may be interpreted in many diff ways 1. All replicas are identical in all times 2. Replicas are perceived as identical only at some points in time 3. Users always read the “most recent”datain the replicas. 4. Write operations are always performed immediately and their results are propagated in a best -effort fashion In timedomain interleaved read and write results in concurrent file accesses. 1. Simple RW 2. Transaction 3. Session
Semantics of sharing Solutions to coherency and concurrency control problem depends on semantics of sharing. Three popular semantic models:
Version control: All problems associated with file sharing and replication disappear if the file is read only. To achieve write sharing clients must know the names of newly created files. A simple solution is to use same file name but with a version number for each revision of the file. The burden of enforcing file sharing semantics is separated from the file service in to a higher level service called version control. File with highest version number considered to be current version.