IAC 2024 - IA Fast Track to Search Focused AI Solutions
AFS case study
1. AFS Case Study
Manfred at zeropiu.it
EuroBsdCon November 2006
2. Agenda
• Overview
• Basic Concepts
• AFS Servers Type
• Arla Overview
• Best practice
• Planning
• AFS Convention
• Case Study
• Solution
• Architecture
• Result
Pagina 2
3. Overview
Andrew File System is a distributed file system
designed to :
• handle terabytes of data
• handle thousands of users
• working in WAN environment
Pagina 3
4. Brief history of a AFS
• 1983 Andrew Project started at Carnegie Mellon University (CMU)
• 1987 Coda research work begun (based on AFS)
• 1988 First use of AFS version 3 First use of AFS outside Carnegie
Mellon University
• 1988 Institutional File System project at University of Michigan –
• 1989 Transarc Corporation founded to commercialize AFS,
• 1993 Arla project started at Kungliga Tekniska Högskolan
• 1998 Transarc Corporation becomes wholly owned subsidiary of IBM
• 2000 IBM releases OpenAFS as OpenSource (IBM License),
• 2000 OpenAFS release version 1.0 based on Transarc 3.6
• 2001 OpenAFS release version 1.2 first release with better support
of new operating system and fix several memory leak
• 2005 OpenAFS release version 1.4 with a lot of new feature
• 2005 AFS was discontinued from IBM
Pagina 4
5. Basic Concepts
• Transparent Access and Uniform Namespace
• Cell
• Partitions and Volumes
• Mount Points
• Scalability
• Client Caching
• Replication
• Security
• Authentication and secure communication
• Authorization and flexible access control
• System Management
• Single system interface
• Delegation
• Backup
Pagina 5
6. Transparent Access and Uniform Namespace
• Cell
• Cell is collection of file servers and workstation
• The directories under /afs are cells , unique tree
• Fileserver contains volumes
• Volumes
• Volumes are "containers" or sets of related files
and directories
• Have size limit
• 3 type rw,ro,backup
• Mount Point
• Access to a volume is provided through a mount
point
• A mount point looks and just like a static
directory
Pagina 6
7. Scalability
• Cache Manager (Client Side)
• Maintain information about identities users
• Retrieve data from fileserver
• Keeps chunks of retrieved files on local disk (cache)
• Replication
• Frequently accessed data can be replicated (read-only) on
several server
• Cache Manager make use of replicate volumes first
Pagina 7
8. Security
• Authentication
• Kerberos IV native (kaserver)
• External Kerberos V
• Unique identity
• Encryption communication on data transfer (crypt option)
• Authorization
• Access control list with 7 types permissions
• Groups definition by user
Pagina 8
9. System Management
• Single system interface
• Configuration changes can made from any client
• Move volume in transparent way
• On-line upgrade and extend system
• Delegation
• Group delegation
• Admin delegation
• Backup
• Backup volume and file
• Built in backup function
• User direct access (backup mounted)
Pagina 9
10. Example write operation client side
1 create file rpc
2 write chunks into cache
(interrupted by
store_data RPC)
3 read from cache
4 transfer over network
5 write to /vicepXX
Pagina 10
11. Example write operation server side
1 Create file
2 Check metadata, permission,
quota and return file path
3 write file into /vicepXX
4 Update meta data on server
5 Update db
Pagina 11
12. AFS Servers Type
• Fileserver machine
• file storage
• Database server machine
• File and Volume localization
• ACL and groups administration
• Authentication provider
• Binary distribution
• Master server for afs binary
(specific architecture)
• System control machine
• Time server
• AFS configuration master
Pagina 12
13. AFS Server Process
• Bosserver, system monitor
• Fileserver, serves file
• Volserver, serves volume data
• Vlserver, volume location server
• Kaserver, kerberos IV server
• Ptserver, protection server (group,acl)
• Buserver, backup server
• Upserver
• Update conf
• Update binary
Pagina 13
14. Weakness
• File restriction
• Pipes
• Devices files
• Sockets
• Unicode name
• AFS Lock
• Only advisory locks (byte-range locking underway)
• ACL
• Only on directory
• Volume
• Read only
• Manual sync
• Write on close
• Date time operation
Pagina 14
15. Arla
• AFS client
alternative
• *BSD support
• Disconnected
operation
Pagina 15
16. Where used ?
• University
• Cmu, Stanford,MIT,KTH(Sweden),
Chemitz(Germany), Roma3(italy),…
• Research Labs
• SLAC, DESY,CERN(EUROPE),INFN(ITALY),…
• Companies:
• Intel,Morgan Stanley,Pictage,..
Pagina 16
17. Conventions and Best Practices
• AFS file space layout
• Server planning
• Volume naming and schemas
• Volume replication
• Username schemas
• Partition Filesystem
• Backup planning
• Security consideration
• Client Cache tuning
• AFS limitations
Pagina 17
18. Cell Name
• Convention
• Company Domain name
• Company Kerberos Realm
• Cell name
• Short name (Max size cell is 64 characters)
• Cell name can contain only lowercase characters
• Suitable for different operating system. (Do not
include command shell metacharacters).
Pagina 18
19. Server planning
• Fileserver
• Ratio 200:1 client server (many site today have 1000:1)
• Replica server location
• Big machine vs small machine
• Database server
• 3 machine for election algorithm (ubik)
• Separate from Fileserver
• Update server
• One system
• Binary distribution
• One system per architecture
Pagina 19
20. Volume naming and schemas
• Volume name limit
• Read/write volume names can be up to 22
characters in length
• The .readonly and .backup extensions are reserved
word
• root.afs and root.cell name are used for default
• Volume naming
• Mount point prefix name (user.manfred)
• Function suffix name
Pagina 20
21. Volume layout and replication
• Volume
• User have its own volume for simplify load balance
operations (move,backup)
• Volume for group of file (binary, documents ..)
• Replication is not appropriate for volumes that change
frequently
• Replicate the root.afs and root.cell most as possible
• Backup volume use the same partition (it is a copy of the
source volume's vnode index)
Pagina 21
22. Username
• Username
• Characters, which have special meanings to the command shell
• The colon ( : ), because AFS reserves it as a field separator in protection
group names;
• The period ( . ); it is conventional used to identify special username that
have administrator capability (ex. manfred.admin )
• AFS UID, 32766, is reserved for the user anonymous.
• UID maching, unix uid / AFS uid
• Unix ldap
• NIS
• kerberos ldap backend
• smb
Pagina 22
23. Partition Filesystem (inode vs iname)
Inode faster
• Dedicated partition
• Special fsck for the system partition
• No journaling file system
• Restore on same filesystem layout (same inode structure)
Iname slower
• OS fsck
• Filesystem independent, with advantage of journaling
• The aren’t special requirement for /VicepXX, it could be a mounted
filesystem
• Simply restore operation
Pagina 23
24. Backup
• Native backup system and recovery
AFS can be configured to create a full or incremental backup
• Volume dump
This operation permit to create a binary file with all
information of backup volume
• Backup system with AFS support
• Amanda
• Bacula
• Other commercial product
Pagina 24
25. Security consideration
• User Accounts:
• Kerberos integration with modified login utility
• replace kaserver with Unix Kerberos solution or Windows AD (OpenAFS
support basic Kerberos 5 2b protocol)
• including the unlog command in every user's .logout file or equivalent
• Server Machines
• Change the AFS server encryption key on a frequent and regular schedule.
• Particularly limit access to the local superuser root account on a server
machine.
• System Administrators
• Create an administrative account for each administrator separate from the
personal account
• assign AFS privileges only to the administrative account.
• Set the token lifetime for administrative accounts to a fairly short amount
of time.
Pagina 25
26. Client Cache
• Cache Size
• single user machine 128MB
• Multi-user machine 1GB/4GB
• Cache partition
• Directory, the partition must grantee enough space
• Disk partition, better performance (Terminal Server)
• Login Integration
Pagina 26
27. AFS limitations
General Limit
• OpenAFS can support a maximum of 104.000 clients per server
• tmpfs no work as AFS Cache, (ramdisk work)
• Max 255 partition per server (/vicepa-/vicepiv), no limits in partition size
• Max 4,294,967,295 volumes per partition (this a limit of VLDB),
• Max Volume size is 2TB
• Max file limit per directory is 64,000 files (less than 16 characters).
Windows Limit
• Write-on-close, the changes are synchronized only on close operation
• No integration on Microsoft DFS
• No support for files greater than 2GB on windows platform ( work in
progress).
Pagina 27
28. Case Study
Italsempione
is nowadays the biggest Italian fully indipendent forwarding
company covering any service related to transports and
logistics with a worldwide agency network.
Company:
• Head Quarter in Italy
• 16 Branch Office in Italy
• 7 branch outside Italy
• 400 PC , Windows XX
• 150 PC , Linux
• 8 Windows NT Domain
• Wide Area Network
• No IT stuff on the branch office
Pagina 28
29. Solution
• Primary goals
• Reduce cost of Software License
• Simplify System Administration task.
• Solution
• Thin client replacement, terminal server
• Server Virtualization , VMware
• Storage Virtualization, OpenAFS
Pagina 29
31. Architecture
Head Quarter
3 Fileserver Machines
• User:Server rate 200:1.
• The read-write information volumes are replicated with circular schema
• The volumes of binary and programs are replicated on all fileserver.
• The fileserver are based on OpenBSD 3.9.
3 Database Servers
• installed on the same machine of fileserver
2 Authentication Servers
• Heimdal Kerberos V
• ldap backend (samba, heimdal, unix, profile info)
8 VMmachine
• windows terminal server image
• Linux terminal server image
• OpenBSD network service image
Pagina 31
32. Architecture
• Cell name= domain name
• Main Directory tree = country/city/function
• User Directory tree = usr/m/manfred
• User volume
• Volume name
Directory usage Volume name
User home user.username
prefix= mount point
User home backup user.username.backup
Suffix= function
Application apps.applicationname
OS Software software.soname
• Volume replication Groups groups.groupname
Binary data VMware image image.osname
Root volume(afs,cell)
Pagina 32
33. Architecture
• Partition
• inode base
• Small partition for quick check
• Odd vicepX for rw volume even for ro volume
• Backup
• Bacula for incremental / total dump
• User backup volume mounted in home dir
• Monitoring
• Zabbix
• AFS monitor and performance
Pagina 33
34. Hardware
Fileserver /DbServer:
• 1GB of RAM,
• 3GHz Xeon single processor
• 2x36Gb SCSI RAID 1 for operating system partition
• 4x 143GB SCSI RAID5 storage (/vicepXX)
Authentication server:
• 1GB of RAM
• 3GHz Xeon single processor
• 2x36Gb SCSI RAID 1 for operating system and db backend
VMmachine:
• 4GB of RAM
• 3GHz Xeon dual processor.
• 2x36Gb SCSI RAID 1 for operating system and local vmware image.
Pagina 34
35. Why OpenBSD
• OpenAFS support
• Porting Server side and client side
• Security level
• Heimdal integration
• AFS emulation
• LDAP backend
• 2ab protocol (large kerberos ticket)
• Small and fast
• Stable
Pagina 35
36. Consideration
• Iron server vs Small server
• small number of inexpensive fileservers (provides equivalent
performance)
• inexpensive incremental increase in capacity
• better manageability and redundancy.
NFS file sharing vs AFS
• AFS resulted in a 60% decrease in network traffic.
• The server's load decreased by 80%
• task execution time was reduced by 30%.
Pagina 36
37. Benefit
• Reduced cost
• Reduced software costs for 150.000 Euro
• Increase performance (Server and Desktop)
• Reduced down time
• Reduced helpdesk load
• Simplify System Administration task
• Improved manageability
• Full disaster recovery protection
• Data accessible from Spain to Singapore with a
• High security level
• Single sign-on
Pagina 37
38. Next
OpenAFS
• Lock subsystem
• Windows support
• Kerberos V support
External project (www.beolink.org)
• Ptserver with ldap backend
• Web interface
Pagina 38