An Introduction to OpenAFS This chapter introduces basic AFS concepts and terms. It assumes that you are already familiar with standard UNIX commands, file protection, and pathname conventions. AFS Concepts AFS makes it easy for people to work together on the same files, no matter where the files are located. AFS users do not have to know which machine is storing a file, and administrators can move files from machine to machine without interrupting user access. Users always identify a file by the same pathname and AFS finds the correct file automatically, just as happens in the local file system on a single machine. While AFS makes file sharing easy, it does not compromise the security of the shared files. It provides a sophisticated protection scheme. AFSsharing information AFStransparent access Client/Server Computing AFS uses a client/server computing model. In client/server computing, there are two types of machines. Server machines store data and perform services for client machines. Client machines perform computations for users and access data and services provided by server machines. Some machines act as both clients and servers. In most cases, you work on a client machine, accessing files stored on a file server machine. client/server computing client machine server machines defined machinesserver machinesclient communicationbetween clients and servers Distributed File Systems AFS is a distributed file system which joins together the file systems of multiple file server machines, making it as easy to access files stored on a remote file server machine as files stored on the local disk. A distributed file system has two main advantages over a conventional centralized file system: distributed file system Increased availability: A copy of a popular file, such as the binary for an application program, can be stored on many file server machines. An outage on a single machine or even multiple machines does not necessarily make the file unavailable. Instead, user requests for the program are routed to accessible machines. With a centralized file system, the loss of the central file storage machine effectively shuts down the entire system. Increased efficiency: In a distributed file system, the work load is distributed over many smaller file server machines that tend to be more fully utilized than the larger (and usually more expensive) file storage machine of a centralized file system. AFS hides its distributed nature, so working with AFS files looks and feels like working with files stored on your local machine, except that you can access many more files. And because AFS relies on the power of users' client machines for computation, increasing the number of AFS users does not slow AFS performance appreciably, making it a very efficient computing environment. AFS Filespace and Local Filespace local machine AFS acts as an extension of your machine's local UNIX file system. Your system administrator creates a directory on the local disk of each AFS client machine to act as a gateway to AFS. By convention, this directory is called /afs, and it functions as the root of the AFS filespace. AFSfilespace as extension of local filespace afs (/afs) directoryas root of AFS filespace root of AFS filespace Just like the UNIX file system, AFS uses a hierarchical file structure (a tree). Under the /afs root directory are subdirectories created by your system administrator, including your home directory. Other directories that are at the same level of the local file system as /afs, such as /usr, /etc, or /bin, can either be located on your local disk or be links to AFS directories. Files relevant only to the local machine are usually stored on the local machine. All other files can be stored in AFS, enabling many users to share them and freeing the local machine's disk space for other uses. You can use AFS commands only on files in the AFS filespace or the local directories that are links to the AFS filespace. Cells and Sites The cell is the administrative domain in AFS. Each cell's administrators determine how client machines are configured and how much storage space is available to each user. The organization corresponding to a cell can be a company, a university department, or any defined group of users. From a hardware perspective, a cell is a grouping of client machines and server machines defined to belong to the same cell. cellsdefined An AFS site is a grouping of one or more related cells. For example, the cells at the Example Corporation form a single site. site defined By convention, the subdirectories of the /afs directory are cellular filespaces, each of which contains subdirectories and files that belong to a single cell. For example, directories and files relevant to the Example Corporation cell are stored in the subdirectory /afs/example.com. While each cell organizes and maintains its own filespace, it can also connect with the filespace of other AFS cells. The result is a huge filespace that enables file sharing within and across cells. communicationamong cells and sites The cell to which your client machine belongs is called your local cell. All other cells in the AFS filespace are termed foreign cells. local cell, defined foreign cellsdefined cellslocal vs. foreign Volumes and Mount Points The storage disks in a computer are divided into sections called partitions. AFS further divides partitions into units called volumes, each of which houses a subtree of related files and directories. The volume provides a convenient container for storing related files and directories. Your system administrators can move volumes from one file server machine to another without your noticing, because AFS automatically tracks a volume's location. volumesdefined disk partitionuse in AFS You access the contents of a volume by accessing its mount point in the AFS filespace. A mount point is a special file system element that looks and acts like a regular UNIX directory, but tells AFS the volume's name. When you change to a different directory (by using the cd command, for example) you sometimes cross a mount point and start accessing the contents of a different volume than before. You normally do not notice the crossing, however, because AFS automatically interprets mount points and retrieves the contents of the new directory from the appropriate volume. You do not need to track which volume, partition, or file server machine is housing a directory's contents. If you are interested, though, you can learn a volume's location; for instructions, see Locating Files and Directories. mount points defined volumesaccessing via mount points If your system administrator has followed the conventional practice, your home directory corresponds to one volume, which keeps its contents together on one partition of a file server machine. User volumes are typically named user.username. For example, the volume for a user named smith in the cell example.com is called user.smith and is mounted at the directory /afs/example.com/usr/smith. examplesvolume/mount point interaction Because AFS volumes are stored on different file server machines, when a machine becomes unavailable only the volumes on that machine are inaccessible. Volumes stored on other machines are still accessible. However, if a volume's mount point resides in a volume that is stored on an unavailable machine, the former volume is also inaccessible. For that reason, volumes containing frequently used directories (for example, /afs and /afs/cellname) are often copied and distributed to many file server machines. Volume Quotas volumesvolume/mount point interaction Each volume has a size limit, or quota, assigned by the system administrator. A volume's quota determines the maximum amount of disk space the volume can consume. If you attempt to exceed a volume's quota, you receive an error message. For instructions on checking volume quota, see Displaying Volume Quota. Volumes have completely independent quotas. For example, say that the current working directory is /afs/example.com/usr/smith, which is the mount point for the user.smith volume with 1000 free blocks. You try to copy a 500 block file from the current working directory to the /afs/example.com/usr/pat directory, the mount point for the volume user.pat. However, you get an error message saying there is not enough space. You check the volume quota for user.pat, and find that the volume only has 50 free blocks. Using Files in AFS The Cache Manager You can access the AFS filespace only when working on an AFS client machine. The Cache Manager on that machine is your agent in accessing information stored in the AFS filespace. When you access a file, the Cache Manager on your client machine requests the file from the appropriate file server machine and stores (caches) a copy of it on your client machine's local disk. Application programs on your client machine use the local, cached copy of the file. This improves performance because it is much faster to use a local file than to send requests for file data across the network to the file server machine. caching files Cache Managerdescribed client machine filescaching Because application programs use the cached copy of a file, any changes you make are not necessarily stored permanently to the central version stored on the file server machine until the file closes. At that point, the Cache Manager writes your changes back to the file server machine, where they replace the corresponding parts of the existing file. Some application programs close a file in this way each time you issue their save command (and then immediately reopen the file so that you can continue working). With other programs, issuing the save command writes the changes only to the local cached copy. If you use the latter type of text editor, you need to close the file periodically to make sure your changes are stored permanently. If a file server machine becomes inaccessible, you can continue working with the local, cached copy of a file fetched from that machine, but you cannot save your changes permanently until the server machine is again accessible. Updating Copies of Cached Files filesupdating callbacks When the central version of a file changes on the file server machine, the AFS File Server process running on that machine advises all other Cache Managers with copies of that file that their version is no longer valid. AFS has a special mechanism for performing these notifications efficiently. When the File Server sends the Cache Manager a copy of a modifiable file, it also sends a callback. A callback functions as a promise from the File Server to contact the Cache Manager if the centrally stored copy of the file is changed while it is being used. If that happens, the File Server breaks the callback. If you run a program that requests data from the changed file, the Cache Manager notices the broken callback and gets an updated copy of the file from the File Server. Callbacks ensure that you are working with the most recent copy of a file. The callback mechanism does not guarantee that you immediately see the changes someone else makes to a file you are using. Your Cache Manager does not notice the broken callback until your application program asks it for more data from the file. Multiple Users Modifying Files filesdenying access filessharing Like a standard UNIX file system, AFS preserves only the changes to a file that are saved last, regardless of who made the changes. When collaborating with someone on the same files, you must coordinate your work to avoid overwriting each other's changes. You can use AFS access control lists (ACLs) to limit the ability of other users to access or change your files, and so prevent them from accidentally overwriting your files. See Protecting Your Directories and Files. AFS Security AFSsecurity security in AFS AFS makes it easy for many users to access the same files, but also uses several mechanisms to ensure that only authorized users access the AFS filespace. The mechanisms include the following: Passwords and mutual authentication ensure that only authorized users access AFS filespace Access control lists enable users to restrict or permit access to their own directories Passwords and Mutual Authentication mutual authentication authenticationmutual password AFS uses two related mechanisms to ensure that only authorized users access the filespace: passwords and mutual authentication. Both mechanisms require that a user prove his or her identity. When you first identify yourself to AFS, you must provide the password associated with your username, to prove that you are who you say you are. When you provide the correct password, you become authenticated and your Cache Manager receives a token. A token is a package of information that is scrambled by an AFS authentication program using your AFS password as a key. Your Cache Manager can unscramble the token because it knows your password and AFS's method of scrambling. tokensas proof of authentication authenticationdefined The token acts as proof to AFS server programs that you are authenticated as a valid AFS user. It serves as the basis for the second means through which AFS creates security, called mutual authentication. Under mutual authentication, both parties communicating across the network prove their identities to one another. AFS requires mutual authentication whenever a server and client (most often, a Cache Manager) communicate with each other. The mutual authentication protocol that AFS uses is designed to make it very difficult for people to authenticate fraudulently. When your Cache Manager contacts a File Server on your behalf, it sends the token you obtained when you authenticated. The token is encrypted with a key that only an AFS File Server can know. If the File Server can decrypt your token, it can communicate with your Cache Manager. In turn, the Cache Manager accepts the File Server as genuine because the File Server can decrypt and use the information in the token. tokensuse in mutual authentication Access Control Lists ACLdescribed AFS uses access control lists (ACLs) to determine who can access the information in the AFS filespace. Each AFS directory has an ACL to specify what actions different users can perform on that directory and its files. An ACL can contain up to about 20 entries for users, groups, or both; each entry lists a user or group and the permissions it possesses. The owner of a directory and system administrators can always administer an ACL. Users automatically own their home directories and subdirectories. Other non-owner users can define a directory's ACL only if specifically granted that permission on the ACL. For more information on ACLs, see Protecting Your Directories and Files . A group is composed of one or more users and client machines. If a user belongs to a group that appears on an ACL, the user gets all of the permissions granted to that group, just as if the user were listed directly on the ACL. Similarly, if a user is logged into a client machine that belongs to a group, the user has all of the permissions granted to that group. For instructions on defining and using groups, see Using Groups. All users who can access your cell's filespace, authenticated or not, are automatically assigned to a group called system:anyuser. For a discussion of placing the system:anyuser group on ACLs, see Extending Access to Users from Foreign Cells. You can use the UNIX mode bits to control access on specific files within an AFS directory; however, the effect of these mode bits is different under AFS than in the standard UNIX file system. See File and Directory Protection. Differences Between UNIX and AFS AFS is designed to be similar to the UNIX file system. For instance, many of the basic UNIX file manipulation commands (cp for copy, rm for remove, and so on) are the same in AFS as they are as in UNIX. All of your application programs work as they did before. The following sections describe some of the differences between a standard UNIX file system and AFS. File Sharing UNIX, differences with AFSfile transfer UNIX, differences with AFSsharing files filessharing AFS enables users to share remote files as easily as local files. To access a file on a remote machine in AFS, you simply specify the file's pathname. In contrast, to access a file in a remote machine's UNIX file system, you must log into the remote machine or create a mount point on the local machine that points to a directory in the remote machine's UNIX file system. AFS users can see and share all the files under the /afs root directory, given the appropriate privileges. An AFS user who has the necessary privileges can access a file in any AFS cell, simply by specifying the file's pathname. File sharing in AFS is not restricted by geographical distances or operating system differences. Login and Authentication UNIX, differences with AFSlogin To become an authenticated AFS user, you need to provide a password to AFS. On machines that use an AFS-modified login utility, logging in is a one-step process; your initial login automatically authenticates you with AFS. On machines that do not use an AFS-modified login utility, you must perform three steps. Log in to your local machine. Issue the kinit command to obtain a kerberos Ticket Granting Ticket or TGT. If the kinit is compiled with AFS support, it may automatically get a token for you. However to ensure that you get an afs token, you will need to run a second command. OpenAFS provides the aklog command to allow you to obtain a token, or AFS service ticket using your kerberos TGT. A kinit with AFS support will run this as part of it's execution, but if you issue the aklog command that will ensure you have an AFS token. Your system administrator can tell you whether your machine uses an AFS-modified login utility or not. Then see the login instructions in Logging in and Authenticating with AFS. AFS uses the kerberos authentication protocol, rather than storing passwords in the local password file (/etc/passwd or equivalent). If your machine uses an AFS-modified login utility, you can change your password with a single command. If your machine does not use an AFS-modified login utility, you must issue separate commands to change your AFS and local passwords. See Changing Your Password. UNIX, differences with AFSpasswords local password file (/etc/passwd) passwdfile File and Directory Protection ACLcompared to UNIX mode bits UNIX, differences with AFSfile access/protection AFS does not rely on the mode bit protections of a standard UNIX system (though its protection system does interact with these mode bits). Instead, AFS uses an access control list (ACL) to control access to each directory and its contents. The following list summarizes the differences between the two methods: UNIX mode bits specify three types of access permissions: r (read), w (write), and x (execute). An AFS ACL uses seven types of permissions: r (read), l (lookup), i (insert), d (delete), w (write), k (lock), and a (administer). For more information, see The AFS ACL Permissions and How AFS Uses the UNIX Mode Bits. The three sets of mode bits on each UNIX file or directory enable you to grant permissions to three users or groups of users: the file or directory's owner, the group that owns the file or directory, and all other users. An ACL can accommodate up to about 20 entries, each of which extends certain permissions to a user or group. Unlike standard UNIX, a user can belong to an unlimited number of groups, and groups can be defined by both users and system administrators. See Using Groups. UNIX mode bits are set individually on each file and directory. An ACL applies to all of the files in a directory. While at first glance the AFS method possibly seems less precise, in actuality (given a proper directory structure) there are no major disadvantages to directory-level protections and they are easier to establish and maintain. Machine Outages The kinds of failures you experience when a standard UNIX file system goes down are different than when one or more individual AFS file server machines become unavailable. When a standard UNIX file system is inaccessible, the system simply locks up and you can lose changes to any files with which you were working. When an AFS file server machine becomes inaccessible, you cannot access the files on that machine. If a copy of the file is available from another file server machine, however, you do not necessarily even notice the server outage. This is because AFS gives your cell's system administrators the ability to store copies of popular programs on multiple file servers. The Cache Manager chooses between the copies automatically; when one copy becomes unavailable, the Cache Manager simply chooses another. If there are no other copies of a file that is stored on an inaccessible server machine, you can usually continue to use the copy stored in your client machine's local AFS cache. However, you cannot save changes to files stored on an inaccessible file server machine until it is accessible again. Remote Commands SSH, differences with AFScommands remote commands commandsssh ftp command commandsscp scp command The ssh and scp commands enable you to run programs on a remote machine or copy files to/from a remote machine. ssh commands can work seamlessly with AFS, depending on how your administrators have configured them. For the recent versions of OpenSSH, you need to have a kerberos ticket on the machine you are connecting from and support in the ssh client to forward that ticket to the remote machine. The remote machine needs to be configured to use that ticket to obtain a token after it is forwarded. Most current unix OS's come with a version of OpenSSH that understands the necessary GSSAPI protocol that can use kerberos to forward TGT's, but this ability is generally not enabled by default. In order to configure your ssh client to use this you need to add the following lines to your ~/.ssh/config file. GSSAPIAuthentication yes GSSAPIDelegateCredentials yes GSSAPITrustDNS yes See the ssh_config man page on your system for more details about these configuration options. In particular, you may want to limit them to specific hosts or domains. If you do not have an ssh client that can do TGT forwarding, when you login into a remote machine, you will have access to native UNIX file system. However, since you are not authenticated to AFS, you can only access the AFS directories that grant access to the system:anyuser group, but you cannot access protected AFS directories. You can enable this access by following the kinit/aklog procedure listed above. Differences in the Semantics of Standard UNIX Commands This section summarizes differences in the functionality of some commonly issued UNIX commands. chmod chmod command commandschmod Only members of the system:administrators group can use this command to turn on the setuid, setgid or sticky mode bits on AFS files. (For more information about this group, see Using the System Groups on ACLs.) chown chown command commandschown Only members of the system:administrators group can issue this command on AFS files. chgrp chgrp command commandschgrp Only members of the system:administrators group can issue this command on AFS files and directories. groups groups command commandsgroups If the user's AFS tokens are identified by a process authentication group (PAG), the output of this command includes two large numbers. For a description of PAGs, see Authenticating with AFS. login utilities login utility In general, most systems will use a combination of PAM modules to provide both kerberos enabled logins and automatic AFS tokens on login. Often these PAM modules will also be used with screenlockers and graphic logins at the console. ln ln command commandsln You cannot use this command to create a hard link between files that reside in different AFS directories. You must add the -s option to create a symbolic link instead. Using OpenAFS with NFS Some cells use the Networking File System (NFS) in addition to AFS. If you work on an NFS client machine, your system administrator can configure it to access the AFS filespace through a program called the NFS/AFS TranslatorTM. See Appendix A, Using the NFS/AFS Translator.