openafs/doc/xml/AdminGuide/auagd006.xml
Jeffrey Altman fb285880da Correct DTD validation errors in the AdminGuide
Correct DTD validation errors detected using xmllint.

Change-Id: Ia255ac319a81966e63b702dd2b672ff3d6d8958a
Reviewed-on: http://gerrit.openafs.org/2392
Reviewed-by: Russ Allbery <rra@stanford.edu>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: Derrick Brashear <shadow@dementia.org>
2010-07-12 21:09:55 -07:00

1486 lines
62 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<chapter id="HDRWQ5">
<title>An Overview of OpenAFS Administration</title>
<para>This chapter provides a broad overview of the concepts and
organization of AFS. It is strongly recommended that anyone involved in
administering an AFS cell read this chapter before beginning to issue
commands.</para>
<sect1 id="HDRWQ6">
<title>A Broad Overview of AFS</title>
<para>This section introduces most of the key terms and concepts
necessary for a basic understanding of AFS. For a more detailed
discussion, see <link linkend="HDRWQ7">More Detailed Discussions of
Some Basic Concepts</link>.</para>
<sect2 renderas="sect3">
<title>AFS: A Distributed File System</title>
<para>AFS is a distributed file system that enables users to share
and access all of the files stored in a network of computers as
easily as they access the files stored on their local machines. The
file system is called distributed for this exact reason: files can
reside on many different machines (be distributed across them), but
are available to users on every machine.</para>
</sect2>
<sect2 renderas="sect3">
<title>Servers and Clients</title>
<para>AFS stores files on file server machines. File server machines
provide file storage and delivery service, along with other
specialized services, to the other subset of machines in the
network, the client machines. These machines are called clients
because they make use of the servers' services while doing their own
work. In a standard AFS configuration, clients provide computational
power, access to the files in AFS and other "general purpose" tools
to the users seated at their consoles. There are generally many more
client workstations than file server machines.</para>
<para>AFS file server machines run a number of server processes, so
called because each provides a distinct specialized service: one
handles file requests, another tracks file location, a third manages
security, and so on. To avoid confusion, AFS documentation always
refers to server machines and server processes, not simply to
servers. For a more detailed description of the server processes,
see <link linkend="HDRWQ17">AFS Server Processes and the Cache
Manager</link>.</para>
</sect2>
<sect2 renderas="sect3">
<title>Cells</title>
<para>A cell is an administratively independent site running AFS. As
a cell's system administrator, you make many decisions about
configuring and maintaining your cell in the way that best serves
its users, without having to consult the administrators in other
cells. For example, you determine how many clients and servers to
have, where to put files, and how to allocate client machines to
users.</para>
</sect2>
<sect2 renderas="sect3">
<title>Transparent Access and the Uniform Namespace</title>
<para>Although your AFS cell is administratively independent, you
probably want to organize the local collection of files (your
filespace or tree) so that users from other cells can also access
the information in it. AFS enables cells to combine their local
filespaces into a global filespace, and does so in such a way that
file access is transparent--users do not need to know anything about
a file's location in order to access it. All they need to know is
the pathname of the file, which looks the same in every cell. Thus
every user at every machine sees the collection of files in the same
way, meaning that AFS provides a uniform namespace to its
users.</para>
</sect2>
<sect2 renderas="sect3">
<title>Volumes</title>
<para>AFS groups files into volumes, making it possible to
distribute files across many machines and yet maintain a uniform
namespace. A volume is a unit of disk space that functions like a
container for a set of related files, keeping them all together on
one partition. Volumes can vary in size, but are (by definition)
smaller than a partition.</para>
<para>Volumes are important to system administrators and users for
several reasons. Their small size makes them easy to move from one
partition to another, or even between machines. The system
administrator can maintain maximum efficiency by moving volumes to
keep the load balanced evenly. In addition, volumes correspond to
directories in the filespace--most cells store the contents of each
user home directory in a separate volume. Thus the complete contents
of the directory move together when the volume moves, making it easy
for AFS to keep track of where a file is at a certain time.</para>
<para>Volume moves are recorded automatically, so users do not have
to keep track of file locations. Volumes can be moved from server to
server by a cell administrator without notifying clients, even while
the volume is in active use by a client machine. Volume moves are
transparent to client machines apart from a brief interruption in
file service for files in that volume.</para>
</sect2>
<sect2 renderas="sect3">
<title>Efficiency Boosters: Replication and Caching</title>
<para>AFS incorporates special features on server machines and
client machines that help make it efficient and reliable.</para>
<para>On server machines, AFS enables administrators to replicate
commonly-used volumes, such as those containing binaries for popular
programs. Replication means putting an identical read-only copy
(sometimes called a clone) of a volume on more than one file server
machine. The failure of one file server machine housing the volume
does not interrupt users' work, because the volume's contents are
still available from other machines. Replication also means that one
machine does not become overburdened with requests for files from a
popular volume.</para>
<para>On client machines, AFS uses caching to improve efficiency.
When a user on a client machine requests a file, the Cache Manager
on the client sends a request for the data to the File Server
process running on the proper file server machine. The user does not
need to know which machine this is; the Cache Manager determines
file location automatically. The Cache Manager receives the file
from the File Server process and puts it into the cache, an area of
the client machine's local disk or memory dedicated to temporary
file storage. Caching improves efficiency because the client does
not need to send a request across the network every time the user
wants the same file. Network traffic is minimized, and subsequent
access to the file is especially fast because the file is stored
locally. AFS has a way of ensuring that the cached file stays
up-to-date, called a callback.</para>
</sect2>
<sect2 renderas="sect3">
<title>Security: Mutual Authentication and Access Control
Lists</title>
<para>Even in a cell where file sharing is especially frequent and
widespread, it is not desirable that every user have equal access to
every file. One way AFS provides adequate security is by requiring
that servers and clients prove their identities to one another
before they exchange information. This procedure, called mutual
authentication, requires that both server and client demonstrate
knowledge of a "shared secret" (like a password) known only to the
two of them. Mutual authentication guarantees that servers provide
information only to authorized clients and that clients receive
information only from legitimate servers.</para>
<para>Users themselves control another aspect of AFS security, by
determining who has access to the directories they own. For any
directory a user owns, he or she can build an access control list
(ACL) that grants or denies access to the contents of the
directory. An access control list pairs specific users with specific
types of access privileges. There are seven separate permissions and
up to twenty different people or groups of people can appear on an
access control list.</para>
<para>For a more detailed description of AFS's mutual authentication
procedure, see <link linkend="HDRWQ75">A More Detailed Look at
Mutual Authentication</link>. For further discussion of ACLs, see
<link linkend="HDRWQ562">Managing Access Control
Lists</link>.</para>
</sect2>
</sect1>
<sect1 id="HDRWQ7">
<title>More Detailed Discussions of Some Basic Concepts</title>
<para>The previous section offered a brief overview of the many
concepts that an AFS system administrator needs to understand. The
following sections examine some important concepts in more
detail. Although not all concepts are new to an experienced
administrator, reading this section helps ensure a common
understanding of term and concepts.</para>
<sect2 id="HDRWQ8">
<title>Networks</title>
<indexterm>
<primary>network</primary>
<secondary>defined</secondary>
</indexterm>
<para>A <emphasis>network</emphasis> is a collection of
interconnected computers able to communicate with each other and
transfer information back and forth.</para>
<para>A network can connect computers of any kind, but the typical
network running AFS connects servers or high-function personal
workstations with AFS file server machines. For more about the
classes of machines used in an AFS environment, see <link
linkend="HDRWQ10">Servers and Clients</link>.</para>
</sect2>
<sect2 id="HDRWQ9">
<title>Distributed File Systems</title>
<indexterm>
<primary>file system</primary>
<secondary>defined</secondary>
</indexterm>
<indexterm>
<primary>distributed file system</primary>
</indexterm>
<para>A <emphasis>file system</emphasis> is a collection of files
and the facilities (programs and commands) that enable users to
access the information in the files. All computing environments have
file systems.</para>
<para>Networked computing environments often use
<emphasis>distributed file systems</emphasis> like AFS. A
distributed file system takes advantage of the interconnected nature
of the network by storing files on more than one computer in the
network and making them accessible to all of them. In other words,
the responsibility for file storage and delivery is "distributed"
among multiple machines instead of relying on only one. Despite the
distribution of responsibility, a distributed file system like AFS
creates the illusion that there is a single filespace.</para>
</sect2>
<sect2 id="HDRWQ10">
<title>Servers and Clients</title>
<indexterm>
<primary>server/client model</primary>
</indexterm>
<indexterm>
<primary>server</primary>
<secondary>definition</secondary>
</indexterm>
<indexterm>
<primary>client</primary>
<secondary>definition</secondary>
</indexterm>
<para>AFS uses a server/client model. In general, a server is a
machine, or a process running on a machine, that provides
specialized services to other machines. A client is a machine or
process that makes use of a server's specialized service during the
course of its own work, which is often of a more general nature than
the server's. The functional distinction between clients and server
is not always strict, however--a server can be considered the client
of another server whose service it is using.</para>
<para>AFS divides the machines on a network into two basic classes,
<emphasis>file server machines</emphasis> and <emphasis>client
machines</emphasis>, and assigns different tasks and
responsibilities to each.</para>
<formalpara>
<title>File Server Machines</title>
<indexterm>
<primary>file server machine</primary>
</indexterm>
<indexterm>
<primary>server</primary>
<secondary>process</secondary>
<tertiary>definition</tertiary>
</indexterm>
<para><emphasis>File server machines</emphasis> store the files in
the distributed file system, and a <emphasis>server
process</emphasis> running on the file server machine delivers and
receives files. AFS file server machines run a number of
<emphasis>server processes</emphasis>. Each process has a special
function, such as maintaining databases important to AFS
administration, managing security or handling volumes. This
modular design enables each server process to specialize in one
area, and thus perform more efficiently. For a description of the
function of each AFS server process, see <link
linkend="HDRWQ17">AFS Server Processes and the Cache
Manager</link>.</para>
</formalpara>
<para>Not all AFS server machines must run all of the server
processes. Some processes run on only a few machines because the
demand for their services is low. Other processes run on only one
machine in order to act as a synchronization site. See <link
linkend="HDRWQ90">The Four Roles for File Server
Machines</link>.</para>
<formalpara>
<title>Client Machines</title>
<indexterm>
<primary>client</primary>
<secondary>machine</secondary>
<tertiary>definition</tertiary>
</indexterm>
<para>The other class of machines are the <emphasis>client
machines</emphasis>, which generally work directly for users,
providing computational power and other general purpose tools but
may also be other servers that use data stored in AFS to provide
other services. Clients also provide users with access to the
files stored on the file server machines. Clients run a Cache
Manager, which is normally a combination of a kernel module and a
running process that enables them to communicate with the AFS
server processes running on the file server machines and to cache
files. See <link linkend="HDRWQ28">The Cache Manager</link> for
more information. There are usually many more client machines in a
cell than file server machines.</para>
</formalpara>
</sect2>
<sect2 id="HDRWQ11">
<title>Cells</title>
<indexterm>
<primary>cell</primary>
</indexterm>
<para>A <emphasis>cell</emphasis> is an independently administered
site running AFS. In terms of hardware, it consists of a collection
of file server machines defined as belonging to the cell. To say
that a cell is administratively independent means that its
administrators determine many details of its configuration without
having to consult administrators in other cells or a central
authority. For example, a cell administrator determines how many
machines of different types to run, where to put files in the local
tree, how to associate volumes and directories, and how much space
to allocate to each user.</para>
<para>The terms <emphasis>local cell</emphasis> and <emphasis>home
cell</emphasis> are equivalent, and refer to the cell in which a
user has initially authenticated during a session, by logging onto a
machine that belongs to that cell. All other cells are referred to
as <emphasis>foreign</emphasis> from the user's perspective. In
other words, throughout a login session, a user is accessing the
filespace through a single Cache Manager--the one on the machine to
which he or she initially logged in--and that Cache Manager is
normally configured to have a default local cell. All other cells
are considered foreign during that login session, even if the user
authenticates in additional cells or uses the <emphasis
role="bold">cd</emphasis> command to change directories into their
file trees. This distinction is mostly invisible and irrelavant to
users. For most purposes, users will see no difference between local
and foreign cells.</para>
<indexterm>
<primary>local cell</primary>
</indexterm>
<indexterm>
<primary>cell</primary>
<secondary>local</secondary>
</indexterm>
<indexterm>
<primary>foreign cell</primary>
</indexterm>
<indexterm>
<primary>cell</primary>
<secondary>foreign</secondary>
</indexterm>
<para>It is possible to maintain more than one cell at a single
geographical location. For instance, separate departments on a
university campus or in a corporation can choose to administer their
own cells. It is also possible to have machines at geographically
distant sites belong to the same cell; only limits on the speed of
network communication determine how practical this is.</para>
<para>Despite their independence, AFS cells generally agree to make
their local filespace visible to other AFS cells, so that users in
different cells can share files if they choose. If your cell is to
participate in the "global" AFS namespace, it must comply with a few
basic conventions governing how the local filespace is configured
and how the addresses of certain file server machines are advertised
to the outside world.</para>
</sect2>
<sect2 id="HDRWQ12">
<title>The Uniform Namespace and Transparent Access</title>
<indexterm>
<primary>transparent access as AFS feature</primary>
</indexterm>
<indexterm>
<primary>access</primary>
<secondary>transparent (AFS feature)</secondary>
</indexterm>
<para>One of the features that makes AFS easy to use is that it
provides transparent access to the files in a cell's
filespace. Users do not have to know which file server machine
stores a file in order to access it; they simply provide the file's
pathname, which AFS automatically translates into a machine
location.</para>
<para>In addition to transparent access, AFS also creates a
<emphasis>uniform namespace</emphasis>--a file's pathname is
identical regardless of which client machine the user is working
on. The cell's file tree looks the same when viewed from any client
because the cell's file server machines store all the files
centrally and present them in an identical manner to all
clients.</para>
<para>To enable the transparent access and the uniform namespace
features, the system administrator must follow a few simple
conventions in configuring client machines and file trees. For
details, see <link linkend="HDRWQ39">Making Other Cells Visible in
Your Cell</link>.</para>
</sect2>
<sect2 id="HDRWQ13">
<title>Volumes</title>
<indexterm>
<primary>volume</primary>
<secondary>definition</secondary>
</indexterm>
<para>A <emphasis>volume</emphasis> is a conceptual container for a
set of related files that keeps them all together on one file server
machine partition. Volumes can vary in size, but are (by definition)
smaller than a partition. Volumes are the main administrative unit
in AFS, and have several characteristics that make administrative
tasks easier and help improve overall system
performance. <itemizedlist>
<listitem>
<para>The relatively small size of volumes makes them easy to
move from one partition to another, or even between
machines.</para>
</listitem>
<listitem>
<para>You can maintain maximum system efficiency by moving
volumes to keep the load balanced evenly among the different
machines. If a partition becomes full, the small size of
individual volumes makes it easy to find enough room on other
machines for them.</para>
<indexterm>
<primary>volume</primary>
<secondary>in load balancing</secondary>
</indexterm>
</listitem>
<listitem>
<para>Each volume corresponds logically to a directory in the
file tree and keeps together, on a single partition, all the
data that makes up the files in the directory (including
possible subdirectories). By maintaining (for example) a
separate volume for each user's home directory, you keep all
of the user's files together, but separate from those of other
users. This is an administrative convenience that is
impossible if the partition is the smallest unit of
storage.</para>
<indexterm>
<primary>volume</primary>
<secondary>correspondence with directory</secondary>
</indexterm>
<indexterm>
<primary>directory</primary>
<secondary>correspondence with volume</secondary>
</indexterm>
<indexterm>
<primary>correspondence</primary>
<secondary>of volumes and directories</secondary>
</indexterm>
</listitem>
<listitem>
<para>The directory/volume correspondence also makes
transparent file access possible, because it simplifies the
process of file location. All files in a directory reside
together in one volume and in order to find a file, a file
server process need only know the name of the file's parent
directory, information which is included in the file's
pathname. AFS knows how to translate the directory name into
a volume name, and automatically tracks every volume's
location, even when a volume is moved from machine to
machine. For more about the directory/volume correspondence,
see <link linkend="HDRWQ14">Mount Points</link>.</para>
</listitem>
<listitem>
<para>Volumes increase file availability through replication
and backup.</para>
<indexterm>
<primary>volume</primary>
<secondary>as unit of</secondary>
<tertiary>replication</tertiary>
</indexterm>
<indexterm>
<primary>volume</primary>
<secondary>as unit of</secondary>
<tertiary>backup</tertiary>
</indexterm>
</listitem>
<listitem>
<para>Replication (placing copies of a volume on more than one
file server machine) makes the contents more reliably
available; for details, see <link
linkend="HDRWQ15">Replication</link>. Entire sets of volumes
can be backed up as dump files (possibly to tape) and restored
to the file system; see <link linkend="HDRWQ248">Configuring
the AFS Backup System</link> and <link
linkend="HDRWQ283">Backing Up and Restoring AFS
Data</link>. In AFS, backup also refers to recording the state
of a volume at a certain time and then storing it (either on
tape or elsewhere in the file system) for recovery in the
event files in it are accidentally deleted or changed. See
<link linkend="HDRWQ201">Creating Backup
Volumes</link>.</para>
</listitem>
<listitem>
<para>Volumes are the unit of resource management. A space
quota associated with each volume sets a limit on the maximum
volume size. See <link linkend="HDRWQ234">Setting and
Displaying Volume Quota and Current Size</link>.</para>
<indexterm>
<primary>volume</primary>
<secondary>as unit of</secondary>
<tertiary>resource management</tertiary>
</indexterm>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="HDRWQ14">
<title>Mount Points</title>
<indexterm>
<primary>mount point</primary>
<secondary>definition</secondary>
</indexterm>
<para>The previous section discussed how each volume corresponds
logically to a directory in the file system: the volume keeps
together on one partition all the data in the files residing in the
directory. The directory that corresponds to a volume is called its
<emphasis>root directory</emphasis>, and the mechanism that
associates the directory and volume is called a <emphasis>mount
point</emphasis>. A mount point is similar to a symbolic link in the
file tree that specifies which volume contains the files kept in a
directory. A mount point is not an actual symbolic link; its
internal structure is different.</para>
<note>
<para>You must not create, in AFS, a symbolic link to a file whose
name begins with the number sign (#) or the percent sign (%),
because the Cache Manager interprets such a link as a mount point
to a regular or read/write volume, respectively.</para>
</note>
<indexterm>
<primary>root directory</primary>
</indexterm>
<indexterm>
<primary>directory</primary>
<secondary>root</secondary>
</indexterm>
<indexterm>
<primary>volume</primary>
<secondary>root directory of</secondary>
</indexterm>
<indexterm>
<primary>volume</primary>
<secondary>mounting</secondary>
</indexterm>
<para>The use of mount points means that many of the elements in an
AFS file tree that look and function just like standard UNIX file
system directories are actually mount points. In form, a mount point
is a symbolic link in a special format that names the volume
containing the data for files in the directory. When the Cache
Manager (see <link linkend="HDRWQ28">The Cache Manager</link>)
encounters a mount point--for example, in the course of interpreting
a pathname--it looks in the volume named in the mount point. In the
volume the Cache Manager finds an actual UNIX-style directory
element--the volume's root directory--that lists the files contained
in the directory/volume. The next element in the pathname appears in
that list.</para>
<para>A volume is said to be <emphasis>mounted</emphasis> at the
point in the file tree where there is a mount point pointing to the
volume. A volume's contents are not visible or accessible unless it
is mounted. Unlike some other file systems, AFS volumes can be
mounted at multiple locations in the file system at the same
time.</para>
</sect2>
<sect2 id="HDRWQ15">
<title>Replication</title>
<indexterm>
<primary>replication</primary>
<secondary>definition</secondary>
</indexterm>
<indexterm>
<primary>clone</primary>
</indexterm>
<para><emphasis>Replication</emphasis> refers to making a copy, or
<emphasis>clone</emphasis>, of a source read/write volume and then
placing the copy on one or more additional file server machines in a
cell. One benefit of replicating a volume is that it increases the
availability of the contents. If one file server machine housing the
volume fails, users can still access the volume on a different
machine. No one machine need become overburdened with requests for a
popular file, either, because the file is available from several
machines.</para>
<para>Replication is not necessarily appropriate for cells with
limited disk space, nor are all types of volumes equally suitable
for replication (replication is most appropriate for volumes that
contain popular files that do not change very often). For more
details, see <link linkend="HDRWQ50">When to Replicate
Volumes</link>.</para>
</sect2>
<sect2 id="HDRWQ16">
<title>Caching and Callbacks</title>
<indexterm>
<primary>caching</primary>
</indexterm>
<para>Just as replication increases system availability,
<emphasis>caching</emphasis> increases the speed and efficiency of
file access in AFS. Each AFS client machine dedicates a portion of
its local disk or memory to a cache where it stores data
temporarily. Whenever an application program (such as a text editor)
running on a client machine requests data from an AFS file, the
request passes through the Cache Manager. The Cache Manager is a
portion of the client machine's kernel that translates file requests
from local application programs into cross-network requests to the
<emphasis>File Server process</emphasis> running on the file server
machine storing the file. When the Cache Manager receives the
requested data from the File Server, it stores it in the cache and
then passes it on to the application program.</para>
<para>Caching improves the speed of data delivery to application
programs in the following ways:</para>
<itemizedlist>
<listitem>
<para>When the application program repeatedly asks for data from
the same file, it is already on the local disk. The application
does not have to wait for the Cache Manager to request and
receive the data from the File Server.</para>
</listitem>
<listitem>
<para>Caching data eliminates the need for repeated request and
transfer of the same data, so network traffic is reduced. Thus,
initial requests and other traffic can get through more
quickly.</para>
<indexterm>
<primary>AFS</primary>
<secondary>reducing traffic in</secondary>
</indexterm>
<indexterm>
<primary>network</primary>
<secondary>reducing traffic through caching</secondary>
</indexterm>
<indexterm>
<primary>slowed performance</primary>
<secondary>preventing in AFS</secondary>
</indexterm>
</listitem>
</itemizedlist>
<indexterm>
<primary>callback</primary>
</indexterm>
<indexterm>
<primary>consistency guarantees</primary>
<secondary>cached data</secondary>
</indexterm>
<para>While caching provides many advantages, it also creates the
problem of maintaining consistency among the many cached copies of a
file and the source version of a file. This problem is solved using
a mechanism referred to as a <emphasis>callback</emphasis>.</para>
<para>A callback is a promise by a File Server to a Cache Manager to
inform the latter when a change is made to any of the data delivered
by the File Server. Callbacks are used differently based on the type
of file delivered by the File Server: <itemizedlist>
<listitem>
<para>When a File Server delivers a writable copy of a file
(from a read/write volume) to the Cache Manager, the File
Server sends along a callback with that file. If the source
version of the file is changed by another user, the File
Server breaks the callback associated with the cached version
of that file--indicating to the Cache Manager that it needs to
update the cached copy.</para>
</listitem>
<listitem>
<para>When a File Server delivers a file from a read-only
volume to the Cache Manager, the File Server sends along a
callback associated with the entire volume (so it does not
need to send any more callbacks when it delivers additional
files from the volume). Only a single callback is required per
accessed read-only volume because files in a read-only volume
can change only when a new version of the complete volume is
released. All callbacks associated with the old version of the
volume are broken at release time.</para>
</listitem>
</itemizedlist>
</para>
<para>The callback mechanism ensures that the Cache Manager always
requests the most up-to-date version of a file. However, it does not
ensure that the user necessarily notices the most current version as
soon as the Cache Manager has it. That depends on how often the
application program requests additional data from the File System or
how often it checks with the Cache Manager.</para>
</sect2>
</sect1>
<sect1 id="HDRWQ17">
<title>AFS Server Processes and the Cache Manager</title>
<indexterm>
<primary>AFS</primary>
<secondary>server processes used in</secondary>
</indexterm>
<indexterm>
<primary>server</primary>
<secondary>process</secondary>
<tertiary>list of AFS</tertiary>
</indexterm>
<para>As mentioned in <link linkend="HDRWQ10">Servers and
Clients</link>, AFS file server machines run a number of processes,
each with a specialized function. One of the main responsibilities of
a system administrator is to make sure that processes are running
correctly as much of the time as possible, using the administrative
services that the server processes provide.</para>
<para>The following list briefly describes the function of each server
process and the Cache Manager; the following sections then discuss the
important features in more detail.</para>
<para>The <emphasis>File Server</emphasis>, the most fundamental of
the servers, delivers data files from the file server machine to local
workstations as requested, and stores the files again when the user
saves any changes to the files.</para>
<para>The <emphasis>Basic OverSeer Server (BOS Server)</emphasis>
ensures that the other server processes on its server machine are
running correctly as much of the time as possible, since a server is
useful only if it is available. The BOS Server relieves system
administrators of much of the responsibility for overseeing system
operations.</para>
<para>The Protection Server helps users control who has access to
their files and directories. It is responsible for mapping Kerberos
principals to AFS identities. Users can also grant access to several
other users at once by putting them all in a group entry in the
Protection Database maintained by the Protection Server.</para>
<para>The <emphasis>Volume Server</emphasis> performs all types of
volume manipulation. It helps the administrator move volumes from one
server machine to another to balance the workload among the various
machines.</para>
<para>The <emphasis>Volume Location Server (VL Server)</emphasis>
maintains the Volume Location Database (VLDB), in which it records the
location of volumes as they move from file server machine to file
server machine. This service is the key to transparent file access for
users.</para>
<para>The <emphasis>Salvager</emphasis> is not a server in the sense
that others are. It runs only after the File Server or Volume Server
fails; it repairs any inconsistencies caused by the failure. The
system administrator can invoke it directly if necessary.</para>
<para>The <emphasis>Update Server</emphasis> distributes new versions
of AFS server process software and configuration information to all
file server machines. It is crucial to stable system performance that
all server machines run the same software.</para>
<para>The <emphasis>Backup Server</emphasis> maintains the Backup
Database, in which it stores information related to the Backup
System. It enables the administrator to back up data from volumes to
tape. The data can then be restored from tape in the event that it is
lost from the file system. The Backup Server is optional and is only
one of several ways that the data in an AFS cell can be backed
up.</para>
<para>The <emphasis>Cache Manager</emphasis> is the one component in
this list that resides on AFS client rather than file server
machines. It not a process per se, but rather a part of the kernel on
AFS client machines that communicates with AFS server processes. Its
main responsibilities are to retrieve files for application programs
running on the client and to maintain the files in the cache.</para>
<para>AFS also relies on two other services that are not part of AFS
and need to be instaled separately:</para>
<para>AFS requires a <emphasis>Kerberos KDC</emphasis> to use for user
authentication. It verifies user identities at login and provides the
facilities through which participants in transactions prove their
identities to one another (mutually authenticate). AFS uses Kerberos
for all of its authentication. The Kerberos KDC replaces the old
<emphasis>Authentication Server</emphasis> included in OpenAFS. The
Authentication Server is still available for sites that need it, but
is now deprecated and should not be used for any new
installations.</para>
<para>The <emphasis>Network Time Protocol Daemon (NTPD)</emphasis> is
not an AFS server process, but plays a vital role nonetheless. It
synchronizes the internal clock on a file server machine with those on
other machines. Synchronized clocks are particularly important for
correct functioning of the AFS distributed database technology (known
as Ubik); see <link linkend="HDRWQ103">Configuring the Cell for Proper
Ubik Operation</link>. The NTPD is usually provided with the operating
system.</para>
<sect2 id="HDRWQ18">
<title>The File Server</title>
<indexterm>
<primary>File Server</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>File Server</emphasis> is the most fundamental
of the AFS server processes and runs on each file server machine. It
provides the same services across the network that the UNIX file
system provides on the local disk: <itemizedlist>
<listitem>
<para>Delivering programs and data files to client
workstations as requested and storing them again when the
client workstation finishes with them.</para>
</listitem>
<listitem>
<para>Maintaining the hierarchical directory structure that
users create to organize their files.</para>
</listitem>
<listitem>
<para>Handling requests for copying, moving, creating, and
deleting files and directories.</para>
</listitem>
<listitem>
<para>Keeping track of status information about each file and
directory (including its size and latest modification
time).</para>
</listitem>
<listitem>
<para>Making sure that users are authorized to perform the
actions they request on particular files or
directories.</para>
</listitem>
<listitem>
<para>Creating symbolic and hard links between files.</para>
</listitem>
<listitem>
<para>Granting advisory locks (corresponding to UNIX locks) on
request.</para>
</listitem>
</itemizedlist>
</para>
</sect2>
<sect2 id="HDRWQ19">
<title>The Basic OverSeer Server</title>
<indexterm>
<primary>BOS Server</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Basic OverSeer Server (BOS Server)</emphasis>
reduces the demands on system administrators by constantly
monitoring the processes running on its file server machine. It can
restart failed processes automatically and provides a convenient
interface for administrative tasks.</para>
<para>The BOS Server runs on every file server machine. Its primary
function is to minimize system outages. It also</para>
<itemizedlist>
<listitem>
<para>Constantly monitors the other server processes (on the
local machine) to make sure they are running correctly.</para>
</listitem>
<listitem>
<para>Automatically restarts failed processes, without
contacting a human operator. When restarting multiple server
processes simultaneously, the BOS server takes interdependencies
into account and initiates restarts in the correct order.</para>
<indexterm>
<primary>system outages</primary>
<secondary>reducing</secondary>
</indexterm>
<indexterm>
<primary>outages</primary>
<secondary>BOS Server role in,</secondary>
</indexterm>
</listitem>
<listitem>
<para>Accepts requests from the system administrator. Common
reasons to contact BOS are to verify the status of server
processes on file server machines, install and start new
processes, stop processes either temporarily or permanently, and
restart dead processes manually.</para>
</listitem>
<listitem>
<para>Helps system administrators to manage system configuration
information. The BOS Server provides a simple interface for
modifying two files that contain information about privileged
users and certain special file server machines. It also
automates the process of adding and changing <emphasis>server
encryption keys</emphasis>, which are important in mutual
authentication, if the Authentication Server is still in use,
but this function of the BOS Server is deprecated. For more
details about these configuration files, see <link
linkend="HDRWQ85">Common Configuration Files in the /usr/afs/etc
Directory</link>.</para>
</listitem>
</itemizedlist>
</sect2>
<sect2 id="HDRWQ21">
<title>The Protection Server</title>
<indexterm>
<primary>protection</primary>
<secondary>in AFS</secondary>
</indexterm>
<indexterm>
<primary>Protection Server</primary>
<secondary>description</secondary>
</indexterm>
<indexterm>
<primary>protection</primary>
<secondary>in UNIX</secondary>
</indexterm>
<para>The <emphasis>Protection Server</emphasis> is the key to AFS's
refinement of the normal UNIX methods for protecting files and
directories from unauthorized use. The refinements include the
following: <itemizedlist>
<listitem>
<para>Defining associations between Kerberos principals and
AFS identities. Normally, this is a simple mapping between
principal names in the Kerberos realm associated with an AFS
cell to AFS identities in that cell, but the Protection Server
also manages mappings for users using cross-realm
authentication from a different Kerberos realm.</para>
<para>Defining seven access permissions rather than the
standard UNIX file system's three. In conjunction with the
UNIX mode bits associated with each file and directory
element, AFS associates an <emphasis>access control list
(ACL)</emphasis> with each directory. The ACL specifies which
users have which of the seven specific permissions for the
directory and all the files it contains. For a definition of
AFS's seven access permissions and how users can set them on
access control lists, see <link linkend="HDRWQ562">Managing
Access Control Lists</link>.</para>
<indexterm>
<primary>access</primary>
<secondary></secondary>
<see>ACL</see>
</indexterm>
</listitem>
<listitem>
<para>Enabling users to grant permissions to numerous
individual users--a different combination to each individual
if desired. UNIX protection distinguishes only between three
user or groups: the owner of the file, members of a single
specified group, and everyone who can access the local file
system.</para>
</listitem>
<listitem>
<para>Enabling users to define their own groups of users,
recorded in the <emphasis>Protection Database</emphasis>
maintained by the Protection Server. The groups then appear on
directories' access control lists as though they were
individuals, which enables the granting of permissions to many
users simultaneously.</para>
</listitem>
<listitem>
<para>Enabling system administrators to create groups
containing client machine IP addresses to permit access when
it originates from the specified client machines. These types
of groups are useful when it is necessary to adhere to
machine-based licensing restrictions or where it is difficult
for some reason to obtain Kerberos credentials for processes
running on those systems that need access to AFS.</para>
</listitem>
</itemizedlist>
</para>
<indexterm>
<primary>group</primary>
<secondary>definition</secondary>
</indexterm>
<indexterm>
<primary>Protection Database</primary>
</indexterm>
<para>The Protection Server's main duty is to help the File Server
determine if a user is authorized to access a file in the requested
manner. The Protection Server creates a list of all the groups to
which the user belongs. The File Server then compares this list to
the ACL associated with the file's parent directory. A user thus
acquires access both as an individual and as a member of any
groups.</para>
<para>The Protection Server also maps Kerberos principals to
<emphasis>AFS user ID</emphasis> numbers (<emphasis>AFS
UIDs</emphasis>). These UIDs are functionally equivalent to UNIX
UIDs, but operate in the domain of AFS rather than in the UNIX file
system on a machine's local disk. This conversion service is
essential because the tickets that the Kerberos KDC gives to
authenticated users are stamped with principal names (to comply with
Kerberos standards). The AFS server processes identify users by AFS
UID, not by username. Before they can understand whom the token
represents, they need the Protection Server to translate the
username into an AFS UID. For further discussion of the
authentication process, see <link linkend="HDRWQ75">A More Detailed
Look at Mutual Authentication</link>.</para>
</sect2>
<sect2 id="HDRWQ22">
<title>The Volume Server</title>
<indexterm>
<primary>Volume Server</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Volume Server</emphasis> provides the interface
through which you create, delete, move, and replicate volumes, as
well as prepare them for archiving to disk, tape, or other media
(backing up). <link linkend="HDRWQ13">Volumes</link> explained the
advantages gained by storing files in volumes. Creating and deleting
volumes are necessary when adding and removing users from the
system; volume moves are done for load balancing; and replication
enables volume placement on multiple file server machines (for more
on replication, see <link
linkend="HDRWQ15">Replication</link>).</para>
</sect2>
<sect2 id="HDRWQ23">
<title>The Volume Location (VL) Server</title>
<indexterm>
<primary>VL Server</primary>
<secondary>description</secondary>
</indexterm>
<indexterm>
<primary>VLDB</primary>
</indexterm>
<para>The <emphasis>VL Server</emphasis> maintains a complete list
of volume locations in the <emphasis>Volume Location Database
(VLDB)</emphasis>. When the Cache Manager (see <link
linkend="HDRWQ28">The Cache Manager</link>) begins to fill a file
request from an application program, it first contacts the VL Server
in order to learn which file server machine currently houses the
volume containing the file. The Cache Manager then requests the file
from the File Server process running on that file server
machine.</para>
<para>The VLDB and VL Server make it possible for AFS to take
advantage of the increased system availability gained by using
multiple file server machines, because the Cache Manager knows where
to find a particular file. Indeed, in a certain sense the VL Server
is the keystone of the entire file system--when the information in
the VLDB is inaccessible, the Cache Manager cannot retrieve files,
even if the File Server processes are working properly. A list of
the information stored in the VLDB about each volume is provided in
<link linkend="HDRWQ180">Volume Information in the
VLDB</link>.</para>
<indexterm>
<primary>VL Server</primary>
<secondary>importance to transparent access</secondary>
</indexterm>
</sect2>
<sect2 id="HDRWQ26">
<title>The Salvager</title>
<indexterm>
<primary>Salvager</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Salvager</emphasis> differs from other AFS
Servers in that it runs only at selected times. The BOS Server
invokes the Salvager when the File Server, Volume Server, or both
fail. The Salvager attempts to repair disk corruption that can
result from a failure.</para>
<para>As a system administrator, you can also invoke the Salvager as
necessary, even if the File Server or Volume Server has not
failed. See <link linkend="HDRWQ232">Salvaging
Volumes</link>.</para>
</sect2>
<sect2 id="HDRWQ24">
<title>The Update Server</title>
<indexterm>
<primary>Update Server</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Update Server</emphasis> is an optional process
that helps guarantee that all file server machines are running the
same version of a server process. System performance can be
inconsistent if some machines are running one version of the File
Server (for example) and other machines were running another
version.</para>
<para>To ensure that all machines run the same version of a process,
install new software on a single file server machine of each system
type, called the <emphasis>binary distribution machine</emphasis>
for that type. The binary distribution machine runs the server
portion of the Update Server, whereas all the other machines of that
type run the client portion of the Update Server. The client
portions check frequently with the <emphasis>server
portion</emphasis> to see if they are running the right version of
every process; if not, the <emphasis>client portion</emphasis>
retrieves the right version from the binary distribution machine and
installs it locally. The system administrator does not need to
remember to install new software individually on all the file server
machines: the Update Server does it automatically. For more on
binary distribution machines, see <link linkend="HDRWQ93">Binary
Distribution Machines</link>.</para>
<indexterm>
<primary>Update Server</primary>
<secondary>server portion</secondary>
</indexterm>
<indexterm>
<primary>Update Server</primary>
<secondary>client portion</secondary>
</indexterm>
<para>The Update Server also distributes configuration files that
all file server machines need to store on their local disks (for a
description of the contents and purpose of these files, see <link
linkend="HDRWQ85">Common Configuration Files in the /usr/afs/etc
Directory</link>). As with server process software, the need for
consistent system performance demands that all the machines have the
same version of these files. The system administrator needs to make
changes to these files on one machine only, the cell's
<emphasis>system control machine</emphasis>, which runs a server
portion of the Update Server. All other machines in the cell run a
client portion that accesses the correct versions of these
configuration files from the system control machine. For more
information, see <link linkend="HDRWQ94">The System Control
Machine</link>.</para>
</sect2>
<sect2 id="HDRWQ25">
<title>The Backup Server</title>
<indexterm>
<primary>Backup System</primary>
<secondary>Backup Server described</secondary>
</indexterm>
<indexterm>
<primary>Backup Server</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Backup Server</emphasis> is an optional process
that maintains the information in the <emphasis>Backup
Database</emphasis>. The Backup Server and the Backup Database
enable administrators to back up data from AFS volumes to tape and
restore it from tape to the file system if necessary. The server and
database together are referred to as the Backup System. This Backup
System is only one way to back up AFS, and many AFS cells use
different methods.</para>
<para>Administrators who wish to use the Backup System initially
configure it by defining sets of volumes to be dumped together and
the schedule by which the sets are to be dumped. They also install
the system's tape drives and define the drives' <emphasis>Tape
Coordinators</emphasis>, which are the processes that control the
tape drives.</para>
<para>Once the Backup System is configured, user and system data can
be dumped from volumes to tape or disk. In the event that data is
ever lost from the system (for example, if a system or disk failure
causes data to be lost), administrators can restore the data from
tape. If tapes are periodically archived, or saved, data can also be
restored to its state at a specific time. Additionally, because
Backup System data is difficult to reproduce, the Backup Database
itself can be backed up to tape and restored if it ever becomes
corrupted. For more information on configuring and using the Backup
System, and on other AFS backup options, see <link
linkend="HDRWQ248">Configuring the AFS Backup System</link> and
<link linkend="HDRWQ283">Backing Up and Restoring AFS
Data</link>.</para>
</sect2>
<sect2 id="HDRWQ28">
<title>The Cache Manager</title>
<indexterm>
<primary>Cache Manager</primary>
<secondary>functions of</secondary>
</indexterm>
<para>As already mentioned in <link linkend="HDRWQ16">Caching and
Callbacks</link>, the <emphasis>Cache Manager</emphasis> is the one
component in this section that resides on client machines rather
than on file server machines. It is a combination of a daemon
process and a set of extensions or modifications in the client
machine's kernel, usually implemented as a loadable kernel module,
that enable communication with the server processes running on
server machines. Its main duty is to translate file requests (made
by application programs on client machines) into <emphasis>remote
procedure calls (RPCs)</emphasis> to the File Server. (The Cache
Manager first contacts the VL Server to find out which File Server
currently houses the volume that contains a requested file, as
mentioned in <link linkend="HDRWQ23">The Volume Location (VL)
Server</link>). When the Cache Manager receives the requested file,
it caches it before passing data on to the application
program.</para>
<para>The Cache Manager also tracks the state of files in its cache
compared to the version at the File Server by storing the callbacks
sent by the File Server. When the File Server breaks a callback,
indicating that a file or volume changed, the Cache Manager requests
a copy of the new version before providing more data to application
programs.</para>
</sect2>
<sect2 id="HDRWQ20">
<title>The Kerberos KDC</title>
<indexterm>
<primary>Kerberos KDC</primary>
<secondary>description</secondary>
</indexterm>
<indexterm>
<primary>Authentication Server</primary>
<secondary>description</secondary>
<seealso>Kerberos KDC</seealso>
</indexterm>
<indexterm>
<primary>Active Directory</primary>
<secondary>Kerberos KDC</secondary>
</indexterm>
<indexterm>
<primary>MIT Kerberos</primary>
<secondary>Kerberos KDC</secondary>
</indexterm>
<indexterm>
<primary>Heimdal</primary>
<secondary>Kerberos KDC</secondary>
</indexterm>
<para>The <emphasis>Kerberos KDC</emphasis> (Key Distribution
Center) performs two main functions related to network security:
<itemizedlist>
<listitem>
<para>Verifying the identity of users as they log into the
system by requiring that they provide a password or some other
form of authentication credentials. The Kerberos KDC grants
the user a ticket, which is converted into a token to prove to
AFS server processes that the user has authenticated. For more
on tokens, see <link linkend="HDRWQ76">Complex Mutual
Authentication</link>.</para>
</listitem>
<listitem>
<para>Providing the means through which server and client
processes prove their identities to each other (mutually
authenticate). This helps to create a secure environment in
which to send cross-network messages.</para>
</listitem>
</itemizedlist>
</para>
<para>The Kerberos KDC is a required service, but does not come with
OpenAFS. One Kerberos KDC may provide authentication services for
multiple AFS cells. Each AFS cell must be associated with a Kerberos
realm with one or more Kerberos KDCs supporting version 4 or 5 of
the Kerberos protocol. Kerberos version 4 is not secure and is
supported only for backwards compatibility; Kerberos 5 should be
used for any new installation.</para>
<para>A Kerberos KDC maintains a database in which it stores
encryption keys for users and for services, including the AFS server
encryption key. For users, these encryption keys are normally formed
by converting a user password to a key, but Kerberos KDCs also
support other authentication mechanisms. To learn more about the
procedures AFS uses to verify user identity and during mutual
authentication, see <link linkend="HDRWQ75">A More Detailed Look at
Mutual Authentication</link>.</para>
<para>Kerberos KDC software is included with some operating systems
or may be acquired separately. MIT Kerberos, Heimdal, and Microsoft
Active Directory are known to work with OpenAFS as a Kerberos
Server.This technology was originally developed by the Massachusetts
Institute of Technology's Project Athena.</para>
<note>
<para>The <emphasis>Authentication Server</emphasis>, or kaserver,
was a Kerberos version 4 KDC. It is obsolete and should no longer
be used. A third-party Kerberos version 5 KDC should be used
instead. The Authentication Server is still provided with OpenAFS,
but only for backward compatibility and legacy support for sites
that have not yet migrated to a Kerberos version 5 KDC. the
Kerberos Server. All references to the <emphasis>Kerberos
KDC</emphasis> in this guide refer to a Kerberos 5 server.</para>
</note>
<indexterm>
<primary>AFS</primary>
<secondary></secondary>
<see>AFS UID</see>
</indexterm>
<indexterm>
<primary>username</primary>
<secondary>use by Kerberos</secondary>
</indexterm>
<indexterm>
<primary>UNIX</primary>
<secondary>UID</secondary>
<tertiary>functional difference from AFS UID</tertiary>
</indexterm>
<indexterm>
<primary>Kerberos</primary>
<secondary>use of usernames</secondary>
</indexterm>
</sect2>
<sect2 id="HDRWQ27">
<title>The Network Time Protocol Daemon</title>
<indexterm>
<primary>ntpd</primary>
<secondary>description</secondary>
</indexterm>
<para>The <emphasis>Network Time Protocol Daemon (NTPD)</emphasis>
is not an AFS server process, but plays an important role. It helps
guarantee that all of the file server machines and client machines
agree on the time. The NTPD on all file server machines learns the
correct time from a parent NTPD source, which may be located inside
or outside the cell.</para>
<para>Keeping clocks synchronized is particularly important to the
correct operation of AFS's distributed database technology, which
coordinates the copies of the Backup, Protection, and Volume
Location Databases; see <link linkend="HDRWQ52">Replicating the
OpenAFS Administrative Databases</link>. Client machines may also
refer to these clocks for the correct time; therefore, it is less
confusing if all file server machines have the same time. For more
technical detail about the NTPD, see <ulink
url="http://www.ntp.org/">The NTP web site</ulink> or the
documentation for your operating system.</para>
<important><title>Clock Skew Impact</title> <para>Client machines
that are authenticating to an OpenAFS cell with valid credentials
may still fail when the clocks of the client machine, Kerberos KDC,
and the File Server machines are not in sync.</para></important>
</sect2>
</sect1>
</chapter>