mirror of
https://git.openafs.org/openafs.git
synced 2025-01-18 15:00:12 +00:00
c6f5ebc4cf
The doc/txt directory has become the de facto home for text-based technical notes. Relocate the contents of the doc/arch directory to doc/txt. Relocate doc/examples to doc/txt/examples. Update the doc/README file to be more current and remove old work in progress comments. Change-Id: Iaa53e77eb1f7019d22af8380fa147305ac79d055 Reviewed-on: https://gerrit.openafs.org/12675 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
271 lines
13 KiB
Plaintext
271 lines
13 KiB
Plaintext
## Introduction
|
|
|
|
This version works on Linux 2.6, and provides the following features:
|
|
|
|
- Basic AFS/NFS translator functionality, similar to other platforms
|
|
- Ability to distinguish PAG's assigned within each NFS client
|
|
- A new 'afspag' kernel module, which provides PAG management on
|
|
NFS client systems, and forwards AFS system calls to the translator
|
|
system via the remote AFS system call (rmtsys) protocol.
|
|
- Support for transparent migration of an NFS client from one translator
|
|
server to another, without loss of credentials or sysnames.
|
|
- The ability to force the translator to discard all credentials
|
|
belonging to a specified NFS client host.
|
|
|
|
|
|
The patch applies to OpenAFS 1.4.1, and has been tested against the
|
|
kernel-2.6.9-22.0.2.EL kernel binaries as provided by the CentOS project
|
|
(essentially these are rebuilds from source of Red Hat Enterprise Linux).
|
|
This patch is not expected to apply cleanly to newer versions of OpenAFS,
|
|
due to conflicting changes in parts of the kernel module source. To apply
|
|
this patch, use 'patch -p0'.
|
|
|
|
It has been integrated into OpenAFS 1.5.x.
|
|
|
|
## New in Version 1.4
|
|
|
|
- There was no version 1.3
|
|
- Define a "sysname generation number" which changes any time the sysname
|
|
list is changed for the translator or any client. This number is used
|
|
as the nanoseconds part of the mtime of directories, which forces NFS
|
|
clients to reevaluate directory lookups any time the sysname changes.
|
|
- Fixed several bugs related to sysname handling
|
|
- Fixed a bug preventing 'fs exportafs' from changing the flag which
|
|
controls whether callbacks are made to NFS clients to obtain tokens
|
|
and sysname lists.
|
|
- Starting in this version, when the PAG manager starts up, it makes a
|
|
call to the translator to discard any tokens belonging to that client.
|
|
This fixes a problem where newly-created PAG's on the client would
|
|
inherit tokens owned by an unrelated process from an earlier boot.
|
|
- Enabled the PAG manager to forward non-V-series pioctl's.
|
|
- Forward ported to OpenAFS 1.4.1 final
|
|
- Added a file, /proc/fs/openafs/unixusers, which reports information
|
|
about "unixuser" structures, which are used to record tokens and to
|
|
bind translator-side PAG's to NFS client data and sysname lists.
|
|
|
|
|
|
## Finding the RPC server authtab
|
|
|
|
In order to correctly detect NFS clients and distinguish between them,
|
|
the translator must insert itself into the RPC authentication process.
|
|
This requires knowing the address of the RPC server authentication dispatch
|
|
table, which is not exported from standard kernels. To address this, the
|
|
kernel must be patched such that net/sunrpc/svcauth.c exports the 'authtab'
|
|
symbol, or this symbol's address must be provided when the OpenAFS kernel
|
|
module is loaded, using the option "authtab_addr=0xXXXXXXXX" where XXXXXXXX
|
|
is the address of the authtab symbol as obtained from /proc/kallsyms. The
|
|
latter may be accomplished by adding the following three lines to the
|
|
openafs-client init script in place of 'modprobe openafs':
|
|
|
|
modprobe sunrpc
|
|
authtab=`awk '/[ \t]authtab[ \t]/ { print $1 }' < /proc/kallsyms`
|
|
modprobe openafs ${authtab:+authtab_addr=0x$authtab}
|
|
|
|
|
|
## Exporting the NFS filesystem
|
|
|
|
In order for the translator to work correctly, /afs must be exported with
|
|
specific options. Specifically, the 'no_subtree_check' option is needed
|
|
in order to prevent the common NFS server code from performing unwanted
|
|
access checks, and an fsid option must be provided to set the filesystem
|
|
identifier to be used in NFS filehandles. Note that for live migration
|
|
to work, a consistent filesystem id must be used on all translator systems.
|
|
The export may be accomplished with a line in /etc/exports:
|
|
|
|
/afs (rw,no_subtree_check,fsid=42)
|
|
|
|
Or with a command:
|
|
|
|
exportfs -o rw,no_subtree_check,fsid=42 :/afs
|
|
|
|
The AFS/NFS translator code is enabled by default; no additional command
|
|
is required to activate it. However, the 'fs exportafs nfs' command can
|
|
be used to disable or re-enable the translator and to set options. Note
|
|
that support for client-assigned PAG's is not enabled by default, and
|
|
must be enabled with the following command:
|
|
|
|
fs exportafs nfs -clipags on
|
|
|
|
Support for making callbacks to obtain credentials and sysnames from
|
|
newly-discovered NFS clients is also not enabled by default, because this
|
|
would result in long timeouts on requests from NFS clients which do not
|
|
support this feature. To enable this feature, use the following command:
|
|
|
|
fs exportafs nfs -pagcb on
|
|
|
|
|
|
## Client-Side PAG Management
|
|
|
|
Management of PAG's on individual NFS clients is provided by the kernel
|
|
module afspag.ko, which is automatically built alongside the libafs.ko
|
|
module on Linux 2.6 systems. This component is not currently supported
|
|
on any other platform.
|
|
|
|
To activate the client PAG manager, simply load the module; no additional
|
|
parameters or commands are required. Once the module is loaded, PAG's
|
|
may be acquired using the setpag() call, exactly as on systems running the
|
|
full cache manager. Both the traditional system call and new-style ioctl
|
|
entry points are supported.
|
|
|
|
In addition, the PAG manager can forward pioctl() calls to an AFS/NFS
|
|
translator system via the remote AFS system call service (rmtsys). To
|
|
enable this feature, the kernel module must be loaded with a parameter
|
|
specifying the location of the translator system:
|
|
|
|
insmod afspag.ko nfs_server_addr=0xAABBCCDD
|
|
|
|
In this example, 0xAABBCCDD is the IP address of the translator system,
|
|
in network byte order. For example, if the translator has the IP address
|
|
192.168.42.100, the nfs_server_addr parameter should be set to 0xc0a82a64.
|
|
|
|
The PAG manager can be shut down using 'afsd -shutdown' (ironically, this
|
|
is the only circumstance in which that command is useful). Once the
|
|
shutdown is complete, the kernel module can be removed using rmmod.
|
|
|
|
|
|
## Remote System Calls
|
|
|
|
The NFS translator supports the ability of NFS clients to perform various
|
|
AFS-specific operations via the remote system call interface (rmtsys).
|
|
To enable this feature, afsd must be run with the -rmtsys switch. OpenAFS
|
|
client utilities will use this feature automatically if the AFSSERVER
|
|
environment variable is set to the address or hostname of the translator
|
|
system, or if the file ~/.AFSSERVER or /.AFSSERVER exists and contains the
|
|
translator's address or hostname.
|
|
|
|
On systems running the client PAG manager (afspag.ko), AFS system calls
|
|
made via the traditional methods will be automatically forwarded to the
|
|
NFS translator system, if the PAG manager is configured to do so. This
|
|
feature must be enabled, as described above.
|
|
|
|
|
|
## Credential Caching
|
|
|
|
The client PAG manager maintains a cache of credentials belonging to each
|
|
PAG. When an application makes a system call to set or remove AFS tokens,
|
|
the PAG manager updates its cache in addition to forwarding the request
|
|
to the NFS server.
|
|
|
|
When the translator hears from a previously-unknown client, it makes a
|
|
callback to the client to retrieve a copy of any cached credentials.
|
|
This means that credentials belonging to an NFS client are not lost if
|
|
the translator is rebooted, or if the client's location on the network
|
|
changes such that it is talking to a different translator.
|
|
|
|
This feature is automatically supported by the PAG manager if it has
|
|
been configured to forward system calls to an NFS translator. However,
|
|
requests will be honored only if they come from port 7001 on the NFS
|
|
translator host. In addition, this feature must be enabled on the NFS
|
|
translator system as described above.
|
|
|
|
|
|
## System Name List
|
|
|
|
When the NFS translator hears from a new NFS client whose system name
|
|
list it does not know, it can make a callback to the client to discover
|
|
the correct system name list. This ability is enabled automatically
|
|
with credential caching and retrieval is enabled as described above.
|
|
|
|
The PAG manager maintains a system-wide sysname list, which is used to
|
|
satisfy callback requests from the NFS translator. This list is set
|
|
initially to contain only the compiled-in default sysname, but can be
|
|
changed by the superuser using the VIOC_AFS_SYSNAME pioctl or the
|
|
'fs sysname' command. Any changes are automatically propagated to the
|
|
NFS translator.
|
|
|
|
|
|
## Dynamic Mount Points
|
|
|
|
This patch introduces a special directory ".:mount", which can be found
|
|
directly below the AFS root directory. This directory always appears to
|
|
be empty, but any name of the form "cell:volume" will resolve to a mount
|
|
point for the specified volume. The resulting mount points are always
|
|
RW-path mount points, and so will resolve to an RW volume even if the
|
|
specified name refers to a replicated volume. However, the ".readonly"
|
|
and ".backup" suffixes can be used to refer to volumes of those types,
|
|
and a numeric volume ID will always be used as-is.
|
|
|
|
This feature is required to enable the NFS translator to reconstruct a
|
|
reachable path for any valid filehandle presented by an NFS client.
|
|
Specifically, when the path reconstruction algorithm is walking upward
|
|
from a client-provided filehandle and encounters the root directory of
|
|
a volume which is no longer in the cache (and thus has no known mount
|
|
point), it will complete the path to the AFS root using the dynamic
|
|
mount directory.
|
|
|
|
On non-linux cache managers, this feature is available when dynamic
|
|
root and fake stat modes are enabled.
|
|
|
|
On Linux systems, it is also available even when dynroot is not enabled,
|
|
to support the NFS translator. It is presently not possible to disable
|
|
this feature, though that ability may be added in the future. It would
|
|
be difficult to make this feature unavailable to users and still make the
|
|
Linux NFS translator work, since the point of the check being performed
|
|
by the NFS server is to ensure the requested file would be reachable by
|
|
the client.
|
|
|
|
|
|
## Security
|
|
|
|
The security of the NFS translator depends heavily on the underlying
|
|
network. Proper configuration is required to prevent unauthorized
|
|
access to files, theft of credentials, or other forms of attack.
|
|
|
|
NFS, remote syscall, and PAG callback traffic between an NFS client host
|
|
and translator may contain sensitive file data and/or credentials, and
|
|
should be protected from snooping by unprivileged users or other hosts.
|
|
|
|
Both the NFS translator and remote system call service authorize requests
|
|
in part based on the IP address of the requesting client. To prevent an
|
|
attacker from making requests on behalf of another host, the network must
|
|
be configured such that it is impossible for one client to spoof the IP
|
|
address of another.
|
|
|
|
In addition, both the NFS translator and remote system call service
|
|
associate requests with specific users based on user and group ID data
|
|
contained within the request. In order to prevent users on the same client
|
|
from making filesystem access requests as each other, the NFS server must
|
|
be configured to accept requests only from privileged ports. In order to
|
|
prevent users from making AFS system calls on each other's behalf, possibly
|
|
including retrieving credentials, the network must be configured such that
|
|
requests to the remote system call service (port 7009) are accepted only
|
|
from port 7001 on NFS clients.
|
|
|
|
When a client is migrated away from a translator, any credentials held
|
|
on behalf of that client must be discarded before that client's IP address
|
|
can safely be reused. The VIOC_NFS_NUKE_CREDS pioctl and 'fs nukenfscreds'
|
|
command are provided for this purpose. Both take a single argument, which
|
|
is the IP address of the NFS client whose credentials should be discarded.
|
|
|
|
|
|
## Known Issues
|
|
|
|
+ Because NFS clients do not maintain active references on every inode
|
|
they are using, it is possible that portions of the directory tree
|
|
in use by an NFS client will expire from the translator's AFS and
|
|
Linux dentry cache's. When this happens, the NFS server attempts to
|
|
reconstruct the missing portion of the directory tree, but may fail
|
|
if the client does not have sufficient access (for example, if his
|
|
tokens have expired). In these cases, a "stale NFS filehandle" error
|
|
will be generated. This behavior is similar to that found on other
|
|
translator platforms, but is triggered under a slightly different set
|
|
of circumstances due to differences in the architecture of the Linux
|
|
NFS server.
|
|
|
|
+ Due to limitations of the rmtsys protocol, some pioctl calls require
|
|
large (several KB) transfers between the client and rmtsys server.
|
|
Correcting this issues would require extensions to the rmtsys protocol
|
|
outside the scope of this project.
|
|
|
|
+ The rmtsys interface requires that AFS be mounted in the same place
|
|
on both the NFS client and translator system, or at least that the
|
|
translator be able to correctly resolve absolute paths provided by
|
|
the client.
|
|
|
|
+ If a client is migrated or an NFS translator host is unexpectedly
|
|
rebooted while AFS filesystem access is in progress, there may be
|
|
a short delay before the client recovers. This is because the NFS
|
|
client must time out any request it made to the old server before
|
|
it can retransmit the request, which will then be handled by the
|
|
new server. The same applies to remote system call requests.
|