mirror of
https://git.openafs.org/openafs.git
synced 2025-01-19 07:20:11 +00:00
0a8e7b1548
Make ihandle file descriptor cache parameters tunable, and accommodate platforms where max open files is large. Expand the fd cache hash table to 2048 entries. Raise fd cache size automatically to match configured number of lwps. NOTE: This code has been tested on Centos 5.3 x86_64, on VMWare, 2 physical, 2 logical CPUs (in tandem with viced_more_threads). LICENSE BSD Change-Id: If68eda6e1c955e026b250ca52bddf0b8383959c9 Change-Id: I5fbbec95523ea9cd9ff42dcf43f17db94c7bb161 Reviewed-on: http://gerrit.openafs.org/584 Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
802 lines
31 KiB
Plaintext
802 lines
31 KiB
Plaintext
=head1 NAME
|
|
|
|
fileserver - Initializes the File Server component of the fs process
|
|
|
|
=head1 SYNOPSIS
|
|
|
|
=for html
|
|
<div class="synopsis">
|
|
|
|
B<fileserver> S<<< [B<-auditlog> <I<path to log file>>] >>>
|
|
S<<< [B<-audit-interface> (file | sysvmq)] >>>
|
|
S<<< [B<-d> <I<debug level>>] >>>
|
|
S<<< [B<-p> <I<number of processes>>] >>>
|
|
S<<< [B<-spare> <I<number of spare blocks>>] >>>
|
|
S<<< [B<-pctspare> <I<percentage spare>>] >>>
|
|
S<<< [B<-b> <I<buffers>>] >>>
|
|
S<<< [B<-l> <I<large vnodes>>] >>>
|
|
S<<< [B<-s> <I<small vnodes>>] >>>
|
|
S<<< [B<-vc> <I<volume cachesize>>] >>>
|
|
S<<< [B<-w> <I<call back wait interval>>] >>>
|
|
S<<< [B<-cb> <I<number of call backs>>] >>>
|
|
S<<< [B<-banner>] >>>
|
|
S<<< [B<-novbc>] >>>
|
|
S<<< [B<-implicit> <I<admin mode bits: rlidwka>>] >>>
|
|
S<<< [B<-readonly>] >>>
|
|
S<<< [B<-hr> <I<number of hours between refreshing the host cps>>] >>>
|
|
S<<< [B<-busyat> <I<< redirect clients when queue > n >>>] >>>
|
|
S<<< [B<-nobusy>] >>>
|
|
S<<< [B<-rxpck> <I<number of rx extra packets>>] >>>
|
|
S<<< [B<-rxdbg>] >>>
|
|
S<<< [B<-rxdbge>] >>>
|
|
S<<< [B<-rxmaxmtu> <I<bytes>>] >>>
|
|
S<<< [B<-nojumbo> >>>
|
|
S<<< [B<-jumbo> >>>
|
|
S<<< [B<-rxbind> >>>
|
|
S<<< [B<-allow-dotted-principals>] >>>
|
|
S<<< [B<-L>] >>>
|
|
S<<< [B<-S>] >>>
|
|
S<<< [B<-k> <I<stack size>>] >>>
|
|
S<<< [B<-realm> <I<Kerberos realm name>>] >>>
|
|
S<<< [B<-udpsize> <I<size of socket buffer in bytes>>] >>>
|
|
S<<< [B<-sendsize> <I<size of send buffer in bytes>>] >>>
|
|
S<<< [B<-abortthreshold> <I<abort threshold>>] >>>
|
|
S<<< [B<-enable_peer_stats>] >>>
|
|
S<<< [B<-enable_process_stats>] >>>
|
|
S<<< [B<-syslog> [<I< loglevel >>]] >>>
|
|
S<<< [B<-mrafslogs>] >>>
|
|
S<<< [B<-saneacls>] >>>
|
|
S<<< [B<-help>] >>>
|
|
S<<< [B<-fs-state-dont-save>] >>>
|
|
S<<< [B<-fs-state-dont-restore>] >>>
|
|
S<<< [B<-fs-state-verify>] (none | save | restore | both)] >>>
|
|
S<<< [B<-vhandle-setaside> <I<fds reserved for non-cache io>>] >>>
|
|
S<<< [B<-vhandle-max-cachesize> <I<max open files>>] >>>
|
|
S<<< [B<-vhandle-initial-cachesize> <I<fds reserved for non-cache io>>] >>>
|
|
S<<< [B<-vhashsize> <I<log(2) of number of volume hash buckets>>] >>>
|
|
S<<< [B<-vlrudisable>] >>>
|
|
S<<< [B<-vlruthresh> <I<minutes before eligibility for soft detach>>] >>>
|
|
S<<< [B<-vlruinterval> <I<seconds between VLRU scans>>] >>>
|
|
S<<< [B<-vlrumax> <I<max volumes to soft detach in one VLRU scan>>] >>>
|
|
S<<< [B<-vattachpar> <I<number of volume attach threads>>] >>>
|
|
S<<< [B<-m> <I<min percentage spare in partition>>] >>>
|
|
S<<< [B<-lock>] >>>
|
|
|
|
=for html
|
|
</div>
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
The B<fileserver> command initializes the File Server component of the
|
|
C<fs> process. In the conventional configuration, its binary file is
|
|
located in the F</usr/afs/bin> directory on a file server machine.
|
|
|
|
The B<fileserver> command is not normally issued at the command shell
|
|
prompt, but rather placed into a database server machine's
|
|
F</usr/afs/local/BosConfig> file with the B<bos create> command. If it is
|
|
ever issued at the command shell prompt, the issuer must be logged onto a
|
|
file server machine as the local superuser C<root>.
|
|
|
|
The File Server creates the F</usr/afs/logs/FileLog> log file as it
|
|
initializes, if the file does not already exist. It does not write a
|
|
detailed trace by default, but the B<-d> option may be used to
|
|
increase the amount of detail. Use the B<bos getlog> command to
|
|
display the contents of the log file.
|
|
|
|
The command's arguments enable the administrator to control many aspects
|
|
of the File Server's performance, as detailed in L<OPTIONS>. By default
|
|
the B<fileserver> command sets values for many arguments that are suitable
|
|
for a medium-sized file server machine. To set values suitable for a small
|
|
or large file server machine, use the B<-S> or B<-L> flag
|
|
respectively. The following list describes the parameters and
|
|
corresponding argument for which the B<fileserver> command sets default
|
|
values, and the table below summarizes the setting for each of the three
|
|
machine sizes.
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
The maximum number of lightweight processes (LWPs) or pthreads
|
|
the File Server uses to handle requests for data; corresponds to the
|
|
B<-p> argument. The File Server always uses a minimum of 32 KB of
|
|
memory for these processes.
|
|
|
|
=item *
|
|
|
|
The maximum number of directory blocks the File Server caches in memory;
|
|
corresponds to the B<-b> argument. Each cached directory block (buffer)
|
|
consumes 2,092 bytes of memory.
|
|
|
|
=item *
|
|
|
|
The maximum number of large vnodes the File Server caches in memory for
|
|
tracking directory elements; corresponds to the B<-l> argument. Each large
|
|
vnode consumes 292 bytes of memory.
|
|
|
|
=item *
|
|
|
|
The maximum number of small vnodes the File Server caches in memory for
|
|
tracking file elements; corresponds to the B<-s> argument. Each small
|
|
vnode consumes 100 bytes of memory.
|
|
|
|
=item *
|
|
|
|
The maximum volume cache size, which determines how many volumes the File
|
|
Server can cache in memory before having to retrieve data from disk;
|
|
corresponds to the B<-vc> argument.
|
|
|
|
=item *
|
|
|
|
The maximum number of callback structures the File Server caches in
|
|
memory; corresponds to the B<-cb> argument. Each callback structure
|
|
consumes 16 bytes of memory.
|
|
|
|
=item *
|
|
|
|
The maximum number of Rx packets the File Server uses; corresponds to the
|
|
B<-rxpck> argument. Each packet consumes 1544 bytes of memory.
|
|
|
|
=back
|
|
|
|
The default values are:
|
|
|
|
Parameter (Argument) Small (-S) Medium Large (-L)
|
|
---------------------------------------------------------------------
|
|
Number of LWPs (-p) 6 9 128
|
|
Number of cached dir blocks (-b) 70 90 120
|
|
Number of cached large vnodes (-l) 200 400 600
|
|
Number of cached small vnodes (-s) 200 400 600
|
|
Maximum volume cache size (-vc) 200 400 600
|
|
Number of callbacks (-cb) 20,000 60,000 64,000
|
|
Number of Rx packets (-rxpck) 100 150 200
|
|
|
|
To override any of the values, provide the indicated argument (which can
|
|
be combined with the B<-S> or B<-L> flag).
|
|
|
|
The amount of memory required for the File Server varies. The approximate
|
|
default memory usage is 751 KB when the B<-S> flag is used (small
|
|
configuration), 1.1 MB when all defaults are used (medium configuration),
|
|
and 1.4 MB when the B<-L> flag is used (large configuration). If
|
|
additional memory is available, increasing the value of the B<-cb> and
|
|
B<-vc> arguments can improve File Server performance most directly.
|
|
|
|
By default, the File Server allows a volume to exceed its quota by 1 MB
|
|
when an application is writing data to an existing file in a volume that
|
|
is full. The File Server still does not allow users to create new files in
|
|
a full volume. To change the default, use one of the following arguments:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
Set the B<-spare> argument to the number of extra kilobytes that the File
|
|
Server allows as overage. A value of C<0> allows no overage.
|
|
|
|
=item *
|
|
|
|
Set the B<-pctspare> argument to the percentage of the volume's quota the
|
|
File Server allows as overage.
|
|
|
|
=back
|
|
|
|
By default, the File Server implicitly grants the C<a> (administer) and
|
|
C<l> (lookup) permissions to system:administrators on the access control
|
|
list (ACL) of every directory in the volumes stored on its file server
|
|
machine. In other words, the group's members can exercise those two
|
|
permissions even when an entry for the group does not appear on an ACL. To
|
|
change the set of default permissions, use the B<-implicit> argument.
|
|
|
|
The File Server maintains a I<host current protection subgroup> (I<host
|
|
CPS>) for each client machine from which it has received a data access
|
|
request. Like the CPS for a user, a host CPS lists all of the Protection
|
|
Database groups to which the machine belongs, and the File Server compares
|
|
the host CPS to a directory's ACL to determine in what manner users on the
|
|
machine are authorized to access the directory's contents. When the B<pts
|
|
adduser> or B<pts removeuser> command is used to change the groups to
|
|
which a machine belongs, the File Server must recompute the machine's host
|
|
CPS in order to notice the change. By default, the File Server contacts
|
|
the Protection Server every two hours to recompute host CPSs, implying
|
|
that it can take that long for changed group memberships to become
|
|
effective. To change this frequency, use the B<-hr> argument.
|
|
|
|
The File Server stores volumes in partitions. A partition is a
|
|
filesystem or directory on the server machine that is named C</vicepX>
|
|
or C</vicepXX> where XX is "a" through "z" or "aa" though "iv". Up to
|
|
255 partitions are allowed. The File Server expects that the /vicepXX
|
|
directories are each on a dedicated filesystem. The File Server will
|
|
only use a /vicepXX if it's a mountpoint for another filesystem,
|
|
unless the file C</vicepXX/AlwaysAttach> exists. The data in the
|
|
partition is a special format that can only be access using OpenAFS
|
|
commands or an OpenAFS client.
|
|
|
|
The File Server generates the following message when a partition is nearly
|
|
full:
|
|
|
|
No space left on device
|
|
|
|
This command does not use the syntax conventions of the AFS command
|
|
suites. Provide the command name and all option names in full.
|
|
|
|
=head1 CAUTIONS
|
|
|
|
Do not use the B<-k> and B<-w> arguments, which are intended for use
|
|
by the OpenAFS developers only. Changing them from their default
|
|
values can result in unpredictable File Server behavior. In any case,
|
|
on many operating systems the File Server uses native threads rather
|
|
than the LWP threads, so using the B<-k> argument to set the number of
|
|
LWP threads has no effect.
|
|
|
|
Do not specify both the B<-spare> and B<-pctspare> arguments. Doing so
|
|
causes the File Server to exit, leaving an error message in the
|
|
F</usr/afs/logs/FileLog> file.
|
|
|
|
Options that are available only on some system types, such as the B<-m>
|
|
and B<-lock> options, appear in the output generated by the B<-help>
|
|
option only on the relevant system type.
|
|
|
|
Currently, the maximum size of a volume is 2 terabytes (2^31 bytes)
|
|
and the maximum size of a /vicepX partition on a fileserver is 2^64
|
|
kilobytes. The maximum partition size in releases 1.4.7 and earlier is
|
|
2 terabytes (2^31 bytes). The maximum partition size for 1.5.x
|
|
releases 1.5.34 and earlier is 2 terabytes as well.
|
|
|
|
The maximum number of directory entries is 64,000 if all of the entries
|
|
have names that are 15 octets or less in length. A name that is 15 octets
|
|
long requires the use of only one block in the directory. Additional
|
|
sequential blocks are required to store entries with names that are longer
|
|
than 15 octets. Each additional block provides an additional length of 32
|
|
octets for the name of the entry. Note that if file names use an encoding
|
|
like UTF-8, a single character may be encoded into multiple octets.
|
|
|
|
In real world use, the maximum number of objects in an AFS directory
|
|
is usually between 16,000 and 25,000, depending on the average name
|
|
length.
|
|
|
|
=head1 OPTIONS
|
|
|
|
=over 4
|
|
|
|
=item B<-auditlog> <I<log path>>
|
|
|
|
Turns on audit logging, and sets the path for the audit log. The audit
|
|
log records information about RPC calls, including the name of the RPC
|
|
call, the host that submitted the call, the authenticated entity (user)
|
|
that issued the call, the parameters for the call, and if the call
|
|
succeeded or failed.
|
|
|
|
=item B<-audit-interface> (file | sysvmq)
|
|
|
|
Specifies what audit interface to use. The C<file> interface writes audit
|
|
messages to the file passed to B<-auditlog>. The C<sysvmq> interface
|
|
writes audit messages to a SYSV message (see L<msgget(2)> and
|
|
L<msgrcv(2)>). The message queue the C<sysvmq> interface writes to has the
|
|
key C<ftok(path, 1)>, where C<path> is the path specified in the
|
|
B<-auditlog> option.
|
|
|
|
Defaults to C<file>.
|
|
|
|
=item B<-d> <I<debug level>>
|
|
|
|
Sets the detail level for the debugging trace written to the
|
|
F</usr/afs/logs/FileLog> file. Provide one of the following values, each
|
|
of which produces an increasingly detailed trace: C<0>, C<1>, C<5>, C<25>,
|
|
and C<125>. The default value of C<0> produces only a few messages.
|
|
|
|
=item B<-p> <I<number of processes>>
|
|
|
|
Sets the number of threads (or LWPs) to run. Provide a positive integer.
|
|
The File Server creates and uses five threads for special purposes,
|
|
in addition to the number specified (but if this argument specifies
|
|
the maximum possible number, the File Server automatically uses five
|
|
of the threads for its own purposes).
|
|
|
|
The maximum number of threads can differ in each release of OpenAFS.
|
|
Consult the I<OpenAFS Release Notes> for the current release.
|
|
|
|
=item B<-spare> <I<number of spare blocks>>
|
|
|
|
Specifies the number of additional kilobytes an application can store in a
|
|
volume after the quota is exceeded. Provide a positive integer; a value of
|
|
C<0> prevents the volume from ever exceeding its quota. Do not combine
|
|
this argument with the B<-pctspare> argument.
|
|
|
|
=item B<-pctspare> <I<percentage spare>>
|
|
|
|
Specifies the amount by which the File Server allows a volume to exceed
|
|
its quota, as a percentage of the quota. Provide an integer between C<0>
|
|
and C<99>. A value of C<0> prevents the volume from ever exceeding its
|
|
quota. Do not combine this argument with the B<-spare> argument.
|
|
|
|
=item B<-b> <I<buffers>>
|
|
|
|
Sets the number of directory buffers. Provide a positive integer.
|
|
|
|
=item B<-l> <I<large vnodes>>
|
|
|
|
Sets the number of large vnodes available in memory for caching directory
|
|
elements. Provide a positive integer.
|
|
|
|
=item B<-s> <I<small nodes>>
|
|
|
|
Sets the number of small vnodes available in memory for caching file
|
|
elements. Provide a positive integer.
|
|
|
|
=item B<-vc> <I<volume cachesize>>
|
|
|
|
Sets the number of volumes the File Server can cache in memory. Provide a
|
|
positive integer.
|
|
|
|
=item B<-w> <I<call back wait interval>>
|
|
|
|
Sets the interval at which the daemon spawned by the File Server performs
|
|
its maintenance tasks. Do not use this argument; changing the default
|
|
value can cause unpredictable behavior.
|
|
|
|
=item B<-cb> <I<number of callbacks>>
|
|
|
|
Sets the number of callbacks the File Server can track. Provide a positive
|
|
integer.
|
|
|
|
=item B<-banner>
|
|
|
|
Prints the following banner to F</dev/console> about every 10 minutes.
|
|
|
|
File Server is running at I<time>.
|
|
|
|
=item B<-novbc>
|
|
|
|
Prevents the File Server from breaking the callbacks that Cache Managers
|
|
hold on a volume that the File Server is reattaching after the volume was
|
|
offline (as a result of the B<vos restore> command, for example). Use of
|
|
this flag is strongly discouraged.
|
|
|
|
=item B<-implicit> <I<admin mode bits>>
|
|
|
|
Defines the set of permissions granted by default to the
|
|
system:administrators group on the ACL of every directory in a volume
|
|
stored on the file server machine. Provide one or more of the standard
|
|
permission letters (C<rlidwka>) and auxiliary permission letters
|
|
(C<ABCDEFGH>), or one of the shorthand notations for groups of permissions
|
|
(C<all>, C<none>, C<read>, and C<write>). To review the meaning of the
|
|
permissions, see the B<fs setacl> reference page.
|
|
|
|
=item B<-readonly>
|
|
|
|
Don't allow writes to this fileserver.
|
|
|
|
=item B<-hr> <I<number of hours between refreshing the host cps>>
|
|
|
|
Specifies how often the File Server refreshes its knowledge of the
|
|
machines that belong to protection groups (refreshes the host CPSs for
|
|
machines). The File Server must update this information to enable users
|
|
from machines recently added to protection groups to access data for which
|
|
those machines now have the necessary ACL permissions.
|
|
|
|
=item B<-busyat> <I<< redirect clients when queue > n >>>
|
|
|
|
Defines the number of incoming RPCs that can be waiting for a response
|
|
from the File Server before the File Server returns the error code
|
|
C<VBUSY> to the Cache Manager that sent the latest RPC. In response, the
|
|
Cache Manager retransmits the RPC after a delay. This argument prevents
|
|
the accumulation of so many waiting RPCs that the File Server can never
|
|
process them all. Provide a positive integer. The default value is
|
|
C<600>.
|
|
|
|
=item B<-rxpck> <I<number of rx extra packets>>
|
|
|
|
Controls the number of Rx packets the File Server uses to store data for
|
|
incoming RPCs that it is currently handling, that are waiting for a
|
|
response, and for replies that are not yet complete. Provide a positive
|
|
integer.
|
|
|
|
=item B<-rxdbg>
|
|
|
|
Writes a trace of the File Server's operations on Rx packets to the file
|
|
F</usr/afs/logs/rx_dbg>.
|
|
|
|
=item B<-rxdbge>
|
|
|
|
Writes a trace of the File Server's operations on Rx events (such as
|
|
retransmissions) to the file F</usr/afs/logs/rx_dbg>.
|
|
|
|
=item B<-rxmaxmtu> <I<bytes>>
|
|
|
|
Defines the maximum size of an MTU. The value must be between the
|
|
minimum and maximum packet data sizes for Rx.
|
|
|
|
=item B<-jumbo>
|
|
|
|
Allows the server to send and receive jumbograms. A jumbogram is
|
|
a large-size packet composed of 2 to 4 normal Rx data packets that share
|
|
the same header. The fileserver does not use jumbograms by default, as some
|
|
routers are not capable of properly breaking the jumbogram into smaller
|
|
packets and reassembling them.
|
|
|
|
=item B<-nojumbo>
|
|
|
|
Deprecated; jumbograms are disabled by default.
|
|
|
|
=item B<-rxbind>
|
|
|
|
Force the fileserver to only bind to one IP address.
|
|
|
|
=item B<-allow-dotted-principals>
|
|
|
|
By default, the RXKAD security layer will disallow access by Kerberos
|
|
principals with a dot in the first component of their name. This is to avoid
|
|
the confusion where principals user/admin and user.admin are both mapped to the
|
|
user.admin PTS entry. Sites whose Kerberos realms don't have these collisions
|
|
between principal names may disable this check by starting the server
|
|
with this option.
|
|
|
|
=item B<-L>
|
|
|
|
Sets values for many arguments in a manner suitable for a large file
|
|
server machine. Combine this flag with any option except the B<-S> flag;
|
|
omit both flags to set values suitable for a medium-sized file server
|
|
machine.
|
|
|
|
=item B<-S>
|
|
|
|
Sets values for many arguments in a manner suitable for a small file
|
|
server machine. Combine this flag with any option except the B<-L> flag;
|
|
omit both flags to set values suitable for a medium-sized file server
|
|
machine.
|
|
|
|
=item B<-k> <I<stack size>>
|
|
|
|
Sets the LWP stack size in units of 1 kilobyte. Do not use this argument,
|
|
and in particular do not specify a value less than the default of C<24>.
|
|
|
|
=item B<-realm> <I<Kerberos realm name>>
|
|
|
|
Defines the Kerberos realm name for the File Server to use. If this
|
|
argument is not provided, it uses the realm name corresponding to the cell
|
|
listed in the local F</usr/afs/etc/ThisCell> file.
|
|
|
|
=item B<-udpsize> <I<size of socket buffer in bytes>>
|
|
|
|
Sets the size of the UDP buffer, which is 64 KB by default. Provide a
|
|
positive integer, preferably larger than the default.
|
|
|
|
=item B<-sendsize> <I<size of send buffer in bytes>>
|
|
|
|
Sets the size of the send buffer, which is 16384 bytes by default.
|
|
|
|
=item B<-abortthreshold> <I<abort threshold>>
|
|
|
|
Sets the abort threshold, which is triggered when an AFS client sends
|
|
a number of FetchStatus requests in a row and all of them fail due to
|
|
access control or some other error. When the abort threshold is
|
|
reached, the file server starts to slow down the responses to the
|
|
problem client in order to reduce the load on the file server.
|
|
|
|
The throttling behaviour can cause issues especially for some versions
|
|
of the Windows OpenAFS client. When using Windows Explorer to navigate
|
|
the AFS directory tree, directories with only "look" access for the
|
|
current user may load more slowly because of the throttling. This is
|
|
because the Windows OpenAFS client sends FetchStatus calls one at a
|
|
time instead of in bulk like the Unix Open AFS client.
|
|
|
|
Setting the threshold to 0 disables the throttling behavior. This
|
|
option is available in OpenAFS versions 1.4.1 and later.
|
|
|
|
=item B<-enable_peer_stats>
|
|
|
|
Activates the collection of Rx statistics and allocates memory for their
|
|
storage. For each connection with a specific UDP port on another machine,
|
|
a separate record is kept for each type of RPC (FetchFile, GetStatus, and
|
|
so on) sent or received. To display or otherwise access the records, use
|
|
the Rx Monitoring API.
|
|
|
|
=item B<-enable_process_stats>
|
|
|
|
Activates the collection of Rx statistics and allocates memory for their
|
|
storage. A separate record is kept for each type of RPC (FetchFile,
|
|
GetStatus, and so on) sent or received, aggregated over all connections to
|
|
other machines. To display or otherwise access the records, use the Rx
|
|
Monitoring API.
|
|
|
|
=item B<-syslog [<loglevel>]
|
|
|
|
Use syslog instead of the normal logging location for the fileserver
|
|
process. If provided, log messages are at <loglevel> instead of the
|
|
default LOG_USER.
|
|
|
|
=item B<-mrafslogs>
|
|
|
|
Use MR-AFS (Multi-Resident) style logging. This option is deprecated.
|
|
|
|
=item B<-saneacls>
|
|
|
|
Offer the SANEACLS capability for the fileserver. This option is
|
|
currently unimplemented.
|
|
|
|
=item B<-help>
|
|
|
|
Prints the online help for this command. All other valid options are
|
|
ignored.
|
|
|
|
=item B<-fs-state-dont-save>
|
|
|
|
When present, fileserver state will not be saved during shutdown. Default
|
|
is to save state.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-fs-state-dont-restore>
|
|
|
|
When present, fileserver state will not be restored during startup.
|
|
Default is to restore state on startup.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-fs-state-verify> (none | save | restore | both)
|
|
|
|
This argument controls the behavior of the state verification mechanism.
|
|
A value of C<none> turns off all verification. A value of C<save> only
|
|
performs the verification steps prior to saving state to disk. A value
|
|
of C<restore> only performs the verification steps after restoring state
|
|
from disk. A value of C<both> performs all verifications steps both
|
|
prior to save and following a restore.
|
|
|
|
The default is C<both>.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-vhandle-setaside> <I<fds reserved for non-cache io>>
|
|
|
|
Number of file handles set aside for I/O not in the cache. Defaults to 128.
|
|
|
|
=item B<-vhandle-max-cachesize> <I<max open files>>
|
|
|
|
Maximum number of available file handles.
|
|
|
|
=item B<-vhandle-initial-cachesize> <I<initial open file cache>>
|
|
|
|
Number of file handles set aside for I/O in the cache. Defaults to 128.
|
|
|
|
=item B<-vhashsize <I<size>>
|
|
|
|
The log(2) number of of volume hash buckets. Default is 8 (i.e., by
|
|
default, there are 2^8 = 256 volume hash buckets).
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-vlruthresh <I<minutes>>
|
|
|
|
The number of minutes of inactivity before a volume is eligible for soft
|
|
detachment. Default is 120 minutes.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-vlruinterval <I<seconds>>
|
|
|
|
The number of seconds between VLRU candidate queue scan. The default is
|
|
120 seconds.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-vlrumax <I<positive integer>>
|
|
|
|
The maximum number of volumes which can be soft detached in a single pass
|
|
of the scanner. Default is 8 volumes.
|
|
|
|
This option is only supported by the demand-attach file server.
|
|
|
|
=item B<-vattachpar> <I<number of volume attach threads>>
|
|
|
|
The number of threads assigned to attach and detach volumes. The default
|
|
is 1. Warning: many of the I/O parallism features of Demand-Attach
|
|
Fileserver are turned off when the number of volume attach threads is only
|
|
1.
|
|
|
|
This option is only meaningful for a file server built with pthreads
|
|
support.
|
|
|
|
=item B<-m> <I<min percentage spare in partition>>
|
|
|
|
Specifies the percentage of each AFS server partition that the AIX version
|
|
of the File Server creates as a reserve. Specify an integer value between
|
|
C<0> and C<30>; the default is 8%. A value of C<0> means that the
|
|
partition can become completely full, which can have serious negative
|
|
consequences. This option is not supported on platforms other than AIX.
|
|
|
|
=item B<-lock>
|
|
|
|
Prevents any portion of the fileserver binary from being paged (swapped)
|
|
out of memory on a file server machine running the IRIX operating system.
|
|
This option is not supported on platforms other than IRIX.
|
|
|
|
=back
|
|
|
|
=head1 EXAMPLES
|
|
|
|
The following B<bos create> command creates an fs process on the file
|
|
server machine C<fs2.abc.com> that uses the large configuration size, and
|
|
allows volumes to exceed their quota by 10%. Type the command on a single
|
|
line:
|
|
|
|
% bos create -server fs2.abc.com -instance fs -type fs \
|
|
-cmd "/usr/afs/bin/fileserver -pctspare 10 \
|
|
-L" /usr/afs/bin/volserver /usr/afs/bin/salvager
|
|
|
|
|
|
=head1 TROUBLESHOOTING
|
|
|
|
Sending process signals to the File Server Process can change its
|
|
behavior in the following ways:
|
|
|
|
Process Signal OS Result
|
|
---------------------------------------------------------------------
|
|
|
|
File Server XCPU Unix Prints a list of client IP
|
|
Addresses.
|
|
|
|
File Server USR2 Windows Prints a list of client IP
|
|
Addresses.
|
|
|
|
File Server POLL HPUX Prints a list of client IP
|
|
Addresses.
|
|
|
|
Any server TSTP Any Increases Debug level by a power
|
|
of 5 -- 1,5,25,125, etc.
|
|
This has the same effect as the
|
|
-d XXX command-line option.
|
|
|
|
Any Server HUP Any Resets Debug level to 0
|
|
|
|
File Server TERM Any Run minor instrumentation over
|
|
the list of descriptors.
|
|
|
|
Other Servers TERM Any Causes the process to quit.
|
|
|
|
File Server QUIT Any Causes the File Server to Quit.
|
|
Bos Server knows this.
|
|
|
|
The basic metric of whether an AFS file server is doing well is the number
|
|
of connections waiting for a thread,
|
|
which can be found by running the following command:
|
|
|
|
% rxdebug <server> | grep waiting_for | wc -l
|
|
|
|
Each line returned by C<rxdebug> that contains the text "waiting_for"
|
|
represents a connection that's waiting for a file server thread.
|
|
|
|
If the blocked connection count is ever above 0, the server is having
|
|
problems replying to clients in a timely fashion. If it gets above 10,
|
|
roughly, there will be noticable slowness by the user. The total number of
|
|
connections is a mostly irrelevant number that goes essentially
|
|
monotonically for as long as the server has been running and then goes back
|
|
down to zero when it's restarted.
|
|
|
|
The most common cause of blocked connections rising on a server is some
|
|
process somewhere performing an abnormal number of accesses to that server
|
|
and its volumes. If multiple servers have a blocked connection count, the
|
|
most likely explanation is that there is a volume replicated between those
|
|
servers that is absorbing an abnormally high access rate.
|
|
|
|
To get an access count on all the volumes on a server, run:
|
|
|
|
% vos listvol <server> -long
|
|
|
|
and save the output in a file. The results will look like a bunch of B<vos
|
|
examine> output for each volume on the server. Look for lines like:
|
|
|
|
40065 accesses in the past day (i.e., vnode references)
|
|
|
|
and look for volumes with an abnormally high number of accesses. Anything
|
|
over 10,000 is fairly high, but some volumes like root.cell and other
|
|
volumes close to the root of the cell will have that many hits routinely.
|
|
Anything over 100,000 is generally abnormally high. The count resets about
|
|
once a day.
|
|
|
|
Another approach that can be used to narrow the possibilities for a
|
|
replicated volume, when multiple servers are having trouble, is to find all
|
|
replicated volumes for that server. Run:
|
|
|
|
% vos listvldb -server <server>
|
|
|
|
where <server> is one of the servers having problems to refresh the VLDB
|
|
cache, and then run:
|
|
|
|
% vos listvldb -server <server> -part <partition>
|
|
|
|
to get a list of all volumes on that server and partition, including every
|
|
other server with replicas.
|
|
|
|
Once the volume causing the problem has been identified, the best way to
|
|
deal with the problem is to move that volume to another server with a low
|
|
load or to stop any runaway programs that are accessing that volume
|
|
unnecessarily. Often the volume will be enough information to tell what's
|
|
going on.
|
|
|
|
If you still need additional information about who's hitting that server,
|
|
sometimes you can guess at that information from the failed callbacks in the
|
|
F<FileLog> log in F</var/log/afs> on the server, or from the output of:
|
|
|
|
% /usr/afsws/etc/rxdebug <server> -rxstats
|
|
|
|
but the best way is to turn on debugging output from the file server.
|
|
(Warning: This generates a lot of output into FileLog on the AFS server.)
|
|
To do this, log on to the AFS server, find the PID of the fileserver
|
|
process, and do:
|
|
|
|
kill -TSTP <pid>
|
|
|
|
where <pid> is the PID of the file server process. This will raise the
|
|
debugging level so that you'll start seeing what people are actually doing
|
|
on the server. You can do this up to three more times to get even more
|
|
output if needed. To reset the debugging level back to normal, use (The
|
|
following command will NOT terminate the file server):
|
|
|
|
kill -HUP <pid>
|
|
|
|
The debugging setting on the File Server should be reset back to normal when
|
|
debugging is no longer needed. Otherwise, the AFS server may well fill its
|
|
disks with debugging output.
|
|
|
|
The lines of the debugging output that are most useful for debugging load
|
|
problems are:
|
|
|
|
SAFS_FetchStatus, Fid = 2003828163.77154.82248, Host 171.64.15.76
|
|
SRXAFS_FetchData, Fid = 2003828163.77154.82248
|
|
|
|
(The example above is partly truncated to highlight the interesting
|
|
information). The Fid identifies the volume and inode within the volume;
|
|
the volume is the first long number. So, for example, this was:
|
|
|
|
% vos examine 2003828163
|
|
pubsw.matlab61 2003828163 RW 1040060 K On-line
|
|
afssvr5.Stanford.EDU /vicepa
|
|
RWrite 2003828163 ROnly 2003828164 Backup 2003828165
|
|
MaxQuota 3000000 K
|
|
Creation Mon Aug 6 16:40:55 2001
|
|
Last Update Tue Jul 30 19:00:25 2002
|
|
86181 accesses in the past day (i.e., vnode references)
|
|
|
|
RWrite: 2003828163 ROnly: 2003828164 Backup: 2003828165
|
|
number of sites -> 3
|
|
server afssvr5.Stanford.EDU partition /vicepa RW Site
|
|
server afssvr11.Stanford.EDU partition /vicepd RO Site
|
|
server afssvr5.Stanford.EDU partition /vicepa RO Site
|
|
|
|
and from the Host information one can tell what system is accessing that
|
|
volume.
|
|
|
|
Note that the output of L<vos_examine(1)> also includes the access count, so
|
|
once the problem has been identified, vos examine can be used to see if the
|
|
access count is still increasing. Also remember that you can run vos
|
|
examine on the read-only replica (e.g., pubsw.matlab61.readonly) to see the
|
|
access counts on the read-only replica on all of the servers that it's
|
|
located on.
|
|
|
|
=head1 PRIVILEGE REQUIRED
|
|
|
|
The issuer must be logged in as the superuser C<root> on a file server
|
|
machine to issue the command at a command shell prompt. It is conventional
|
|
instead to create and start the process by issuing the B<bos create>
|
|
command.
|
|
|
|
=head1 SEE ALSO
|
|
|
|
L<BosConfig(5)>,
|
|
L<FileLog(5)>,
|
|
L<bos_create(8)>,
|
|
L<bos_getlog(8)>,
|
|
L<fs_setacl(1)>,
|
|
L<msgget(2)>,
|
|
L<msgrcv(2)>,
|
|
L<salvager(8)>,
|
|
L<volserver(8)>,
|
|
L<vos_examine(1)>
|
|
|
|
=head1 COPYRIGHT
|
|
|
|
IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved.
|
|
|
|
This documentation is covered by the IBM Public License Version 1.0. It was
|
|
converted from HTML to POD by software written by Chas Williams and Russ
|
|
Allbery, based on work by Alf Wachsmann and Elizabeth Cassell.
|