mirror of
https://git.openafs.org/openafs.git
synced 2025-01-19 07:20:11 +00:00
9243beff2b
FIXES 26463 update pod files to deal with some section number changes
415 lines
15 KiB
Plaintext
415 lines
15 KiB
Plaintext
=head1 NAME
|
|
|
|
afsmonitor - Monitors File Servers and Cache Managers
|
|
|
|
=head1 SYNOPSIS
|
|
|
|
B<afsmonitor> [B<initcmd>] [-config <I<configuration file>>]
|
|
[B<-frequency> <I<poll frequency, in seconds>>]
|
|
[B<-output> <I<storage file name>>] [B<-detailed>]
|
|
[B<-debug> <I<debug output file>>]
|
|
[B<-fshosts> <I<list of file servers to monitor>>+]
|
|
[B<-cmhosts> <I<list of cache managers to monitor>>+]
|
|
[B<-buffers> <I<number of buffer slots>>] [B<-help>]
|
|
|
|
B<afsmonitor> [B<i>] [-co <I<configuration file>>]
|
|
[B<-fr> <I<poll frequency, in seconds>>]
|
|
[B<-o> <I<storage file name>>] [B<-det>]
|
|
[B<-deb> <I<debug output file>>]
|
|
[B<-fs> <I<list of file servers to monitor>>+]
|
|
[B<-cm> <I<list of cache managers to monitor>>+]
|
|
[B<-b> <I<number of buffer slots>>] [B<-h>]
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
The afsmonitor command initializes a program that gathers and displays
|
|
statistics about specified File Server and Cache Manager operations. It
|
|
allows the issuer to monitor, from a single location, a wide range of File
|
|
Server and Cache Manager operations on any number of machines in both
|
|
local and foreign cells.
|
|
|
|
There are 271 available File Server statistics and 571 available Cache
|
|
Manager statistics, listed in the appendix about B<afsmonitor> statistics
|
|
in the I<IBM AFS Administration Guide>. By default, the command displays
|
|
all of the relevant statistics for the file server machines named by the
|
|
B<-fshosts> argument and the client machines named by the B<-cmhosts>
|
|
argument. To limit the display to only the statistics of interest, list
|
|
them in the configuration file specified by the B<-config> argument. In
|
|
addition, use the configuration file for the following purposes:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
To set threshold values for any monitored statistic. When the value of a
|
|
statistic exceeds the threshold, the B<afsmonitor> command displays it in
|
|
reverse video. There are no default threshold values.
|
|
|
|
=item *
|
|
|
|
To invoke a program or script automatically when a statistic exceeds its
|
|
threshold. The AFS distribution does not include any such scripts.
|
|
|
|
=item *
|
|
|
|
To list the file server and client machines to monitor, instead of using
|
|
the B<-fshosts> and B<-cmhosts> arguments.
|
|
|
|
=back
|
|
|
|
For a description of the configuration file, see L<afsmonitor(5)>.
|
|
|
|
=head1 CAUTIONS
|
|
|
|
The following software must be accessible to a machine where the
|
|
B<afsmonitor> program is running:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
The AFS xstat libraries, which the afsmonitor program uses to gather data.
|
|
|
|
=item *
|
|
|
|
The curses graphics package, which most UNIX distributions provide as a
|
|
standard utility.
|
|
|
|
=back
|
|
|
|
The B<afsmonitor> screens format successfully both on so-called dumb
|
|
terminals and in windowing systems that emulate terminals. For the output
|
|
to looks its best, the display environment needs to support reverse video
|
|
and cursor addressing. Set the TERM environment variable to the correct
|
|
terminal type, or to a value that has characteristics similar to the
|
|
actual terminal type. The display window or terminal must be at least 80
|
|
columns wide and 12 lines long.
|
|
|
|
The B<afsmonitor> program must run in the foreground, and in its own
|
|
separate, dedicated window or terminal. The window or terminal is
|
|
unavailable for any other activity as long as the B<afsmonitor> program is
|
|
running. Any number of instances of the B<afsmonitor> program can run on a
|
|
single machine, as long as each instance runs in its own dedicated window
|
|
or terminal. Note that it can take up to three minutes to start an
|
|
additional instance.
|
|
|
|
=head1 OPTIONS
|
|
|
|
=over 4
|
|
|
|
=item B<initcmd>
|
|
|
|
Accommodates the command's use of the AFS command parser, and is optional.
|
|
|
|
=item B<-config> <I<file>>
|
|
|
|
Names the configuration file which lists the machines to monitor,
|
|
statistics to display, and threshold values, if any. A partial pathname is
|
|
interpreted relative to the current working directory. Provide this
|
|
argument if not providing the B<-fshosts> argument, B<-cmhosts> argument,
|
|
or neither. For instructions on creating this file, see the preceding
|
|
B<DESCRIPTION> section, and the section on the B<afsmonitor> program in
|
|
the I<IBM AFS Administration Guide>.
|
|
|
|
=item B<-frequency> <I<poll frequency>>
|
|
|
|
Specifies in seconds how often the afsmonitor program probes the File
|
|
Servers and Cache Managers. Valid values range from C<1> to C<86400>
|
|
(which is 24 hours); the default value is C<60>. This frequency applies to
|
|
both File Servers and Cache Managers, but the B<afsmonitor> program
|
|
initiates the two types of probes, and processes their results,
|
|
separately. The actual interval between probes to a host is the probe
|
|
frequency plus the time required for all hosts to respond.
|
|
|
|
=item B<-output> <I<file>>
|
|
|
|
Names the file to which the afsmonitor program writes all of the
|
|
statistics that it collects. By default, no output file is created. See
|
|
the section on the B<afsmonitor> command in the I<IBM AFS Administration
|
|
Guide> for information on this file.
|
|
|
|
=item B<-detailed>
|
|
|
|
Formats the information in the output file named by B<-output> argument in
|
|
a maximally readable format. Provide the B<-output> argument along with
|
|
this one.
|
|
|
|
=item B<-fshosts> <I<host>>+
|
|
|
|
Names one or more machines from which to gather File Server
|
|
statistics. For each machine, provide either a fully qualified host name,
|
|
or an unambiguous abbreviation (the ability to resolve an abbreviation
|
|
depends on the state of the cell's name service at the time the command is
|
|
issued). This argument can be combined with the B<-cmhosts> argument, but
|
|
not with the B<-config> argument.
|
|
|
|
=item B<-cmhosts> <I<host>>+
|
|
|
|
Names one or more machines from which to gather Cache Manager
|
|
statistics. For each machine, provide either a fully qualified host name,
|
|
or an unambiguous abbreviation (the ability to resolve an abbreviation
|
|
depends on the state of the cell's name service at the time the command is
|
|
issued). This argument can be combined with the B<-fshosts> argument, but
|
|
not with the B<-config> argument.
|
|
|
|
=item B<-buffers> <I<slots>>
|
|
|
|
Is nonoperational and provided to accommodate potential future
|
|
enhancements to the program.
|
|
|
|
=item B<-help>
|
|
|
|
Prints the online help for this command. All other valid options are
|
|
ignored.
|
|
|
|
=back
|
|
|
|
=head1 OUTPUT
|
|
|
|
The afsmonitor program displays its data on three screens:
|
|
|
|
=over 4
|
|
|
|
=item System Overview
|
|
|
|
This screen appears automatically when the B<afsmonitor> program
|
|
initializes. It summarizes separately for File Servers and Cache Managers
|
|
the number of machines being monitored and how many of them have I<alerts>
|
|
(statistics that have exceeded their thresholds). It then lists the
|
|
hostname and number of alerts for each machine being monitored, indicating
|
|
if appropriate that a process failed to respond to the last probe.
|
|
|
|
=item File Server
|
|
|
|
This screen displays File Server statistics for each file server machine
|
|
being monitored. It highlights statistics that have exceeded their
|
|
thresholds, and identifies machines that failed to respond to the last
|
|
probe.
|
|
|
|
=item Cache Managers
|
|
|
|
This screen displays Cache Manager statistics for each client machine
|
|
being monitored. It highlights statistics that have exceeded their
|
|
thresholds, and identifies machines that failed to respond to the last
|
|
probe.
|
|
|
|
=back
|
|
|
|
Fields at the corners of every screen display the following information:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
In the top left corner, the program name and version number.
|
|
|
|
=item *
|
|
|
|
In the top right corner, the screen name, current and total page numbers,
|
|
and current and total column numbers. The page number (for example, C<p. 1
|
|
of 3>) indicates the index of the current page and the total number of
|
|
(vertical) pages over which data is displayed. The column number (for
|
|
example, C<c. 1 of 235>) indicates the index of the current leftmost
|
|
column and the total number of columns in which data appears. (The symbol
|
|
C<<<< >>> >>>> indicates that there is additional data to the right; the
|
|
symbol C<<<< <<< >>>> indicates that there is additional data to the
|
|
left.)
|
|
|
|
=item *
|
|
|
|
In the bottom left corner, a list of the available commands. Enter the
|
|
first letter in the command name to run that command. Only the currently
|
|
possible options appear; for example, if there is only one page of data,
|
|
the C<next> and C<prev> commands, which scroll the screen up and down
|
|
respectively, do not appear. For descriptions of the commands, see the
|
|
following section about navigating the display screens.
|
|
|
|
=item *
|
|
|
|
In the bottom right corner, the C<probes> field reports how many times the
|
|
program has probed File Servers (C<fs>), Cache Managers (C<cm>), or
|
|
both. The counts for File Servers and Cache Managers can differ. The
|
|
C<freq> field reports how often the program sends probes.
|
|
|
|
=back
|
|
|
|
=head2 Navigating the afsmonitor Display Screens
|
|
|
|
As noted, the lower left hand corner of every display screen displays the
|
|
names of the commands currently available for moving to alternate screens,
|
|
which can either be a different type or display more statistics or
|
|
machines of the current type. To execute a command, press the lowercase
|
|
version of the first letter in its name. Some commands also have an
|
|
uppercase version that has a somewhat different effect, as indicated in
|
|
the following list.
|
|
|
|
=over 4
|
|
|
|
=item C<cm>
|
|
|
|
Switches to the C<Cache Managers> screen. Available only on the C<System
|
|
Overview> and C<File Servers> screens.
|
|
|
|
=item C<fs>
|
|
|
|
Switches to the C<File Servers> screen. Available only on the C<System
|
|
Overview> and the C<Cache Managers> screens.
|
|
|
|
=item C<left>
|
|
|
|
Scrolls horizontally to the left, to access the data columns situated to
|
|
the left of the current set. Available when the C<<<< <<< >>>> symbol
|
|
appears at the top left of the screen. Press uppercase C<L> to scroll
|
|
horizontally all the way to the left (to display the first set of data
|
|
columns).
|
|
|
|
=item C<next>
|
|
|
|
Scrolls down vertically to the next page of machine names. Available when
|
|
there are two or more pages of machines and the final page is not
|
|
currently displayed. Press uppercase C<N> to scroll to the final page.
|
|
|
|
=item C<oview>
|
|
|
|
Switches to the C<System Overview> screen. Available only on the C<Cache
|
|
Managers> and C<File Servers> screens.
|
|
|
|
=item C<prev>
|
|
|
|
Scrolls up vertically to the previous page of machine names. Available
|
|
when there are two or more pages of machines and the first page is not
|
|
currently displayed. Press uppercase C<N> to scroll to the first page.
|
|
|
|
=item C<right>
|
|
|
|
Scrolls horizontally to the right, to access the data columns situated to
|
|
the right of the current set. This command is available when the C<<<< >>>
|
|
>>>> symbol appears at the upper right of the screen. Press uppercase C<R>
|
|
to scroll horizontally all the way to the right (to display the final set
|
|
of data columns).
|
|
|
|
=back
|
|
|
|
=head2 The System Overview Screen
|
|
|
|
The C<System Overview> screen appears automatically as the B<afsmonitor>
|
|
program initializes. This screen displays the status of as many File
|
|
Server and Cache Manager processes as can fit in the current window;
|
|
scroll down to access additional information.
|
|
|
|
The information on this screen is split into File Server information on
|
|
the left and Cache Manager information on the right. The header for each
|
|
grouping reports two pieces of information:
|
|
|
|
=over 4
|
|
|
|
=item *
|
|
|
|
The number of machines on which the program is monitoring the indicated
|
|
process.
|
|
|
|
=item *
|
|
|
|
The number of alerts and the number of machines affected by them (an
|
|
I<alert> means that a statistic has exceeded its threshold or a process
|
|
failed to respond to the last probe).
|
|
|
|
=back
|
|
|
|
A list of the machines being monitored follows. If there are any alerts on
|
|
a machine, the number of them appears in square brackets to the left of
|
|
the hostname. If a process failed to respond to the last probe, the
|
|
letters C<PF> (probe failure) appear in square brackets to the left of the
|
|
hostname.
|
|
|
|
=head2 The File Servers Screen
|
|
|
|
The C<File Servers> screen displays the values collected at the most
|
|
recent probe for File Server statistics.
|
|
|
|
A summary line at the top of the screen (just below the standard program
|
|
version and screen title blocks) specifies the number of monitored File
|
|
Servers, the number of alerts, and the number of machines affected by the
|
|
alerts.
|
|
|
|
The first column always displays the hostnames of the machines running the
|
|
monitored File Servers.
|
|
|
|
To the right of the hostname column appear as many columns of statistics
|
|
as can fit within the current width of the display screen or window; each
|
|
column requires space for 10 characters. The name of the statistic appears
|
|
at the top of each column. If the File Server on a machine did not respond
|
|
to the most recent probe, a pair of dashes (C<-->) appears in each
|
|
column. If a value exceeds its configured threshold, it is highlighted in
|
|
reverse video. If a value is too large to fit into the allotted column
|
|
width, it overflows into the next row in the same column.
|
|
|
|
=head2 The Cache Managers Screen
|
|
|
|
The C<Cache Managers> screen displays the values collected at the most
|
|
recent probe for Cache Manager statistics.
|
|
|
|
A summary line at the top of the screen (just below the standard program
|
|
version and screen title blocks) specifies the number of monitored Cache
|
|
Managers, the number of alerts, and the number of machines affected by the
|
|
alerts.
|
|
|
|
The first column always displays the hostnames of the machines running the
|
|
monitored Cache Managers.
|
|
|
|
To the right of the hostname column appear as many columns of statistics
|
|
as can fit within the current width of the display screen or window; each
|
|
column requires space for 10 characters. The name of the statistic appears
|
|
at the top of each column. If the Cache Manager on a machine did not
|
|
respond to the most recent probe, a pair of dashes (C<-->) appears in each
|
|
column. If a value exceeds its configured threshold, it is highlighted in
|
|
reverse video. If a value is too large to fit into the allotted column
|
|
width, it overflows into the next row in the same column.
|
|
|
|
=head2 Writing to an Output File
|
|
|
|
Include the B<-output> argument to name the file into which the
|
|
B<afsmonitor> program writes all of the statistics it collects. The
|
|
output file can be useful for tracking performance over long periods of
|
|
time, and enables the administrator to apply post-processing techniques
|
|
that reveal system trends. The AFS distribution does not include any
|
|
post-processing programs.
|
|
|
|
The output file is in ASCII format and records the same information as the
|
|
C<File Server> and C<Cache Manager> display screens. Each line in the
|
|
file uses the following format to record the time at which the
|
|
B<afsmonitor> program gathered the indicated statistic from the Cache
|
|
Manager (C<CM>) or File Server (C<FS>) running on the machine called
|
|
I<host_name>. If a probe failed, the error code C<-1> appears in the
|
|
I<statistic> field.
|
|
|
|
<time> <host_name> CM|FS <statistic>
|
|
|
|
If the administrator usually reviews the output file manually, rather than
|
|
using it as input to an automated analysis program or script, including
|
|
the B<-detail> flag formats the data in a more easily readable form.
|
|
|
|
=head1 EXAMPLES
|
|
|
|
For examples of commands, display screens, and configuration files, see
|
|
the section about the B<afsmonitor> program in the I<IBM AFS
|
|
Administration Guide>.
|
|
|
|
=head1 PRIVILEGE REQUIRED
|
|
|
|
None
|
|
|
|
=head1 SEE ALSO
|
|
|
|
L<afsmonitor(5)>
|
|
L<fstrace(8)>,
|
|
L<scout(1)>
|
|
|
|
=head1 COPYRIGHT
|
|
|
|
IBM Corporation 2000. <http://www.ibm.com/> All Rights Reserved.
|
|
|
|
This documentation is covered by the IBM Public License Version 1.0. It was
|
|
converted from HTML to POD by software written by Chas Williams and Russ
|
|
Allbery, based on work by Alf Wachsmann and Elizabeth Cassell.
|