Chapter 4. Monitoring and Controlling Server Processes

One of your most important responsibilities as a system administrator is ensuring that the processes on file server machines are running correctly. The BOS Server, which runs on every file server machine, relieves you of much of the responsibility by constantly monitoring the other AFS server processes on its machine. It can automatically restart processes that have failed, ordering the restarts to take interdependencies into account.

Because different file server machines run different combinations of processes, you must define which processes the BOS Server on each file server machine is to monitor (to learn how, see Controlling and Checking Process Status).

It is sometimes necessary to take direct control of server process status before performing routine maintenance or correcting problems that the BOS Server cannot correct (such as problems with database replication or mutual authentication). At those times, you control process status through the BOS Server by issuing bos commands.

Summary of Instructions

This chapter explains how to perform the following tasks by using the indicated commands:

Examine process statusbos status
Examine information from the BosConfig file filebos status with -long flag
Create a process instancebos create
Stop a processbos stop
Start a stopped processbos start
Stop a process temporarilybos shutdown
Start a temporarily stopped processbos startup
Stop and immediately restart a processbos restart
Stop and immediately restart all processesbos restart with -bosserver flag
Examine BOS Server's restart timesbos getrestart
Set BOS Server's restart timesbos setrestart
Examine a log filebos getlog
Execute a command remotelybos exec

Brief Descriptions of the AFS Server Processes

This section briefly describes the different server processes that can run on an AFS server machine. In cells with multiple server machines, not all processes necessarily run on all machines.

An AFS server process is referred to in one of three ways, depending on the context:

The following sections specify each name for the process as well as some of the administrative tasks in which you use the process. For a more general description of the servers, see AFS Server Processes and the Cache Manager.

The bosserver Process: the Basic OverSeer Server

The bosserver process, which runs on every AFS server machine, is the Basic OverSeer (BOS) Server responsible for monitoring the other AFS server processes running on its machine. If a process fails, the BOS Server can restart it automatically, without human intervention. It takes interdependencies into account when restarting a process that has multiple component processes (such as the fs process described in The fs Collection of Processes: the File Server, Volume Server and Salvager).

Because the BOS Server does not monitor or restart itself, it does not appear in the output from the bos status command. It appears in the ps command's output as /usr/afs/bin/bosserver.

As a system administrator, you contact the BOS Server when you issue bos commands to perform the following kinds of tasks.

The buserver Process: the Backup Server

The buserver process, which runs on database server machines, is the Backup Server. It maintains information about Backup System configuration and operations in the Backup Database.

The process appears as buserver in the bos status command's output, if the conventional name is assigned. It appears in the ps command's output as /usr/afs/bin/buserver.

As a system administrator, you contact the Backup Server when you issue any backup command that manipulates information in the Backup Database, including those that change Backup System configuration information, that dump data from volumes to permanent storage, or that restore data to AFS. See Configuring the AFS Backup System and Backing Up and Restoring AFS Data.

The fs Collection of Processes: the File Server, Volume Server and Salvager

The fs process, which runs on every file server machine, combines three component processes: File Server, Volume Server and Salvager. The three components perform independent functions, but are controlled as a single process for the following reasons.

  • They all operate on the same data, namely files and directories stored in AFS volumes. Combining them as a single process enables them to coordinate their actions, never attempting simultaneous operations on the same data that can possibly corrupt it.

  • It enables the BOS Server to stop and restart the processes in the required order. When the File Server fails, the BOS Server stops the Volume Server and runs the Salvager to correct any corruption that resulted from the failure. (The Salvager runs only in this special circumstance or when you invoke it yourself by issuing the bos salvage command as instructed in Salvaging Volumes.) If only the Volume Server fails, the BOS Server can restart it without affecting the File Server or Salvager.

The File Server component handles AFS data at the level of files and directories, manipulating file system elements as requested by application programs and the standard operating system commands. Its main duty is to deliver requested files to client machines and store them again on the server machine when the client is finished. It also maintains status and protection information about each file and directory. It runs continuously during normal operation.

The Volume Server component handles AFS data at the level of complete volumes rather than files and directories. In response to vos commands, it creates, removes, moves, dumps and restores entire volumes, among other actions. It runs continuously during normal operation.

The Salvager component runs only after the failure of one of the other two processes. It checks the file system for internal consistency and repairs any errors it finds.

The process appears as fs in the bos status command's output, if the conventional name is assigned. An auxiliary message reports the status of the File Server or Salvager component. See Displaying Process Status and Information from the BosConfig File.

The component processes of the fs process appear individually in the ps command's output, as follows. There is no entry for the fs process itself.

  • /usr/afs/bin/fileserver

  • /usr/afs/bin/volserver

  • /usr/afs/bin/salvager

The Cache Manager contacts the File Server component on your behalf whenever you access data or status information in an AFS file or directory or issue file manipulation commands such as the UNIX cp and ls commands. You can contact the File Server directly by issuing fs commands that perform the following functions

You contact the Volume Server component when you issue vos commands that manipulate volumes in any way--creating, removing, replicating, moving, renaming, converting to different formats, and salvaging. For instructions, see Managing Volumes.

The Salvager normally runs automatically in case of a failure. You can also start it with the bos salvage command as described in Salvaging Volumes.

The kaserver Process: the Authentication Server

The kaserver process, which runs on database server machines, is the Authentication Server responsible for several aspects of AFS security. It verifies AFS user identity by requiring a password. It maintains all AFS server encryption keys and user passwords in the Authentication Database. The Authentication Server's Ticket Granting Service (TGS) module creates the shared secrets that AFS client and server processes use when establishing secure connections.

The process appears as kaserver in the bos status command's output, if the conventional name is assigned. The ka string stands for Kerberos Authentication, reflecting the fact that AFS's authentication protocols are based on Kerberos, which was originally developed at the Massachusetts Institute of Technology's Project Athena.

It appears in the ps command's output as /usr/afs/bin/kaserver.

As a system administrator, you contact the Authentication Server when you issue kas commands to perform the following kinds of tasks.

The ptserver Process: the Protection Server

The ptserver process, which runs on database server machines, is the Protection Server. Its main responsibility is maintaining the Protection Database which contains user, machine, and group entries. The Protection Server allocates AFS IDs and maintains the mapping between them and names. The File Server consults the Protection Server when verifying that a user is authorized to perform a requested action.

The process appears as ptserver in the bos status command's output, if the conventional name is assigned. It appears in the ps command's output as /usr/afs/bin/ptserver.

As a system administrator, you contact the Protection Server when you issue pts commands to perform the following kinds of tasks.

The runntp Process

The runntp process, which runs on every server machine, is a controller program for the Network Time Protocol Daemon (NTPD), which synchronizes the hardware clocks on server machines. You need to run the runntp process if you are not already running NTP or another time synchronization protocol on your server machines.

The clocks on database server machines need to be synchronized because AFS's distributed database technology (Ubik) works properly only when the clocks agree within a narrow range of variation (see Configuring the Cell for Proper Ubik Operation). The clocks on file server machines need to be correct not only because the File Server sets modification time stamps on files, but because in the conventional configuration they serve as the time source for AFS client machines.

The process appears as runntp in the bos status command's output, if the conventional name is assigned. It appears in the output from the ps command as /usr/afs/bin/runntp. The ps command's output also includes an entry called ntpd; its exact form depends on the arguments you provide to the runntp command.

As a system administrator, you do not contact the NTPD directly once you have installed it according to the instructions in the IBM AFS Quick Beginnings.

The upserver and upclient Processes: the Update Server

The Update Server has two separate parts, each of which runs on a different type of server machine. The upserver process is the server portion of the Update Server. Its function depends on which edition of AFS you use:

  • With both the United States and international editions, it runs on the binary distribution machine of each system type you use as a server machine, distributing the contents of each one's /usr/afs/bin directory to the other server machines of that type. This guarantees that all machines have the same version of AFS binaries. (For a list of the binaries, see Binaries in the /usr/afs/bin Directory.)

  • In you use the United States edition of AFS, it also runs on the cell's system control machine, distributing the contents of its /usr/afs/etc directory to all the other server machines in order to synchronize the configuration files stored in that directory. (For a list of the configuration files, see Common Configuration Files in the /usr/afs/etc Directory.)

The upclient process is the client portion of the Update Server, and like the server portion its function depends on the AFS edition in use.

  • It runs on every server machine that is not a binary distribution machine, referencing the binary distribution machine of its system type as the source for updates to the binaries in the /usr/afs/bin directory. The conventional process name to assign is upclientbin.

  • If you use the United States edition of AFS, another instance of the process runs on every server machine except the system control machine. It references the system control machine as the source for updates to the common configuration files in the /usr/afs/etc directory. The conventional process name to assign is upclientetc.

In output from the bos status command, the server portion appears as upserver and the client portions as upclientbin and upclientetc, if the conventional names are assigned. In the output from the ps command, the server portion appears as /usr/afs/bin/upserver and the client portions as /usr/afs/bin/upclient.

You do not contact the Update Server directly once you have installed it. It operates automatically whenever you use bos commands to change the files that it distributes.

The vlserver Process: the Volume Location Server

The vlserver process, which runs on database server machines, is the Volume Location (VL) Server that automatically tracks which file server machines house each volume, making its location transparent to client applications.

The process appears as vlserver in the bos status command's output, if the conventional name is assigned. It appears in the ps command's output as /usr/afs/bin/vlserver.

As a system administrator, you contact the VL Server when you issue any vos command that changes the status of a volume (it records the status changes in the VLDB).

Controlling and Checking Process Status

To define the AFS server processes that run on a server machine, use the bos create command to create entries for them in the local /usr/afs/local/BosConfig file. The BOS Server monitors the processes listed in the BosConfig file that are marked with the Run status flag, and automatically attempts to restart them if they fail. After creating process entries, you use other commands from the bos suite to stop and start processes or change the status flag as desired.

Never edit the BosConfig file directly rather than using bos commands. Similarly, it is not a good practice to run server processes without listing them in the BosConfig file, or to stop them using process termination commands such as the UNIX kill command.

The Information in the BosConfig File

A process's entry in the BosConfig file includes the following information:

  • The process's name. The recommended conventional names are defined in both the IBM AFS Quick Beginnings and Creating and Removing Processes. The name of a simple process usually matches the name of its binary file (for example, ptserver for the Protection Server).

  • Its type, which is one of the following:

    simple

    A process that runs independently of any other on the server machine. If several simple processes fail at the same time, the BOS Server can restart them in any order. All standard AFS processes except the fs process are simple.

    fs

    A process type reserved for the server process for which the conventional name is also fs. This process combines three components: the File Server, the Volume Server, and the Salvager.

    cron

    A process that runs at a defined time rather than continuously. There are no standard processes of this type.

  • Its status flag, which tells the BOS Server whether it performs the following two actions with respect to the process:

    • Start the process during BOS Server initialization

    • Restart the process if it (the process) fails

    The two possible values are Run (which directs the BOS Server to perform these actions) and NotRun (which directs the BOS Server to ignore the process). The BOS Server itself never changes the setting of this flag, even if the process fails repeatedly. Also, this flag is for internal use only; it does not appear in the bos status command's output.

  • Its command parameters, which are the commands that the BOS Server runs to start the process.

    • A simple processes has one: the complete pathname to its binary file

    • The fs process has three: the complete pathnames to each of the three component processes (/usr/afs/bin/fileserver, /usr/afs/bin/volserver, and /usr/afs/bin/salvager)

    • A cron process has two: the first the complete pathname to its binary file, the second the time at which the BOS Server runs it

In addition to process definitions, the BosConfig file also records automatic restart times for processes that have new binaries, and for all server processes including the BOS Server. See Setting the BOS Server's Restart Times.

How the BOS Server Uses the Information in the BosConfig File

Whenever the BOS Server starts or restarts, it reads the BosConfig file to learn which processes it is to start and monitor. It transfers the information into kernel memory and does not read the BosConfig file again until it next restarts. This implies that the BOS Server's memory state can change independently of the BosConfig file. You can, for example, stop a process but leave its status flag in the BosConfig file as Run, or start a process even though its status flag in the BosConfig file is NotRun.

About Starting and Stopping the Database Server Processes

When you start or stop a database server process (Authentication Server, Backup Server, Protection Server, or Volume Location Server) for more than a short time, you must follow the instructions in the IBM AFS Quick Beginnings for installing or removing a database server machine. Here is a summary of the tasks you must perform to preserve correct AFS functioning.

  • Start or stop all four database server processes on that machine. All AFS server processes and the Cache Manager processes expect all four database server processes to be running on each machine listed in the CellServDB file. There is no way to indicate in the file that a machine is running only some of the database server processes.

  • Add or remove the machine in the /usr/afs/etc/CellServDB file on all server machines and the /usr/vice/etc/CellServDB file on all client machines.

  • Restart the database server processes on the other database server machines to force an election of a new Ubik coordinator for each one.

About Starting and Stopping the Update Server

In the conventional cell configuration, one server machine of each system type acts as a binary distribution machine, running the server portion of the Update Server (upserver process) to distribute the contents of its /usr/afs/bin directory. The other server machines of its system type run an instance of the Update Server client portion (by convention called upclientbin) that references the binary distribution machine.

If you run the United States edition of AFS, it is conventional for the first server machine you install to act as the system control machine, running the server portion of the Update Server (upserver process) to distribute the contents of its /usr/afs/etc directory. All other server machines run an instance of the Update Server client portion (by convention called upclientetc) that references the system control machine.

Note: If you are using the international edition of AFS, do not use the Update Server to distribute the contents of the /usr/afs/etc directory (you do not run a system control machine). Ignore all references to the process in this chapter.

It is simplest not to move binary distribution or system control responsibilities to a different machine unless you completely decommission a machine that is currently serving in one of those roles. Running the Update Server usually imposes very little processing load. If you must move the functionality, perform the following related tasks.

  • If you replace the system control machine, you must stop the upclientetc process on every other server machine and define a new one that references the new system control machine.

  • If you replace a binary distribution machine, you must stop the upclientbin process on every other server machine of its system type and define a new one that references the new binary distribution machine (unless you are no longer running any server machines of that system type).

Displaying Process Status and Information from the BosConfig File

To display the status of the AFS server processes on a server machine, issue the bos status command. Adding the -long flag displays most of the information from each process's entry in the BosConfig file, including its type and command parameters. It also displays a warning message if the mode bits on files and subdirectories in the /usr/afs directory do not match the expected values.

To display the status of server processes and their BosConfig entries

  1. Issue the bos status command.

    
   % bos status <machine name>  [<server process name>+]  [-long]
    

    where

    stat

    Is the shortest acceptable abbreviation of status.

    machine name

    Specifies the file server machine for which to display process status.

    server process name

    Names each process for which to display status, using the name assigned when its entry was defined with the bos create command. Omit this argument to display the status of all server processes.

    -long

    Displays, in addition to status, information from the process's entry in the BosConfig file: its type, its status flag, its command parameters, the associated notifier program, and so on.

The output includes an entry for each process and uses one of the following strings to indicate the process's status:

  • currently running normally indicates that the process is running and its status flag in the BosConfig file is Run. For cron entries, this message indicates that the command is still scheduled to run, not necessarily that it is actually running when the bos status command was issued.

  • temporarily enabled indicates that the process is running but that its status flag in the BosConfig file is NotRun. The most common reason is that a system administrator has used the bos startup command to start the process.

  • temporarily disabled indicates that the process is not running even though its status flag in the BosConfig file is Run. The most common reasons are either that a system administrator has used the bos shutdown command to stop the process or that the BOS Server ceased trying to restart the process after numerous failed attempts. In the latter case, a supplementary message appears: stopped for too many errors.

  • disabled indicates that the process is not running and that its status flag in the BosConfig file is NotRun. The BOS Server is not monitoring the process. Only a system administrator can set the flag this way; the BOS Server never does.

The output for the fs process always includes a message marked Auxiliary status, which can be one of the following:

  • file server running indicates that the File Server and Volume Server components of the File Server process are running normally.

  • salvaging file system indicates that the Salvager is running, which usually implies that the File Server and Volume Server are temporarily disabled. The BOS Server restarts them as soon as the Salvager is finished.

The output for a cron process also includes an Auxiliary status message to report when the command is scheduled to run next; see the example that follows.

The output for any process can include the supplementary message has core file to indicate that at some point the process failed and generated a core file in the /usr/afs/logs directory. In most cases, the BOS Server is able to restart the process and it is running.

The following example includes a user-defined cron entry called backupusers:


   % bos status fs3.abc.com
   Instance kaserver, currently running normally.
   Instance ptserver, currently running normally.
   Instance vlserver, has core file, currently running normally.
   Instance buserver, currently running normally.
   Instance fs, currently running normally.
       Auxiliary status is: file server running.
   Instance upserver, currently running normally.
   Instance runntp, currently running normally.
   Instance backupusers, currently running normally.
       Auxiliary status is: run next at Mon Jun 7 02:00:00 1999.

If you include the -long flag to the bos status command, a process's entry in the output includes the following additional information from the BosConfig file:

  • The process's type (simple, fs, or cron).

  • The day and time the process last started or restarted.

  • The number of proc starts, which is how many times the BOS Server has started or restarted the process since it started itself.

  • The Last exit time when the process (or one of the component processes in the fs process) last terminated. This line does not appear if the process has not terminated since the BOS Server started.

  • The Last error exit time when the process (or one of the component processes in the fs process) last failed due to an error. A further explanation such as due to shutdown request sometimes appears. This line does not appear if the process has not failed since the BOS Server started.

  • Each command that the BOS Server invokes to start the process, as specified by the -cmd argument to the bos create command.

  • The pathname of the notifier program that the BOS Server invokes when the process terminates (if any), as specified by the -notifier argument to the bos create command.

In addition, if the BOS Server has found that the mode bits on certain files and directories under /usr/afs deviate from what it expects, it prints the following warning message:


   Bosserver process reports inappropriate access on server directories

The expected protections for the directories and files in the /usr/afs directory are as follows. A question mark indicates that the BOS Server does not check the mode bit. See the IBM AFS Quick Beginnings for more information about setting the protections on these files and directories.

/usr/afsdrwxr?xr-x
/usr/afs/backupdrwx???---
/usr/afs/bindrwxr?xr-x
/usr/afs/dbdrwx???---
/usr/afs/etcdrwxr?xr-x
/usr/afs/etc/KeyFile-rw????---
/usr/afs/etc/UserList-rw?????--
/usr/afs/localdrwx???---
/usr/afs/logsdrwxr?xr-x

The following illustrates the extended output for the fs process running on the machine fs3.abc.com:


   % bos status fs3.abc.com fs -long
   Instance fs, (type is fs), currently running normally.
       Auxiliary status is file server running
   Process last started at Mon May 3 8:29:19 1999 (3 proc starts)
   Last exit at Mon May 3 8:29:19 1999
   Last error exit at Mon May 3 8:29:19 1999, due to shutdown request
   Command 1 is '/usr/afs/bin/fileserver'
   Command 2 is '/usr/afs/bin/volserver'
   Command 3 is '/usr/afs/bin/salvager'

Creating and Removing Processes

To start a new AFS server process on a server machine, issue the bos create command, which creates an entry in the /usr/afs/local/BosConfig file, sets the process's status flag to Run both in the file and in the BOS Server's memory, and starts it running immediately. The binary file for the new process must already be installed, by convention in the /usr/afs/bin directory (see Installing New Binaries).

To stop a process permanently, first issue the bos stop command, which changes the process's status flag to NotRun in both the BosConfig file and the BOS Server's memory; it is marked as disabled in the output from the bos status command. If desired, issue the bos delete command to remove the process's entry from the BosConfig file; the process no longer appears in the bos status command's output.

Note: If you are starting or stopping a database server process in the manner described in this section, follow the complete instructions in the IBM AFS Quick Beginnings for creating or removing a database server machine. If you run one database server process on a given machine, you must run them all; for more information, see About Starting and Stopping the Database Server Processes. Similarly, if you are stopping the upserver process on the system control machine or a binary distribution machine, you must complete the additional tasks described in About Starting and Stopping the Update Server.

To create and start a new process

  1. Verify that you are authenticated as a user listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. (Optional) Verify that the process's binaries are installed in the /usr/afs/bin directory on this machine. If necessary, login at the console or telnet to the machine and list the contents of the /usr/afs/bin directory.

    If the binaries are not present, install them on the binary distribution machine of the appropriate system type, and wait for the Update Server to copy them to this machine. For instructions, see Installing New Binaries.

    
   % ls /usr/afs/bin
    
  3. Issue the bos create command to create an entry in the BosConfig file and start the process.

    
   % bos create <machine name> <server process name>   \
                 <server type> <command lines>+ [ -notifier <Notifier program>]
    

    where

    cr

    Is the shortest acceptable abbreviation of create.

    machine name

    Specifies the file server machine on which to create the process.

    server process name

    Names the process to create and start. For simple processes, the conventional value is the name of the process's binary file. It is best to use the same name on every server machine that runs the process. The following is a list of the conventional names for simple and fs-type processes (there are no standard cron processes).

    • buserver for the Backup Server

    • fs for the process that combines the File Server, Volume Server, and Salvager

    • kaserver for the Authentication Server

    • ptserver for the Protection Server

    • runntp for the controller process for the Network Time Protocol Daemon

    • upclientbin for the client portion of the Update Server that references the binary distribution machine of this machine's system type

    • upclientetc for the client portion of the Update Server that references the system control machine

    • vlserver for the Volume Location (VL) Server

    server type

    Defines the process's type. Choose one of the following values:

    • cron for a cron process

    • fs for the process named fs

    • simple for all other processes listed as acceptable values for the server process name argument

    command lines

    Specifies each command the BOS Server runs to start the process. Specify no more than six commands (which can include the command's options, in which case the entire string is surrounded by double quotes); any additional commands are ignored.

    For a simple process, provide the complete pathname of the process's binary file on the local disk (for example, /usr/afs/bin/ptserver for the Protection Server). If including any of the initialization command's options, surround the entire command in double quotes (" "). The upclient process has a required argument, and the commands for all other processes take optional arguments.

    For the fs process, provide the complete pathname of the local disk binary file for each of the component processes: fileserver, volserver, and salvager, in that order. The standard binary directory is /usr/afs/bin. If including any of an initialization command's options, surround the entire command in double quotes (" ").

    For a cron process, provide two parameters:

    • The complete local disk pathname of either an executable file or a command from one of the AFS suites (complete with all of the necessary arguments). Surround this parameter with double quotes (" ") if it contains spaces.

    • A specification of when the BOS Server executes the file or command indicated by the first parameter. There are three acceptable values:

      • The string now, which directs the BOS Server to execute the file or command immediately and only once. It is usually simpler to issue the command directly or issue the bos exec command.

      • A time of day. The BOS Server executes the file or command daily at the indicated time. Separate the hours and minutes with a colon (hh:MM), and use either 24-hour format, or a value in the range from 1:00 through 12:59 with the addition of am or pm. For example, both 14:30 and "2:30 pm" indicate 2:30 in the afternoon. Surround this parameter with double quotes (" ") if it contains a space.

      • A day of the week and time of day, separated by a space and surrounded with double quotes (" "). The BOS Server executes the file or command weekly at the indicated day and time. For the day, provide either the whole name or the first three letters, all in lowercase letters (sunday or sun, thursday or thu, and so on). For the time, use the same format as when specifying the time alone.

    -notifier

    Specifies the pathname of a program that the BOS Server runs when the process terminates. For more information on notifier programs, see the bos create command reference page in the IBM AFS Administration Reference.

The following example defines and starts the Protection Server on the machine db2.abc.com:


   % bos create db2.abc.com ptserver simple /usr/afs/bin/ptserver

The following example defines and starts the fs process on the machine fs6.abc.com.


   % bos create fs6.abc.com fs fs /usr/afs/bin/fileserver   \
        /usr/afs/bin/volserver /usr/afs/bin/salvager

The following example defines and starts a cron process called backupuser process on the machine fs3.abc.com, scheduling it to run each day at 3:00 a.m.


   % bos create fs3.abc.com backupuser cron  "/usr/afs/bin/vos backupsys -prefix user -local" 3:00

To stop a process and remove it from the BosConfig file

  1. Verify that you are authenticated as a user listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos stop command to change each process's status flag in the BosConfig file to NotRun and to stop it. You must issue this command even for cron processes that you wish to remove from the BosConfig file, even though they do not run continuously. For a detailed description of this command, see To stop a process by changing its status to NotRun.

    
   % bos stop <machine name> <server process name>+ [-wait]
    

  3. Issue the bos delete command to remove each process from the BosConfig file.

    
   % bos delete <machine name> <server process name>+
    

    where

    d

    Is the shortest acceptable abbreviation of delete.

    machine name

    Specifies the server machine on which to remove processes from the BosConfig file.

    server process name

    Names each process entry to remove from the BosConfig file. Provide the same names as in Step 2.

Stopping and Starting Processes Permanently

To stop a process so that the BOS Server no longer attempts to monitor it, issue the bos stop command. The process's status flag is set to NotRun in both the BOS Server's memory and in the BosConfig file. The process does not run again until you issue the bos start command, which sets its status flag back to Run in both the BOS Server's memory and in the BosConfig file. (You can also use the bos startup command to start the process again without changing its status flag in the BosConfig file; see Stopping and Starting Processes Temporarily.)

There is no entry for the BOS Server in the BosConfig file, so the bos stop and bos start commands do not control it. To stop and immediately restart the BOS Server along with all other processes, use the -bosserver flag to the bos restart command as described in Stopping and Immediately Restarting Processes.

Note: If you are starting or stopping a database server process in the manner described in this section, follow the complete instructions in the IBM AFS Quick Beginnings for creating or removing a database server machine. If you run one database server process on a given machine, you must run them all; for more information, see About Starting and Stopping the Database Server Processes. Similarly, if you are stopping the upserver process on the system control machine or a binary distribution machine, you must complete the additional tasks described in About Starting and Stopping the Update Server.

To stop a process by changing its status to NotRun

  1. Verify that you are authenticated as a user listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos stop command to stop each process and set its status flag to NotRun in the BosConfig file and the BOS Server's memory.

    
   % bos stop <machine name> <server process name>+ [-wait]
    

    where

    sto

    Is the shortest acceptable abbreviation of stop.

    machine name

    Specifies the server machine on which to stop the process.

    server process name

    Names each process to stop, using the name assigned when its entry was defined with the bos create command.

    -wait

    Delays the return of the command shell prompt until all specified processes have stopped. If you omit the flag, the prompt returns almost immediately, even if all processes are not yet stopped.

To start processes by changing their status flags to Run

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos start command to change each process's status flag to Run in both the BosConfig file and the BOS Server's memory and to start it.

    
   %  bos start <machine name> <server process name>+
    

    where

    start

    Must be typed in full.

    machine name

    Specifies the server machine on which to start running each process.

    server process name

    Specifies each process to start on machine name. Use the name assigned to the process at creation.

Stopping and Starting Processes Temporarily

It is sometimes necessary to halt a process temporarily (for example, to make slight configuration changes or to perform maintenance). The commands described in this section change a process's status in the BOS Server's memory only; the effect is immediate and lasts until you change the memory state again (or until the BOS Server restarts, at which time it starts the process according to its entry in the BosConfig file).

To stop a process temporarily by changing its status flag in BOS Server memory to NotRun, use the bos shutdown command. To restart a stopped process by changing its status flag in the BOS Server's memory to Run, use the bos startup command. The process starts regardless of its status flag in the BosConfig file. You can also use the bos startup command to start all processes marked with status flag Run in the BosConfig file, as described in the following instructions.

Because the bos startup command starts a process without changing it status flag in the BosConfig file, it is useful for testing a server process without enabling it permanently. To stop and start processes by changing their status flags in the BosConfig file, see Stopping and Starting Processes Permanently; to stop and immediately restart a process, see Stopping and Immediately Restarting Processes.

Note: Do not temporarily stop a database server process on all machines at once. Doing so makes the database completely unavailable.

To stop processes temporarily

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos shutdown command to stop each process by changing its status flag in the BOS Server's memory to NotRun.

    
   % bos shutdown <machine name> [<instances>+] [-wait]
    

    where

    sh

    Is the shortest acceptable abbreviation of shutdown.

    machine name

    Specifies the server machine on which to stop processes temporarily.

    instances

    Specifies each process to stop temporarily. Use the name assigned to the process at creation.

    -wait

    Delays the return of the command shell prompt until all specified processes have actually stopped. If you omit the flag, the prompt returns almost immediately, even if all processes are not yet stopped.

To start all stopped processes that have status flag Run in the BosConfig file

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos startup command to start each process on a machine that has status flag Run in the BosConfig file by changing its status flag in the BOS Server's memory from NotRun to Run.

    
   % bos startup <machine name>
    

    where

    startup

    Must be typed in full.

    machine name

    Specifies the server machine on which you wish to start all processes that have status flag Run in the BosConfig file.

To start specific processes

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos startup command to start specific processes by changing their status flags in the BOS Server's memory to Run without changing their status flags in the BosConfig file.

    
   % bos startup <machine name> <instances>+
    

    where

    startup

    Must be typed in full.

    machine name

    Names the server machine on which to start processes.

    instances

    Specifies each process to start. Use the name assigned to the process at creation.

Stopping and Immediately Restarting Processes

Although by default the BOS Server checks each day for new installed binary files and restarts the associated processes, it is sometimes desirable to stop and restart processes immediately. The bos restart command provides this functionality, starting a completely new instance of each affected process:

Restarting processes causes a service outage. It is usually best to schedule restarts for periods of low usage. The BOS Server automatically restarts all processes once a week, to reduce the potential for the core leaks that can develop as any process runs for an extended time; see Setting the BOS Server's Restart Times.

To stop and restart all processes including the BOS Server

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos restart command with the -bosserver flag to stop and restart the BOS Server, which restarts every process marked with status flag Run in the BosConfig file.

    
   % bos restart <machine name>  -bosserver
    

    where

    res

    Is the shortest acceptable abbreviation of restart.

    machine name

    Specifies the server machine on which to restart all processes.

    -bosserver

    Stops the BOS Server and all processes running on the machine. A new BOS Server instance starts; it then starts new instances of all processes marked with status flag Run in the BosConfig file.

To stop and immediately restart all processes except the BOS Server

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos restart command with the -all flag to stop and immediately restart every process marked with status flag Run in the BosConfig file. The BOS Server does not restart.

    
   % bos restart <machine name> -all
    

    where

    res

    Is the shortest acceptable abbreviation of restart.

    machine name

    Specifies the server machine on which to stop and restart processes.

    -all

    Stops and immediately restarts all processes marked with status flag Run in the BosConfig file.

To stop and immediately restart specific processes

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos restart command to stop and immediately restart one or more specified processes, regardless of their status flag setting in the BosConfig file.

    
   % bos restart <machine name> <instances>+
    

    where

    res

    Is the shortest acceptable abbreviation of restart.

    machine name

    Names the server machine on which to restart the specified processes.

    instances

    Specifies each process to stop and immediately restart. Use the name assigned to the process at creation.

Setting the BOS Server's Restart Times

The BOS Server by default restarts once a week, and the new instance restarts all processes marked with status flag Run in the local /usr/afs/local/BosConfig file (this is equivalent to issuing the bos restart command with the -bosserver flag). The default restart time is Sunday at 4:00 a.m. The weekly restart is designed to minimize core leaks, which can develop as a process continues to allocate virtual memory but does not free it again. When the memory is completely exhausted, the machine can no longer function correctly.

The BOS Server also by default checks once a day for any newly installed binary files. If it finds that the modification time stamp on a process's binary file in the /usr/afs/bin directory is more recent than the time at which the process last started, it restarts the process so that a new instance starts using the new binary file. The default binary-checking time is 5:00 a.m.

Because restarts can cause outages during which the file system is inaccessible, the default times for restarts are in the early morning when usage is likely to be lowest. Restarting a database server process on any database server machine usually makes the entire system unavailable to everyone for a brief time, whereas restarting other types of processes inconveniences only users interacting with that process on that machine. The longest outages typically result from restarting the fs process, because the File Server must reattach all volumes.

The BosConfig file on each file server machine records the two restart times. To display the current setting, issue the bos getrestart command. To reset a time, use the bos setrestart command.

To display the BOS Server restart times

  1. Issue the bos getrestart command to display the automatic restart times.

    
   % bos getrestart <machine name>
    

    where

    getr

    Is the shortest acceptable abbreviation of getrestart.

    machine name

    Specifies the server machine for which to display the restart times.

To set the general or binary restart time

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos setrestart command with the -general flag to set the general restart time or the -newbinary flag to set the binary restart time. The command accepts only one of the flags at a time.

    
   % bos setrestart <machine name> "<time to restart server>" [-general]  [-newbinary]
    

    where

    setr

    Is the shortest acceptable abbreviation of setrestart.

    machine name

    Specifies the server machine.

    time to restart server

    Sets when the BOS Server restarts itself (if combined with the -general flag) or any process with a new binary file (if combined with the -newbinary flag). Provide one of the following types of values:

    • The string never, which directs the BOS Server never to perform the indicated type of restart.

    • A time of day (the conventional type of value for the binary restart time). Separate the hours and minutes with a colon (hh:MM), and use either 24-hour format, or a value in the range from 1:00 through 12:59 with the addition of am or pm. For example, both 14:30 and "2:30 pm" indicate 2:30 in the afternoon. Surround this parameter with double quotes (" ") if it contains a space.

    • A day of the week and time of day, separated by a space and surrounded with double quotes (" "). This is the conventional type of value for the general restart. For the day, provide either the whole name or the first three letters, all in lowercase letters (sunday or sun, thursday or thu, and so on). For the time, use the same format as when specifying the time alone.

    If desired, precede a time or day and time definition with the string every or at. These words do not change the meaning, but possibly make the output of the bos getrestart command easier to understand.

    Note: If the specified time is within one hour of the current time, the BOS Server does not perform the restart until the next eligible time (the next day for a time or next week for a day and time).

    -general

    Sets the general restart time when the BOS Server restarts itself.

    -newbinary

    Sets the restart time for processes with new binary files.

Displaying Server Process Log Files

The /usr/afs/logs directory on each file server machine contains log files that detail interesting events that occur during normal operation of some AFS server processes. The self-explanatory information in the log files can help you evaluate process failures and other problems. To display a log file remotely, issue the bos getlog command. You can also establish a connection to the server machine and use a text editor or other file display program (such as the cat command).

Note: Log files can grow unmanageably large if you do not periodically shutdown and restart the database server processes (for example, if you disable the general restart time). In this case it is a good policy periodically to issue the UNIX rm command to delete the current log file. The server process automatically creates a new one as needed.

To examine a server process log file

  1. Verify that you are listed in the /usr/afs/etc/UserList file. If necessary, issue the bos listusers command, which is fully described in To display the users in the UserList file.

    
   % bos listusers <machine name>
    

  2. Issue the bos getlog command to display a log file.

    
   % bos getlog  <machine name>  <log file to examine>
    

    where

    getl

    Is the shortest acceptable abbreviation of getlog.

    machine name

    Specifies the server machine from which to display the log file.

    log file to examine

    Names the log file to be displayed. Provide one of the following file names to display the indicated log file from the /usr/afs/logs directory.

    • AuthLog for the Authentication Server log file

    • BackupLog for the Backup Server log file

    • BosLog for the BOS Server log file

    • FileLog for the File Server log file

    • SalvageLog for the Salvager log file

    • VLLog for the Volume Location (VL) Server log file

    • VolserLog for the Volume Server log file

    You can provide a full or relative pathname to display a file from another directory. Relative pathnames are interpreted relative to the /usr/afs/logs directory.