watchdog(8) BSD System Manager's Manual watchdog(8)
NAME
watchdog - Mac OS X Server service monitoring daemon
SYNOPSIS
watchdog [-d | -n | -x] [-r] [-f file]
watchdog [-h | -v]
DESCRIPTION
watchdog is an (AT&T) init-like process that launches, monitors, and
relaunches critical services when they terminate. watchdog improves
reliability of the system by maintaining these critical services in a
uniform manner without resorting to service-specific monitors (which
would impact system performance).
In typical usage, watchdog is launched during the boot process by the
Watchdog startup item. Upon launch, watchdog moves to the background,
reads its configuration file, spawns children and waits for their termi-
nation. When a child terminates, watchdog will relaunch it unless the
child quit shortly after it was launched, currently defined as ten sec-
onds or less.
Child processes that appear to be spawning too quickly may actually be
daemonizing themselves. To allow watchdog to properly monitor such a pro-
cess, the configuration file should invoke it with its "no-daemonize"
argument if one exists.
See the Examples section for a detailed description of the configuration
file.
watchdog is also indirectly responsible for rebooting the server hardware
if the machine hangs. On machines that support automatic reboot, this
feature is controlled by the System Preferences' Energy Saver panel,
which modifies the WATCHDOGTIMER field in /etc/hostconfig. If automatic
reboot is enabled, watchdog periodically resets the power management
timer. If the timer ever expires, the power management unit forces a hard
reboot. Automatic reboot is disabled when watchdog quits cleanly, so it
is imperative that watchdog be terminated by sending a termination signal
(SIGTERM), NOT a kill signal (SIGKILL)!
The server logs events that may be of interest to the administrator. The
log file is named watchdog.event.log and is stored in /Library/Logs.
watchdog performs some special operations in response to different sig-
nals:
SIGHUP
watchdog rereads its configuration file when it receives the hangup
signal. Services may be added, deleted or modified when the con-
figuration file is reread. The process associated with any entry
that has been deleted, commented, or had its action changed to
``off'' or ``boot'' will be terminated; ``respawn'' and ``now''
entries with a modified command line will be terminated (if neces-
sary) and relaunched; new ``respawn'' and ``now'' entries will be
launched; unchanged entries, ``bootonce'', and ``bootwait'' entries
will not be touched.
SIGINT
watchdog effects a complete restart when it receives the interrupt
signal. All executing children will be terminated, forcibly (with
SIGKILL) if necessary. After all children have terminated, watchdog
reads the configuration file and spawns children.
SIGTERM
watchdog forces a complete shutdown when it receives the terminate
signal. The automatic reboot timer will be disabled and all exe-
cuting children will be terminated, forcibly (with SIGKILL) if nec-
essary. After all children have terminated, watchdog itself exits.
watchdog should always be terminated with this signal instead of
the kill signal (SIGKILL) to properly disable the automatic reboot
timer.
SIGKILL
Issuing the kill signal to watchdog immediately terminates the pro-
cess without disabling the automatic reboot timer, which will force
a hard reboot after five minutes on supported machines.
OPTIONS
The following options are available:
-d Do not move to background and print log strings to the terminal.
-f Read the specified configuration file instead of the default,
/etc/watchdog.conf.
-h Print usage summary and exit.
-n Do not move to background, print log information to the terminal,
and quit after reading the configuration file but before spawning
child processes. This basically validates the configuration file.
-r Ignore the WATCHDOGTIMER field in /etc/hostconfig and enable the
automatic reboot timer.
-v Print build version and exit.
-x Do not move to background.
EXAMPLES
The configuration file is a plain-text, ASCII file with one process entry
per line. White-space is ignored, as is all text following a pound char-
acter (#). Each valid entry consists of three colon-delimited fields:
id:action:path args
Invalid entries are ignored (and logged).
The id entry is a unique identifying key for the service. Any short
string is valid. This is the key used to identify entries across hangup
signals.
The action entry can be ``off'', ``boot'', ``bootonce'', ``bootwait'',
``respawn'' or ``now'', depending on how the corresponding process should
be monitored. The monitored process list is changed only during configu-
ration file parsing; parsing occurs once at launch and then only in
response to the hangup (SIGHUP) or interrupt (SIGINT) signals, as noted
above. The interpretation of the action fields follows:
off If the process associated with id is not running, this entry is
ignored. If the process currently is running, it will be sent the
terminate signal (SIGTERM), followed four seconds later by the
kill signal (SIGKILL).
boot During first launch, the configuration file will be rewritten to
change ``boot'' to ``respawn'', and the process will be launched
and monitored as if the entry was ``respawn''. If the process is
running during subsequent parsing in response to a hangup signal
(SIGHUP), it will be terminated (as ``off'')
bootonce
The command associated with this entry will only be launched when
watchdog is first launched. If the process terminates, it will
not be restarted. The process is ignored during subsequent pars-
ing in response to a hangup signal (SIGHUP), but will be termi-
nated (as ``off'') in response to an interrupt (SIGINT) or termi-
nate (SIGTERM) signal.
bootwait
This action is currently equivalent to ``boot''; a future revi-
sion will pause configuration file parsing during launch until
this entry terminates.
respawn
If the process is not running, start it. If the process dies,
restart it. If the path field changes between parsing, the pro-
cess will be terminated (as ``off'') and restarted with the new
executable and arguments.
now During first launch, (when watchdog launches ``boot'' entries),
this entry is skipped and the configuration file will be rewrit-
ten to change ``now'' to ``off''. If an entry is ``now'' when
reparsing the configuration file, it will behave like
``respawn''.
The path args field is the command-line to execute. path should be fully
specified; args are any additional arguments. If path does not exist, or
is not an executable file, the entry will be ignored (and logged).
The following is a simple config file:
#
# /etc/watchdog.conf
#
sambadmin:respawn:/usr/sbin/sambadmind -d # SMB Admin daemon
PSM:respawn:/usr/sbin/PrintServiceMonitor -x # Server Printing
ssd:respawn:/usr/sbin/serversettings -x # Server Settings daemon
mail:off:/usr/sbin/MailService -n # Mail service
mm:off:/usr/sbin/MacintoshManagementServer -x # Macintosh Manager
KNOWN ISSUES
Who watches the watchmen?
If an entry originally defined with action ``bootonce'' is commented
(instead of being changed to ``off'') , it will not be terminated.
FILES & FOLDERS
/usr/sbin/watchdog
/etc/watchdog.conf
/etc/hostconfig
/Library/Logs/watchdog.event.log
/System/Library/StartupItems/Watchdog
SEE ALSO
serversettingsd(8)
Mac OS X Server 30 August 2001 Mac OS X Server
Mac OS X 10.3 Server - Generated Sat Jun 14 10:27:06 CDT 2008
