watchdog(8) BSD System Manager's Manual watchdog(8)
NAME
watchdog - Mac OS X Server service monitoring daemon
SYNOPSIS
watchdog [-d | -n | -x] [-r] [-f file] watchdog [-h | -v]
DESCRIPTION
watchdog is an (AT&T) init-like process that launches, monitors, and relaunches critical services when they terminate. watchdog improves reliability of the system by maintaining these critical services in a uniform manner without resorting to service-specific monitors (which would impact system performance). In typical usage, watchdog is launched during the boot process by the Watchdog startup item. Upon launch, watchdog moves to the background, reads its configuration file, spawns children and waits for their termi- nation. When a child terminates, watchdog will relaunch it unless the child quit shortly after it was launched, currently defined as ten sec- onds or less. Child processes that appear to be spawning too quickly may actually be daemonizing themselves. To allow watchdog to properly monitor such a pro- cess, the configuration file should invoke it with its "no-daemonize" argument if one exists. See the Examples section for a detailed description of the configuration file. watchdog is also indirectly responsible for rebooting the server hardware if the machine hangs. On machines that support automatic reboot, this feature is controlled by the System Preferences' Energy Saver panel, which modifies the WATCHDOGTIMER field in /etc/hostconfig. If automatic reboot is enabled, watchdog periodically resets the power management timer. If the timer ever expires, the power management unit forces a hard reboot. Automatic reboot is disabled when watchdog quits cleanly, so it is imperative that watchdog be terminated by sending a termination signal (SIGTERM), NOT a kill signal (SIGKILL)! The server logs events that may be of interest to the administrator. The log file is named watchdog.event.log and is stored in /Library/Logs. watchdog performs some special operations in response to different sig- nals: SIGHUP watchdog rereads its configuration file when it receives the hangup signal. Services may be added, deleted or modified when the con- figuration file is reread. The process associated with any entry that has been deleted, commented, or had its action changed to ``off'' or ``boot'' will be terminated; ``respawn'' and ``now'' entries with a modified command line will be terminated (if neces- sary) and relaunched; new ``respawn'' and ``now'' entries will be launched; unchanged entries, ``bootonce'', and ``bootwait'' entries will not be touched. SIGINT watchdog effects a complete restart when it receives the interrupt signal. All executing children will be terminated, forcibly (with SIGKILL) if necessary. After all children have terminated, watchdog reads the configuration file and spawns children. SIGTERM watchdog forces a complete shutdown when it receives the terminate signal. The automatic reboot timer will be disabled and all exe- cuting children will be terminated, forcibly (with SIGKILL) if nec- essary. After all children have terminated, watchdog itself exits. watchdog should always be terminated with this signal instead of the kill signal (SIGKILL) to properly disable the automatic reboot timer. SIGKILL Issuing the kill signal to watchdog immediately terminates the pro- cess without disabling the automatic reboot timer, which will force a hard reboot after five minutes on supported machines.
OPTIONS
The following options are available: -d Do not move to background and print log strings to the terminal. -f Read the specified configuration file instead of the default, /etc/watchdog.conf. -h Print usage summary and exit. -n Do not move to background, print log information to the terminal, and quit after reading the configuration file but before spawning child processes. This basically validates the configuration file. -r Ignore the WATCHDOGTIMER field in /etc/hostconfig and enable the automatic reboot timer. -v Print build version and exit. -x Do not move to background.
EXAMPLES
The configuration file is a plain-text, ASCII file with one process entry per line. White-space is ignored, as is all text following a pound char- acter (#). Each valid entry consists of three colon-delimited fields: id:action:path args Invalid entries are ignored (and logged). The id entry is a unique identifying key for the service. Any short string is valid. This is the key used to identify entries across hangup signals. The action entry can be ``off'', ``boot'', ``bootonce'', ``bootwait'', ``respawn'' or ``now'', depending on how the corresponding process should be monitored. The monitored process list is changed only during configu- ration file parsing; parsing occurs once at launch and then only in response to the hangup (SIGHUP) or interrupt (SIGINT) signals, as noted above. The interpretation of the action fields follows: off If the process associated with id is not running, this entry is ignored. If the process currently is running, it will be sent the terminate signal (SIGTERM), followed four seconds later by the kill signal (SIGKILL). boot During first launch, the configuration file will be rewritten to change ``boot'' to ``respawn'', and the process will be launched and monitored as if the entry was ``respawn''. If the process is running during subsequent parsing in response to a hangup signal (SIGHUP), it will be terminated (as ``off'') bootonce The command associated with this entry will only be launched when watchdog is first launched. If the process terminates, it will not be restarted. The process is ignored during subsequent pars- ing in response to a hangup signal (SIGHUP), but will be termi- nated (as ``off'') in response to an interrupt (SIGINT) or termi- nate (SIGTERM) signal. bootwait This action is currently equivalent to ``boot''; a future revi- sion will pause configuration file parsing during launch until this entry terminates. respawn If the process is not running, start it. If the process dies, restart it. If the path field changes between parsing, the pro- cess will be terminated (as ``off'') and restarted with the new executable and arguments. now During first launch, (when watchdog launches ``boot'' entries), this entry is skipped and the configuration file will be rewrit- ten to change ``now'' to ``off''. If an entry is ``now'' when reparsing the configuration file, it will behave like ``respawn''. The path args field is the command-line to execute. path should be fully specified; args are any additional arguments. If path does not exist, or is not an executable file, the entry will be ignored (and logged). The following is a simple config file: # # /etc/watchdog.conf # sambadmin:respawn:/usr/sbin/sambadmind -d # SMB Admin daemon PSM:respawn:/usr/sbin/PrintServiceMonitor -x # Server Printing ssd:respawn:/usr/sbin/serversettings -x # Server Settings daemon mail:off:/usr/sbin/MailService -n # Mail service mm:off:/usr/sbin/MacintoshManagementServer -x # Macintosh Manager
KNOWN ISSUES
Who watches the watchmen? If an entry originally defined with action ``bootonce'' is commented (instead of being changed to ``off'') , it will not be terminated.
FILES & FOLDERS
/usr/sbin/watchdog /etc/watchdog.conf /etc/hostconfig /Library/Logs/watchdog.event.log /System/Library/StartupItems/Watchdog
SEE ALSO
serversettingsd(8) Mac OS X Server 30 August 2001 Mac OS X Server
Mac OS X 10.3 Server - Generated Sat Jun 14 10:27:06 CDT 2008