Dump and Reboot Nodes Automatically

The SMW daemon dumpd initiates automatic dump and reboot of nodes when requested by the Node Health Checker (NHC).

CAUTION: The dumpd daemon is invoked automatically by xtbootsys on system (or partition) boot. In most cases, system administrators do not need to use this daemon directly.

A system administrator can set global variables in the /etc/opt/cray/nodehealth/nodehealth.conf configuration file to control the interaction of NHC and dumpd. For more information about NHC and the nodehealth.conf configuration file, see Configure the Node Health Checker (NHC).

Variables can also be set in the /etc/opt/cray-xt-dumpd/dumpd.conf configuration file on the SMW to control how dumpd behaves on the system.

Each CLE release package also includes an example dumpd configuration file, /etc/opt/cray-xt-dumpd/dumpd.conf.example. The dumpd.conf.example file is a copy of the /etc/opt/cray-xt-dumpd/dumpd.conf file provided for an initial installation.

Important: The /etc/opt/cray-xt-dumpd/dumpd.conf file is not overwritten during a CLE upgrade if the file already exists. This preserves the site-specific modifications previously made to the file. Cray recommends comparing the site's /etc/opt/cray-xt-dumpd/dumpd.conf file content with the /etc/opt/cray-xt-dumpd/dumpd.conf.example file provided with each release to identify any changes and then update the site's /etc/opt/cray-xt-dumpd/dumpd.conf file accordingly.

If the /etc/opt/cray-xt-dumpd/dumpd.conf file does not exist, then the /etc/opt/cray-xt-dumpd/dumpd.conf.example file is copied to the /etc/opt/cray-xt-dumpd/dumpd.conf file.

The CLE installation and upgrade processes automatically install dumpd software, but it must be explicitly enabled.