Perform Parallel Operations on Compute Nodes
The parallel command tool (pcmd) facilitates execution of the same commands on groups of compute nodes in parallel, similar to pdsh. Although pcmd is launched from a service node, it acts on compute nodes. It allows administrators and/or, if the site deems it feasible, other users to securely execute programs in parallel on compute nodes. The user can specify on which nodes to execute the command. Alternatively, the user can specify an application ID (apid) to execute the command on all the nodes available under that apid.
An unprivileged user must execute the command targeting nodes where the user is currently running an aprun. A root user is allowed to target any compute node, regardless of whether there are jobs running there or not. In either case, if the aprun exits and the associated applications are killed, any commands launched by pcmd will also exit.
By default, pcmd is installed as a root-only tool. It must be installed as setuid root in order for unprivileged users to use it.
The pcmd command is located in the nodehealth module. If the nodehealth module is not part of the default profile, load it by specifying:
module load nodehealth
For additional information, see the pcmd(1) man page.