Configure Service Node MAMU
Introduces the service node MAMU feature and its limitations.
The service node MAMU feature enables administrators to set aside a small number of repurposed compute nodes for serial workload. These nodes serve as workload management execution nodes specifically designated for serial workload and intra-node message passing interface (MPI). The workload manager (WLM) manages these as standard Linux nodes and supports core level placement. Users place jobs directly to the nodes by submitting jobs to a designated WLM queue. These serial workload nodes do not run ALPS, and therefore they cannot be used to place jobs to the compute nodes. Service Node MAMU also provides cgroup-based out-of-memory (OOM) protection for the serial workload nodes.
Initial setup and configuration of MAMU nodes can be done during initial SMW/CLE installation or updates. Setup and configuration of MAMU nodes can also be done on an installed system. After the MAMU nodes are configured on the XC system, a WLM must be set up to use these nodes.
Limitations
- This feature is supported only with the following WLMs: PBS Pro (version 12.1 or later).
- A system can have up to 100 serial workload nodes (having more may be possible but has not been tested).
- The only way to increase or decrease the pool of MAMU nodes is to reconfigure it and reboot the system.
- Cray-specific options, such as node health checking, are supported only to the extent that the workload manager vendor supports these features.
- To guarantee out-of-memory protection, the serial workload nodes do not support ssh login, except for
rootandcrayadmadministrative users. - Sites using power capping may need to take additional action when repurposing nodes. For more information, see XC™ Series Power Management and SEDC Administration Guide (S-0043).