About Node Groups
Provides an overview of node groups and lists some characteristics.
The Cray Node Groups service (cray_node_groups) enables administrators to define and manage logical groupings of system nodes. Nodes can be grouped arbitrarily, though typically they are grouped by software functionality or hardware characteristics, such as login, compute, service, DVS servers, and RSIP servers.
Node groups that have been defined in a config set can be referenced by name within all CLE services in that config set, thereby eliminating the need to specify groups of nodes (often the same ones) for each service individually and greatly streamlining service configuration. Node groups are used in many Cray-provided Ansible configuration playbooks and roles and can be also used in site-local Ansible plays. Node groups are similar to but more powerful than the class specialization feature of releases prior to CLE 6.0. For example, a node can be a member of more than one node group but could belong to only one class.

- Edit and upload the node groups configuration worksheet (cray_node_groups_worksheet.yaml).
- Use the cfgset command to view and modify node groups interactively with the configurator.
- Use the cfgset get and cfgset modify CLI commands to view and modify node groups at the command line. Note that CLI modifications must be followed by a config set update.
Characteristics of Node Groups
- Node group membership is not exclusive, that is, a node may be a member of more than one node group.
- Node group membership is specified as a list of nodes:
- use cname for a CLE node
- use host ID (the output of the hostid command) for the SMW
- use host name for an eLogin node
- All compute nodes and/or all service nodes can be added as node group members by including the keywords “platform:compute” and/or “platform:service” in a node group.
- Any CLE configuration service is able to reference any defined node group by name.
- The Configuration Management Framework (CMF) exposes node group membership of the current node through the local system "facts" provided by the Ansible runtime environment. This means that each node knows what node groups it belongs to, and that knowledge can be used in Cray and site-local Ansible playbooks.
Pre-populated Node Groups
- are likely to be customized and used by many sites
- support useful default values for many of the configuration services
Several of the pre-populated node groups require customization by a site to provide the appropriate node membership information. This table lists the pre-populated groups and indicates which ones require site customization.
Note that beginning with CLE 6.0.UP06, Cray no longer supports a single node group for all login nodes. Instead, there are two architecture-specific login node groups: one for all login nodes with the x86-64 architecture and one for all login nodes with the AArch64 architecture. To specify all login nodes in the system, use both of those node groups.
| Pre-populated Node Group | Requires Customization? | Notes |
|---|---|---|
| compute_nodes | No | Contains all compute nodes in the given partition. The list of nodes is determined at runtime. |
| compute_nodes_x86_64 | No | Contains all x86-64 compute nodes in the given partition. The list of nodes is determined at runtime. |
| compute_nodes_aarch64 | No | Contains all AArch64 compute nodes in the given partition. The list of nodes is determined at runtime. |
| service_nodes | No | Contains all service nodes in the given partition. The list of nodes is determined at runtime. |
| service_nodes_x86_64 | No | Contains all x86-64 service nodes in the given partition. The list of nodes is determined at runtime. |
| service_nodes_aarch64 | No | Contains all AArch64 service nodes in the given partition. The list of nodes is determined at runtime. |
| smw_nodes | Yes | Add the host ID (output of the hostid command) of the SMW. For an SMW HA system, add the host ID of the second SMW also. |
| boot_nodes | Yes | Add the cname of the boot node. If there is a failover boot node, add its cname also. |
| sdb_nodes | Yes | Add the cname of the SDB node. If there is a failover SDB node, add its cname also. |
| login_nodes_x86_64 | Yes | Add the cnames of all x86-64 internal login nodes on the system. |
| login_nodes_aarch64 | Yes | Add the cnames of all AArch64 internal login nodes on the system. Leave empty (set to []) if there are none. |
| elogin_nodes | Yes | Add the host names of external login nodes on the system. Leave empty (set to []) if there are no eLogin nodes. |
| all_nodes | Maybe | Contains all compute nodes and service nodes on the system. Add external nodes (e.g., eLogin nodes), if needed. |
| all_nodes_x86_64 | No | Contains all x86-64 nodes in the given partition. The list of nodes is determined at runtime. |
| all_nodes_aarch64 | No | Contains all AArch64 nodes in the given partition. The list of nodes is determined at runtime. |
| tier2_nodes | Yes | Add the cnames of nodes that will be used as tier2 servers in the cray_scalable_services configuration. |
Why is there no "tier1_nodes" pre-populated node group? Cray provides a pre-populated tier2_nodes node group to support defaults in the cray_simple_shares service. Cray does not provide a tier1_nodes node group because no default data in any service requires it. Because it is likely that tier1 nodes will consist of only the boot node and the SDB node, for which node groups already exist, Cray recommends using those groups to populate the cray_scalable_services tier1_groups setting rather than defining a tier1_nodes group.
About eLogin nodes. To add eLogin nodes to a node group, use their host names instead of cnames, because unlike CLE nodes, eLogin nodes do not have cname identifiers. If eLogin nodes are intended to receive configuration settings associated with the all_nodes group, add them to that group, or change the relevant settings in other configuration services to include both all_nodes and elogin_nodes.
Additional Platform Keywords
platform:computeplatform:service
platform:compute-X86platform:service-X86platform:compute-ARMplatform:service-ARM
Disabled nodes. All platform keywords, such as platform:compute, platform:service-ARM, and platform:compute-HW12, include nodes that have been disabled. To identify disabled nodes, use this keyword: platform:disabled
~ (the tilde symbol). For example, a custom node group that contains all enabled compute and service nodes would have the following list as its members. The ordering of the list does not matter: all non-negated keywords are resolved first, then negated ones are removed.- platform:compute - platform:service - ~platform:disabled
platform:compute-XX##platform:service-XX##
smw# xtcli status p0 Network topology: class 0 Network type: Aries Nodeid: Service Core Arch| Comp state [Flags] ----------------------------------------------------- c0-0c0s0n0: service BW18 X86| ready [noflags|] c0-0c0s0n1: service BW18 X86| ready [noflags|] c0-0c0s0n2: service BW18 X86| ready [noflags|] c0-0c0s0n3: service BW18 X86| ready [noflags|] c0-0c0s1n0: service BW18 X86| ready [noflags|] c0-0c0s1n1: service BW18 X86| ready [noflags|] c0-0c0s1n2: service BW18 X86| ready [noflags|] c0-0c0s1n3: service BW18 X86| ready [noflags|] c0-0c0s2n0: - HW12 X86| ready [noflags|] c0-0c0s2n1: - HW12 X86| ready [noflags|] c0-0c0s2n2: - HW12 X86| ready [noflags|] c0-0c0s2n3: - HW12 X86| ready [noflags|]The following table lists some of the common processor/core codes supported by Cray.
| Processor (XX) | Core (##) | Intel Code Name |
|---|---|---|
| BW | 12, 14, 16, 18, 20, 22, 24, 28, 32, 36, 40, 44 | "Broadwell" |
| HW | 04, 06, 08, 10, 12, 14, 16, 18, 20, 24, 28, 32, 36 | "Haswell" |
| IV | 02, 04, 06, 08, 10, 12, 16, 20, 24 | "Ivy Bridge" |
| KL | 60, 64, 66, 68, 72 | "Knights Landing" |
| SB | 04, 06, 08, 12, 16 | "Sandy Bridge" |
| SK | 04, 08, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56 | "Skylake" |
| CL | 18, 20, 24 | "Cascade Lake" |