About Boot Automation Files

Lists the boot automation files provided by Cray, discusses when customization is needed.

The default boot behavior for Cray systems without direct-attached Lustre (DAL) nodes is to boot the boot and SDB nodes first, then boot all other service nodes and all compute nodes at the same time, thereby decreasing overall boot time. Systems with DAL must boot the computes nodes after the service nodes.
  • Default for systems without DAL:
    1. Boot + SDB (if SDB image small enough to PXE boot)
    2. SDB (if SDB image too large to PXE boot)
    3. Service + Compute
  • Default for systems with DAL:
    1. Boot + SDB (if SDB image small enough to PXE boot)
    2. SDB (if SDB image too large to PXE boot)
    3. Service
    4. Compute
Cray provides the following boot automation files with this release.
auto.generic
Used to boot the entire XC system.
auto.xtshutdown
Used to shut down the entire XC system.
auto.bootnode
Used to boot only the boot node(s).
auto.bootnode+sdb
Used to boot only the boot node(s) and SDB node(s).

During a fresh install, sites typically copy auto.generic, rename it with the host name of the system for which it will be used (auto.hostname.start), and customize it for that site and system. Likewise, sites typically copy auto.xtshutdown, rename it with the host name of the system for which it will be used (auto.hostname.stop), and customize it, as needed. The host name is included because different systems may have different software installed, resulting in different boot or shutdown requirements. For example, on a system with a workload manager (WLM) installed, extra commands may be needed in the auto.hostname.stop file to cleanly stop the WLM queues on SDB or MOM nodes before shutting down the nodes.

When is customization of an automation file needed?
  • For systems booting tmpfs images (instead of netroot) with no SDB node failover, no changes may be necessary.
  • For systems with boot or SDB node failover, instructions for adding or enabling commands are provided at the appropriate place in the fresh install and update processes.
  • For systems booting netroot images, instructions for making netroot-related changes after the first boot with tmpfs are provided at the appropriate place in the fresh install process.
  • For systems booting direct-attached Lustre (DAL) images, instructions for making DAL-related changes are provided at the appropriate place in the fresh install process.
  • For systems with added content in the recipe used for SDB nodes, if the resulting custom recipe produces a boot image too large for a PXE boot, changes to the boot automation file are necessary. If based on auto.generic, the system boot automation file will have an option (commented out by default) to boot the boot node via PXE boot and then boot the SDB node via the HSN.
  • For systems with a workload manager (WLM) installed, WLM-related changes may be needed. Specific commands to add will vary based on the WLM.