About Simple Sync
Provides an overview of Simple Sync, describes how it works, lists its characteristics, and provides use cases.
The Cray Simple Sync service (cray_simple_sync) provides a simple, generic mechanism for copying user-defined content to internal and external nodes in a Cray XC system. When executed, the service automatically copies files found in source directories in the config set to one or more target nodes. The Simple Sync service is enabled by default and has no additional configuration options. It can be enabled or disabled during the initial installation using worksheets or with the cfgset command at any time. For more information, see man cfgset(8).
With regard to external nodes like eLogin nodes, the exclusions specified in the cray_cfgset_exclude configuration service are applied when the CLE config set is transferred to the node, and some portions of the Simple Sync directory in the config set are excluded. The "Files Excluded from eLogin Nodes" section contains more details.
Simple Sync is a simple tool and not intended as the sole solution for making configuration changes to the system. Writing custom Ansible plays might provide better maintainability, flexibility, and scalability in the long term.
How Simple Sync Works
When enabled, the Simple Sync service is executed on all internal CLE nodes and eLogin nodes at boot time and whenever the administrator executes /etc/init.d/cray-ansible start on a CLE node or eLogin node. When Simple Sync is executed, files placed in the following directory structure are copied to the root file system (/) on the target nodes.
Simple Sync requires cfgset update p0 to be run before /etc/init.d/cray-ansible start in order to ensure files are passed on to the boot node before final changes are moved to the service and compute nodes.
The Simple Sync directory structure has this root: smw:/var/opt/cray/imps/config/sets/<config_set>/files/simple_sync/
- ./common/files/
- Targets all nodes, both internal CLE nodes and eLogin nodes.
- ./hardwareid/<hardwareid>/files/
- Targets a specific node with that hardware ID, which is the cname of a CLE node or the output of the hostid command (e.g., 1eac0b0c) on other nodes. An admin must create both the <hardwareid> directory and the files directory.
- ./hostname/<hostname>/files/
- Used ONLY for eLogin nodes. Targets a node with the specified host name. An admin must create both the <hostname> directory and the files directory.
- ./nodegroups/<node_group_name>/files/
- Targets all nodes in the specified node group. The directories for this nodegroups directory are automatically stubbed out when the config set is updated after node groups are defined and configured in the cray_node_groups service.
- ./platform/[compute|service]/files/
- Targets all compute nodes or all service nodes, depending on whether they are placed in platform/compute/files or platform/service/files. Each time the config set is updated, the HSS data store is queried to update which nodes are service and which are compute.
- ./README
- Provides brief guidance on using Simple Sync and a list of existing node groups in the order in which files will be copied. This ordering enables an administrator to predict behavior in cases where a file may be duplicated within the Simple Sync directory structure.
Simple Sync copies content into place prior to the standard Linux startup (systemd) and before cray-ansible runs any other services.
The ownership and permissions of copied directories and files are preserved when they are copied to root on the target nodes. An administrator can run cray-ansible multiple times, as needed, and only the files that have changed will be copied to the target nodes.
Because of the way it works, Simple Sync can be used to configure services that have configuration parameters not currently supported by configuration templates and worksheets. An administrator can create a configuration file with the necessary settings and values, place it in the Simple Sync directory structure, and it will be distributed and applied to the target nodes.
Files Excluded from eLogin Nodes
- files/simple_sync/common/files/etc/ssh
- files/simple_sync/common/files/root/.ssh
Simple Sync and Configuration File Management
- Managed entirely by a site system administrator
Such config files are considered non-conflicting because there is no potential conflict between administrator-provided content and Cray-managed content.
- Managed entirely by Cray configuration services
Where possible, such config files have a comment at the top indicating that the file is completely under the management of the Cray service. Files that have been changed by Cray services can be identified by checking the change logs on the running node in /var/opt/cray/log/ansible. Simple Sync does not provide a mechanism to override changes made by Cray services. To override changes made by Cray services, refer to the documentation for the specific service.
- Jointly managed by a system administrator and by Cray config services
These config files can contain both administrator-managed content and Cray-managed content, so there is potential for conflict. Administrator changes to Cray-managed content can be overridden. Content that is not managed by Cray is considered non-conflicting because any admin changes to it will not conflict with changes made by Cray services.
Because Simple Sync copies administrator-provided files into place before cray-ansible runs, any Cray services that make small changes to jointly managed files will operate on the administrator-provided files. Afterwards, that file will contain both non-conflicting administrator-provided content as well as the changes made by the Cray service. Because these changes happen prior to Linux startup, the changes will be in place when the services start up.
Characteristics of Simple Sync
| Simple Sync is: | Simple Sync is NOT: |
|---|---|
| for simple and straightforward use cases | a comprehensive system management solution |
| for copying a moderate number of moderately sized files* | intended to transfer large objects or a large volume of files |
| an interface to configure Cray "turnkey" services such as ALPS, Node Health or Lightweight Log Manager (LLM) |
- runs as early in the Ansible execution sequence as possible (it runs BEFORE other cray-ansible plays, so it can be used to make changes to files that Cray updates, like sshd_config)
- runs during the netroot setup sequence, so it can be used to change LNet and DVS settings, if needed
- supports node groups for targeting which system nodes to copy files to (see About Node Groups)
- removing files
- appending to files
- changing file ownership and permissions (the permissions of the file in the config set are mirrored on-node)
- backing up files
- overriding Cray-set values (it cannot be used to change files that Cray completely overwrites, such as alps.conf, or change values in files that Cray modifies such as PermitRootLogin in /etc/ssh/sshd_config)
Cautions about the Use of Simple Sync
- Simple Sync copies files from the config set, which in the case of nodes without a persistent root file-system is cached in a compressed form, locally, in memory. As a result, each file stored in the config set uses some memory on the node. Therefore, using Simple Sync to copy binary files or large numbers of files is inadvisable.
- Be aware of differences in node environments when using Simple Sync. For example, systems configured with direct-attached Lustre (DAL) have nodes running CentOS instead of SLES. Administrators would have to be very careful to avoid putting an inappropriate configuration file into place when using the Simple Sync platform/service target in such a situation.
- Storage and distribution of verbatim config files through Simple Sync creates the potential for unintentional impact to the system when config files evolve due to software changes. Making minimal necessary changes through a site-local Ansible playbook provides more flexibility and minimizes the potential for unintended consequences.
Use Cases
Copy a non-conflicting file to all nodes
- Place etc/myfile under ./common/files/ in the Simple Sync directory structure.
- Simple Sync copies it to /etc/myfile on all nodes.
Copy a non-conflicting file to a service node
- Place etc/servicefile under ./platform/service/files/ in the Simple Sync directory structure.
- Simple Sync copies it to /etc/servicefile on all service nodes.
Copy a non-conflicting file to a compute node
- Place etc/computefile under ./platform/compute/files/ in the Simple Sync directory structure.
- Simple Sync copies it to /etc/computefile on all compute nodes.
Copy a non-conflicting file to a specific node
- Place etc/mynode under ./hardwareid/c0-0c0s0n0/files/ in the Simple Sync directory structure.
- Simple Sync copies it to /etc/mynode on c0-0c0s0n0.
Copy a non-conflicting file to a user-defined collection of nodes
- Create a node group called "my_nodes" containing a list of nodes.
- Update the config set.
smw# cfgset update p0
- Place etc/mynodes under ./nodegroups/my_nodes/files/ in the Simple Sync directory structure.
- Simple Sync copies it to /etc/mynodes on all nodes listed in node group my_nodes.
Copy to a node a file that has Cray-maintained content
- Place a version of sshd_config (entire file) that includes “MaxAuthTries 3” under ./nodegroups/login_nodes_x86_64/files/etc/ssh/ and ./nodegroups/login_nodes_aarch64/files/etc/ssh/ in the Simple Sync directory structure.
- The booted system will contain both:
- “MaxAuthTries 3” (from the files copied by Simple Sync)
- “PasswordAuthentication yes” (from modification of file by Cray)
Copy to a node a file that is exclusively maintained by Cray
Files exclusively maintained by Cray such as alps.conf cannot be updated using Simple Sync. Please refer to the owning service (such as ALPS) for information on how to update the contents.
Copy to a node a file that resides on a file system that will be mounted during Linux boot
No special operational changes are necessary. However, Simple Sync will put the file in place early in the boot sequence, and then it will be over-mounted by the file system. Because Simple Sync runs again later, it will copy the file into the mounted file system. Due to the ordering of operations, the file will not be present between the time the file system was mounted and the late execution of Ansible.
On netroot login nodes, modify an LNet modprobe parameter
- Generate a file my_lnet.conf containing
options lnet router_ping_timeout=100. - Place my_lnet.conf under ./nodegroups/login/files/etc/modprobe.d/ in the Simple Sync directory structure.
- The
lnet router_ping_timeoutvalue will be 100.
Copy a file with incompatible content to a node file that has Cray-maintained content
While Simple Sync allows an administrator to make changes to configuration files that are modified by Cray, be very careful to avoid introducing syntax errors or incompatible values that may cause the system to fail to operate correctly.