Create New Alarm Definition

Procedure to create a new alarm definition with site local alarm fields.

Administrators can create new alarm definitions. For descriptions and examples of the types of changes that can be made to alarms, please refer to the following files, which can be found in the /etc/sma-data/etc directory on the View for ClusterStor™ server.

  • site-override.alarms.yaml.example
  • site-local-alarms.yml.j2.example

Creating site-specific local alarms involves copying the site-local-alarms.yml.j2.example file to a new file, and then making edits as desired. Fields used to define these new alarms are described in detail in Site Local Alarms Fields.

The following procedure explains how to create new alarms.

  1. Log in to View for ClusterStor as root.
  2. Copy the site-local-alarms.yml.j2.example file to a new file name in the same directory.
    hostname# cp /etc/sma-data/etc/site-local-alarms.yml.j2.example \
    /etc/sma-data/etc/site-local-alarms.yml.j2
  3. Edit the new file to define the desired new alarm.
    1. Open the file in an editor.
      hostname# vi /etc/sma-data/etc/site-local-alarms.yml.j2
      
    2. Remove any of the examples that are not needed or comment them out.
    3. Modify the selected example to define the new alarm. Use the fields described in Site Local Alarms Fields.
    4. Save the modified file.
      Following is an example of a new alarm named Sim_SMA_metric_health. Site local alarms fields are shown in bold:
      - name: "Sim_SMA_metric_health"
        description: "Check for Incoming simulated metrics"
        expression: "last(cray_test.metric_health{device=MDT0000}) < 0"
        match_by:
          - "system_name"
        severity: "CRITICAL"
        undetermined_actions: *notification_list
        ok_actions: *notification_list
        cause: "n/a"
        solution: "n/a"
        undcause: "There is an interruption in the flow of metric data from the ClusterStor storage system."
        undsolution: "The management console of the ClusterStor storage cluster should be checked to see if it is operational and if there are alerts on its service console."
        ignoredund: False
        undalarm: True
  4. Create the new alarm(s).
    hostname# cd /etc/sma-data/etc
    hostname# docker-compose run --rm alarms remove_all_alarms.sh
    hostname# docker-compose run --rm alarms /start.sh
    This step executes two Docker Compose commands:
    • The first removes all currently defined alarms.
    • The second sets up all Cray defined alarms and all of the site local overrides.
    CAUTION: All alarm history will be lost when removing alarms.
New alarm definitions can be viewed in the Alarms Definitions Table, which is described in Cray Defined Alarms. New alarms may take a while to appear.