Set the Migration Threshold for a Resource

Commands for listing resource name/status, setting the migration threshold, and checking to see that it was set as intended.

The set_migration_threshold command sets the migration threshold for a resource in an SMW HA cluster. A migration threshold is defined as the maximum number of failures (the failcount) allowed for the resource. If the failcount exceeds this threshold, a failover occurs and management of all cluster resources migrates to the other SMW, making it the active SMW. By default, the migration threshold is 1,000,000.
Important: Cray recommends that you either leave migration thresholds at the default values or set them to a very high value until you have experience with SMW HA operation. Migration threshold settings that are too low could cause the resource to be ineligible to run if the failcount exceeds that value on both SMWs. If lower settings are used, Cray recommends that you monitor failcounts regularly for trends and clear the failcount values as appropriate. Otherwise, transient errors over time could push failcount values beyond the migration threshold, which could lead to one of the following scenarios:
  • Failovers could be triggered by a transient error condition that might otherwise have been handled by a less disruptive mechanism.
  • Failovers might not be possible because both SMWs have exceeded the migration threshold.

Execute these commands as root on either SMW.

  1. Determine the resource name.
    To display a list of resource names and the status of those resources, use the crm_resource command.
    smw1# crm_resource -l
    
    
  2. Use the set_migration_threshold command to change the migration threshold for a resource.

    For resource, specify a particular resource name. For value, specify an integer in the range of 0 - 1000000.

    smw1# set_migration_threshold resource value
    
    
  3. Verify the change.
    smw1# show_migration_threshold resource 
    
    

    For more information, see the set_migration_threshold(8) man page.