Addressing Data Alarms

Lists all the data alarms and their mitigation.

When a disk fails, data on that disk becomes unavailable. As a result, you will probably see one of these two data alarms along with a Disk Failure alarm:

If you see a Data Unavailable volume alarm in the cluster, follow these steps to run the /opt/mapr/server/fsck utility on all the offline storage pools. On each node in the cluster that has raised a disk failure alarm:
  1. Run the following command to identify which storage pools are offline:

    [user@host] /opt/mapr/server/mrconfig sp list | grep Offline
  2. For each storage pool reported by the previous command, run the following command, where <sp> specifies the name of an offline storage pool:

    [user@host] /opt/mapr/server/fsck -n <sp> -r 
    When you run fsck with the -r option, it identifies corrupt blocks and removes them. If there are no corrupt blocks, fsck clears the error condition so you can bring the storage pool back online.
    Note: Using the /opt/mapr/server/fsck utility with the -r flag to repair a filesystem risks data loss. Call support before using /opt/mapr/server/fsck -r.
  3. Verify that all Data Unavailable volume alarms are cleared. If Data Unavailable volume alarms persist, contact support or post on answers.mapr.com.

If there are any Data Under Replicated volume alarms in the cluster, can repair the problem by automatically replicating data and putting it on another disk. After you allow a reasonable amount of time for re-replication, verify that the under-replication alarms are cleared.

Using the /opt/mapr/server/fsck utility with the -r option produces different results depending on the scenario. The fsck utility does not interpret the scenario nor does it have a safe mode.

The most conservative usage of fsck -r is to run fsck without the -r option (verification mode) and check the output. If the output is ok, then run fsck with the -r option.
Note: Disk Failure node alarms that persist require disk replacement. If Data Under Replicated volume alarms persist, contact support or post on answers.mapr.com.