HPE Ezmeral Data Fabric Monitoring (part
of the Spyglass initiative) provides the ability to collect, store, and view metrics and logs
for nodes, services, and jobs/applications.
Metric Monitoring
Administrators can monitor the current status of the cluster and anticipate future cluster
requirements with dashboards. For example, you can use metrics dashboards to visualize the following:
- Storage Utilization
- Use metrics dashboards to monitor storage trends. For example, you can compare the
volume of filesystem usage at different times to the filesystem capacity and then allocate
resources to the filesystem accordingly.
- Node Utilization
- Use metrics dashboards to check for node overload. For example, if the CPU usage is
high on a few nodes, you may want to distribute the load across more nodes for better
performance and efficiency.
- HPE Ezmeral Data Fabric Database Operational Trends
- Use metrics dashboards to display historical trends for HPE Ezmeral Data Fabric Database operations. For
example, if a user reports HPE Ezmeral Data Fabric Database slowness, the historical trends associated with row
scans, get, and put operations can be used to identify the node(s) on which the
performance degradation occurs.
Log Monitoring
Administrators can use dashboards to visualize, search, and review logs when
troubleshooting issues. For example, you can use log dashboards to troubleshoot the
following issues:
- Service Failures
- When metrics indicate that one or more services are down, use log dashboards to
check the logs for each failed service and drill-down to each associated node.
- Application Failures
- When an application or job fails, use log dashboard to identify possible
bottlenecks. For example, you can search the logs for a given application ID across
all the nodes in the cluster.
- filesystem Performance
- When users experience filesystem or NFS for the HPE Ezmeral Data Fabric slowness,
use log dashboards to search the HPE Ezmeral Data Fabric filesystem logs for service errors or application issues.