NXD Cache State on Lustre Filesystem
About NXD caching state on a mounted Lustre file system
When the Lustre file system is mounted, the NXD cache state may be enabled or disabled on all the Object Storage Targets (OSTs). There is no option to change the NXD cache state on selected OSTs only.
cscli nxd list command to determine if NXD caching on OSTs is enabled or disabled. In the following example, the caching state is enabled on the targets:root@cls12345n000$ cscli nxd list
----------------------------------------------------------------------------------------
Host Cache Caching Total Cache Cache Cache Bypass
Group State Cache Size Block Window IO Size
Size In Use Size Size
----------------------------------------------------------------------------------------
cls12345n004 nxd_cache_0 enabled 1.406 TB 699.875 MB 8(4 KiB) 128(64 KiB) 2048(1 MiB)
cls12345n005 nxd_cache_1 enabled 1.406 TB 745.500 MB 8(4 KiB) 128(64 KiB) 2048(1 MiB)
----------------------------------------------------------------------------------------
cscli nxd [enable|disable] commands, however, the Lustre file system must be mounted. Note also that NXD caching may be enabled or disabled even if: - An OSS node is down
- One or more OSTs of an OSS node are down
If an OST goes down while NXD caching is enabled, dirty data may remain in the NXD cache.
Considerations When Disabling NXD Caching
After running the cscli nxd disable command, it is possible for the output of the cscli nxd list command to continue showing the caching state for any OST as disabling for an extended period. This is because of the length of time needed by background processes to flush outstanding dirty data from the NXD cache device (SSD) to GridRAID.
On a healthy system, it typically takes less than 40 minutes for 1.4TB of a cache device (90% of the dirty data) to flush completely to GridRAID. If the process takes longer, we recommend monitoring the flushing progress for the OST by using the cscli nxd list -a command repeatedly until the process is complete. The -a option will display detailed NXD configuration information and statistics in the DirtyCWs and Cache Blocks Flushed fields.
| Value in Dirty CWs Field is... | Value in Cache Blocks Flushed Field is... | Possible Issues |
|---|---|---|
| Not changing | Not changing | NXD cache device is up and running fine, but the NXD virtual drive (GridRAID) is faulty or down. Run cat /proc/mdstat to check GridRAID state on the affected OSS node. |
| Decreasing | Increasing | Wait until the caching state on the affected OSTs is completely disabled. The flushing speed might be reduced for several reasons, such as 1) GridRAID is busy serving large bypassed IOs, 2) RAID check is running in the background, 3) GridRAID array is degraded due to 1 or 2 rotational drive failures. |
| Increasing | Increasing | There may be continuous large overlapping IO coming into the affected OSTs. We recommend stopping client IO until flushing is complete. Any new IOs which are not overlapping in nature will bypass the NXD cache, but for any overlapping IOs the only solution is to stop client IOs to allow the NXD disable process to complete. |