The memory Data Plugin

How much memory is being used?

The memory plugin collects information from /proc and /sys that is useful when assessing the memory performance of an application or job. The data is written in JSON dictionary format. The type of data reported is determined by the value of arg set for the memory within the cray_rur service settings.

Important: The memory plugin does not provide consolidated information for all nodes within an application; instead it reports memory statistics for each node within the application. This can result in a large amount of RUR output data for systems of even modest size. When the memory plugin is enabled, it produces a significant amount of output.

If arg is not set (default), the plugin reports the following data:

%_of_boot_mem
The % of boot memory for each order chunk in /proc/buddyinfo summed across all memory zones
Active(anon)
Total amount of memory in active use by the application
Active(file)
Total amount of memory in active use by cache and buffers
boot_freemem
Contents of /proc/boot_freemem
current_freemem
Contents of /proc/current_freemem
free
Number of hugepages that are not yet allocated
hugepages-sizekB
The hugepage size for the select entries from /sys/kernel/mm/hugepages/hugepages-*kB/*
Inactive(anon)
Total amount of memory that is candidate to be swapped out
Inactive(file)
Total amount of memory that is candidate to be dropped from cache
nr
Number of hugepages that exist at this point
resv
Number of hugepages committed for allocation, but no allocation has occurred
Slab
Total amount of memory used by the kernel
surplus
Number of hugepages above nr

RUR default memory output

This example shows the default memory data as written to /var/opt/cray/log/partition-current/messages-date on the SMW.
2017-02-03T11:37:24.480982-05:00 c0-0c0s0n2 RUR 23710 p0-20140321t091957 [RUR@34] uid: 12345, apid: 33079, jobid: 0, cmdname: /bin/hostname, plugin: memory {"current_freemem": 21858372, "meminfo": {"Active(anon)": 35952, "Slab": 105824, "Inactive(anon)": 1104}, "hugepages-2048kB": {"nr": 5120, "surplus": 5120}, "%_of_boot_mem": ["67.23", "67.23", "67.23", "67.22", "67.21", "67.18", "67.11", "67.04", "66.94", "66.83", "66.77", "66.66", "66.53", "66.38", "65.87", "65.07", "63.05", "61.43"], "nid": "8", "cname": "c0-0c0s2n0", "boot_freemem": 32432628}

If arg is set to extended_buddy, the output relating to /proc/buddyinfo includes NUMA node granularity information in addition to the existing node granularity information. This information is useful when troubleshooting certain fragmentation related issues.

RUR extended memory output

This example shows extended memory data as written to /var/opt/cray/log/partition-current/messages-date on the SMW:
2017-02-03T11:37:24.480982-05:00 c0-0c0s0n2 RUR 23710 p0-20140321t091957 [RUR@34] uid: 12345, apid: 33079, jobid: 0, cmdname: /bin/hostname, plugin: memory {"current_freemem": 21858372, "meminfo": {"Active(anon)": 35952, "Slab": 105824, "Inactive(anon)": 1104}, "hugepages-2048kB": {"nr": 5120, "surplus": 5120}, "Node_0_zone_DMA": ["0.05", "0.05", "0.05", "0.05", "0.05", "0.05", "0.05", "0.05", "0.05", "0.04", "0.04", "0.03", "0.00", "0.00", "0.00", "0.00", "0.00", "0.00"],"%_of_boot_mem": ["67.23", "67.23", "67.23", "67.22", "67.21", "67.18", "67.11", "67.04", "66.94", "66.83", "66.77", "66.66", "66.53", "66.38", "65.87", "65.07", "63.05", "61.43"], "nid": "8", "cname": "c0-0c0s2n0", "boot_freemem": 32432628, "Node_0_zone_DMA32": ["6.07", "6.07", "6.07", "6.07", "6.07", "6.07", "6.07", "6.06", "6.05", "6.04", "6.01", "5.94", "5.86", "5.76", "5.46", "4.85", "3.23", "3.23"], "Node_0_zone_Normal": ["61.11", "61.11", "61.11", "61.11", "61.09", "61.07", "60.99", "60.93", "60.84", "60.75", "60.72", "60.70", "60.67", "60.62", "60.42", "60.22", "59.81", "58.20"]}