Enabling YARN Local-Node Log Aggregation

The steps in this procedure configure log aggregation for NodeManager processes, enabling you to store the logs on nodes (node-local volumes) where the YARN containers are launched.

To enable YARN local-node log aggregation, add or edit the following properties in the yarn-site.xml file:

  1. Set the value of yarn.node-local-log-aggregation.enable to true.
    Note: The default setting for YARN Log Aggregation (yarn.log-aggregation-enable) should be removed or set to false in the yarn-site.xml file.
  2. Optional: Set the value of yarn.node-local-log-aggregation.metadata-path to a location in the system. By default the location is maprfs:///NM_REMOTE_APP_LOG_DIR/<user>/logsMeta. NM_REMOTE_APP_LOG_DIR should match the yarn.nodemanager.remote-app-log-dir property.
    Note: The location should not be an absolute path (the location begins from /). In the filesystem (maprfs), the default setting for NM_REMOTE_APP_LOG_DIR is /tmp/logs.
  3. Optional: Set the value of node-local-log-aggregation.metadata-filename to the name of the metafile that should contain the information about containers for each node. By default, the file name is containers.seq, so if you use the default paths, the file will be stored at /tmp/logs//logsMeta/<appId>/<nodeName>/containers.seq.
  4. Restart the NodeManager and HistoryServer services.

Aggregated logs are owned by the user who runs the job.

Different users cannot see each other's logs. For example, when the data-fabric user admin runs a job, the logs are stored in maprfs:///var/mapr/local/<nodeNames>/mapred/nodeManager/logs/admin/<appId>. If a user analyst runs a job, the logs are stored in maprfs:///var/mapr/local/<nodeNames>/mapred/nodeManager/logs/analyst/<appId>.