To set up node labels for the purpose of scheduling YARN applications (including
MapReduce applications) on a specific node or group of nodes:
-
Create a text file and specify the labels you want to use for the nodes in your
cluster. In this example, the file is named
node.labels.
-
Copy the file to a location on data-fabric filesystem where it will not be modified or deleted, such as
/var/mapr.
hadoop fs -put ~/node.labels /var/mapr
-
Edit
yarn-site.xml on all ResourceManager nodes and set the
node.labels.file parameter and the optional
node.labels.monitor.interval parameter as shown:
<property>
<name>node.labels.file</name>
<value>/var/mapr/node.labels</value>
<description>The path to the node labels file.</description>
</property>
<property>
<name>node.labels.monitor.interval</name>
<value>120000</value>
<description>Interval for checking the labels file for updates (default is 120000 ms)</description>
</property>
-
For this and subsequent changes to take effect, issue either of the following commands to manually tell the ResourceManager to reload the node labels file:
-
For any YARN applications, including MapReduce applications, enter
yarn
rmadmin -refreshLabels
-
For MapReduce applications, enter
mapred job -refreshLabels
-
Verify that labels are implemented correctly by running either of the following
commands:
yarn rmadmin -showLabels
mapred job -showlabels
The following flowchart summarizes these steps. In addition, the flowchart introduces
the concept of queue labels for the Fair Scheduler and the Capacity Scheduler.
