AnsweredAssumed Answered

Allocating more memory to YARN?

Question asked by reedv on Jan 22, 2018
Latest reply on Jan 23, 2018 by deborah

Reading this blog post to find the parameter to change to dedicate more memory of each node in the cluster to YARN, instructs to edit the parameter: 

yarn.scheduler.minimum-allocation-mb

Looking at the configuration page in the YARN web UI, the config parameter for setting the total memory allocation to yarn from each node is listed as

....
<property>
   <name>yarn.nodemanager.resource.memory-mb</name>
   <value>${nodemanager.resource.memory-mb}</value>
   <source>yarn-default.xml</source>
</property>
....

so I take this to mean would need to edit the yarn-default.xml file in $HADOOP_HOME (and restart YARN/the cluster?), but where is this file? Can I just add this property to the $HADOOP_HOME/etc/hadoop/yarn-site.xml files for all the nodes?

Looking in the mapr $HADOOP_HOME/etc/hadoop, could only find

[root@mapr002 hadoop-2.7.0]# ls etc/hadoop/yarn*
etc/hadoop/yarn-env.sh etc/hadoop/yarn-site.xml.template
etc/hadoop/yarn-site-2017-12-14.17-02.xml etc/hadoop/yarn-timelineserver-properties.xml
etc/hadoop/yarn-site-2017-12-14.17-56.xml etc/hadoop/yarn-timelineserver-security-properties.xml
etc/hadoop/yarn-site.xml

and none of these files contained any mention of the resource.memory-mb property. Further searching (after reading this blog post), lead me to look in the location /opt/mapr/conf/conf.d (should I have used /opt/mapr/conf/conf.d.new?) which has the file warden.resourcemanager.conf where I'd expect to get to set

YARN_NODEMANAGER_OPTS= -Dnodemanager.resource.memory-mb=10817

but there seems to be no such variable there.

Looking in /opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/yarn-env.sh, there are the lines

....

YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -Dfs.cache.lru.enable=true"
export YARN_RESOURCEMANAGER_OPTS="${YARN_RESOURCEMANAGER_OPTS} ${MAPR_LOGIN_OPTS}"
export YARN_NODEMANAGER_OPTS="${YARN_NODEMANAGER_OPTS} ${MAPR_LOGIN_OPTS}"
export YARN_HISTORYSERVER_OPTS="${YARN_HISTORYSERVER_OPTS} ${MAPR_LOGIN_OPTS}"
export YARN_TIMELINESERVER_OPTS="${YARN_TIMELINESERVER_OPTS} ${MAPR_LOGIN_OPTS}"

....

Yet the command "echo ${YARN_RESOURCEMANAGER_OPTS} " returns nothing. So, what should be done here? Should I just edit the line in yarn-env.sh is to look like

YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -Dfs.cache.lru.enable=true -Dnodemanager.resource.memory-mb=<desired per-node mem. allocation amount>"

 

As an aside, is there a way to have certain node contribute more memory that others? I ask because ultimately, in our cluster, some nodes have more memory allocated to them than others that they can spare, so it would be very convenient it we could just use resources from those nodes specifically.

version: mapr 6.0

installed via installer script web GUI.

Outcomes