AnsweredAssumed Answered

Using Node Labels with Spark

Question asked by apuranik on Apr 5, 2018
Latest reply on Apr 24, 2018 by apuranik

I am trying to use node based labeling with spark applications. I have added the label entry in the yarn.node.labels.file. However, when I use the "spark.yarn.executor.nodeLabelExpression=<label>" property with spark submit, it does not spin up executors on the labelled node only, but on all the available nodes in the cluster. In verbose mode, the submission does show that this is a part of the system properties for this application (spark.yarn.executor.nodeLabelExpression -> <label>) but for some reason it is being ignored. 

When I submit a sample MapReduce Pi job with the "-Dmapreduce.job.label=<label>" property set, it does run all the task attempts on that one node defined by the label. This indicates that the configuration is correct, but somehow the spark submission is ignoring the label. 

Our cluster is running yarn 2.7, which is greater than the minimum required for node based labeling.

Any help with this is appreciated!