AnsweredAssumed Answered

What is the recommended value for spark.port.maxRetries? 

Question asked by davidehle on Apr 13, 2017
Latest reply on Apr 24, 2017 by maprcommunity

What is the MapR recommended value for spark.port.maxRetries in a MapR v5.2.1/Yarn/Spark 2.1 environment?         


The default value is 16, which seems to create an (unintentional ?) limit of 16 simultaneously submitted spark applications when all are submitted from the same edge node. 

This value is defined as: 

Maximum number of retries when binding to a port before giving up. When a port is given a specific value (non 0), each subsequent retry will increment the port used in the previous attempt by 1 before retrying. This essentially allows it to try a range of ports from the start port specified to port + maxRetries.

What overhead or issues might be incurred from raising this value?  What would be the upper limit (other than port collisions with other applications)?

Is there some other issue that I should be addressing that is causing the spark.port.maxRetries value to be the upper limit for concurrent spark jobs in a yarn cluster?

 

Thanks!

David. 

Outcomes