AnsweredAssumed Answered

Spark Streaming YARN Log Management

Question asked by john.humphreys on Jul 14, 2017
Latest reply on Jul 24, 2017 by cathy

We'll be deploying a spark streaming app sometime soon on YARN.

 

I've noticed during testing that the logs in YARN grow absolutely huge; to the point where if you need to check something from yesterday, it can take a huge amount of time to load in the browser.  Clearly, if this keeps up, it will not even be loadable.

 

How do people manage YARN logs for spark streaming?

  • Can I set rolling?
  • Can I set retention time?
  • Anything else useful to set?
  • Are there external tools people tend to use (I'd like to avoid that probably).

Outcomes