Spark Troubleshooting guide: Tuning Spark: How do I tune the Spark History Server for large event logs?

Document created by hdevanath Employee on Jun 19, 2017Last modified by hdevanath Employee on Jun 19, 2017
Version 2Show Document
  • View in full screen mode

The Spark History Server displays information about the history of completed Spark applications. Spark History server is used to view the statistics after the job completion.

 

 

Step 1) By Default, SPARK_DAEMON_MEMORY is allocated with 1GB. You can modify this limit in the /opt/mapr/spark/spark-2.1.0/conf/ spark-env.sh setting. Just remember to restart the history server for the updated setting to take effect.

export SPARK_DAEMON_MEMORY=5g

 
Step 2) Validate the changes

ps -ef | grep org.apache.spark.deploy.history.HistoryServer 
mapr     14373     1  0 May23 ?        00:36:40 /usr/lib/jvm/java-1.8.0/bin/java -cp /opt/mapr/spark/spark-1.6.1/conf/:/opt/mapr/spark/spark-1.6.1/lib/spark-assembly-1.6.1-mapr-1611-hadoop2.7.0-mapr-1602.jar -Dspark.history.fs.logDirectory=maprfs:///apps/spark -Dspark.ui.acls.enable=true -Xms5g -Xmx5g org.apache.spark.deploy.history.HistoryServer


Step 3) Go to Spark UI for history server to view the logs:
 

http://<hostname>:18080/


 

Attachments

    Outcomes