Spark Troubleshooting guide: Debugging Spark Applications: How to add MapR file client debug from Spark application

Document created by hdevanath Employee on Jun 19, 2017Last modified by hdevanath Employee on Jun 19, 2017
Version 3Show Document
  • View in full screen mode

To troubleshoot any Spark jobs interacting with the MapR File System (MFS) you can use the options specified below. For troubleshooting situations needing MapR Tech Support's assistance, we recommend you run these diagnostics and submit the logs along with a Tech Support case. This typically expedites a value-added initial response from the Support.
Scenario 1) Enable file client debug for Spark Java Program

package test; import; 
import java.util.Arrays; import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.log4j.Level;
import org.apache.spark.SparkConf;

public class Word {
public static void main (String args[]){  
   SparkConf conf = new SparkConf().setAppName("Eclipse").setMaster("yarn");  
   JavaSparkContext jsc = new JavaSparkContext(conf);  
   List<Integer> data = Arrays.asList(1, 2, 3, 4, 5);  
   JavaRDD<Integer> distData = jsc.parallelize(data);
   //Test 1 - Set debug for all executors.  
   //Test 2 - Set file client debug  
   jsc.hadoopConfiguration().set("fs.mapr.trace", "debug");    
   JavaRDD<String> file = jsc.textFile("/avro/install.log");  
   long wordCount = file.count();  
   System.out.println("Number of words in file : "+wordCount); } }

Scenario 2) Enable file client debug for Spark Scala Program

spark-shell > 
import org.apache.spark.{SparkConf, SparkContext}
val df = sc.textFile("/avro/install.log")