AnsweredAssumed Answered

Tasktracker Jobs Fail With Permission Denied Executing taskjvm.sh

Question asked by mapr_newbie on Jun 12, 2014
Latest reply on Jun 16, 2014 by nabeel
Hi All,

We have just installed MapR M7 on an 8-node cluster.  I've put data into the MapR-FS.  With Hive, I've created an external table on top of that data and am able to do a simple select * on the entire table (no mapreduce jobs spawned) and get back results.

But when I run a query that needs to spawn mapreduce jobs (select count(*)), all the tasktracker jobs fail on all the nodes with the same permissioning error.  The actual error in the log is:

    2014-06-12 15:15:01,742 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Couldn't execute the task jvm file /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache/job_201406110957_0007/attempt_201406110957_0007_m_000000_4/work/taskjvm.sh - Permission denied

I'm running the jobs under my own userid (vsalvi) and the cluster admin account is hdp9adm.

The directory tree (and permissions) for the local public cache working directory is the following:

    namei -mo /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache
        f: /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache
        
        dr-xr-xr-x root    root     /
        drwxrwxrwt root    root     /tmp
        drwxr-xr-x root    root     /tmp/mapr-hadoop
        drwxr-xr-x root    root     /tmp/mapr-hadoop/mapred
        drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local
        drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local/taskTracker
        drwxr-s--- vsalvi  hdp9adms /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi
        drwxr-s--- vsalvi  hdp9adms /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache

And for the private cache working directory:

    namei -mo /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker/vsalvi/jobcache
    f: /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker/vsalvi/jobcache
    dr-xr-xr-x root    root     /
    drwxrwxrwt root    root     /tmp
    drwxr-xr-x root    root     /tmp/mapr-hadoop
    drwxr-xr-x root    root     /tmp/mapr-hadoop/mapred
    drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local
    drwx------ hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local/ttprivate
    drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker
    drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker/vsalvi
    drwxr-xr-x hdp9adm hdp9adms /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker/vsalvi/jobcache

I've stopped the warden, deleted the directory /tmp/mapr-hadoop, and restarted the warden but still get the same permission issues.

This was installed on Linux Redhat machines.

Here is the tasktracker log:

    2014-06-12 15:15:00,250 INFO mapred.TaskTracker [main]: LaunchTaskAction (registerTask): attempt_201406110957_0007_m_000000_4 task's state:UNASSIGNED
    2014-06-12 15:15:00,250 INFO mapred.TaskTracker [TaskLauncher for MAP tasks]: Trying to launch : attempt_201406110957_0007_m_000000_4 which needs 1 slots
    2014-06-12 15:15:00,250 INFO mapred.TaskTracker [TaskLauncher for MAP tasks]: In TaskLauncher, current free slots : 28 and trying to launch attempt_201406110957_0007_m_000000_4 which needs 1 slots
    2014-06-12 15:15:00,260 INFO fs.MapRFileSystem [Thread-269]: User hdp9adm is impersonating user vsalvi
    2014-06-12 15:15:01,726 INFO mapred.JvmManager [Thread-271]: In JvmRunner constructed JVM ID: jvm_201406110957_0007_m_-1341195267
    2014-06-12 15:15:01,727 INFO mapred.JvmManager [Thread-271]: JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.
    2014-06-12 15:15:01,729 INFO mapred.TaskLog [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: MapR: Setup Cmds: export JVM_PID=`echo $$`
    export HADOOP_CLIENT_OPTS="-Dhadoop.tasklog.taskid=attempt_201406110957_0007_m_000000_4 -Dhadoop.tasklog.iscleanup=false -Dhadoop.tasklog.totalLogFileSize=0"
    export HADOOP_WORK_DIR="/tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache/job_201406110957_0007/attempt_201406110957_0007_m_000000_4/work"
    export JAVA_LIBRARY_PATH="/opt/mapr/conf:/opt/mapr/hadoop/hadoop-0.20.2/conf:/opt/mapr/hive/hive-0.12/conf:/opt/mapr/pig/pig-0.11.2/conf:/opt/mapr/lib:/opt/mapr/hadoop/hadoop-0.20.2/c++/lib:/opt/mapr/hadoop/hadoop-0.20.2/c++/Linux-amd64-4/lib:/opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64:/opt/java/jdk1.7.0_55/jre/lib/amd64/server:/usr/lib:/usr/lib64"
    export HADOOP_TOKEN_FILE_LOCATION="/tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache/job_201406110957_0007/jobToken"
    export PATH="/opt/java/jdk1.7.0_55/bin:/sbin:/usr/sbin:/bin:/usr/bin:/sbin:/usr/sbin:/bin:/usr/bin:/opt/mapr/warden/:/opt/mapr/lib:/opt/mapr/server"
    export HADOOP_ROOT_LOGGER="INFO,maprfsTLA"
    export LD_LIBRARY_PATH=":/opt/mapr/conf:/opt/mapr/hadoop/hadoop-0.20.2/conf:/opt/mapr/hive/hive-0.12/conf:/opt/mapr/pig/pig-0.11.2/conf:/opt/mapr/lib:/opt/mapr/hadoop/hadoop-0.20.2/c++/lib:/opt/mapr/hadoop/hadoop-0.20.2/c++/Linux-amd64-4/lib:/opt/mapr/hadoop/hadoop-0.20.2/lib/native/Linux-amd64-64:/opt/java/jdk1.7.0_55/jre/lib/amd64/server:/usr/lib:/usr/lib64"
    'ulimit' '-v' '4294967296'
    
    echo 10 > /proc/self/oom_score_adj;taskset -p -c 2-31 $$;renice -n 10 -p $$ 1>/dev/null;
    2014-06-12 15:15:01,729 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Writing commands to /tmp/mapr-hadoop/mapred/local/ttprivate/taskTracker/vsalvi/jobcache/job_201406110957_0007/attempt_201406110957_0007_m_000000_4/taskjvm.sh
    2014-06-12 15:15:01,741 WARN mapred.LinuxTaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Exit code from task is : 7
    2014-06-12 15:15:01,741 WARN mapred.LinuxTaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Exception thrown while launching task JVM : org.apache.hadoop.util.Shell$ExitCodeException:
            at org.apache.hadoop.util.Shell.runCommand(Shell.java:322)
            at org.apache.hadoop.util.Shell.run(Shell.java:249)
            at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:442)
            at org.apache.hadoop.mapred.LinuxTaskController.launchTask(LinuxTaskController.java:250)
            at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:567)
            at org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:537)
    
    2014-06-12 15:15:01,741 INFO mapred.LinuxTaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Output from LinuxTaskController's launchTaskJVM follows:
    2014-06-12 15:15:01,741 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Reading task controller config from /opt/mapr/hadoop/hadoop-0.20.2/conf/taskcontroller.cfg
    2014-06-12 15:15:01,741 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: number of groups = 2
    2014-06-12 15:15:01,741 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: main : command provided 1
    2014-06-12 15:15:01,741 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: main : user is vsalvi
    2014-06-12 15:15:01,741 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: number of groups = 18
    2014-06-12 15:15:01,742 INFO mapred.TaskController [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: Couldn't execute the task jvm file /tmp/mapr-hadoop/mapred/local/taskTracker/vsalvi/jobcache/job_201406110957_0007/attempt_201406110957_0007_m_000000_4/work/taskjvm.sh - Permission denied
    2014-06-12 15:15:01,743 INFO mapred.JvmManager [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: JVM Not killed jvm_201406110957_0007_m_-1341195267 but just removed
    2014-06-12 15:15:01,743 INFO mapred.JvmManager [JVM Runner jvm_201406110957_0007_m_-1341195267 spawned.]: JVM : jvm_201406110957_0007_m_-1341195267 exited with exit code 7. Number of tasks it ran: 0
    2014-06-12 15:15:01,743 WARN mapred.TaskRunner [Thread-271]: attempt_201406110957_0007_m_000000_4 : Child Error
    java.io.IOException: Task process exit with nonzero status of 7.
            at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:262)
    2014-06-12 15:15:04,747 INFO mapred.TaskTracker [Thread-271]: Task attempt_201406110957_0007_m_000000_4 exiting, status: FAILED_UNCLEAN
    2014-06-12 15:15:04,747 INFO mapred.TaskTracker [Thread-271]: addFreeSlot : current free slots : 28
    2014-06-12 15:15:04,776 INFO mapred.TaskTracker [main]: LaunchTaskAction (registerTask): attempt_201406110957_0007_m_000000_4 task's state:FAILED_UNCLEAN



Any help would be greatly appreciated.  Please let me know if any other information is needed.

Thanks!



Outcomes