AnsweredAssumed Answered

Spark Jobs in MapR Data Science Refinery Stall Forever

Question asked by PETER.EDIKE on May 25, 2018
Latest reply on May 30, 2018 by maprcommunity

Hello everyone,

 

I am having an issue running spark scripts from zeppelin both in pyspark and spark interpreters. When i hit the play button for anything as simple as an import, I get the following exception.

java.lang.NullPointerException
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:391)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:380)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

 

I check the resource manager Logs and I see the following Errors.

 

Container exited with a non-zero exit code 10
Failing this attempt. Failing the application. APPID=application_1527232866021_0008
2018-05-25 09:55:20,070 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1527232866021_0008,name=Zeppelin,user=mapr,queue=root.mapr,state=FAILED,trackingUrl=http://BGDTEST4.INTERSWITCH.COM:8088/cluster/app/application_1527232866021_0008,appMasterHost=N/A,startTime=1527238308241,finishTime=1527238520039,finalStatus=FAILED,memorySeconds=216252,vcoreSeconds=211,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0\, disks:0.0>,applicationType=SPARK
2018-05-25 09:59:55,737 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 9
2018-05-25 10:00:02,679 WARN org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific max attempts: 0 for application: 9 is invalid, because it is out of the range [1, 2]. Use the global max attempts instead.
2018-05-25 10:00:02,680 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 9 submitted by user mapr
2018-05-25 10:00:02,680 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr IP=172.25.10.230 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1527232866021_0009
2018-05-25 10:00:02,680 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1527232866021_0009
2018-05-25 10:00:02,680 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1527232866021_0009 State change from NEW to NEW_SAVING
2018-05-25 10:00:02,680 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1527232866021_0009
2018-05-25 10:00:02,685 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Storing info for app: application_1527232866021_0009 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/application_1527232866021_0009
2018-05-25 10:00:02,699 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1527232866021_0009 State change from NEW_SAVING to SUBMITTED
2018-05-25 10:00:02,700 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1527232866021_0009 from user: mapr, in queue: default, currently num of applications: 2
2018-05-25 10:00:02,700 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1527232866021_0009 State change from SUBMITTED to ACCEPTED
2018-05-25 10:00:02,700 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,700 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from NEW to SUBMITTED
2018-05-25 10:00:02,701 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1527232866021_0009_000001 to scheduler from user: mapr
2018-05-25 10:00:02,701 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from SUBMITTED to SCHEDULED
2018-05-25 10:00:02,781 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_01_000001 Container Transitioned from NEW to ALLOCATED
2018-05-25 10:00:02,781 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1527232866021_0009 CONTAINERID=container_e22_1527232866021_0009_01_000001
2018-05-25 10:00:02,781 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e22_1527232866021_0009_01_000001 of capacity <memory:1024, vCores:1, disks:0.0> on host BGDTEST1.INTERSWITCH.COM:8099, which has 3 containers, <memory:6144, vCores:3, disks:0.0> used and <memory:42414, vCores:3, disks:1.33> available after allocation
2018-05-25 10:00:02,782 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : BGDTEST1.INTERSWITCH.COM:8099 for container : container_e22_1527232866021_0009_01_000001
2018-05-25 10:00:02,783 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_01_000001 Container Transitioned from ALLOCATED to ACQUIRED
2018-05-25 10:00:02,783 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,783 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1527232866021_0009 AttemptId: appattempt_1527232866021_0009_000001 MasterContainer: Container: [ContainerId: container_e22_1527232866021_0009_01_000001, NodeId: BGDTEST1.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST1.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.151:8099 }, ]
2018-05-25 10:00:02,783 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from SCHEDULED to ALLOCATED_SAVING
2018-05-25 10:00:02,784 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Storing info for attempt: appattempt_1527232866021_0009_000001 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,790 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from ALLOCATED_SAVING to ALLOCATED
2018-05-25 10:00:02,791 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1527232866021_0009_000001
2018-05-25 10:00:02,793 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_e22_1527232866021_0009_01_000001, NodeId: BGDTEST1.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST1.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.151:8099 }, ] for AM appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,793 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_e22_1527232866021_0009_01_000001 : {{JAVA_HOME}}/bin/java,-server,-Xmx512m,-Djava.io.tmpdir={{PWD}}/tmp,-Dspark.yarn.app.container.log.dir=<LOG_DIR>,org.apache.spark.deploy.yarn.ExecutorLauncher,--arg,'172.35.15.230:11000',--properties-file,{{PWD}}/__spark_conf__/__spark_conf__.properties,1>,<LOG_DIR>/stdout,2>,<LOG_DIR>/stderr
2018-05-25 10:00:02,793 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,793 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,808 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_e22_1527232866021_0009_01_000001, NodeId: BGDTEST1.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST1.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.151:8099 }, ] for AM appattempt_1527232866021_0009_000001
2018-05-25 10:00:02,808 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from ALLOCATED to LAUNCHED
2018-05-25 10:00:03,781 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_01_000001 Container Transitioned from ACQUIRED to RUNNING
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_01_000001 Container Transitioned from RUNNING to COMPLETED
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_e22_1527232866021_0009_01_000001 in state: COMPLETED event:FINISHED
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1527232866021_0009 CONTAINERID=container_e22_1527232866021_0009_01_000001
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_e22_1527232866021_0009_01_000001 of capacity <memory:1024, vCores:1, disks:0.0> on host BGDTEST1.INTERSWITCH.COM:8099, which currently has 2 containers, <memory:5120, vCores:2, disks:0.0> used and <memory:43438, vCores:4, disks:1.33> available, release resources=true
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1527232866021_0009_000001 released container container_e22_1527232866021_0009_01_000001 on node: host: BGDTEST1.INTERSWITCH.COM:8099 #containers=2 available=<memory:43438, vCores:4, disks:1.33> used=<memory:5120, vCores:2, disks:0.0> with event: FINISHED
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1527232866021_0009_000001 with final state: FAILED, and exit status: 10
2018-05-25 10:01:47,882 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from LAUNCHED to FINAL_SAVING
2018-05-25 10:01:47,883 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Updating info for attempt: appattempt_1527232866021_0009_000001 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/appattempt_1527232866021_0009_000001
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1527232866021_0009_000001
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1527232866021_0009_000001
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000001 State change from FINAL_SAVING to FAILED
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1527232866021_0009_000001 is done. finalState=FAILED
2018-05-25 10:01:47,901 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from NEW to SUBMITTED
2018-05-25 10:01:47,902 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1527232866021_0009 requests cleared
2018-05-25 10:01:47,902 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1527232866021_0009_000002 to scheduler from user: mapr
2018-05-25 10:01:47,902 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from SUBMITTED to SCHEDULED
2018-05-25 10:01:47,960 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_02_000001 Container Transitioned from NEW to ALLOCATED
2018-05-25 10:01:47,960 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1527232866021_0009 CONTAINERID=container_e22_1527232866021_0009_02_000001
2018-05-25 10:01:47,960 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e22_1527232866021_0009_02_000001 of capacity <memory:1024, vCores:1, disks:0.0> on host BGDTEST2.INTERSWITCH.COM:8099, which has 2 containers, <memory:4096, vCores:2, disks:0.0> used and <memory:44462, vCores:4, disks:1.33> available after allocation
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : BGDTEST2.INTERSWITCH.COM:8099 for container : container_e22_1527232866021_0009_02_000001
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_02_000001 Container Transitioned from ALLOCATED to ACQUIRED
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1527232866021_0009 AttemptId: appattempt_1527232866021_0009_000002 MasterContainer: Container: [ContainerId: container_e22_1527232866021_0009_02_000001, NodeId: BGDTEST2.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST2.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.152:8099 }, ]
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from SCHEDULED to ALLOCATED_SAVING
2018-05-25 10:01:47,961 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Storing info for attempt: appattempt_1527232866021_0009_000002 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,966 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from ALLOCATED_SAVING to ALLOCATED
2018-05-25 10:01:47,966 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1527232866021_0009_000002
2018-05-25 10:01:47,967 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_e22_1527232866021_0009_02_000001, NodeId: BGDTEST2.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST2.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.152:8099 }, ] for AM appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,967 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to launch container container_e22_1527232866021_0009_02_000001 : {{JAVA_HOME}}/bin/java,-server,-Xmx512m,-Djava.io.tmpdir={{PWD}}/tmp,-Dspark.yarn.app.container.log.dir=<LOG_DIR>,org.apache.spark.deploy.yarn.ExecutorLauncher,--arg,'172.35.15.230:11000',--properties-file,{{PWD}}/__spark_conf__/__spark_conf__.properties,1>,<LOG_DIR>/stdout,2>,<LOG_DIR>/stderr
2018-05-25 10:01:47,967 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,967 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,980 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_e22_1527232866021_0009_02_000001, NodeId: BGDTEST2.INTERSWITCH.COM:8099, NodeHttpAddress: BGDTEST2.INTERSWITCH.COM:8042, Resource: <memory:1024, vCores:1, disks:0.0>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.25.10.152:8099 }, ] for AM appattempt_1527232866021_0009_000002
2018-05-25 10:01:47,980 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from ALLOCATED to LAUNCHED
2018-05-25 10:01:48,961 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_02_000001 Container Transitioned from ACQUIRED to RUNNING
2018-05-25 10:03:34,075 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e22_1527232866021_0009_02_000001 Container Transitioned from RUNNING to COMPLETED
2018-05-25 10:03:34,075 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_e22_1527232866021_0009_02_000001 in state: COMPLETED event:FINISHED
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1527232866021_0009 CONTAINERID=container_e22_1527232866021_0009_02_000001
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_e22_1527232866021_0009_02_000001 of capacity <memory:1024, vCores:1, disks:0.0> on host BGDTEST2.INTERSWITCH.COM:8099, which currently has 1 containers, <memory:3072, vCores:1, disks:0.0> used and <memory:45486, vCores:5, disks:1.33> available, release resources=true
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1527232866021_0009_000002 released container container_e22_1527232866021_0009_02_000001 on node: host: BGDTEST2.INTERSWITCH.COM:8099 #containers=1 available=<memory:45486, vCores:5, disks:1.33> used=<memory:3072, vCores:1, disks:0.0> with event: FINISHED
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1527232866021_0009_000002 with final state: FAILED, and exit status: 10
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from LAUNCHED to FINAL_SAVING
2018-05-25 10:03:34,076 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Updating info for attempt: appattempt_1527232866021_0009_000002 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/appattempt_1527232866021_0009_000002
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1527232866021_0009_000002
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1527232866021_0009_000002
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1527232866021_0009_000002 State change from FINAL_SAVING to FAILED
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1527232866021_0009 with final state: FAILED
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1527232866021_0009 State change from ACCEPTED to FINAL_SAVING
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1527232866021_0009
2018-05-25 10:03:34,088 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1527232866021_0009_000002 is done. finalState=FAILED
2018-05-25 10:03:34,089 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: Updating info for app: application_1527232866021_0009 at: /var/mapr/cluster/yarn/rm/system/FSRMStateRoot/RMAppRoot/application_1527232866021_0009/application_1527232866021_0009
2018-05-25 10:03:34,089 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1527232866021_0009 requests cleared
2018-05-25 10:03:34,103 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1527232866021_0009 failed 2 times due to AM Container for appattempt_1527232866021_0009_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://BGDTEST4.INTERSWITCH.COM:8088/cluster/app/application_1527232866021_0009Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e22_1527232866021_0009_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:563)
at org.apache.hadoop.util.Shell.run(Shell.java:460)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:748)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:305)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:356)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shell output: main : command provided 1
main : user is mapr
main : requested yarn user is mapr
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application.
2018-05-25 10:03:34,103 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1527232866021_0009 State change from FINAL_SAVING to FAILED
2018-05-25 10:03:34,103 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=mapr OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1527232866021_0009 failed 2 times due to AM Container for appattempt_1527232866021_0009_000002 exited with exitCode: 10
For more detailed output, check application tracking page:http://BGDTEST4.INTERSWITCH.COM:8088/cluster/app/application_1527232866021_0009Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e22_1527232866021_0009_02_000001
Exit code: 10
Stack trace: ExitCodeException exitCode=10:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:563)
at org.apache.hadoop.util.Shell.run(Shell.java:460)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:748)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:305)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:356)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Shell output: main : command provided 1
main : user is mapr
main : requested yarn user is mapr
Container exited with a non-zero exit code 10
Failing this attempt. Failing the application. APPID=application_1527232866021_0009
2018-05-25 10:03:34,104 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1527232866021_0009,name=Zeppelin,user=mapr,queue=root.mapr,state=FAILED,trackingUrl=http://BGDTEST4.INTERSWITCH.COM:8088/cluster/app/application_1527232866021_0009,appMasterHost=N/A,startTime=1527238802679,finishTime=1527239014088,finalStatus=FAILED,memorySeconds=216285,vcoreSeconds=211,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0\, disks:0.0>,applicationType=SPARK

 


Even more interesting is that if I submit regular jobs via spark-submit script, they run fine.

Outcomes