AnsweredAssumed Answered

Running Talend spark job on Mapr 5.2 always hung/error

Question asked by demarton on Sep 19, 2017
Latest reply on Oct 9, 2017 by cathy

hi,

 

I have 4 node cluster of mapr 5.2, with mep 3.0.2.

 

I have been try running Spark batch job created from Talend, but it never seems to ever be successful, it always hung, when I check Yarn history logs, it always have the same error, it starts ok with spark requesting 2 container, but whenever the container given by Yarn, and try to execute it, it always error.

Tried the suggestion here: MapR ticket exception: "Initial job has not accept... - Talend Community  it didn't seems to help at all.

Also tried tuning yarn-site.xml and mapred-site.xml, didn't seems to have any effect, checking rsource manager web UI, I have 4 nodes (dot1-dot4), and didn't seems to be in short of cpu/ram. Anything I can check?

 

Error message:

[root@dot4 ~]# cat /opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1505724345419_0024/container_e21_1505724345419_0024_01_000001/stderr
17/09/19 15:27:57 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
17/09/19 15:27:58 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1505724345419_0024_000001
17/09/19 15:27:58 INFO spark.SecurityManager: Changing view acls to: root
17/09/19 15:27:58 INFO spark.SecurityManager: Changing modify acls to: root
17/09/19 15:27:58 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
17/09/19 15:27:59 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
17/09/19 15:27:59 INFO yarn.ApplicationMaster: Driver now available: 192.168.3.104:50545
17/09/19 15:28:04 INFO yarn.ApplicationMaster$AMEndpoint: Add WebUI Filter. AddWebUIFilter(org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,Map(PROXY_HOSTS -> dot2.lsa.asia, PROXY_URI_BASES -> http://dot2.lsa.asia:8088/proxy/application_1505724345419_0024),/proxy/application_1505724345419_0024)
17/09/19 15:28:04 INFO client.MapRZKBasedRMFailoverProxyProvider: Updated RM address to dot2.lsa.asia/192.168.8.22:8030
17/09/19 15:28:04 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/09/19 15:28:04 INFO yarn.YarnAllocator: Will request 2 executor containers, each with 1 cores and 1408 MB memory including 384 MB overhead
17/09/19 15:28:04 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1, disks:0.0>)
17/09/19 15:28:04 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1, disks:0.0>)
17/09/19 15:28:04 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
17/09/19 15:28:05 INFO impl.AMRMClientImpl: Received new token for : dot4.lsa.asia:38153
17/09/19 15:28:05 INFO impl.AMRMClientImpl: Received new token for : dot3.lsa.asia:35958
17/09/19 15:28:05 INFO yarn.YarnAllocator: Launching container container_e21_1505724345419_0024_01_000002 for on host dot4.lsa.asia
17/09/19 15:28:05 INFO yarn.YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler@192.168.3.104:50545, executorHostname: dot4.lsa.asia
17/09/19 15:28:05 INFO yarn.YarnAllocator: Launching container container_e21_1505724345419_0024_01_000003 for on host dot3.lsa.asia
17/09/19 15:28:05 INFO yarn.YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler@192.168.3.104:50545, executorHostname: dot3.lsa.asia
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Starting Executor Container
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Starting Executor Container
17/09/19 15:28:05 INFO yarn.YarnAllocator: Received 2 containers from YARN, launching executors on 2 of them.
17/09/19 15:28:05 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
17/09/19 15:28:05 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Setting up ContainerLaunchContext
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Setting up ContainerLaunchContext
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Preparing Local resources
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Preparing Local resources
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Prepared Local resources Map(__spark__.jar -> resource { scheme: "maprfs" port: -1 file: "/user/root/.sparkStaging/application_1505724345419_0024/talend-spark-assembly-1.6.1-mapr-1608-hadoop2.7.0-mapr-1607.jar" } size: 154048867 timestamp: 1505809672346 type: FILE visibility: PRIVATE)
17/09/19 15:28:05 INFO yarn.ExecutorRunnable: Prepared Local resources Map(__spark__.jar -> resource { scheme: "maprfs" port: -1 file: "/user/root/.sparkStaging/application_1505724345419_0024/talend-spark-assembly-1.6.1-mapr-1608-hadoop2.7.0-mapr-1607.jar" } size: 154048867 timestamp: 1505809672346 type: FILE visibility: PRIVATE)
Exception in thread "ContainerLauncher-0" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/MRJobConfig
at org.apache.spark.deploy.yarn.Client$$anonfun$20.apply(Client.scala:1204)
at org.apache.spark.deploy.yarn.Client$$anonfun$20.apply(Client.scala:1203)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.deploy.yarn.Client$.getDefaultMRApplicationClasspath(Client.scala:1203)
at org.apache.spark.deploy.yarn.Client$.getMRAppClasspath(Client.scala:1180)
at org.apache.spark.deploy.yarn.Client$.populateHadoopClasspath(Client.scala:1165)
at org.apache.spark.deploy.yarn.Client$.populateClasspath(Client.scala:1271)
at org.apache.spark.deploy.yarn.ExecutorRunnable.prepareEnvironment(ExecutorRunnable.scala:297)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env$lzycompute(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:80)
at org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:68)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.MRJobConfig
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 15 more
Exception in thread "ContainerLauncher-1" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/MRJobConfig
at org.apache.spark.deploy.yarn.Client$$anonfun$20.apply(Client.scala:1204)
at org.apache.spark.deploy.yarn.Client$$anonfun$20.apply(Client.scala:1203)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.deploy.yarn.Client$.getDefaultMRApplicationClasspath(Client.scala:1203)
at org.apache.spark.deploy.yarn.Client$.getMRAppClasspath(Client.scala:1180)
at org.apache.spark.deploy.yarn.Client$.populateHadoopClasspath(Client.scala:1165)
at org.apache.spark.deploy.yarn.Client$.populateClasspath(Client.scala:1271)
at org.apache.spark.deploy.yarn.ExecutorRunnable.prepareEnvironment(ExecutorRunnable.scala:297)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env$lzycompute(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.env(ExecutorRunnable.scala:61)
at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:80)
at org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:68)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Outcomes