AnsweredAssumed Answered

Application get stuck at 95%

Question asked by ihijazi on Jul 29, 2016
Latest reply on Oct 25, 2016 by asukhenko

I have a problem with Ozzie stuck at 95%, fire a job and nothing happens. Only way to move it is to kill the stuck application (the 95% which is part of one flow submitted). The logs is giving state 1 for it:

 

Console

 

 

2016-07-29 12:30:45,952 INFO [communication thread] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1469819778648_0002_m_000000_0 is : 1.0

 

2016-07-29 12:31:15,992 INFO [communication thread] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1469819778648_0002_m_000000_0 is : 1.0

 

2016-07-29 12:31:46,040 INFO [communication thread] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1469819778648_0002_m_000000_0 is : 1.0

 

2016-07-29 12:32:16,083 INFO [communication thread] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1469819778648_0002_m_000000_0 is : 1.0

 

2016-07-29 12:32:46,108 INFO [communication thread] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1469819778648_0002_m_000000_0 is : 1.0

 

Here are more details:

 

 

Terminal

 

Terminal

[mapr@maprdemo ~]$ yarn application -list

 

16/07/29 13:12:20 INFO client.RMProxy: Connecting to ResourceManager at maprdemo/10.0.2.15:8032

 

Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2

 

                Application-Id        Application-Name        Application-Type          User         Queue                 State           Final-State           Progress                           Tracking-URL

 

application_1469819778648_0009    oozie:launcher:T=java:W=Hive=>Hive:A=Hive__Hive:ID=0000001-160729130628025-oozie-mapr-W               MAPREDUCE          mapr     root.mapr               RUNNING             UNDEFINED                95%                  http://maprdemo:43288

 

application_1469819778648_0010    INSERT OVERWRITE TABLE default.clic...CLICKS(Stage-1)               MAPREDUCE          mapr     root.mapr              ACCEPTED             UNDEFINED                 0%                                N/A

 

[mapr@maprdemo ~]$ yarn application -kill application_1469819778648_0009

 

16/07/29 13:12:32 INFO client.RMProxy: Connecting to ResourceManager at maprdemo/10.0.2.15:8032

 

Killing application application_1469819778648_0009

 

16/07/29 13:12:33 INFO impl.YarnClientImpl: Killed application application_1469819778648_0009

 

[mapr@maprdemo ~]$ yarn application -status application_1469819778648_0010

 

16/07/29 13:15:16 INFO client.RMProxy: Connecting to ResourceManager at maprdemo/10.0.2.15:8032

 

Application Report :

 

    Application-Id : application_1469819778648_0010

 

    Application-Name : INSERT OVERWRITE TABLE default.clic...CLICKS(Stage-1)

 

    Application-Type : MAPREDUCE

 

    User : mapr

 

    Queue : root.mapr

 

    Start-Time : 1469823130050

 

    Finish-Time : 1469823158880

 

    Progress : 100%

 

    State : FINISHED

 

    Final-State : SUCCEEDED

 

    Tracking-URL : http://maprdemo:19888/jobhistory/job/job_1469819778648_0010

 

    RPC Port : 57068

 

    AM Host : maprdemo

 

    Aggregate Resource Allocation : 77739 MB-seconds, 39 vcore-seconds

 

    Diagnostics :

 

 

 

 

My VM resources are about 12GB of memory, along with 2 v-cores assigned. Using "FairScheduler", and configuration is:

 

 

 

yarn-site.xml

<property>

 

  <name>yarn.scheduler.minimum-allocation-mb</name>

 

  <value>1024</value>

 

</property>

 

<property>

 

  <name>yarn.scheduler.maximum-allocation-mb</name>

 

  <value>8192</value>

 

</property>

 

<property>

 

  <name>yarn.scheduler.minimum-allocation-vcores</name>

 

  <value>1</value>

 

</property>

 

<property>

 

  <name>yarn.scheduler.maximum-allocation-vcores</name>

 

  <value>2</value>

 

</property>

 

<property>

 

  <name>yarn.nodemanager.resource.memory-mb</name>

 

  <value>8192</value>

 

</property>

 

<property>

 

<name>yarn.nodemanager.vmem-pmem-ratio</name>

 

<value>2.1</value>

 

</property>

 

 

The rest is MapR Sandbox defaults (running MapR 5.1).

 

 

Can someone point me to what's wrong? I know if resources are increased it would finish, right? But giving the preceding information, how can optimize my VM? The job being submitted is a very simple, only hive=>hive. I don't mind, for testing/learning purposes, to have a sequential execution of applications, not multiple/threaded (if that helps!)

 

 

Thanks,

 

Issam

 

Outcomes