AnsweredAssumed Answered

Oozie: Workflow job getting killed

Question asked by asingh on Jan 3, 2013
While kicking in workflow from coordinator, the workflow gets killed with error:

    2013-01-03 09:04:27,234 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Making jobClient call
    2013-01-03 09:04:27,335 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] jobClient call is successful
    2013-01-03 09:04:27,336 DEBUG HiveActionExecutor:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Action check is done after submission
    2013-01-03 09:04:27,337  WARN ActionStartXCommand:539 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] [***0000227-121127101924844-oozie-root-W@hive-add-partition***]Action status=RUNNING
    2013-01-03 09:04:27,342  WARN ActionStartXCommand:539 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] [***0000227-121127101924844-oozie-root-W@hive-add-partition***]Action updated in DB!
    2013-01-03 09:04:27,342 DEBUG ActionStartXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] ENDED ActionStartXCommand for wf actionId=0000227-121127101924844-oozie-root-W@hive-add-partition, jobId=0000227-121127101924844-oozie-root-W
    2013-01-03 09:04:27,342 DEBUG ActionStartXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing [1] commands with delay [0]ms
    2013-01-03 09:04:27,342 DEBUG ActionStartXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Released lock for [0000227-121127101924844-oozie-root-W] in [action.start]
    2013-01-03 09:04:56,312 DEBUG CallbackServlet:542 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Received a CallbackServlet.doGet() with query string id=0000227-121127101924844-oozie-root-W@hive-add-partition&status=SUCCEEDED&
    2013-01-03 09:04:56,313  INFO CallbackServlet:536 - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] callback for action [0000227-121127101924844-oozie-root-W@hive-add-partition]
    2013-01-03 09:04:56,315 DEBUG CompletedActionXCommand:542 - USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Load state for [null]
    2013-01-03 09:04:56,316 DEBUG CompletedActionXCommand:542 - USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Precondition check for command [callback] key [null]
    2013-01-03 09:04:56,316 DEBUG CompletedActionXCommand:542 - USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Execute command [callback] key [null]
    2013-01-03 09:04:56,316 DEBUG CompletedActionXCommand:542 - USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing [1] commands with delay [0]ms
    2013-01-03 09:04:56,320 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Acquired lock for [0000227-121127101924844-oozie-root-W] in [action.check]
    2013-01-03 09:04:56,321 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Load state for [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,321 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Precondition check for command [action.check] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,321 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Execute command [action.check] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,321 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] STARTED ActionCheckXCommand for wf actionId=0000227-121127101924844-oozie-root-W@hive-add-partition priority =2
    2013-01-03 09:04:56,347 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Making jobClient call
    2013-01-03 09:04:56,353 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] jobClient call is successful
    2013-01-03 09:04:56,369  INFO HiveActionExecutor:536 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] action completed, external ID [job_201212211214_0236]
    2013-01-03 09:04:56,384  WARN HiveActionExecutor:539 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [11]
    2013-01-03 09:04:56,384 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Making jobClient call
    2013-01-03 09:04:56,485 DEBUG MaprJobClient:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] jobClient call is successful
    2013-01-03 09:04:56,505 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] ENDED ActionCheckXCommand for wf actionId=0000227-121127101924844-oozie-root-W@hive-add-partition, jobId=0000227-121127101924844-oozie-root-W
    2013-01-03 09:04:56,506 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing [1] commands with delay [0]ms
    2013-01-03 09:04:56,506 DEBUG ActionCheckXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Released lock for [0000227-121127101924844-oozie-root-W] in [action.check]
    2013-01-03 09:04:56,509 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Precondition check for command [action.end] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,509 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Execute command [action.end] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,509 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] STARTED ActionEndXCommand for action 0000227-121127101924844-oozie-root-W@hive-add-partition
    2013-01-03 09:04:56,510 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] End, name [hive-add-partition] type [hive] status[DONE] external status [FAILED/KILLED] signal value [null]
    2013-01-03 09:04:56,539  INFO ActionEndXCommand:536 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] ERROR is considered as FAILED for SLA
    2013-01-03 09:04:56,540 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing commands for action=0000227-121127101924844-oozie-root-W@hive-add-partition, status=ERROR, Set pending=true
    2013-01-03 09:04:56,545 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] ENDED ActionEndXCommand for action 0000227-121127101924844-oozie-root-W@hive-add-partition
    2013-01-03 09:04:56,545 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing [2] commands with delay [0]ms
    2013-01-03 09:04:56,545 DEBUG ActionEndXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Released lock for [0000227-121127101924844-oozie-root-W] in [action.end]
    2013-01-03 09:04:56,546 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Acquired lock for [0000227-121127101924844-oozie-root-W] in [signal]
    2013-01-03 09:04:56,546 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Load state for [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,548 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Precondition check for command [signal] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,548 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Execute command [signal] key [0000227-121127101924844-oozie-root-W]
    2013-01-03 09:04:56,548 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] STARTED SignalCommand for jobid=0000227-121127101924844-oozie-root-W, actionId=0000227-121127101924844-oozie-root-W@hive-add-partition
    2013-01-03 09:04:56,549 DEBUG LiteWorkflowInstance:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Signaling job execution path [/] signal value [ERROR]
    2013-01-03 09:04:56,549 DEBUG LiteWorkflowInstance:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Exiting node [hive-add-partition] with transition[/#fail]
    2013-01-03 09:04:56,549 DEBUG LiteWorkflowInstance:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Signaling job execution path [/] signal value [::synch::]
    2013-01-03 09:04:56,549 DEBUG LiteWorkflowInstance:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Completing job, kill node [fail]
    2013-01-03 09:04:56,563 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Try to resolve KillNode message for jobid [0000227-121127101924844-oozie-root-W], actionId [0000227-121127101924844-oozie-root-W@hive-add-partition], before resolve [Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]], after resolve [Hive failed, error message[Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [11]]]
    2013-01-03 09:04:56,571 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Updated the workflow status to 0000227-121127101924844-oozie-root-W  status =KILLED
    2013-01-03 09:04:56,607 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] ENDED SignalCommand for jobid=0000227-121127101924844-oozie-root-W, actionId=0000227-121127101924844-oozie-root-W@hive-add-partition
    2013-01-03 09:04:56,607 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Queuing [1] commands with delay [0]ms
    2013-01-03 09:04:56,607 DEBUG SignalXCommand:542 - USER[asingh2] GROUP[users] TOKEN[] APP[hive-add-partition-wf] JOB[0000227-121127101924844-oozie-root-W] ACTION[0000227-121127101924844-oozie-root-W@hive-add-partition] Released lock for [0000227-121127101924844-oozie-root-W] in [signal]

Outcomes