Skip navigation

 RSS Is it possible to run an Oozie Spark Action without specifying inputDir & outputDir

RSS posted on behalf of Newest questions tagged mapr - Stack Overflow 4 weeks ago

    According to https://oozie.apache.org/docs/3.3.1/WorkflowFunctionalSpec.html#a4.1_Workflow_Job_Properties_or_Parameters we know ..

    When submitting a workflow job for the workflow definition above, 3 workflow job properties must be specified:  jobTracker: inputDir: outputDir:

    I have a PySpark script that has specified input & output locations in the script itself. I don't need and want an inputDir and outputDir in my workflow XML. When running my PySpark script via Oozie, I get this error message.

    WARN ParameterVerifier:523 - SERVER[] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition   WARN JobResourceUploader:64 - SERVER[] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.  2018-05-24 11:52:29,844  WARN JobResourceUploader:171 - SERVER[] No job jar file set.  User classes may not be found. See Job or Job#setJar(String).

    Based on https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/util/ParameterVerifier.java , my first warning is caused by the fact that I dont have a "inputDir"

    else {                 // Log a warning when the  section is missing                 XLog.getLog(ParameterVerifier.class).warn("The application does not define formal parameters in its XML "                         + "definition");             }

    Can I get around this at all ?


    shared via RSS

      Comments