AnsweredAssumed Answered

Issue in running hive udf written in python

Question asked by Karthee on Nov 29, 2017
Latest reply on Dec 11, 2017 by madumoulin

Hi There,


 Our team have written a hive UDF in Python for the table's having some million rows, but when we run it in hive shell, it throws the below error,(Please find the attached for the hive log and yarn application log)




Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20001]: An error occurred while reading or writing to your custom script. It may have crashed with an error.
at org.apache.hadoop.hive.ql.exec.ScriptOperator.process(
at org.apache.hadoop.hive.ql.exec.Operator.forward(
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(
at org.apache.hadoop.hive.ql.exec.Operator.forward(
at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(
at org.apache.hadoop.hive.ql.exec.MapOperator.process(
... 9 more
Caused by: Broken pipe
at Method)
at org.apache.hadoop.hive.ql.exec.TextRecordWriter.write(
at org.apache.hadoop.hive.ql.exec.ScriptOperator.process(
... 15 more


1.But if we limit the query result in the hive script with the limit 20k or 30K (limit 30000) - Successful Execution.

2.I have tried with map reduce and Tez-0.8 as well. But the result is same.


My hive script :


add FILE ///home/mapr/;

set hive.vectorized.execution.enabled=false;
set hive.vectorized.execution.reduce.enabled=false;

TRANSFORM(account_internal_seq_no, fst_trans_mnth, current_trans_mnth, trans_profile_map, mnths_in_areers_map)

USING 'python'

as (isn, str_history_date_index, str_pp_mia,str_pp_hist, str_pp_amnesty)
from test.pp_strings


and the python scripts is, (please find the attached)



Would really appreciate your assistance to sort out this issue!