AnsweredAssumed Answered

Error on writing Hive Table -  Text to ORC (CTAS)

Question asked by Karthee on Sep 22, 2017
Latest reply on Oct 10, 2017 by Murshid Chalaev

Hi Team,

 

My Environment:
          MapR- 5.2
          Hive-2.1
          Tez-0.8
          Abinitio GDE -3.2.7

 

We have a problem with writing a Text format - Hive Table into ORC Table as CTAS. The data set is around 370G. We can't directly write into Hive table (ORC Format), so we writing into Text hive table from Abinitio and then we are trying to do CTAS to the ORC table, 

 

but am getting this error,

 

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1505976130946_0012_2_00, diagnostics=[Task failed, taskId=task_1505976130946_0012_2_00_000006, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1505976130946_0012_2_00_000006_0:org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1505976130946_0012_2_00_000006_0_10003_0/file.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:403) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131) at 

 

and this is my node1's disk space usage,

 

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_local-lv_root
9.8G                        5.4G           3.9G    59%      /
tmpfs            

 127G                      0            127G      0%        /dev/shm
/dev/md0

477M                  100M          352M     22%      /boot
/dev/sda1

500M                   280K          500M     1%      /boot/efi
/dev/mapper/vg_local-lv_home
9.8G                2.3G              7.0G       25%      /home
/dev/mapper/vg_local-lv_opt
35G               26G               7.0G         79%          /opt
/dev/mapper/vg_local-lv_zkdata
2.0G             18M             1.8G        1%     /opt/mapr/zkdata
/dev/mapper/vg_local-lv_tmp
20G                18G            888M       96%       /tmp
/dev/mapper/vg_local-lv_var
9.8G             3.9G            5.4G          42%      /var
/dev/mapper/vg_local-lv_opt_abinitio
99G           24G                70G         26%     /opt/abinitio
/dev/mapper/vg_local-lv_opt_work
99G          8.1G                 86G        9%       /opt/work
/dev/mapper/vg_local-lv_opt_transient
2.0T          214G               1.7T        12%     /opt/transient
/dev/mapper/vg_local-lv_opt_persistence
20G            70M              19G          1%    /opt/persistence
/dev/mapper/vg_local-lv_oracle
3.9G            1.4G             2.3G         39%  /oracle/product/11.2.0/client
localhost:/mapr

100G         0 1            100G             0%         /mapr

 

when i run this job, /tmp folder is running with 96% so the job's are getting failed.

and still i have to run few tables with around 2TB of dataset !!!

is there anyway to use the HDFS space for the /tmp directory?? or any other workaround to avoid this issue???

 

Please find the attached hiveserver2 log,

 

Please advise me on this.

 

Thanks,

Karthi

Outcomes