AnsweredAssumed Answered

How does 'Skewed by [..] Stored As Directories function' in Hive?

Question asked by jeet23 on Feb 23, 2016
Latest reply on May 16, 2016 by Hao Zhu

I was practicing Skewed storage functionality in Hive and hence created the below table with STORED AS Directory:-

 

CREATE EXTERNAL TABLE SKE_TRY (ID INT, NAME STRING, DPTNAME STRING, LOC STRING)
SKEWED BY (LOC) ON ('AUS', 'US') STORED AS DIRECTORIES
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\
'
LOCATION '/try/SSKW_DIR';

 

I than loaded the data using  - INSERT OVERWRITE TABLE SKE_TRY SELECT ID, NAME, DPTNAME, LOCATION FROM PAR_BUCK_SKW;

 

Initially the load failed due to Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. I than did some research and found out that it could be because of some constraints while creating the subdirectories. I than set the below property post which the Insert function ran smoothly:-
Set hive.mapred.supports.subdirectories=true;

 

But when I tried querying the table I got no rows returned. Also, when I browed through the file system in the location directory mentioned when creating the table I could see a Directory 'HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME' under /try/SSKW_CIR'.

 

Could I please request you to what could be the issue here and what should be the actual output?

Outcomes