AnsweredAssumed Answered

finding-specific-part-file-which-gives-drops

Question asked by simanchal.maharana on Jan 23, 2018
Latest reply on Jan 23, 2018 by cathy

Hi,

I am uploading huge amount of XML data in part-file format ( output of MapRedce/Hive jobs ) to MarkLogic database by Mapr Map Reduce job. Due to some cluster issue or network issue only few record 5/10/50/100 records (out of 20 million) are not uploading. For which I need to upload whole 20 million record again. It’s very time consuming. We are losing 2/3 Hrs. again.

 

I want to find those particular split file/part file from which few records missed. So that I can re-ingest only those part files instead of whole 20/30 millions . How can I find those specific part files?

 

Could you please help me for the above thing?

 

Thanks a lot for your help.

Outcomes