SEEKING tips & best practices for validating data while using ImportTsv...

Question asked by markvogt on Jul 27, 2017
Jul 27, 2017

WHAT are some current tips & best practices for performing on-the-fly validation (and handling of data deemed "invalid") when using ImportTsv to bulk-load data into tables? 


MULTIPLE passes of ImportTsv would be expected, but does this mean the table has to stay in "import" mode until all the incoming data is deemed "valid" and permitted to be mapped/reduced and ultimately "put" into an HBase (MapR-DB) table?