Running a streamsets pipeline that is set to impersonate some mapr user, "myuser", and uses batch cluster mode to move data to some mapr volume with auditing enabled. Pipeline is validated and run successfully, but when checking the output of mapr expandaudit on the volumes that were just operated on, using drill explorer, I see:
Notice that the user="mapr", not "myuser". Yet, at the same time, the uid=5001 (which is the uid of "myuser"). Furthermore, running a direct query like:
SELECT * FROM `dfs`.`root`.`./expandaudit_dir/some_audited_vol/38230597` where uid='5001'
there is not a single user="myuser", only the mapr user. This all seems very weird to me. Does anyone have any explanation as to why this could be happening and how to fix it?
Note: this happens whether I run the streamsets pipeline either directly from the streamsets web UI or running a script (as user mapr) that uses the streamsets cli tool to activate the pipeline.