AnsweredAssumed Answered

Drill not "detecting"/being able to query on MapR expandaudit fields (user for streamsets hadoop impersonation)

Question asked by reedv on Jan 24, 2018
Latest reply on Feb 14, 2018 by maprcommunity

Trying to query audit log json files generated using mapr's expandaudit program, but when trying to do something like 

SELECT *
FROM `dfs`.`root`.`./my_audit_logs/some_audited_vol/242165955`
where user='myuser'
limit 1000

in drill explorer, I get no results despite the fact that I can see events logged on user=myuser in the drill explorer's browser mode. The query works fine when querying on user=mapr, the mapr admin. This same problem occurs when running the query in a drillbit web UI.

 

However, querying based on uid, does give some results. myuser's uid is 5001 (mapr's uid is the default 5000) and when I query on uid=5001, I see some (very few) results where the user says "mapr" but the uid is 5001 (which seems strange). Though, it may be somehow related to the fact that I am using the mapr user to start a script that activates a streamsets pipeline (using the streamsets pipeline manager cli tool) that uses hadoop impersonation as the user myuser. Yet, I would think that I would be seeing a bunch of write operations in the audit logs attributed to the streamsets hadoop impersonator myuser due to the running pipeline (and certainly not have any of this weird uid mixup).

 

Furthermore, given that the expandaudit create a directory structure of the form audited_vol/<timestamp>/<nodename>/<some other timestamp?>/<all of the audit files>, when browse auditedd_vol/<timestamp> from the dril explorer, I can see entries the have the user=myuser, but when browsing through the individual audited_vol/<timestamp>/<nodename>/<some other timestamp?>/<all of the audit files> audit logs for all of the nodes, I can't find those same entries.

 

So I'm confused about what's going on here. Am I missing something (like how mapr's auditing handles user and uid fields for FS operations)? Does anyone know what could be going on here? Thanks.

Outcomes