Can I set a cron job on hourly basis that will delete the contents of /tmp/hadoop-mapr ,
ie. "nm-local-dir yarn-nm-recovery" these 2 direcotories ?
Will this affect the functionality of the cluster or running jobs ?
Any current process should have a lock on a file so long as you don't use some of the more advanced commands of 'rm' you are fine in deleting the data in there. There are some pretty good scripts out there too. See: Should i need to cleaning up of tmp space in hadoop cluster on weekly basis ? if yes how can i do it? please suggest - H…
You can set up a bash script to remove only file after a certain time and then call that scrip in the cronjob just to be safe.
what is the main motivation for removing files under /tmp/nm-local-dir, /tmp/yarn-nm-recovery?
If you are worried about the storage space for /tmp check below property yarn.nodemanager.delete.debug-delay-sec it might help you for "/tmp/nm-local-dir" without a cronjob.
Retrieving data ...