We have a use case to check the daily loading data size in Hadoop cluster. Is there any option to check in Grafana or any other option ?
I found this discussion thread on Quora: https://www.quora.com/How-do-you-load-data-into-a-Hadoop-cluster
Hope it is helpful.
If you're loading batch files or even streaming for a known set of processes, for each file loaded, you could keep a record count. Then store that metadata in ... lets say MapRDB that has the load process's stats. So you would capture that information. Streaming would be the same thing... you track a daily count of ingested data.
If you know the name(s) of the load processes, you could go back to the logs and I think you should be able to fetch the counters... and then build up the counts and store them in a store like MapRDB.
Retrieving data ...