Physically remove data from M7 Table (e.g. TTL)

Question asked by oldsql on Aug 1, 2014
Latest reply on Aug 5, 2014 by snelson
This is a use case in conjunction with TTL, but also valid to normal row deletion. For example, assume in one of the M7 Table, the TTL for a column family is set to 30 days. 30 days passed after the last time data was ingested into HBase. At this time, `get` or `scan` should return nothing. This is indeed true. However, the data is not physically removed which is a big issue for both performance and storage considering it is several TB of data per day.

Several observations on data not being deleted:

 - issue a scan takes exceptional long, even though no data is returned
 - counting takes exceptional long
 - checking the MapR dashboard shows the table still have 5000 regions, with none of data physically removed.

I wish there were some daemon at M7 Table side monitoring this and recalling the space. But so far all the documents on HBase indicate the space recall is triggered by compaction. Even with the latter, I am still not able to see the data are physically removed.