AnsweredAssumed Answered

MapR-DB silently duplicating rows?

Question asked by john.humphreys on Jun 21, 2017
Latest reply on Jun 23, 2017 by john.humphreys

I'm writing the same row to MapR-DB repetitively, and while a select * from <table> yields just one row, the MCS user interface keeps showing both the table size and row count increasing (both overall, and in the regions tab).

 

The column family is set to max-versions = 1 (I can see it, and I think it's the default anyway).

 

This sounds like an error to me, but I'm guessing I just don't understand what's going on.  Can someone please explain this to me?

 

----- Extra Details -----

 

For a little more context... I have a set of spark code which basically makes a data frame of:

 

HosttimestampMetricValue
host-11metric-1123
host-11metric-2456
host-11metric-3789

 

It saves this data-frame to MapR-DB using saveAsHadoopDataset(), converting each row to a put.  The row-key is host+timestamp.  Each metric is a column in the same column family.  I have limited this to one host and one time period, so there is just one row with 325 columns in the same column family.  This is what drill is showing when I do a select *, though MCS is stating more data is there.

Outcomes