alankala

Tall vs Wide Table Design

Discussion created by alankala on Jul 14, 2016
Latest reply on Jul 26, 2016 by MichaelSegel


We have below usecase and want to find out what would be the general suggestion for the design of Table MaprDB/HBase.

 

Our data consists of users(say unique username) and count(integer values per hour)

We have 10K-100K requests to update db for every 10secs. We have to store 8000 hours of data to be stored for every user.

I tried creating single row with rowkey as username and 1 "count" column with 8000 versions. But I could not "Increment" a particular version depending on the hour of data that is coming in(Hbase "Increment" Api doesn't support "timestamp" parameter unlike "Put"). I tried creating a KeyValue with specific timestamp for this row and column, but it only updates the latest version.

 

So, now I have to redesign this. This is where I need the suggestion. Is it a better idea to create a Tall table with row key as "username + hour" and have only one column for the count and use Increment to update that one column or make the table Wide where i have row key as username and 8000 columns. Please note that I will be querying this table lets say once every 30mins and cache the data. This is a write heavy table.

 

Any suggestions greatly appreciated.

Outcomes