I have a dump table where i have lots of records and i want to summarize my data in to a summary table so that i can avoid computation every time i need data, The grouping and aggregation can vary, below is my row key design to put all records in single table
Query Patterns with rowkeys can be like
processor:*_currency:*_date:20160504 --- get me data for all processor for all currency for particular date
processor:*_currency:*_date:* --- get me data for all processor for all currency for all days
program:*_currency:*_date:* --- get me data for all programs for all currency for all days
currency:USD_channel:*_date:* --- get me data for usd currency for all channels for all days
currency:INR_date:* -- get me data for inr currency for all days
Instead of having it in multiple tables i chose to have it in single table so that i can give ease of changing order of param in my rowkey anytime, But my problem here is will my row key design lead to hot spotting?
Previously i thought of using row key like FIS_USD_20160504 without ingesting any metadata but this will lead to semantics issues but data will get distributed, so to avoid semantic issues i came up with metadata to be ingested but now i fear it will lead to hot spotting since my rowkeys start with same word for many rows like program or processor or currency?