M7 Table Batch Put takes 6-8 hours to complete.

Question asked by himanshu on Jul 22, 2014
We have a 7 node M7 Cluster and we are doing a Batch Put to M7 table, which takes around 6-8 hours.

Sometimes we also get RPC timeout while insertion.

We are inserting total of 30-32 Million records (Total 1.5 - 1.8 TB of data).

Each row contains 2329 columns (Total of 24KB per row).

Batch Size is 40

Just wanted your expert advise so that we can bring it down to 1-2 hours.

A sinpet of sample code on how we are inserting:

HTable zipTable = new HTable(conf,â€testâ€);


Put put = new Put(Bytes.toBytes(st[0]));

for (String h : headerIndex.keySet()) {
   put.add("z".getBytes(),h.getBytes(), obj.toString().getBytes());