AnsweredAssumed Answered

M7 Table Batch Put takes 6-8 hours to complete.

Question asked by himanshu on Jul 22, 2014
Latest reply on Jul 23, 2014 by snelson
We have a 7 node M7 Cluster and we are doing a Batch Put to M7 table, which takes around 6-8 hours.

Sometimes we also get RPC timeout while insertion.

We are inserting total of 30-32 Million records (Total 1.5 - 1.8 TB of data).

Each row contains 2329 columns (Total of 24KB per row).

Batch Size is 40



Just wanted your expert advise so that we can bring it down to 1-2 hours.

A sinpet of sample code on how we are inserting:

HTable zipTable = new HTable(conf,â€testâ€);

for(...){

Put put = new Put(Bytes.toBytes(st[0]));

for (String h : headerIndex.keySet()) {
   put.add("z".getBytes(),h.getBytes(), obj.toString().getBytes());

}
table.put(put);
i++;
if(i%40==0){
   table.flushCommits();
}
}

Outcomes