AnsweredAssumed Answered

Hbase import from csv is throwing Java Heap error

Question asked by gary on Apr 21, 2013
Latest reply on Apr 22, 2013 by yufeldman
I'm trying to import data from a CSV into HBase using "hadoop jar hbase-0.92.2-mapr.jar importtsv" option (Note: the csv has only 15 rows for trial purpose), and MR throws a 'Error: Java Heap Space'. The Master has 4 cores with 8 GB RAM, so not sure which aspect is finding insufficient memory (Hadoop config, hbase config, something in warden.conf ?):

<pre><code>
13/04/22 06:37:37 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 28864@vm-2
13/04/22 06:37:37 INFO mapreduce.HFileOutputFormat: Looking up current regions for table org.apache.hadoop.hbase.client.HTable@663b1f38
13/04/22 06:37:37 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce partitions to match current region count
13/04/22 06:37:37 INFO mapreduce.HFileOutputFormat: Writing partition information to /user/root/partitions_1366630657853
13/04/22 06:37:37 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
13/04/22 06:37:37 INFO compress.CodecPool: Got brand-new compressor
13/04/22 06:37:37 INFO mapreduce.HFileOutputFormat: Incremental table output configured.
13/04/22 06:37:38 INFO fs.JobTrackerWatcher: Current running JobTracker is: vm-2/173.36.55.179:9001
13/04/22 06:37:38 INFO input.FileInputFormat: Total input paths to process : 1
13/04/22 06:37:38 WARN snappy.LoadSnappy: Snappy native library not loaded
13/04/22 06:37:38 INFO mapred.JobClient: Creating job's output directory at /root/csv_dir5
13/04/22 06:37:38 INFO mapred.JobClient: Creating job's user history location directory at /root/csv_dir5/_logs
13/04/22 06:37:38 INFO mapred.JobClient: Running job: job_201304181357_0016
13/04/22 06:37:39 INFO mapred.JobClient:  map 0% reduce 0%
13/04/22 06:37:48 INFO mapred.JobClient: Task Id : attempt_201304181357_0016_m_000000_0, Status : FAILED on node vm-2
Error: Java heap space
13/04/22 06:37:51 INFO mapred.JobClient: Task Id : attempt_201304181357_0016_m_000000_1, Status : FAILED on node vm-1
Error: Java heap space
13/04/22 06:37:52 INFO mapred.JobClient: Task Id : attempt_201304181357_0016_m_000000_2, Status : FAILED on node vm-1
</code></pre>

Any pointers on which conf needs update to the Xmx param definition ?

Outcomes