AnsweredAssumed Answered

MapR-DB Java API - Fast Loading

Question asked by john.humphreys on Jun 8, 2017
Latest reply on Jun 19, 2017 by john.humphreys



I'm exploring the MapR-DB Java API to see how efficiently I can use it to populate JSON tables (or binary tables).  Thus far it seems pretty slow, so I'm assuming I'm starting off wrong.


A few related questions:

  1. Is there a faster way to populate MapR-DB JSON documents than using table.insert()?  (say, some equivalent to JDBC batch inserts)?
  2. How often should I call table.flush() assuming I'm writing a crazily large number in an infinite stream?
  3. Would running table.insert() from, say, 30 threads be safe/helpful?
  4. Would binary tables be a lot faster for writing?


When we use OpenTSDB on top of MapR-DB (via its REST API, heavily multi-threaded), we can write 230,000 data points a second with one time-series daemon. With this MapR-DB API, I'm seeing more like 1,000 data points a second.  I understand how OpenTSDB works and that it has an advantage; but I still feel like I'm doing something wrong here (even if I threw 30 threads at it and it was linear, it would just be 30,000 data points a second).