I'm exploring the MapR-DB Java API to see how efficiently I can use it to populate JSON tables (or binary tables). Thus far it seems pretty slow, so I'm assuming I'm starting off wrong.
A few related questions:
- Is there a faster way to populate MapR-DB JSON documents than using table.insert()? (say, some equivalent to JDBC batch inserts)?
- How often should I call table.flush() assuming I'm writing a crazily large number in an infinite stream?
- Would running table.insert() from, say, 30 threads be safe/helpful?
- Would binary tables be a lot faster for writing?
When we use OpenTSDB on top of MapR-DB (via its REST API, heavily multi-threaded), we can write 230,000 data points a second with one time-series daemon. With this MapR-DB API, I'm seeing more like 1,000 data points a second. I understand how OpenTSDB works and that it has an advantage; but I still feel like I'm doing something wrong here (even if I threw 30 threads at it and it was linear, it would just be 30,000 data points a second).