High Performance C APIs on MapR-DB
C Vs. Java APIs
Native languages like C/C++ provide a tighter control on memory and performance characteristics of the application than languages with automatic memory management. A well written C++ program that has intimate knowledge of the memory access patterns and the architecture of the machine can run several times faster than a Java program that depends on garbage collection. For these reasons, many enterprise developers with massive scalability and performance requirements tend to use C/C++ in their server applications in comparison to Java. Thus, the need to provide C APIs for MapR-DB.
A C language API for HBase known as libHBase was released in March last year (https://github.com/mapr/libhbase). This implementation leveraged the AsyncHBase Java library to interact with the HBase cluster. Since MapR-DB supports AsyncHBase as well as the synchronous HBase API’s, anyone can use libHBase to talk to MapR-DB as well as to HBase. The libHBase APIs are much faster than the HBase Thrift APIs, but they still incur a serious penalty due to embedding Java code in a C program because this embedding forces data to be copied from C data structures to Java. Even worse, since the MapR database client performs RPCs in native code, applications that use MapR DB incur this penalty twice, because data must be copied multiple times. The figure below shows how this happens.
The motivation for this project was to bypass the Java layer completely and directly encode the user application data into RPC buffers by calling into the MapR native database client from C directly. The following figure shows how this eliminates the need to cross the JNI barrier twice.
- No JVM is spawned
- No JNI (Java Native Interface) overhead imposed on the application
- No duplication of data buffers needed to transition between Java and C land
- No garbage collection uncertainty
- Tighter control on memory and CPU usage
The MapR-DB C APIs are asynchronous in nature which means that calls return instantly, even before any results are received. The alternative is to make all calls wait for completion. Our experience, and that of many others, is that the use of RPC calls that block until completion are a serious impediment to high performance at scale. This was the original reason for the introduction of the AsyncHBase API library. If an application requires a synchronous API, it is very easy to write synchronous wrappers on the asynchronous methods (just invoke the method and wait for the callback). It is much more difficult to convert a synchronous API into a performant asynchronous API.
A practical impact of this is that all methods that can result in an RPC must accept a callback parameter as an argument.
As an example, here is the core API point for any operation that mutates data. The cb argument is the callback and the mutation argument is where the actual operation is specified.
The following figure shows how the API’s work internally.
When these asynchronous methods are invoked, a work item is created and queued for processing on the client side. This work item will be picked up as soon as possible by one of the threads in a thread pool. When responses to RPC calls are received the callback will be invoked by the thread pool.
Client applications are often faster than RPC calls, so we need to make sure that the queue of work items does not grow without bound. For this reason, we have a config parameterfs.mapr.pool.queue.max_size (default 10000) which controls the maximum size of the work item queue. This parameter can be modified by updating the /opt/mapr/conf/dbclient.conf file.
Whenever the work item queue size reaches this limit, the library return ENOBUFS errors for the asynchronous calls. The client application is expected to handle this error, and can decide retry invoking the asynchronous call after some time. Another option is to pass a shared global condition variable to all callbacks via the extra argument so that the callbacks can signal the condition variable as they complete. The completion of any pending callback is a likely indication that the ENOBUFS condition has been cured and an operation should be retried.
Performance was our foremost goal when we started working on this project. On that front, our implementation has the following characteristics:
- The library does not copy any of the user application allocated buffers. It rather just maintains references to it. These buffers are then directly encoded into RPC buffers. Thus, the library expects that the user application gives up the ownership of these buffers till the time the callback is invoked. Once the callback is invoked, ownership is returned to the application so that these buffers can be destroyed or re-used as appropriate.
- These config parameters that can be tuned by the client application to trade throughput versus resource usage.
Our new API has a number of important features that are not available in libHBase:
Secure user impersonation while creating connection
Adding or modifying column families of a table
Setting timestamps or time range for get and scan operations
Increment and append mutations
Filtering on column family and column name in scan operations
For MapR versions >= v5.0: HBase thrift language compliant filter support for get/scan operations (http://hbase.apache.org/0.94/book/thrift.html)
Learn more about creating native applications for MapR-DB here:http://doc.mapr.com/display/MapR/Creating+MapR-DB+Applications+with+C
To help you get started quickly, we have added two sample applications as part of the installation package. You can find them under /opt/mapr/examples directory when you install mapr-client. These applications are also located in a github repository here: https://github.com/mapr-demos/c-api-sample-applications
In this blog post, you’ve learned about high performance C APIs on MapR-DB. If you have any questions, please add your comments in the section below..
Want to learn more?