Today, we are delighted to announce the next level of advancement of the MapR Converged Data Platform with the latest release of MapR-DB 6.0–the modern database for global data-intensive applications.
At MapR, our goal has been to build a complete data platform with a built-in, modern, scalable database to create a broad variety of operational, analytic, and real-time applications spread across on-premises, edge, and multi-cloud environments with no complex trade-offs and compromises. MapR-DB allows these broad variety of applications by bringing critical database capabilities into one system as below.
MapR has systematically built MapR-DB to be a converged and complete database over the past 3 years, and the latest MapR-DB 6.0 release delivers on this broader vision.
MapR-DB 6.0 is a significant milestone. With this release, we are introducing several new capabilities and performance improvements to expand the usage of the database in organizations.
Here is the summary of the key features in this release.
Powerful and Efficient Data Access with Native Secondary Indexes
Prior to 6.0, MapR-DB was optimized for access only based on rowkey. The new, built-in, secondary indexes expand on this feature by supporting flexible and efficient queries on any columns in the DB tables at scale. This capability enables application developers to build rich and new types of applications that support complex user interaction patterns, and business users can perform optimized/high performance SQL queries, using the familiar BI/Analytics tools.
The key features of the secondary indexing functionality include:
- Native secondary indexes for MapR-DB JSON tables–no external indexing system, such as Elasticsearch or Solr, necessary
- Scalable and enterprise-grade indexing with auto-propagation, auto-scale, and auto-management
- Extreme index scalability and performance with SSD optimizations
- Rich indexing functionality–unlimited indexes, composite indexes with large # of columns, comprehensive data types support, hashed indexes, covering/non-covering query support, security, and more
- Highly functional and seamless queries across primary and secondary index tables
- Optimized index-based access for application development and BI/Analytics
Rich and Expanded Application Development with MapR-DB OJAI 2.0 APIs
OJAI (Open JSON Application Interface) is the API to develop applications with MapR-DB document data model. In 6.0, we are expanding on the API for more functionality and performance.
The new capabilities include:
- New and intuitive OJAI query interface
- JSON grammar and fluent API semantics
- Rich expressive language support, including conditional filtering, sorting, and pagination support
- Efficient queries with seamless index-based access
- Smart query execution to support operational and operational analytic applications on any data scale and with any query complexity
Optimized Drill/DB Integration for In-Place SQL Data Exploration and Operational BI
Apache Drill provides flexible SQL analytics on the data in MapR-DB JSON tables. Drill is a distributed SQL query engine and serves as a unified interactive access layer for the MapR Platform, bringing together data from MapR-FS and MapR-DB.
The new capabilities of the MapR-DB and Drill integration optimize the SQL data access on MapR-DB, speeding up ad-hoc queries. The new capabilities include:
- Ability for a variety of Drill SQL queries to seamlessly leverage MapR-DB secondary indexes, significantly speeding up query performance and avoiding large scans
- Statistics, selectivity, and cost-based index selection
- Index support for Filter/Sort/Offset/Limit operators
- Comprehensive index functionality support, including single, composite, covering/non-covering indexes, and index intersection
In-Place Advanced Analytics/ML on MapR-DB JSON with Native Spark Connectivity
MapR-DB 6.0 deeply integrates with Apache Spark and MapR-DB JSON tables. Customers can use these capabilities to perform real-time data processing as well as build and serve machine learning models on MapR-DB tables directly without creating analytic silos.
The new capabilities of this integration include:
- Batch and real-time data processing support with native Spark connectivity
- Supports for all key Spark constructs–RDDs, data frames/data sets
- Optimized Spark performance with projection and filter pushdown
In-Place ETL/Data Processing on MapR-DB JSON with Native Hive Support
MapR-DB 6.0 deeply integrates with Apache Hive and MapR-DB JSON tables. Customers can use these capabilities to perform ETL/batch processing of the data in MapR-DB tables directly.
The new capabilities of this integration include:
- New Hive storage handler for MapR-DB JSON tables
- Support for extensive Hive SQL functionality and data types on MapR-DB tables
Real-Time Data Integration and Micro-Services with MapR-DB Change Data Capture API
Built on the foundations of global table replication and MapR Event Streaming, the MapR-DB Change Data Capture API provides a powerful and easy-to-use interface to support real-time integration of changes arriving at a MapR-DB table to arbitrary, external systems. Users can now build applications to consume and process the MapR-DB table data changes published as ‘change log’ streams in real time in a highly scalable way. The change data propagation is granular for selected columns/fields and supports ordered at least-once delivery.
This capability enables use cases such as:
- Track changes happening to the MapR-DB (Inserts, Updates, Deletes) and perform real-time processing on the data
- Synchronize data in MapR-DB with a downstream search index (such as Elasticsearch, Solr), materialized views, or in-memory caches
All the new functionality expands on the data access capabilities on MapR-DB and helps leverage in a variety of use cases, such as customer 360, personalization, real-time analytics, IoT, and building scalable and high performance enterprise apps. The general availability of the MapR-DB 6.0 is in Q4’2017.
For more information on MapR-DB, refer to the following: