As of today May 22nd 2017, there are over 70 Kafka Connect connectors for streaming data into and out of Apache Kafka!
The connectors themselves for different applications or data systems are not maintained with Apache Kafka main code base. An easy way to discover Kafka Connect resources including connectors is to search GitHub for ’kafka-connect’ or directly open this URL https://github.com/search?q=kafka-connect.
Kafka Connect is included in MapR Streams, please see Kafka Connect for MapR Streams. Now, through simple configurations and no code necessary, we can leverage these Kafka Connect connectors for large scale streaming of data in and out of Kafka/MapR Streams for a variety of data systems!
I categorized the available into several while specifying their type as either for getting from data from another data system into Apache Kafka ; Or , for getting data from Kafka into another data system:
- : Attunity Replicate (Source) , Dbvisit Replicate Connector for Oracle (Source), Oracle Golden Gate (Source) , IBM Data Replication (Source), Debezium [MySQL, PostgreSQL, MongoDB]
- : JDBC (Source, Sink), MySQL, Blockchain, Edge Intelligence InfluxDB (Sink), KineticaDB (Sink), KLP-PostgreSQL (Sink) from InfoBright, SAP HANA (Source, Sink), Vertica (Source, Sink) , VoltDB (Sink) , ReThinkDB (Sink), OpenTSDB (Sink)
- : Azure DocumentDb (Sink), Aerospike (Sink), Cassandra (Source, Sink), Couchbase (Source), Druid (Sink), Dynamo DB (Source, Sink), HBase (Source, Sink), MongoDB (Source, Sink), Redis (Sink), MarkLogic (Sink)
- : FTP (Source) , HTTP (Source) , File (Source, Sink), FileSystem (Source), HDFS (Sink), Apache Kudu (Sink), spooldir (Source)
- : Splunk (Sink, Source) , Syslog (Source)
- : Elasticsearch (Sink), Solr (Sink, Source)
- : Amazon S3, Google Cloud Storage, Azure Blob Store ( on the roadmap)
- : Syncsort DMX (Source, Sink)
- : Azure IoT Hub (Source), CoAP [Constrained Application Protocol] (Source, Sink) , MQTT( Source), Flogo (Source)
- : BigQuery (Sink), Hive (Sink)
- : Apache Ignite (Source, Sink), Hazelcast (Sink)
- : AMQP, Google PubSub (Source, Sink), JMS (Source, Sink), Amazon SQS (source) , MQTT( Source), Slack via webhooks (Sink), RabbitMQ, AWS Kinesis
- : Bloomberg Feeds (Source), Jenkins (Source), Salesforce (Source), IRC (Internet Relay Chat) Source, PubNub, Mobile Apps , Twitter (Source, Sink), Yahoo Finance ( Source), GitHub (Source)
- : Mixpanel (Source)
- : JMX (Source)
- : DocumentSource
A few examples ofwould be:
- Publishing SQL Tables (or an entire SQL database) into Apache Kafka
- Consuming streams from Apache Kafka into HDFS for batch processing
- Consuming streams from Apache Kafka into Elasticsearch for secondary indexing
- Integrating legacy systems such as mainframe ones with Apache Kafka
Please share your experience, in the comments section, using Kafka Connect connectors with MapR Streams!