Apache Kafka options for edge applications
Edit meThis document provides different Apache Kafka options available to users developing edge applications.
Apache Kafka allows users to publish and subscribe to streams of records, similar to a message queue or enterprise messaging system. This makes it an effective tool for edge solutions to pass data from sensors and systems close to the edge to a large data processing system in a private or public cloud.
The rest of this document will cover how to install and deploy several flavors of Kafka and how to connect to the those deployments in IBM Streams applications using the streamsx.kafka
or streamsx.messagehub
toolkit.
Before you begin: Streams application development
-
Python developers: use the
streamsx.kafka
Python package regardless of Kafka deployment. Usage ofstreamsx.eventstreams
is not recommended because it is no longer updated. -
SPL developers: use the
KafkaConsumer
andKafkaProducer
operators in thestreamsx.kafka
toolkit for all Kafka deployments other than IBM Event Streams. For users planning to use IBM Event Streams, use thestreamsx.messagehub
toolkit instead.
Using an existing Kafka deployment
If you already have a Kafka environment ready, use that environment by configuring your KafkaConsumer
and KafkaProducer
operators and your property file for the Streams application to access the existing Kafka environment.
- Python applications: see Connection examples.
- SPL applications: see
streamsx.kafka
samples.
Red Hat AMQ Streams
Red Hat’s version of the Apache Kafka and Strimzi projects which simplifies the process of running Apache Kafka in an OpenShift cluster.
For a full overview of AMQ Streams and Kafka concepts and architecture, see the AMQ Streams overview documentation
Installing and deploying AMQ Streams
AMQ Streams can be installed and deployed in the following environments:
See their respective links for instructions on how to download, install, and deploy AMQ Streams.
Quick Start for OpenShift 4.3
- Go to the
OperatorHub
in the OCP console. - Select the installation mode to be a specific namespace.
- Select a namespace where the AMQ Streams deployment will be (e.g. amq-streams).
- If you need to create a new namespace, create a new project via the OCP console, or via CLI by running
oc new-project amq-streams
.
- If you need to create a new namespace, create a new project via the OCP console, or via CLI by running
- Click ‘Subscribe’ and wait for the AMQ Streams operator to be installed.
- Once installed, click on the operator and click on ‘Create Instance’ for a ‘Kafka’ resource
- Edit the YAML to add a route, change the name, or set the storage type of the Kafka or Zookeeper deployments
-
Under
.spec.kafka.listeners
, addexternal: type: route
- The default YAML will create an emphimeral Kafka cluster named ‘my-cluster’
- If you need a persistent cluster, see the AMQ Streams doc for more info.
-
- Once done editing the Kafka YAML, click ‘Create’.
- Return back to the AMQ Streams Operator, click on ‘Kafka Topic’, and click ‘Create KafkaTopic’
- Set the name, partitions, and config for the topic as desired
- Click ‘Create’
Connecting to AMQ Streams
-
In a terminal, run the commands below to get certificates and keystores necessary to connect:
# Get Kafka bootstrap route; value will be referred to as <RouteURL> later oc get routes my-cluster-kafka-bootstrap -n amq-streams -o=jsonpath='{.status.ingress[0].host}{"\n"}' # Extract server public cert, client public cert, and client private key oc extract secret/my-cluster-cluster-ca-cert -n amq-streams --keys=ca.crt --to=- > ca.crt oc extract secret/my-cluster-client-ca-cert -n amq-streams --keys=ca.crt --to=- > user.crt oc extract secret/my-cluster-client-ca -n amq-streams --keys=ca.key --to=- > user.key
-
Use the methods depending on application type:
- Python applications: use the
streamsx-kafka-make-properties
command to create a properties file. -
SPL applications:
-
Create the truststore and keystore manually:
keytool -import -trustcacerts -alias root -file ca.crt -keystore truststore.jks -storepass trustpassword -noprompt openssl pkcs12 -export -in user.crt -inkey user.key -name client-alias -out ./keystore.pkcs12 -noiter -nomaciter -passout keypassword keytool -importkeystore -deststorepass password -destkeystore ./keystore.jks -srckeystore ./keystore.pkcs12 -srcstoretype pkcs12 -srcstorepass keypassword
-
Populate a
kafka.properties
file with the following values:bootstrap.servers=<RouteURL> security.protocol=SSL ssl.keystore.type=JKS ssl.keystore.password=keypassword ssl.key.password=keypassword ssl.keystore.location={applicationDir}/etc/keystore.jks ssl.endpoint.identification.algorithm=https ssl.truststore.type=JKS ssl.truststore.password=trustpassword ssl.truststore.location={applicationDir}/etc/truststore.jks
-
Copy the JKS files and
kafka.properties
toetc/
in the SPL application workspace
-
- Python applications: use the
For more information, see Using streamsx.kafka with Red Hat AMQ Streams documentation.
Event Streams in IBM Cloud
IBM Event Streams builds on top of open source Apache Kafka to offer enterprise-grade event streaming capabilities.
Provisioning Event Streams
- Log in to IBM Cloud or create an account if you do not have one.
- Visit Event Streams in the catalog.
- Select a region (e.g. Dallas, Frankfurt)
- Select a plan (e.g. Lite).
- Important: The Lite plan only allows one topic which may not be enough for some samples to work.
- Enter in a service name (e.g. Event Streams for Edge)
- Click ‘Create’
Creating credentials and a topic
- Go to ‘Service credentials’ in the navigation pane.
- Click ‘New credential’.
- Give the credential a name so you can identify its purpose later. You can accept the default value.
- Give the credential the Manager role so that it can access the topics, and create them if necessary.
- Click ‘Add’. The new credential is listed in the table in Service credentials.
- For the newly created credentials, click the ‘Copy to clipboard’ icon.
- Go to the Topics tab.
- Go to ‘Manage’ in the navigation pane.
- Click ‘Create a topic’
- Name your topic.
- Keep the defaults set in the rest of the topic creation, click ‘Next’ and then ‘Create topic’.
For any questions or details regarding Event Streams, see the Event Streams documentation for more information.
Connecting to Event Streams
Use the copied credentials to save them to a file or create a Streams application config with the credentials:
- Python applications: see Connecting with the IBM Event Streams cloud service.
- SPL applications: see
streamsx.messagehub
samples for connection configuration options.
Vanilla Apache Kafka
Apache Kafka can be deployed on bare-metal or VM systems as well as Kubernetes, or OpenShift environments. Edge applications can leverage these Kafka installations; however, the edge systems where the edge application will be running must be able to connect to the system or environment where Kafka is running. Additionally, any cloud services or applications that consume Kafka topics must be able to access the system or environment.
Because users should already have access to a Kubernetes-like environment, the following install section will cover Helm charts and Operators to deploy Kafka servers.
Installing and deploying Kafka
Helm
- Download the latest Helm 3 release
- Follow the instructions for the Bitnami Kakfa Helm charts
Kubernetes Operator
To deploy Kafka using a Kubernetes operator, use the Strimzi Kafka Operator here.
Setup and configuration is nearly identical to AMQ Streams. For full information, visit the Strimzi doc.
Connecting to Kafka
- Python applications: see Connection examples.
- SPL applications: see
streamsx.kafka
samples.
What to do next?
Build and test your application using your Streams service instance in Cloud Pak for Data. Once your application is ready to be built as an edge application, see Building an edge application for more information.