Using Kubernetes Secrets in Kafka Connect Configurations

Kafka Connect is an integration framework that is part of the Apache Kafka project. On Kubernetes and Red Hat OpenShift platforms, you can deploy it using operators Strimzi and Red Hat AMQ Streams. Kafka Connect has two kinds of connectors: source and sink. The first ones are intended for loading data into Kafka from external systems, the second – for downloading data from Kafka. Connection to external systems, as a rule, requires authentication, therefore, when configuring the connector, you must specify credentials. In this post, we show how to use Kubernetes secrets to store credentials when configuring connectors.

Here we will use the S3 source connector, which is part of Apache Camel Kafka (for more details, see here), and on his example we will show how to configure the connector so that it uses a secret.

The setup procedure described here is universal and suitable for any type of connector. We will use the S3 connector to connect to Amazon AWS S3 storage and upload files from the S3 bucket to the Apache Kafka topic. To connect to the S3 repository, we need the following AWS credentials: access key and secret key. So, let’s start by preparing a secret with credentials.

Create a secret with credentials

First of all, we create the simplest properties file aws-credentials.properties and write the credentials into it, like this:

aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

The credentials specified here should provide access to the S3 basket, from where the data will be read. Now you need to generate a secret from this file, this is done with the following command:

$ kubectl create secret generic aws-credentials --from-file=./aws-credentials.properties

We collect a new container image with a connector

Next, you need to prepare a new Docker image with our connector. If using Strimzi, then the Dockerfile for adding our connector looks like this:

FROM strimzi/kafka:0.16.1-kafka-2.4.0
USER root:root
COPY ./my-plugins/ /opt/kafka/plugins/
USER 1001

If Red Hat AMQ Streams is used, then the Dockerfile looks like this:

FROM registry.redhat.io/amq7/amq-streams-kafka-23:1.3.0
USER root:root
COPY ./my-plugins/ /opt/kafka/plugins/
USER jboss:jboss

Then, using the Dockerfile, you need to build a container image containing the connectors we need and place it in the registry. If you do not have your own private registry, then you can use public registries, for example Quay or Docker hub.

Deploy Apache Kafka Connect

So, having a container image, we can deploy Apache Kafka Connect by creating the following resource in Kubernetes:

apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnect
metadata:
  name: my-connect-cluster
spec:
  image: docker.io/scholzj/kafka:camel-kafka-2.4.0
  replicas: 3
  bootstrapServers: my-cluster-kafka-bootstrap:9092
  externalConfiguration:
    volumes:
      - name: aws-credentials
        secret:
          secretName: aws-credentials
  config:
    config.providers: file
    config.providers.file.class: org.apache.kafka.common.config.provider.FileConfigProvider
    key.converter: org.apache.kafka.connect.json.JsonConverter
    value.converter: org.apache.kafka.connect.json.JsonConverter
    key.converter.schemas.enable: false
    value.converter.schemas.enable: false

What should I look for in this description? Firstly, on the image field, which tells the Apache Kafka Connect deployment operator what image to use in this case – that is, the image where we included the connectors we need. In our example, the container image collected in the previous step is placed in the Docker Hub registry at scholzj / kafka: camel-kafka-2.4.0, so we have the image field as follows:

image: docker.io/scholzj/kafka:camel-kafka-2.4.0

Secondly, the externalConfiguration section:

externalConfiguration:
  volumes:
    - name: aws-credentials
      secret:
        secretName: aws-credentials

Here we tell the operator to mount the Kubernetes secret aws-credentials (we created it above) in the Apache Kafka Connect pods. The secrets listed here will be mounted with the path / opt / kafka / external-configuration /, where is the name of the secret.

Finally, pay attention to the config section, where we say that Apache Kafka Connect uses FileConfigProvider as the configuration provider:

config:
  config.providers: file
  config.providers.file.class: org.apache.kafka.common.config.provider.FileConfigProvider

A configuration provider is such a way not to prescribe the parameters directly in the configuration, but to take them from another source. In our case, we create a configuration provider named file that will use the FileConfigProvider class. This provider is part of Apache Kafka. FileConfigProvider can read property files and extract values ​​from them, and we will use it to load API keys for our Amazon AWS account.

Create a connector using the Apache Kafka Connect REST API

Usually, you have to wait a minute or two for Apache Kafka Connect to deploy before creating an instance of the connector. In previous versions of Strimzi and Red Hat AMQ Streams, you had to use the REST API for this. Now we can create the connector by doing a POST for the following JSON’s:

{
 "name": "s3-connector",
 "config": {
   "connector.class": "org.apache.camel.kafkaconnector.CamelSourceConnector",
   "tasks.max": "1",
   "camel.source.kafka.topic": "s3-topic",
   "camel.source.maxPollDuration": "10000",
   "camel.source.url": "aws-s3://camel-connector-test?autocloseBody=false",
   "key.converter": "org.apache.kafka.connect.storage.StringConverter",
   "value.converter": "org.apache.camel.kafkaconnector.converters.S3ObjectConverter",
   "camel.component.aws-s3.configuration.access-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_access_key_id}",
   "camel.component.aws-s3.configuration.secret-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_secret_access_key}",
   "camel.component.aws-s3.configuration.region": "US_EAST_1"
   }
}

The connector configuration contains AWS API keys in the fields camel.component.aws-s3.configuration.access-key and camel.component.aws-s3.configuration.secret-key. Instead of writing values ​​directly, we simply reference the file configuration provider to load the aws_access_key_id and aws_secret_access_key fields from our aws-credentials.properties file.

Pay attention to exactly how we refer to the configuration provider: specify the path to the file to be used, and through the colon – the name of the key to be extracted, like this:

"camel.component.aws-s3.configuration.access-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_access_key_id}"

And like this:

"camel.component.aws-s3.configuration.secret-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_secret_access_key}"

POST these results in Apache Kafka Connect REST API, for example, using the curl command

$ curl -X POST -H "Content-Type: application/json" -d connector-config.json http://my-connect-cluster-connect-api:8083/connectors

One of the advantages of configuration providers is that they allow you not to “shine” those configuration parameters that you need to keep secret:

$ curl http://my-connect-cluster-connect-api:8083/connectors/s3-connector
{
  "name": "s3-connector",
  "config": {
    "connector.class": "org.apache.camel.kafkaconnector.CamelSourceConnector",
    "camel.source.maxPollDuration": "10000",
    "camel.source.url": "aws-s3://camel-connector-test?autocloseBody=false",
    "camel.component.aws-s3.configuration.region": "US_EAST_1",
    "camel.component.aws-s3.configuration.secret-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_secret_access_key}",
    "tasks.max": "1",
    "name": "s3-connector",
    "value.converter": "org.apache.camel.kafkaconnector.converters.S3ObjectConverter",
    "camel.component.aws-s3.configuration.access-key": "${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_access_key_id}",
    "key.converter": "org.apache.kafka.connect.storage.StringConverter",
    "camel.source.kafka.topic": "s3-topic"
  },
  "tasks": [
    {
      "connector": "s3-connector",
      "task": 0
    }
  ],
  "type": "source"
}

Create a connector using the Strimzi operator

Starting from version 0.16.0, Strimzi also introduced a new operator for creating connectors using the following custom YAML resource (in KafkaConnector, you can also use the configuration provider directly):

apiVersion: kafka.strimzi.io/v1alpha1
kind: KafkaConnector
metadata:
  name: s3-connector
  labels:
    strimzi.io/cluster: my-connect-cluster
spec:
  class: org.apache.camel.kafkaconnector.CamelSourceConnector
  tasksMax: 1
  config:
    key.converter: org.apache.kafka.connect.storage.StringConverter
    value.converter: org.apache.camel.kafkaconnector.converters.S3ObjectConverter
    camel.source.kafka.topic: s3-topic
    camel.source.url: aws-s3://camel-connector-test?autocloseBody=false
    camel.source.maxPollDuration: 10000
    camel.component.aws-s3.configuration.access-key: ${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_access_key_id}
    camel.component.aws-s3.configuration.secret-key: ${file:/opt/kafka/external-configuration/aws-credentials/aws-credentials.properties:aws_secret_access_key}
    camel.component.aws-s3.configuration.region: US_EAST_1

To summarize

The security of secrets in Kubernetes has its limitations, and any user with execute rights inside the container can read the connected secrets anyway. But the method described in this article, at a minimum, allows you to not reveal credentials, API keys, and other secret information when using the REST API or user resources in KafkaConnector.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *