Catalog Source Configuration > Kafka
  

Kafka

Kafka is a distributed event streaming platform for data pipelines, streaming analytics, data integration, and mission-critical applications. Configure a Kafka catalog source to extract metadata from Apache Kafka, Confluent Platform, and Confluent Cloud source systems.

Objects extracted

The metadata extraction service extracts the following objects from a Kafka source system:

Supported file types

You can extract the following file types in Confluent Cloud and Apache Kafka from Kafka source system :

Prerequisites for configuring the Kafka catalog source

Use the Kafka connector to connect to Kafka source system. For information about configuring a connection in Administrator, see Connections in the Cloud Common Services help.

Connection properties

When you configure a connection to Kafka in Administrator, you can view the connection properties for that connection on the Registration page in Metadata Command Center.
On the Registration page in Metadata Command Center, choose the connection and choose one of the following Kafka distributions to view the connection properties:
The following table describes the connection properties for the Apache Kafka and Confluent Platform distributions:
Property
Description
Runtime Environment
Name of the runtime environment where you want to run tasks.
Specify a Secure Agent or a serverless runtime environment for a mapping that runs on the advanced cluster.
Kafka Broker List
Comma-separated list of the Kafka brokers.
To list a Kafka broker, use the following format:
<HostName>:<PortNumber>
Note: When you connect to a Kafka broker over SSL, you must specify the fully qualified domain name for the host name. Otherwise, the test connection fails with SSL handshake error.
Retry Timeout
Optional. Number of seconds after which the Secure Agent attempts to reconnect to the Kafka broker to read or write data.
Default is 180 seconds.
Kafka Broker Version
Kafka message broker version. The only valid value is Apache 0.10.1.1 and above.
Additional Connection Properties
Optional. Comma-separated list of additional configuration properties of the Kafka producer or consumer.
For a streaming ingestion and replication task, ensure that you set the <kerberos name> property if you configure <Security Protocol> as SASL_PLAINTEXT or SASL_SSL.
For a database ingestion and replication task, if you want to specify a security protocol and properties, specify them here instead of in the Additional Security Properties property.
For example: security.protocol=SSL,ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks,ssl.truststore.password=<trustore_password>.
If you choose the Confluent Cloud distribution, specify values for the following properties on the Registration page in Metadata Command Center:
Property
Description
Schema Registry URL
Specify a URL to access the schema registry. The URL syntax is http://host1:port1.
Confluent Cloud API Key
An API key to manage access and authentication to Confluent Cloud.
Confluent Cloud API Secret
An API secret for the Confluent Cloud API key.
Confluent Cloud Schema Registry API Key
An API key to interact with schema registry in Confluent Cloud.
Confluent Cloud Schema Registry API Secret
An API secret for the Confluent Cloud schema registry API key.

Configuration parameters for metadata extraction

Expand the Catalog Source Configuration Options in the Metadata Extraction tab of the Configuration page. Configure the following parameters for extracting metadata from a Kafka source system:
Parameter
Description
Polling Strategy
Select one of the following polling strategies for sampling messages:
  • - From Beginning
  • - From End