Kafka is a distributed event streaming platform for data pipelines, streaming analytics, data integration, and mission-critical applications. Configure a Kafka catalog source to extract metadata from Apache Kafka, Confluent Platform, and Confluent Cloud source systems.
Objects extracted
The metadata extraction service extracts the following objects from a Kafka source system:
•Cluster
•Topic
•Field
Supported file types
You can extract the following file types in Confluent Cloud and Apache Kafka from Kafka source system :
•Confluent Cloud. JSON, XML, CSV, and AVRO
•Apache Kafka. JSON, XML, and CSV
Prerequisites for configuring the Kafka catalog source
Use the Kafka connector to connect to Kafka source system. For information about configuring a connection in Administrator, see Connections in the Cloud Common Services help.
Connection properties
When you configure a connection to Kafka in Administrator, you can view the connection properties for that connection on the Registration page in Metadata Command Center.
On the Registration page in Metadata Command Center, choose the connection and choose one of the following Kafka distributions to view the connection properties:
•Apache Kafka. A distributed event streaming platform on premises for data pipelines, streaming analytics, and data integration.
•Confluent Platform. An improved on premises distributed event streaming platform, based on Apache Kafka.
•Confluent Cloud. A fully managed streaming data service on cloud, based on Apache Kafka.
The following table describes the connection properties for the Apache Kafka and Confluent Platform distributions:
Property
Description
Runtime Environment
The execution platform that runs tasks. The runtime environment is either a Secure Agent or a serverless runtime environment.
Kafka Broker List
A list of brokers in the following format: <hostname>:<port number>.
Retry Timeout
Time in seconds to connect to the Kafka broker.
Kafka Broker Version
The version of the Kafka broker that runs the catalog source.
Additional Connection Properties
Optional. Comma-separated list of additional configuration properties of the Kafka producer or consumer.
If you choose the Confluent Cloud distribution, specify values for the following properties on the Registration page in Metadata Command Center:
Property
Description
Schema Registry URL
Specify a URL to access the schema registry. The URL syntax is http://host1:port1.
Confluent Cloud API Key
An API key to manage access and authentication to Confluent Cloud.
Confluent Cloud API Secret
An API secret for the Confluent Cloud API key.
Confluent Cloud Schema Registry API Key
An API key to interact with schema registry in Confluent Cloud.
Confluent Cloud Schema Registry API Secret
An API secret for the Confluent Cloud schema registry API key.
Configuration parameters for metadata extraction
Expand the Catalog Source Configuration Options in the Metadata Extraction tab of the Configuration page. Configure the following parameters for extracting metadata from a Kafka source system:
Parameter
Description
Polling Strategy
Select one of the following polling strategies for sampling messages: