User Guide > Targets in a Streaming Mapping > Kafka Data Objects
  

Kafka Data Objects

A Kafka data object is a physical data object that represents data in a Kafka stream. After you configure a Messaging connection, create a Kafka data object to write to Apache Kafka brokers.
Kafka runs as a cluster comprised of one or more servers each of which is called a broker. Kafka brokers stream data in the form of messages. These messages are published to a topic. When you write data to a Kafka messaging stream, specify the name of the topic that you publish to. You can also write to a Kerberised Kafka cluster.
Kafka topics are divided into partitions. Spark Streaming can read the partitions of the topics in parallel. This gives better throughput and could be used to scale the number of messages processed. Message ordering is guaranteed only within partitions. For optimal performance you should have multiple partitions.
When you write to Kafka brokers, you can use the partionId, Key, and TopicName output ports. You can override these ports when you create the mapping. You can create or import a Kafka data object.
After you create a Kafka data object, create a write operation. You can use the Kafka data object write operation as a target in Streaming mappings. If you want to configure high availability for the mapping, ensure that the Kafka cluster is highly available.
When you configure the data operation properties, specify the format in which the Kafka data object writes data. You can specify XML, JSON, Avro, or Flat as format. When you specify XML format, you must provide a XSD file. When you specify Avro format, provide a sample Avro schema in a .avsc file. When you specify JSON or Flat format, you must provide a sample file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
For more information about processing hierarchical data, see the Informatica Big Data Management User Guide.
For more information about Kafka clusters, Kafka brokers, and partitions see http://kafka.apache.org/082/documentation.html.

Kafka Data Object Overview Properties

The Data Integration Service uses overview properties when it reads data from or writes data to a Kafka broker.
Overview properties include general properties that apply to the Kafka data object. They also include object properties that apply to the resources in the Kafka data object. The Developer tool displays overview properties for Kafka messages in the Overview view.

General Properties

The following table describes the general properties that you configure for Kafka data objects:
Property
Description
Name
The name of the Kafka data object.
Description
The description of the Kafka data object.
Connection
The name of the Kafka connection.

Objects Properties

The following table describes the objects properties that you configure for Kafka data objects:
Property
Description
Name
The name of the topic or topic pattern of the Kafka data object.
Description
The description of the Kafka data object.
Native Name
The native name of Kafka data object.
Path Information
The type and name of the topic or topic pattern of the Kafka data object.

Column Properties

The following table describes the column properties that you configure for Kafka data objects:
Property
Description
Name
The name of the Kafka data object.
Native Name
The native name of the Kafka data object.
Type
The native data type of the Kafka data object.
Precision
The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types.
Scale
The scale of the data type.
Description
The description of the Kafka data object.
Access Type
The type of access the port or column has.

Kafka Data Object Write Operation Properties

The Data Integration Service uses write operation properties when it writes data to a Kafka broker.

General Properties

The Developer tool displays general properties for Kafka targets in the Write view.
The following table describes the general properties that you view for Kafka targets:
Property
Description
Name
The name of the Kafka broker.
This property is read-only.
Description
The description of the Kafka broker.

Ports Properties

Ports properties for a physical data object include port names and port attributes such as data type and precision.
The following table describes the ports properties that you configure for Kafka broker sources:
Property
Description
Name
The name of the resource.
Type
The native data type of the resource.
Precision
The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types.
Scale
The scale of the data type.
Description
The description of the resource.

Run-time Properties

The run-time properties displays the name of the connection.
The following table describes the run-time property that you configure for Kafka targets:
Property
Description
Connection
Name of the Kafka connection.

Target Properties

The targets properties list the targets of the Kafka data object.
The following table describes the sources property that you can configure for Kafka targets:
Property
Description
Target
The target which the Kafka data object writes to.
You can add or remove targets.

Advanced Properties

The Developer tool displays the advanced properties for Kafka targets in the Input transformation in the Write view.
The following table describes the advanced properties that you configure for Kafka targets:
Property
Description
Operation Type
Specifies the type of data object operation.
This is a read-only property.
Metadata Fetch Timeout in milliseconds
The time after which the metadata is not fetched.
Batch Flush Time in milliseconds
The interval after which the data is published to the target.
Batch Flush Size in bytes
The batch size of the events after which the data is written to the target.
Producer Configuration Properties
The configuration properties for the producer. If the Kafka data object is writing data to a Kafka cluster that is configured for Kerberos authentication, include the following property:
security.protocol=SASL_PLAINTEXT,sasl.kerberos.service.name=kafka,sasl.mechanism=GSSAPI
For more information about Kafka broker properties, see http://kafka.apache.org/082/documentation.html.

Column Projections Properties

The Developer tool displays the column projection properties in the Properties view of the write operation.
To specify column projection properties, double click on the write operation and select the data object. The following table describes the columns projection properties that you configure for Kafka targets:
Property
Description
Column Name
The field in the target that the data object writes to.
This property is read-only.
Type
The native data type of the target.
This property is read-only.
Enable Column Projection
Indicates that you use a schema to publish the data to the target.
By default, the data is streamed in binary format. To change the streaming format, select this option and specify the schema format.
Schema Format
The format in which you stream data to the target. You can select one of the following formats:
  • - XML
  • - JSON
  • - Flat
  • - Avro
Schema
Specify the XSD schema for the XML format, acsample file for JSON, or .avsc file for Avro format.
For the Flat file format, configure the schema to associate a flat file to the Kafka target. When you provide a sample file, the Data Integration Service uses UTF-8 code page when writing the data.
Column Mapping
The mapping of data object to the target. Click View to see the mapping.
Project Column as Complex Data Type
Project columns as complex data type for hierarchical data.
For more information, see the Informatica Big Data Management User Guide.