AmazonKinesis Data Objects
An AmazonKinesis data object is a physical data object that represents data in an Amazon Kinesis Data Firehose Delivery Stream. After you create an AmazonKinesis connection, create an AmazonKinesis data object to write to Amazon Kinesis Data Firehose.
Kinesis Data Firehose is a real-time data stream processing option that Amazon Kinesis offers within the AWS ecosystem. Kinesis Data Firehose allows batching, encrypting, and compressing of data. Kinesis Data Firehose can automatically scale to meet system needs.
When you configure the AmazonKinesis data object, specify the name of the Data Firehose Delivery Stream that you write to. You can specify the Kinesis Data Firehose Delivery Stream name or use a regular expression for the stream name pattern. If the input has multiple partitions, you can create multiple Kinesis Data Firehose Delivery Streams to the same target and send the data from these partitions to the individual delivery streams based on the pattern you specify in the stream name.
After you create the data object, create a data object write operation to write data to an Amazon Kinesis Data Firehose Delivery Stream. You can then add the data object write operation as a target in Streaming mappings.
When you configure the data operation properties, specify the format in which the data object writes data. When you write to Amazon Data Firehose targets, you can specify JSON or binary as the format.
When you specify JSON format, you must provide a sample file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
When you run a mapping to write data to an Amazon Kinesis Data Firehose Delivery Stream, the data object uses the AWS Firehose SDK to write data.
Note: You cannot run a mapping with an AmazoznKinesis data object on a MapR distribution.
For more information about processing hierarchical data, see the Informatica Big Data Management User Guide.
For more information about Kinesis Data Firehose, see the Amazon Web Services documentation.
AmazonKinesis Data Object Overview Properties
Overview properties include general properties that apply to the AmazonKinesis data object. The Developer tool displays overview properties of the data object in the Overview view.
You can configure the following overview properties for AmazonKinesis data objects:
- General
- You can configure the following general properties for the AmazonKinesis data object:
- - Name. Name of the AmazonKinesis data object.
- - Description. Description of the AmazonKinesis data object.
- - Native Name. Name of the AmazonKinesis data object.
- - Path Information. The path of the data object in AmazonKinesis. For example, /DeliveryStreams/router1
- Column
- You can configure the name, native name, data type, precision, scale, and description of the columns in the AmazonKinesis resource.
- Advanced
The following are the advanced properties for the AmazonKinesis data object:
- - Amazon Resource Name. The Kinesis resource that the AmazonKinesis data object is reading from or writing to.
- - Type. The type of delivery stream that the AmazonKinesis data object is reading from or writing to. The delivery stream is either Kinesis Stream or Firehose DeliveryStream
- - Number of Shards. Specify the number of shards that the Kinesis Stream is composed of. This property is not applicable for Firehose DeliveryStream.
AmazonKinesis Data Object Write Operation Properties
The Data Integration Service uses write operation properties when it writes data to Amazon Kinesis Firehose.
General Properties
The Developer tool displays general properties for AmazonKinesis targets in the Write view.
The following table describes the general properties that you view for AmazonKinesis targets:
Property | Description |
---|
Name | The name of the Amazon Simple Storage Service (Amazon S3), Amazon Redshift tables, or Amazon Elasticsearch Service (Amazon ES) . This property is read-only. |
Description | The description of the target. |
Ports Properties
Ports properties for a physical data object include port names and port attributes such as data type and precision.
The following table describes the ports properties that you configure for AmazonKinesis targets:
Property | Description |
---|
Name | The name of the target. |
Type | The native data type of the target. |
Precision | The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types. |
Detail | The detail of the data type. |
Scale | The scale of the data type. |
Description | The description of the target. |
Target Properties
The targets properties list the targets of the Amazon Kinesis data object.
The following table describes the sources property that you can configure for Kinesis Firehose targets:
Property | Description |
---|
Target | The target which the Amazon Kinesis data object writes to. You can add or remove targets. |
Run-time Properties
The run-time properties include properties that the Data Integration Service uses when writing data to the target at run time, such as reject file names and directories.
The run-time property for AmazonKinesis targets includes the name of the AmazonKinesis connection.
Advanced Properties
Advanced properties include tracing level, row order, and retry attempt properties.
The following table describes the advanced properties for AmazonKinesis targets:
Property | Description |
---|
Operation Type | Specifies the type of data object operation. This is a read-only property. |
Record Delimiter | The record delimiter that is inserted into the Kinesis Firehose delivery stream. |
Maximum Error Retry Attempts | The number of times that the Data Integration Service attempts to reconnect to the target. |
Response Wait Time (milliseconds) | The number of milliseconds that the Data Integration Service waits for a response to send a batch request. |
Retry Attempt Delay Time (milliseconds) | The number of milliseconds that the Data Integration Service waits before it retries to send data to the Kinesis Firehose delivery stream. |
Runtime Properties | The runtime properties for the connection pool and AWS client configuration. Specify the properties as key -value pairs. For example: key1=value1,key2=value2 |
Column Projection Properties
The following table describes the columns projection properties that you configure for Amazon Kinesis Firehose targets:
Property | Description |
---|
Column Name | The name field that contains data. This property is read-only. |
Type | The native data type of the resource. This property is read-only. |
Enable Column Projection | Indicates that you use a schema to write data to the target. By default, the data is streamed in binary format. To change the format in which the data is streamed, select this option and specify the schema format. |
Schema Format | The format in which data is written to the target. You can select one of the following formats: |
Schema | Specify the XSD schema for the XML format, a sample file for JSON, or .avsc file for Avro format. For the Flat file format, configure the schema to associate a flat file to the source. |
Column Mapping | The mapping of source data to the data object. Click View to see the mapping. |
Project Column as Complex Data Type | Project columns as complex data type for hierarchical data. For more information on hierarchical data, see the Informatica Big Data Management User Guide. |