User Guide > Sources in a Streaming Mapping > AmazonKinesis Data Objects
  

AmazonKinesis Data Objects

An AmazonKinesis data object is a physical data object that represents data in a Amazon Kinesis Data Stream. After you create an AmazonKinesis connection, create an AmazonKinesis data object to read from Amazon Kinesis Data Streams.
Kinesis Data Streams is a real-time data stream processing option that Amazon Kinesis offers within the AWS ecosystem. It is a customizable option for users who want to build custom applications to process and analyze streaming data. You must manually provision enough capacity to meet system needs.
When you configure the AmazonKinesis data object, specify the name of the Amazon Kinesis Data Stream that you read from. After you create the data object, create a read operation to read data from an Amazon Kinesis Data Stream. You can then add the data object read operation as a source in streaming mappings.
When you configure the data operation properties, specify the format in which the data object reads data. When you read from Amazon Kinesis Data Stream sources, you can read data in JSON, XML, Avro, Flat, or binary format. When you specify XML format, you must provide a XSD file. When you specify Avro format, provide a sample Avro schema in a .avsc file. When you specify JSON or Flat format, you must provide a sample file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
Note: You cannot run a mapping with an AmazonKinesis data object on a MapR distribution.
For more information about processing hierarchical data, see the Informatica Big Data Management User Guide.
For more information about Kinesis Data Streams, see the Amazon Web Services documentation.

AmazonKinesis Data Object Overview Properties

Overview properties include general properties that apply to the AmazonKinesis data object. The Developer tool displays overview properties of the data object in the Overview view.
You can configure the following overview properties for AmazonKinesis data objects:
General
You can configure the following general properties for the AmazonKinesis data object:
Column
You can configure the name, native name, data type, precision, scale, and description of the columns in the AmazonKinesis resource.
Advanced
The following are the advanced properties for the AmazonKinesis data object:

AmazonKinesis Data Object Read Operation Properties

The Data Integration Service uses read operation properties when it reads data from AmazonKinesis Streams.

General Properties

The Developer tool displays general properties for AmazonKinesis sources in the Read view.
The following table describes the general properties for the AmazonKinesis data object read operation:
Property
Description
Name
The name of the AmazonKinesis data object
This property is read-only. You can edit the name in the Overview view. When you use the AmazonKinesis stream as a source in a mapping, you can edit the name in the mapping.
Description
The description of the AmazonKinesis data object operation.

Ports Properties

Ports properties for a physical data object include port names and port attributes such as data type and precision.
The following table describes the ports properties that you configure for AmazonKinesis stream sources:
Property
Description
Name
The name of the source.
Type
The native data type of the source.
Precision
The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types.
Detail
The detail of the data type.
Scale
The scale of the data type.
Description
The description of the resource.

Sources Properties

The sources properties list the resources of the Amazon Kinesis data object.
The following table describes the sources property that you can configure for Amazon Kinesis Streams sources:
Property
Description
Sources
The sources which the Amazon Kinesis data object reads from.
You can add or remove sources.

Run-time Properties

The run-time properties include properties that the Data Integration Service uses when reading data from the source at run time.
The run-time property for AmazonKinesis Stream source includes the name of the AmazonKinesis connection.

Advanced Properties

The following table describes the advanced properties for AmazonKinesis Stream sources:
Property
Description
Operation Type
Specifies the type of data object operation.
This is a read-only property.
Guaranteed Processing
Guaranteed processing ensures that the mapping processes messages published by the sources and delivers them to the targets at least once. In the event of a failure, there could be potential duplicates but the messages are processed successfully. If the external source or the target is not available, the mapping execution stops to avoid any data loss.
Select this option for guaranteed delivery of data streamed from the AmazonKinesis Stream.
Degree of Parallelism
The number of processes that run in parallel within a shard.
Specify a value that is less than or equal to the number of shards.

Column Projections Properties

The following table describes the columns projection properties that you configure for Amazon Kinesis Stream sources:
Property
Description
Column Name
The name field that contains data.
This property is read-only.
Type
The native data type of the resource.
This property is read-only.
Enable Column Projection
Indicates that you use a schema to read the data that the source streams.
By default, the data is streamed in binary format. To change the format in which the data is processed, select this option and specify the schema format.
Schema Format
The format in which the source processes data. You can select one of the following formats:
  • - XML
  • - JSON
  • - Avro
  • - Flat
Schema
Specify the XSD schema for the XML format, the sample JSON for the JSON format.
Specify a .avsc file for the Avro format or a sample file for the Flat format.
Column Mapping
The mapping of source data to the data object. Click View to see the mapping.
Project Column as Complex Data Type
Project columns as complex data type for sources with hierarchical data.
Select this option if the source has hierarchical data.
For more information on hierarchical data, see the Informatica Big Data Management User Guide.

Configuring Scheme for Flat Files

Configure schema for flat files when you configure column projection properties.
    1. On the Column Projection tab, enable column projection and select the flat schema format.
    The page displays the column projection properties page.
    2. On the column projection properties page, configure the following properties:
    3. Click Next.
    4. In the delimited format properties page, configure the following properties:
    Property
    Description
    Delimiters
    Specify the character that separates entries in the file.
    Default is a comma (,). You can only specify one delimiter at a time.
    If you select Other and specify a custom delimiter, you can only specify a single-character delimiter.
    Text Qualifier
    Specify the character used to enclose text that should be treated as one entry. Use a text qualifier to disregard the delimiter character within text. Default is No quotes. You can only specify an escape character of one character.
    Preview Options
    Specify the escape character. The row delimiter is not applicable as only one row is created at a time.
    Maximum rows to preview
    Specify the rows of data you want to preview.
    5. Click Next to preview the flat file data object.
    If required, you can change the column attributes. The data type timestampWithTZ format is not supported.
    6. Click Finish.
    The data object opens in the editor.