MapRStreams Data Objects
A MapRStreams data object is a physical data object that represents data in a MapR Stream. After you create a MapRStreams connection, create a MapRStreams data object to read from MapR Streams.
Before you create and use MapR Stream data objects in Streaming mappings, complete the required prerequisites.
For more information about the prerequisite tasks, see the Informatica Big Data Management Cluster Integration Guide.
When you configure the MapRStreams data object, specify the stream name that you read from in the following format:
/pathname:topic name
You can specify the stream name or use a regular expression for the stream name pattern only when you read from MapR Streams. The regular expression that you specify applies to the topic name and not the path name. To subscribe to multiple topics that match a pattern, you can specify a regular expression. When you run the application on the cluster, the pattern matching is done against topics before the application runs. If you add a topic with a similar pattern when the application is already running, the application will not read from the topic.
After you create a MapRStreams data object, create a read data object operation. You can then add the data object read operation as a source in streaming mappings.
When you configure the data operation properties, specify the format in which the MapR Streams data object reads data. You can specify XML, JSON, or Avro as format.When you specify XML format, you must provide a XSD file. When you specify Avro format, provide a sample Avro schema in a .avsc file. When you specify JSON or Flat format, you must provide a sample file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
For more information about processing hierarchical data, see the Informatica Big Data Management User Guide.
MapRStreams Object Overview Properties
Overview properties include general properties that apply to the MapRStreams data object. The Developer tool displays overview properties in the Overview view.
General
Property | Description |
---|
Name | Name of the MapRStreams data object. |
Description | Description of the MapRStreams data object. |
Native Name | Native name of the MapR Stream. |
Path Information | The path of the MapR Stream. |
Column Properties
The following table describes the column properties that you configure for MapRStreams data objects:
Property | Description |
---|
Name | The name of the MapRStreams data object. |
Native Name | The native name of the MapRStreams data object. |
Type | The native data type of the MapRStreams data object. |
Precision | The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types. |
Scale | The scale of the data type. |
Description | The description of the MapRStreams data object. |
Access Type | The access type of the port or column. |
MapRStreams Data Object Read Operation Properties
The Data Integration Service uses read operation properties when it reads data from MapR Streams.
General Properties
The Developer tool displays general properties for MapR Streams sources in the Read view.
The following table describes the general properties for the MapRStreams data object read operation:
Property | Description |
---|
Name | The name of the MapRStreams data object This property is read-only. You can edit the name in the Overview view. When you use the MapR Streams as a source in a mapping, you can edit the name in the mapping. |
Description | The description of the MapRStreams data object operation. |
Ports Properties
Ports properties for a physical data object include port names and port attributes such as data type and precision.
The following table describes the ports properties that you configure for MapR Stream sources:
Property | Description |
---|
Name | The name of the resource. |
Type | The native data type of the resource. |
Precision | The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types. |
Scale | The scale of the data type. |
Description | The description of the resource. |
Sources Properties
The sources properties list the resources of the MapRStreams data object.
The following table describes the sources property that you can configure for MapR Stream sources:
Property | Description |
---|
Sources | The sources which the MapRStreams data object reads from. You can add or remove sources. |
Run-time Properties
The run-time property for MapR Stream source includes the name of the MapRStream connection.
Advanced Properties
The Developer tool displays the advanced properties for MapR Stream sources in the Output transformation in the Read view.
The following table describes the advanced properties for MapR Stream sources:
Property | Description |
---|
Operation Type | Specifies the type of data object operation. This is a read-only property. |
Guaranteed Processing | Guaranteed processing ensures that the mapping processes messages published by the sources and delivers them to the targets at least once. In the event of a failure, there could be potential duplicates but the messages are processed successfully. If the external source or the target is not available, the mapping execution stops to avoid any data loss. Select this option to avoid data loss. |
Column Projections Properties
The following table describes the columns projection properties that you configure for MapR Stream sources:
Property | Description |
---|
Column Name | The name field that contains data. This property is read-only. |
Type | The native data type of the resource. This property is read-only. |
Enable Column Projection | Indicates that you use a schema to read the data that the source streams. By default, the data is streamed in binary format. To change the format in which the data is streamed, select this option and specify the schema format. |
Schema Format | The format in which the source streams data. You can select one of the following formats: |
Schema | Specify the XSD schema for the XML format, a sample JSON for the JSON format, or .avsc file for the Avro format. |
Column Mapping | The mapping of source data to the data object. Click View to see the mapping. |
Project Column as Complex Data Type | Project columns as complex data type for hierarchical data. For more information, see the Informatica Big Data Management User Guide. |