User Guide > Sources in a Streaming Mapping > MapRStreams Data Objects
  

MapRStreams Data Objects

A MapRStreams data object is a physical data object that represents data in a MapR Stream. After you create a MapRStreams connection, create a MapRStreams data object to read from MapR Streams.
Before you create and use MapR Stream data objects in Streaming mappings, complete the required prerequisites.
For more information about the prerequisite tasks, see the Informatica Big Data Management Cluster Integration Guide.
When you configure the MapRStreams data object, specify the stream name that you read from in the following format:
/pathname:topic name
You can specify the stream name or use a regular expression for the stream name pattern only when you read from MapR Streams. The regular expression that you specify applies to the topic name and not the path name. To subscribe to multiple topics that match a pattern, you can specify a regular expression. When you run the application on the cluster, the pattern matching is done against topics before the application runs. If you add a topic with a similar pattern when the application is already running, the application will not read from the topic.
After you create a MapRStreams data object, create a read data object operation. You can then add the data object read operation as a source in streaming mappings.
When you configure the data operation properties, specify the format in which the MapR Streams data object reads data. You can specify XML, JSON, or Avro as format.When you specify XML format, you must provide a XSD file. When you specify Avro format, provide a sample Avro schema in a .avsc file. When you specify JSON or Flat format, you must provide a sample file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
For more information about processing hierarchical data, see the Informatica Big Data Management User Guide.
For more information about how to use topic patterns in MapRStreams data objects, see https://kb.informatica.com/h2l/HowTo%20Library/1/1149-HowtoUseTopicPatternsinMapRStreamsDataObjects-H2L.pdf.

MapRStreams Object Overview Properties

Overview properties include general properties that apply to the MapRStreams data object. The Developer tool displays overview properties in the Overview view.

General

Property
Description
Name
Name of the MapRStreams data object.
Description
Description of the MapRStreams data object.
Native Name
Native name of the MapR Stream.
Path Information
The path of the MapR Stream.

Column Properties

The following table describes the column properties that you configure for MapRStreams data objects:
Property
Description
Name
The name of the MapRStreams data object.
Native Name
The native name of the MapRStreams data object.
Type
The native data type of the MapRStreams data object.
Precision
The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types.
Scale
The scale of the data type.
Description
The description of the MapRStreams data object.
Access Type
The access type of the port or column.

MapRStreams Data Object Read Operation Properties

The Data Integration Service uses read operation properties when it reads data from MapR Streams.

General Properties

The Developer tool displays general properties for MapR Streams sources in the Read view.
The following table describes the general properties for the MapRStreams data object read operation:
Property
Description
Name
The name of the MapRStreams data object
This property is read-only. You can edit the name in the Overview view. When you use the MapR Streams as a source in a mapping, you can edit the name in the mapping.
Description
The description of the MapRStreams data object operation.

Ports Properties

Ports properties for a physical data object include port names and port attributes such as data type and precision.
The following table describes the ports properties that you configure for MapR Stream sources:
Property
Description
Name
The name of the resource.
Type
The native data type of the resource.
Precision
The maximum number of significant digits for numeric data types, or the maximum number of characters for string data types.
Scale
The scale of the data type.
Description
The description of the resource.

Sources Properties

The sources properties list the resources of the MapRStreams data object.
The following table describes the sources property that you can configure for MapR Stream sources:
Property
Description
Sources
The sources which the MapRStreams data object reads from.
You can add or remove sources.

Run-time Properties

The run-time property for MapR Stream source includes the name of the MapRStream connection.

Advanced Properties

The Developer tool displays the advanced properties for MapR Stream sources in the Output transformation in the Read view.
The following table describes the advanced properties for MapR Stream sources:
Property
Description
Operation Type
Specifies the type of data object operation.
This is a read-only property.
Guaranteed Processing
Guaranteed processing ensures that the mapping processes messages published by the sources and delivers them to the targets at least once. In the event of a failure, there could be potential duplicates but the messages are processed successfully. If the external source or the target is not available, the mapping execution stops to avoid any data loss.
Select this option to avoid data loss.

Column Projections Properties

The following table describes the columns projection properties that you configure for MapR Stream sources:
Property
Description
Column Name
The name field that contains data.
This property is read-only.
Type
The native data type of the resource.
This property is read-only.
Enable Column Projection
Indicates that you use a schema to read the data that the source streams. By default, the data is streamed in binary format. To change the format in which the data is streamed, select this option and specify the schema format.
Schema Format
The format in which the source streams data. You can select one of the following formats:
  • - XML
  • - JSON
  • - Avro
Schema
Specify the XSD schema for the XML format, a sample JSON for the JSON format, or .avsc file for the Avro format.
Column Mapping
The mapping of source data to the data object. Click View to see the mapping.
Project Column as Complex Data Type
Project columns as complex data type for hierarchical data.
For more information, see the Informatica Big Data Management User Guide.