Data Synchronization Task Options
The Data Synchronization Task wizard provides many options for you to configure tasks. The options that display depend on the options that you select and the licenses enabled for the organization. For example, advanced Salesforce target options display on the Schedule page of the task wizard if you select a Salesforce target connection for the task on the Target page and your organization has the DSS Advanced Options license.
Data Synchronization Task Sources
You can use a single object as a source or multiple related objects as sources in a Data Synchronization task.
You can use multiple objects for database connection types. Define relationships based on key columns or create a user-defined join condition.
Rules and Guidelines for Multiple-Object Databases
Use the following rules and guidelines when you configure a multiple-object database:
- •All objects must be available through the same source connection. All database tables in a multiple-object source must have valid relationships defined by key columns or user-defined join conditions.
- •When you add multiple database tables as sources for a task, you can either create relationships or user-defined joins, but not both.
- •The Data Synchronization task wizard removes a user-defined join under the following conditions:
- - You remove one of two remaining database tables from the list of sources for the task.
- - You change the source connection from database to flat file or Salesforce.
Data Synchronization Task Targets
You can use a single object as a target for a Data Synchronization task.
The target connections that you can use depend on the task operation you select for the task. For example, if you select the upsert task operation, you cannot use a flat file target connection because you cannot upsert records into a flat file target.
Flat File Target Creation
If a task has a flat file target, create the flat file before you save the task. Or, you can create the flat file target with the Data Synchronization task wizard when all of the following are true:
- •The source connection type is Salesforce, database, or ODBC.
- •The source object is Single or Custom.
- •The target connection type is Flat File.
The Data Synchronization task wizard uses the source object name as the default name of the flat file target. It truncates the name of the flat file to the first 100 characters if the source name is too long. If the target name conflicts with the name of another target object, the following error appears:
Object named <object name> already exists in the target connection.
Database Target Truncation
You can configure a Data Synchronization task to truncate a database target table before writing new data to the table when you configure the task to use an Insert task operation. By default, Informatica Cloud inserts new rows without truncating the target table.
Salesforce Targets and IDs for Related Objects
Informatica Cloud identifies records of a Salesforce object based on one of the following types of IDs:
- •Salesforce ID
Salesforce generates an ID for each new record in a Salesforce object.
- • External ID
You can create a custom external ID field in the Salesforce object to identify records in the object. You might create an external ID to use the ID generated from a third-party application to identify records in the Salesforce object. You can use one or more external IDs to uniquely identify records in each Salesforce object.
If you create a Data Synchronization task that writes to a Salesforce target, the source must provide either the Salesforce IDs or the external IDs for the records in the Salesforce target object and applicable related objects. A related object is an object that is related to another object based on a relationship defined in Salesforce. The Data Synchronization task uses the Salesforce ID or external ID to update changes to related objects.
If the source in a task contains external IDs for Salesforce objects, you must specify the external IDs for all related objects when you create the Salesforce target for the task. If you do not specify the external ID, Informatica Cloud requires the Salesforce ID to identify records in each related object.
For more information about creating and using Salesforce external IDs, see the
Informatica Cloud Community article, "
Using External IDs and Related Objects in Informatica Cloud".
Column Names in Flat Files
If the column name in a flat file source contains nonalphanumeric characters, starts with a number, or contains more than 75 characters, the Data Synchronization task modifies the column name in the flat file target.
The Data Synchronization task truncates column names to 75 characters. For a flat file source, the Data Preview area and the Field Expression dialog box show modified column names. For a flat file target, the Data Synchronization task changes the column name in the flat file when it generates the file at run time.
Rules and Guidelines for Data Synchronization Task Sources and Targets
Use the following rules and guidelines for Data Synchronization sources and targets:
- •Field names must contain 65 characters or less.
- •Field names must contain only alphanumeric or underscore characters. Spaces are not allowed.
- •Field names cannot start with a number.
- •Each field name must be unique within each source and target object.
- •Informatica Cloud truncates data if the scale or precision of a numeric target column is less than the scale or precision of the corresponding source column.
Rules and Guidelines for Flat File Sources and Targets
Use the following rules and guidelines for flat file sources and targets:
- •All date columns in a flat file source must have the same date format. Rows that have dates in a different format than the one specified in the Data Synchronization task definition are written to the error rows file.
- •Each flat file target must contain all fields that will be populated by the Data Synchronization task.
- •The Data Synchronization task truncates a flat file target before writing target data to the file.
To avoid overwriting target data, you might use a post-session command to merge target data with a master target file in a different location.
- •The flat file cannot contain empty column names. If a file contains an empty column name, the following error appears:
Invalid header line: Empty column name found.
- •Do not map binary fields when you use a flat file source or target in a Data Synchronization task.
- •Column names in a flat file must contain printable tab or ASCII characters (ASCII code 32-126). If the file contains a character that is not valid, the following error appears:
Invalid header line: Non-printable character found. The file might be binary or might have invalid characters in the header line.
- •For flat file sources and targets with multibyte data on Linux, the default locale must be UTF-8.
Rules and Guidelines for Database Sources and Targets
Use the following rules and guidelines for database sources and targets:
- •You can use database tables as targets. You can use database tables, aliases, and views as sources.
- •Relational targets must meet the minimum system requirements.
- •The database user account for each database target connection must have DELETE, INSERT, SELECT, and UPDATE privileges.
Field Mappings
Map source columns to target columns on the Field Mapping page of the Data Synchronization task wizard. You must map at least one source column to a target column. Map columns with compatible datatypes or create field expressions to convert datatypes appropriately.
Depending on the task operation, the Data Synchronization task requires certain fields to be included in the field mapping. By default, the Data Synchronization task maps the required fields. If you configure the field mapping, ensure that the required fields remain mapped. If you do not map the required fields, the Data Synchronization task fails.
The following table shows the required fields for each applicable task operation for a database target:
Required Field | Task Operations | Description |
---|
Primary Keys | Delete Update Upsert | Map primary key columns to enable the Data Synchronization task to identify records to delete, update, or upsert in a database target. |
Non-null fields | Insert Update Upsert | Map all fields that cannot be null in database. |
When you configure field mappings, you can also perform the following tasks:
- •Edit field datatypes.
- •Add a mapplet to the field mapping.
- •Create lookups.
Field Datatypes
When you create a Data Synchronization task, Informatica Cloud assigns a datatype to each field in the source and target. You can edit the field datatypes on the Field Mapping page of the Data Synchronization task wizard. You can edit field datatypes for any source or target type except for Informatica Cloud Connector sources and targets, and mapplets.You can edit field datatypes for any source or target type except for Informatica Cloud Connector sources and targets, mapplets, and saved queries.
Mapplets in Field Mappings
You can add a mapplet to a field mapping. After you add a mapplet to a field mapping, you must map the source fields to the input fields of the mapplet and map the output fields of the mapplet to the target fields.
When a source field is mapped directly to a target field and you map an output field of a mapplet to the same target field, Informatica Cloud concatenates the values of the source and output fields in the target field. Verify that the expression is correct for the target field.
Note: The names of the output fields of a mapplet do not match the source field names. Informatica Cloud appends a number to the end of the source field name to determine the output field name. In addition, Informatica Cloud may not display the output fields in the same order as the source fields.
Lookup Conditions
A lookup returns values based on a lookup condition. You can create a lookup condition based on information in the source on the Field Mapping page of the Data Synchronization task wizard. For example, for a SALES source database table, you might set the ITEM_ID column equal to ITEM_ID column in a ITEMS flat file, and have the lookup return the item name for each matching item ID.
When you create a lookup condition, you define the following components:
- •Lookup connection and object. The connection and object to use to perform the lookup. When possible, use a native connection. For example, to perform a lookup on an Oracle table, use an Oracle connection instead of an ODBC connection.
- •Source and lookup fields. The fields used to define the lookup condition. The Data Synchronization task compares the value of the source field against the lookup field and then returns a value based on the match. You can define multiple conditions in a lookup. If you define more than one lookup condition, all lookup conditions must be true to find the match.
For example, you define the following conditions for a lookup:
SourceTable.Name = LookupTable.Name
SourceTable.ID = LookupTable.ID
The Data Synchronization task performs the following lookup:
Lookup (SourceTable.Name = LookupTable.Name, SourceTable.ID = LookupTableID)
Lookup Return Values
When you configure a lookup, you configure a lookup return value. The lookup return value depends on the return value properties that you define, such as multiplicity or a lookup expression.
A lookup return value is the value that Informatica Cloud returns when it finds a match based on the lookup condition. If the lookup returns an error, Informatica Cloud writes the row to the error rows file.
You can configure a lookup expression as part of the lookup return value. Configure a simple expression that uses the $OutputField variable to represent the lookup return value.
For example, the following expression adds 100 to each lookup return value:
$OutputField+100
As another example, you can use the concatenate operator ( || ) to append a string to a string lookup return value as follows:
'Mighty' || '$OutputField'
You can use parameters defined in a parameter file in a lookup expression.
The following table describes the lookup return value properties that you can configure:
Lookup Return Value Property | Description |
---|
Output Field | The field from the lookup table that you want to use. |
Multiplicity | How Informatica Cloud handles multiple return values. - - Error If More Than 1 Output Value. Select if the Data Synchronization task should display an error when the lookup condition returns multiple values. Informatica Cloud rejects rows when multiple matches are found, writing them to the error rows file. This is the default.
- - Randomly Pick 1 Output Value. Select if the Data Synchronization task should choose the first returned value when a lookup condition returns multiple values. Different systems might use different orders to return lookup values.
|
Expression | A simple expression that uses $OutputField to represent the selected output field. By default, Informatica Cloud passes the lookup return value without alteration with the following expression: $OutputField. |
Rules and Guidelines for Lookups
Use the following rules and guidelines when creating a lookup:
- •If the lookup is on a flat file, the file must use a comma delimiter. You cannot use any other type of delimiter.
- •When you configure a lookup, you can configure a simple lookup expression as part of the lookup return value. Use the $OutputField variable to represent the expression. If you use a lookup expression that does not include $OutputField, you negate the action of the lookup.
- •Tasks with a flat file lookup that run by a Secure Agent on Windows 7 (64 bit) might not complete. To resolve the issue, configure a network login for the Secure Agent service.
- •On the Field Mapping page, you can perform a lookup or create an expression for each source field. You cannot do both.
- •Each task can contain one or more lookups. To avoid impacting performance, include less than six lookups in a task.
- •When performing the lookup, the task performs an outerjoin and does not sort the input rows. The lookup performs a string comparison that is not case-sensitive to determine matching rows.
- • The source field and lookup field in the lookup condition must have compatible datatypes. If the datatypes are not compatible, the following error appears:
Source field [<source field name> (<source field datatype>)] and lookup field [<lookup field name> (<lookup field datatype>)] have incompatible datatypes.
If you create multiple lookup conditions on a lookup field and the lookup source is a flat file, all source fields must have the same datatype. The Data Synchronization task uses the larger precision and scale of the source field datatypes as the precision and scale for the target field. If the source fields do not have the same datatype, the following error appears:
Lookup field <field name> in <file name> has conflict data types inferenced: <datatype 1> and <datatype 2>.
- •You cannot include lookup fields of particular datatypes in a lookup condition. When the lookup field in a flat file has the Text or Ntext datatype or the target field of a lookup has the Text or Ntext datatype, the task fails.
- •If you run a task with a lookup and the source field, lookup field, or output field of the lookup no longer exist in the lookup object, an error appears.
Data Synchronization Task Schedule and Advanced Options
In the last page of the Data Synchronization Task wizard, you can configure scheduling and advanced options.
You can configure the following options:
- •Task scheduling and run options.
- •Email notification options.
- •Advanced source and target options.
- •Advanced options, such as preprocessing or postprocessing commands.
Email Notification Options
You can configure email notification for a Data Synchronization or Data Replication task. When you configure email notification for the task, Informatica Cloud uses the email notification options configured for the task instead of the email notification options configured for the organization. You can send email to different addresses based on the status of the task:
- •Success. The task completed successfully.
- •Warning. The task completed with errors.
- •Failure. The task did not complete.
Preprocessing and Postprocessing Commands
You can run preprocessing and postprocessing commands to perform additional jobs. The task runs preprocessing commands before it reads the source. It runs postprocessing commands after it writes to the target.
You can use the following types of commands:
- •SQL commands. Use SQL commands to perform database tasks.
- •Operating system commands. Use shell and DOS commands to perform operating system tasks.
If any command in the preprocessing or postprocessing scripts fail, the task fails.
Preprocessing and Postprocessing SQL Commands
You can run SQL commands before or after a task. For example, you can use SQL commands to drop indexes on the target before the task runs, and then recreate them when the task completes. Informatica Cloud does not validate the SQL.
Use the following rules and guidelines when creating the SQL commands:
- •Use any command that is valid for the database type. However, Informatica Cloud does not allow nested comments, even if the database allows them.
- •Use a semicolon (;) to separate multiple statements. Informatica Cloud issues a commit after each statement.
- •Informatica Cloud ignores semicolons within comments. If you need to use a semicolon outside of comments, you can escape it with a backslash (\).
Preprocessing and Postprocessing Operating System Commands
Informatica Cloud can perform operating system commands before or after the task runs. For example, use a preprocessing shell command to archive a copy of the target flat file before the task runs on a UNIX machine.
You can use the following types of operating system commands:
- •UNIX. Any valid UNIX command or shell script.
- •Windows. Any valid DOS or batch file.
Update Columns
Update columns are columns that uniquely identify rows in the target table. Add update columns when the database target table does not contain a primary key and the Data Synchronization task uses an update, upsert, or delete task operation.
When you run the Data Synchronization task, the Data Synchronization task uses the field mapping to match rows in the source to the database table. If the Data Synchronization task matches a source row to multiple target rows, it performs the specified task operation on all matched target rows.