Data Replication Task Configuration
Configure a Data Replication task to replicate data from a source to a target. When you configure a Data Replication task, you specify the source connection, target connection, and the objects to replicate.
A Data Replication task can replicate data from one or more Salesforce objects or database tables. When you configure the task, you can replicate all available objects through the selected connection, or you can select objects for replication by including or excluding a set of objects. You can also exclude rows and columns from the Data Replication task. Associate a schedule with a Data Replication task to specify when and how often the task runs.
You configure a Data Replication task to run full or incremental loads. Perform a full load to replicate all rows for each object. Perform an incremental load to replicate rows that are new or changed since the last time you ran the task.
If you remove the Data Replication task from a schedule as the task runs, the task completes. The Data Replication task cancels any additional task runs associated with the schedule.
When you configure a Data Replication task, you can save your work at any time. You can choose one of the following options:
- •Save and continue
- •Save and run
- •Save and close
To configure a Data Replication task, use the Data Replication Task wizard to perform the following steps:
- 1. Configure the source.
- 2. Configure the target.
- 3. Optionally, exclude fields.
- 4. Optionally, configure data filters.
- 5. Optionally, configure a schedule.
Rules and Guidelines for Configuring Data Replication Tasks
Use the following rules and guidelines for configuring Data Replication tasks:
- •The names of source tables and fields can contain at most 79 characters.
- •Multiple Data Replication tasks cannot write to the same database table or flat file.
- •You cannot configure a Data Replication task with the same source and target objects. If the source and target connections are the same, you must enter a target prefix to distinguish the source and target objects.
- •You cannot replicate data to a Salesforce target.
- •The maximum number of characters that a Data Replication task can write to each row in a Microsoft SQL Server 2000 target is 8060. If a Data Replication task tries to write more than the maximum amount of characters to a row, the task fails with the following error:
WRT_8229 Database errors occurred: FnName: Execute -- [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot create a row of size <row size> which is greater than the allowable maximum of 8060. FnName: Execute -- [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been terminated.
Step 1. Configure the Source
Configure the source on the Source page of the Data Replication Task wizard.
Note: Column names of a database source must not contain spaces or hyphens.
1. In the Task Details area, configure the following fields:
Field | Description |
---|
Task Name | Name of the Data Replication task. Task names must be unique within the organization. Task names can contain alphanumeric characters, spaces, and the following special characters: _ . + - Maximum length is 100 characters. Task names are not case sensitive. |
Description | Description of the task. Data Replication task descriptions can contain up to 255 characters. |
2. In the Source Details area, select a connection.
To create a connection, click New. To edit a connection, click View, and in the View Connection dialog box, click Edit.
3. To replicate all objects in the database or Salesforce account, select All Objects.
To select the objects to replicate, select one of the following options:
- - Include Objects. To select the objects you want to include, click Select. In the Include Source Objects dialog box, select the objects to use and click Select.
- - Exclude Objects. To select the objects you want to exclude, click Select. In the Exclude Source Objects dialog box, select the objects to exclude and click Select. The task replicates all available objects except for the selected objects.
The Available Objects area displays up to 200 objects. If the objects that you want to use do not display, enter a search string to reduce the number of objects that display.
When you select an object, it displays in a list. To remove a selected object, select the object and press Delete.
4. If you want the Data Replication task to stop processing when it encounters an error, click Cancel processing the remaining objects.
If you want the Data Replication task to continue to process a task after it encounters an error, click Continue processing of the remaining objects.
By default, the Data Replication task stops processing the task when it encounters an error.
5. To display technical names instead of business names for some source types, click Display technical names instead of labels.
6. Click Next.
Step 2. Configure the Target
Configure a target for the Data Replication task.
Target Prefixes
When you replicate data to a database table or flat file, the Data Replication task names each database table or flat file based on the corresponding source object name.
By default, the Data Replication task includes the target prefix SF_. For example, the default flat file name for the Account Salesforce object is SF_ACCOUNT.CSV. If you remove the default target prefix and do not specify another prefix, the Data Replication task creates a flat file or database table with the same name as the corresponding source object.
You can use target prefixes to prevent overwriting data. For example, you and another user share a database user account. The other user ran a Data Replication task on the Contact object from her Salesforce account. Her Data Replication task created a database table named Contact in the shared database. You use no target prefix and run a Data Replication task on the Contact object from your Salesforce account. The Data Replication task overwrites the data in the existing Contact table with your data. If you use the SF_ prefix, the Data Replication task creates a table named SF_CONTACT and does not overwrite the existing table named Contact.
Creating Target Tables
You can use Informatica Cloud to create the database table for a target before you run the Data Replication task. You might want to create the target table, and then modify the table properties before the Data Replication task loads the data into the table.
To create the target table for a Data Replication task:
1. Click Task Wizards > Data Replication.
2. Click the Create Target option next to the applicable Data Replication task.
Configuring a Target
1. On the Target page, enter the following information:
Field | Description |
---|
Connection | Connection to the target object. To create a connection, click New. To edit a connection, click View, and in the View Connection dialog box, click Edit. |
Target Prefix | Prefix that is added to Salesforce object names to create the flat file names or table names in a target database. By default, the prefix is SF_. |
Load Type | Type of load. Select one of the following options: - - Incremental loads after initial full load. Loads all data the first time the task runs. In subsequent runs, loads changed data only.
- - Incremental loads after initial partial load. Loads data created or modified after a specified period in time. If you select this option, enter the date and time, for example, August 29, 2015 at 2:00. The Data Replication task uses the time zone that is set for the user. If the server on which the data resides is located in a different time zone, adjust the date and time accordingly.
For example, the time zone for the user is Pacific Time and the time zone for the server is Eastern Time, which is three hours ahead of Pacific Time. The user wants the initial load to replicate data modified on the server after August 29, 2015 at 2:00 AM. Because the user's time zone is Pacific Time, the user specifies August 28, 2015 and 11:00 PM. - - Full Load each run. Loads all data every time the task runs.
This option is enabled for tasks with a Salesforce source and a relational target. For all other tasks, the Data Replication task performs a full load. |
Delete Options | Select one of the following options: - - Remove Deleted Columns and Rows. Deletes columns and rows from the target if they no longer exist in the source.
- - Retain Deleted Columns and Rows. Retains columns and rows in the target that were removed from the source.
|
Commit Size | Number of rows to commit. Default for full load replication is 5,000 rows. Default for incremental load replication is 999,999,999. |
2. Click Next.
Step 3. Configure the Field Exclusions
By default, the Data Replication task loads all fields in to the target. Configure field exclusions for each source object to limit the fields loaded in to a target.
1. On the Field Exclusion page, click Exclude Fields.
2. In the Field Exclusion dialog box, select the source object that you want to use.
3. In the Included Fields list, select and move the fields that you want to exclude to the Excluded Fields list.
4. Click OK.
The excluded field names display in a list. To remove an excluded field, click the Delete icon for the field.
5. Click Next.
Step 4. Configure the Data Filters
By default, the Data Replication task replicates all source rows to the target. Configure data filters to filter source rows that are replicated. If you replicate multiple source objects, create a different set of data filters for each object.
1. On the Data Filters page, enter the following details:
Field | Description |
---|
Row Limit | Select one of the following options: - - Process all Rows. Replicates all rows of the source.
- - Process Only the First... Rows. Replicates the first X rows, where X is the number of rows. You might choose to process the first set of rows to test the task.
You cannot specify a row limit on Data Replication tasks with non-Salesforce sources. If you select a non-Salesforce source, the option is disabled. |
Data Filters | Click New to create a data filter on a Salesforce or database source. You can create simple or advanced data filters. |
2. Click the Delete icon next to the data filter to delete the filter.
3. Click Next.
Step 5. Configure a Schedule
You can run a Data Replication task manually or schedule it to run at a specific time or on a time interval.
Email Notification Options
You can configure email notification for a Data Synchronization or Data Replication task. When you configure email notification for the task, Informatica Cloud uses the email notification options configured for the task instead of the email notification options configured for the organization. You can send email to different addresses based on the status of the task:
- •Success. The task completed successfully.
- •Warning. The task completed with errors.
- •Failure. The task did not complete.
Preprocessing and Postprocessing Commands
You can run preprocessing and postprocessing commands to perform additional jobs. The task runs preprocessing commands before it reads the source. It runs postprocessing commands after it writes to the target.
You can use the following types of commands:
- •SQL commands. Use SQL commands to perform database tasks.
- •Operating system commands. Use shell and DOS commands to perform operating system tasks.
If any command in the preprocessing or postprocessing scripts fail, the task fails.
Preprocessing and Postprocessing SQL Commands
You can run SQL commands before or after a task. For example, you can use SQL commands to drop indexes on the target before the task runs, and then recreate them when the task completes. Informatica Cloud does not validate the SQL.
Use the following rules and guidelines when creating the SQL commands:
- •Use any command that is valid for the database type. However, Informatica Cloud does not allow nested comments, even if the database allows them.
- •Use a semicolon (;) to separate multiple statements. Informatica Cloud issues a commit after each statement.
- •Informatica Cloud ignores semicolons within comments. If you need to use a semicolon outside of comments, you can escape it with a backslash (\).
Preprocessing and Postprocessing Operating System Commands
Informatica Cloud can perform operating system commands before or after the task runs. For example, use a preprocessing shell command to archive a copy of the target flat file before the task runs on a UNIX machine.
You can use the following types of operating system commands:
- •UNIX. Any valid UNIX command or shell script.
- •Windows. Any valid DOS or batch file.
Configuring a Schedule and Advanced Options
Configure a schedule and advanced options for a Data Replication task on the Schedule page of the task wizard.
1. On the Schedule page, choose whether to run the task on a schedule or without a schedule.
2. To run a task on a schedule, click Run this task on schedule and select the schedule you want to use.
To create a new schedule, click New. Enter schedule details and click OK.
To remove the task from a schedule, click Do not run this task on a schedule.
3. Configure email notification options for the task.
4. Optionally, enter the advanced options as required.
Advanced Option | Description |
---|
Preprocessing Commands | Commands to run before the task. |
Postprocessing Commands | Commands to run after the task completes. |
Maximum Number of Log Files | Number of session log files and import log files to retain. By default, Informatica Cloud stores each type of log file for 10 runs before it overwrites the log files for new runs. |
5. Choose whether to run the task in standard or verbose execution mode.
If you select Verbose mode, the mapping generates additional data in the logs that you can use for troubleshooting. It is recommended that you select verbose execution mode only for troubleshooting purposes. Verbose execution mode impacts performance because of the amount of data it generates.
6. Click Finish.Click Save.