Taskflows > Setting taskflow step properties > Data Task step

Data Task step

When you add a Data Task step, you set some properties.

The following sections describes the Data Task step properties:

General

In the general properties, you can specify a descriptive name for the Data Task step.

The name can contain only alphanumeric characters, underscores (_), spaces, and Unicode characters. The name can't contain curly brackets {}, +, dot (.), comma (,), full width lowline (＿), and hyphen (-).

Data Task

In the Data Task step properties, select the task from a list of existing tasks that you want to add to the taskflow.

Note: You must have an existing task to add to a taskflow. You can't create a task during the taskflow creation process.

When you add a mapping task to a Data Task step, you see a description, input fields, and output fields.

Note: The name of the mapping task must not start with a number. Otherwise, the taskflow becomes invalid.

The input fields show the in-out parameters that the mapping task uses.

The output fields show the output fields that the mapping task returns after the taskflow runs.

When you click the Add icon on the Data Task step, you see one of the following views:

•If the Data Task step contains a task, a non-editable view of the task opens.
•If the Data Task step does not contain a task, you see a dialog box from which you can choose a task.

When you add a task to a Data Task step, a corresponding taskflow temporary field of type Text is created. When you add a task to the Data Task step, the temporary field type is the name of the task. See Temporary fields for details.

Input Fields

The Input Fields section appears when you add a task to the taskflow.

In the Max Wait (Seconds) field, you can configure the maximum length of time in seconds that the Data Task step waits for the data integration task to complete. Specify a value between 1 and 604800 seconds. Default is 604800 seconds, which is 7 days. If the task is not completed within the maximum time specified in the field, the task stops running and the subsequent task in the taskflow starts running.

Note: If the specified value is lesser than 1 or greater than 604800, the maximum wait time is automatically set to 604800 seconds.

If the task contains parameters that you can override, you can add input fields. You can set properties for an input field to override Data Integration runtime parameters. For information about runtime parameters, see Overriding parameters or parameter files in a Data Task step.

If the Data Task step uses a mapping task, you can override the values of input parameters and in-out parameters of the task.

If the Data Task step uses a mapping task, you can perform the following override actions:

•If the mapping task contains a parameter file available on the Secure Agent machine, you can override the parameter file directory and parameter file name.
•If the mapping task contains a parameter file available in a cloud-hosted repository, you can override the parameter file connection and parameter file object. Data Integration supports only the Amazon S3 V2, Azure Data Lake Store Gen2, and Google Storage V2 connection types for mapping tasks.
•If the mapping task uses data formatting options, you can override the data formatting and default precision values of the source data. These options are available in the input fields only if the formatting file is uploaded to the mapping task and not to the mapping. The precision value set in the default precision field takes precedence over the precision set in the data format field or the mapping task. The default precision value is applied to all the columns in the formatting file.
•If the mapping task contains a Lookup transformation, you can override the values of the lookup object and lookup condition.

Note: You cannot override the value of an input parameter of type string or text from the parameter file. However, you can override the input parameter value from the taskflow. You can override the connection parameter values from the parameter file.

Output Fields

The Output Fields section is an exhaustive list of output data fields that appear when the task runs.

The following image shows the Output fields you see:

The image shows nine Data Task output fields that you see when you run a taskflow. The fields include Run Id, Log Id, Task Id, Task Status, Success Source Rows, Failed Source Rows, Success Target Rows, Failed Target Rows, Start Time, End Time, and Error Message.

If you use a data transfer task, you see the following fields:

The image shows Data Task output fields when you use a data transfer task in a taskflow. The fields include Run ID, Status, Success Rows, Failed Rows, and Error Message.

To view the values of each output field, run the taskflow and go to the Taskflow Instance Detail page. For more information about the Taskflow Instance Detail page, see Monitor.

You can use output fields in a Data Decision or Assignment step.

For example, create a temporary field with value Formula and use the following expression to assign data to the field:

if(
($temp.DataTask1[1]/output[1]/Failed_Target_Rows < 0 or
$temp.DataTask1[1]/output[1]/Task_Status = '1')
and
($temp.DataTask2[1]/output[1]/Success_Target_Rows > 0
and $temp.DataTask2[1]/output[1]/Failed_Target_Rows = 0)
and $temp.DataTask3[1]/output[1]/Success_Target_Rows > 0)
then 'Pass'
else 'Fail'

When you use the temporary field in a Decision step, the taskflow takes the Pass path if the following conditions are met:

•Data Task 1 has no failed target rows or Data Task 1 runs successfully.
•Data Task 2 has at least one successful target row.
•Data Task 2 has zero failed target rows.
•Data Task 3 has at least one successful target row.

Timer Events

Enter the following Events properties to add timers to a task:

Use a Timer event to perform an action based on a schedule. The action could be either at a specific time or after an interval.

When you add a timer to a Data Task step, a new branch appears. Add an event to this branch and specify whether you want the event to run At a specific time or After an interval.

In the following image, the event on the timer branch, a Data Decision step, occurs five minutes after the main data task:

The image shows a Data Decision step with a timer set to run five minutes after the main Data Step starts. The timer event is a Data Decision step.

When a timer fires, the taskflow always runs through the entire timer branch. If Data Task 1 finishes before Decision 1, the timer branch is not executed.

Select Interrupting if you want the timer to interrupt the main data task. When you set an interrupting timer, the main data task is interrupted and the taskflow only runs the event on the timer set.

The following image shows an interrupting timer set to occur five minutes after the main data task starts:

The image shows an interrupting timer set to occur five minutes after the main data task starts.

When the event on the timer branch, Data Task 2, executes, Data Task 1 is interrupted. The taskflow follows the timer branch. That is, the taskflow runs Data Task 2 and then ends.

If you delete the End step on the timer branch of an interrupting timer, the timer branch rejoins the main branch.

The following image shows an interrupting timer branch with the End step deleted:

The image shows an interrupting timer branch with the End step deleted.

The timer event, Data Task 2, executes after 5 minutes and interrupts Data Task 1. The timer branch rejoins the main branch. The taskflow executes Data Task 2, a Parallel Paths step, and then ends.

If you use an interrupting timer, the main data task has no output with respect to this taskflow instance. You see no output fields for the main data task in the job details for the taskflow.

If a Data Task step completes before a timer, interrupting or non interrupting, fires no timer will fire for that Data Task step.

Note: When you run a particular step in a timer branch of a taskflow instance, the steps in the alternate branch also get executed. To avoid this issue, add a dummy step after the step that you would like to run.

Error Handling

Use the Error Handling section to indicate how you want the taskflow to behave when a Data Task step encounters a warning or an error. You can also configure the taskflow behavior when the task associated with a Data Task step fails or does not run.

After you select a task, enter the following error handling properties:

Property	Description
On Warning	The path that a taskflow takes when it encounters a warning in a Data Task step. A warning occurs when a Data Task step completes incorrectly or incompletely. For example, you see a warning if the Data Task step copies only 20 out of 25 rows from table A to table B. You can choose from the following options: - Select Ignore to ignore the warning and move to the next step. Note: If you select Ignore for a Data Task step with a subsequent Notification Task step and the data task fails, the email notification that you receive does not contain the fault details. To get the fault details in the email, select Custom error handling. - Select Suspend Taskflow to move the taskflow to the suspended state when it encounters a warning. You can resume the taskflow instance from the All Jobs, Running Jobs, or My Jobs page. The taskflow resumes from the step at which it was suspended. If you know the reason for the warning, correct the issue and then resume the taskflow. Default: Ignore
On Error	The path that a taskflow takes when it encounters an error in a Data Task step. An error occurs when a Data Task step fails. For example, you see an error if the Data Task does not copy table A to table B. You can choose from the following options: - Select Ignore to ignore the error and move to the next step. - Select Suspend Taskflow to move the taskflow to the suspended state when it encounters an error. You can resume the taskflow instance from the All Jobs, Running Jobs, or My Jobs page. The taskflow resumes from the step at which it was suspended. If you know the reason for the error, correct the issue and then resume the taskflow. - Select Custom error handling to handle the error in a manner you choose. If you select Custom error handling, two branches appear. The first branch is the path the taskflow follows if no error occurs. The second branch is the custom path the taskflow follows if an error occurs. Default: Suspend Taskflow
Fail taskflow on completion	The taskflow behavior when the task associated with the Data Task step fails or does not run. You can configure a taskflow to fail on its completion if the task associated with the Data Task step fails or does not run. If the task fails or does not run, the taskflow continues running the subsequent steps. However, after the taskflow completes, the taskflow status is set to failed. Note: If you configure both the Suspend on Fault taskflow advanced property and the Fail taskflow on completion property, the Suspend on Fault property takes precedence. In this case, if the task associated with the Data Task step fails or does not run, the taskflow is suspended. The taskflow does not run the subsequent steps after the Data Task step.

The following image shows a Custom error handling path with an Assignment step and another Data Task step:

The image shows a Data Task step with Custom error handling of an Assignment step and another Data Task step.