Union Transformation example

You have demographic data about employees from two flat file sources and you want to merge that data.

You receive the following data in a .txt file:

employee_ID,first_name,last_name,location,email,phone
1211,John,Davis,Redwood City,jdavis@infa2.com,555-555-4444
0233,Miles,Simone,Barcelona,msimone@infa2.com,555-555-6666
1045,Billie,Coltrane,Philadelphia,bcoltrane@infa2.com,555-555-7777
0987,Django,Holiday,Paris,dholiday@infa3.com,444-444-4444
1199,Nina,Reinhardt,New York,nreinhardt@infa3.com,444-555-5555

A second file contains the following data:

ID,first,last,dept,e-mail,phone
0456,Joni,Smith,Marketing,j_smith@infa4.com,333-333-3333
1325,David,Mitchell,R&D,dmitchell@infa4.com,222-222-2222
1101,David,Harry,R&D,dharry@infa5.com,777-777-7777
0623,Debbie,Byrne,HR,dbyrne@infa5.com,888-888-8888
0777,Patti,Bowie,Sales,pbowie@infa5.com,999-999-9999

You want to merge those records into a single dataset in MySQL with the following columns:

•id

•last

•first

•email

•phone

Note: Remember that the data to be merged with a Union transformation must have the same data type, precision, and scale.

To merge the files with a Union transformation, complete the following steps:

1Ensure that the source files reside in a location accessible to your Secure Agent.

2Define a connection to access the .csv files.

3Create a mapping in the Mapping Designer.

4Add two Source transformations to the mapping to connect to data in the .csv files.

5Add a Union transformation and connect the Source transformations to it.

6In the Union transformation Properties, perform the following steps for each input group:

aIn the Field Rules section, click the group you want to configure.
b(Optional) For the incoming fields, select the fields you want to merge in the output.

The following image shows the selected fields in the first input group:

All the named fields are included from the first input group except for the "location" field.

The following image shows the selected fields in the second input group:

All the named fields are included from the second input group except for the "dept" field.

If you do not specify a rule to exclude fields, at run time, the task ignores any fields that you do not map to the output fields.

cEdit the Output field names in the Union transformation, to correspond to the field names that you want in the target:

The Output Fields tab of the Union transformation shows the output fields that will be included in the data flow. In this image, there are five output fields: email, first, ID, last, and phone.

Note: You can also select fields, change metadata, add other fields, or convert the field types, for example, from integer to number.

7In the Field Mapping of the Union transformation, ensure that the fields are correctly mapped for each input group:

On the Field Mapping tab of the Union transformation, input group "Input1" is selected, and the field map options are set to "Manual." Each output field has been mapped to a corresponding incoming field.

8Add a Target transformation to the mapping.

9Connect the Union transformation to the Target transformation.

10In the Target transformation field mapping, select automatic field mapping:

On the Field Mapping tab of the Target transformation, the field map opitions are set to "Automatic." Therefore, the incoming fields will be automatically mapped to the target fields.

When complete, the mapping appears similar to the following image:

The completed mapping contains two pipelines: In the first pipeline, source "s_ees_jz" is connected to input group "Input1" of the Union transformation, and the Union transformation is connected to the target. In the second pipeline, source "s_ees_rk" is connected to input group "Input2" of the Union transformation, and the Union transformation is connected to the target.