Transformation Override
Configure the transformation override to override the mass transformations applied to all source tables in the mass ingestion specification. When you configure the override, you can configure parameters to apply to specific tables that you ingest from the source database.
Before you configure the transformation override, configure mass transformations to apply parameters to all source tables. Then, navigate to each table and edit the applied parameter. Or, clear the mass transformations and set specific parameters for each table.
When you configure parameters for a table, you cannot configure different parameters for different columns in the table. Additionally, if you configure parameters for a column that does not exist in the table, the table fails to be ingested.
You configure different parameters depending on whether you ingest data to a Hive or HDFS target.
Transformation Override for Hive Targets
When you configure a transformation override for a Hive target, you can edit the mass transformations that are applied to specific source tables when they are ingested to the Hive target. You can additionally edit the Hive target table properties for each source table that you ingest.
The following image shows the transformation override that you can configure for a Hive target:
You can edit the following properties for each target table:
- •Filter By. Filter the columns in the source tables.
- •Drop Columns. Drop columns from the source tables.
- •Replace Columns. Replace values in the source table columns.
- •Trim. Trim spaces from the beginning and end of values in source table columns.
- •Convert to Uppercase. Convert values in source table columns to uppercase.
- •Convert to Lowercase. Convert values in source table columns to lowercase.
- •Hive. Edit the Hive properties for the target table.
The following image shows the dialog box that appears when you edit the properties for the Hive target table:
The following table describes the Hive target table properties:
Property | Description |
---|
Hive Options | Select this option to configure the Hive target location. |
DDL Query | Select this option to configure a custom DDL query that defines how data from the source tables is loaded to the target tables. |
Storage Format | Required. The storage format of the target tables. You can select Text, Avro, Parquet, or ORC. Default is Text. |
External Location | The external location of the Hive target. Enter an external location to specify a location other than the default Hive warehouse directory. A sub-directory is created under the specified external location for each source that is ingested. For example, you can enter /temp. A source table named PRODUCT is ingested to the external location /temp/PRODUCT/ |
Partition Key | The partition key for the target Hive table. |
Clustered By | The cluster key for the target Hive table. |
Number of buckets | Required if you specify a cluster key. The number of buckets to divide the target Hive table. |
The following image shows the dialog box that appears when you edit the DDL query in the properties for the Hive target table:
To edit the DDL query, you can use SQL statements and the following placeholders:
- INFA_TABLE_NAME
- Fetches the target table name at run time.
- INFA_COLUMN_LIST
- Fetches a list of columns in the target table at run time.
Transformation Override for HDFS Targets
When you configure a transformation override for an HDFS target, you can edit the mass transformations that are applied to specific source tables when they are ingested to the HDFS target.
The following image shows the transformation override that you can configure for an HDFS target:
You can edit the following properties for each target table:
- •Filter By. Filter the columns in the source tables.
- •Drop Columns. Drop columns from the source tables.
- •Replace Columns. Replace values in the source table columns.
- •Trim. Trim spaces from the beginning and end of values in source table columns.
- •Convert to Uppercase. Convert values in source table columns to uppercase.
- •Convert to Lowercase. Convert values in source table columns to lowercase.