What's New and Changed > Part I: Version 10.5 > 10.5 What's New > Data Engineering Integration

Data Engineering Integration

Read this section to learn what's new for Data Engineering Integration in version 10.5.

EXTRACT_STRUCT Function

Effective in 10.5, you can use the EXTRACT_STRUCT function in dynamic expressions to extract all elements from a dynamic struct port in an Expression transformation.

The EXTRACT_STRUCT function flattens dynamic struct ports. The expression for the output ports uses the dot operator to extract elements in the dynamic struct.

For more information, see the Informatica 10.5 Transformation Language Reference.

File Manager for Cloud File Preprocessing

Effective in 10.5, you can perform file preprocessing such as list, copy, rename, move, remove, and watch on cloud ecosystems Microsoft Azure and Amazon AWS.

filemanager Commands

The following table describes the available commands for the filemanager utility:

Commands	Description
copy	Use the copy command to copy files on an Amazon AWS cloud ecosystem.
copyfromlocal	Use the copyfromlocal command to copy files from a local system to a cloud ecosystem.
list	Use the list command to list files on a cloud ecosystem.
move	Use the move command to move files on a cloud ecosystem.
removefile	Use the removefile command to delete files from a cloud ecosystem.
rename	Use the rename command to rename files on a cloud ecosystem.
watch	Use the watch command to watch files that trigger a file processing event, mapping, or workflow on a cloud ecosystem.

For more information, see the Informatica 10.5 Command Reference.

Mapping Audits

You can create an audit to validate the consistency and accuracy of data that is processed in a mapping.

An audit is composed of rules and conditions. Use a rule to compute an aggregated value for a single column of data. Use a condition to make comparisons between multiple rules or between a rule and constant values.

You can configure audits for the following mappings that run on the native environment or the Spark engine:

•Read operations in Amazon S3, JDBC V2, Microsoft Azure SQL Data Warehouse, and Snowflake mappings.
•Read operations for complex files such as Avro, Parquet, and JSON in HDFS mappings.
•Read and write operations in Hive and Oracle mappings.

For more information, see the Data Engineering Integration 10.5 User Guide.

Profile on the Databricks Cluster

Effective in version 10.5, you can run profiles on the Databricks cluster.

Profiling on the Databricks cluster: You can create and run profiles on the Databricks cluster in the Informatica Developer and Informatica Analyst tools. You can perform data domain discovery and create scorecards on the Databricks Cluster.

For information about the profiles on the Databricks cluster, see Informatica 10.5 Data Discovery Guide.

Sensitive Data Recommendations and Insights by CLAIRE

Effective in version 10.5, CLAIRE artificial intelligence detects sensitive data in mapping sources when Enterprise Data Catalog is configured on the domain.

Recommendations list source columns that contain sensitive data based on data quality rules. You can also add custom types to the sensitive data that CLAIRE detects.

For more information about recommendations and insights, see the Data Engineering Integration User Guide.

Warm Pool Support for Ephemeral Clusters on Databricks

Effective in version 10.5, you can configure ephemeral Databricks clusters with warm pools. A warm pool is a pool of VM instances reserved for ephemeral cluster creation.

When you configure the warm pool instances in the Databricks environment, the instances wait on standby in a running state for ephemeral cluster creation. You can choose to have the instances remain on standby when the ephemeral clusters are terminated.

For more information, see the chapter on cluster workflows in the Data Engineering Integration User Guide.