Read about new features and enhancements in the April 2024 Mass Ingestion release.
Watch the What's New video to learn about the new features and enhancements in the April 2024 release.
Common
The April 2024 release of Informatica Intelligent Cloud ServicesData Ingestion and Replication service includes the following new features that are common to application ingestion and database ingestion tasks.
Amazon Aurora PostgreSQL targets with Oracle and Salesforce sources
You can use Amazon Aurora PostgreSQL targets in database ingestion jobs that have an Oracle source or in application ingestion jobs that have a Salesforce source. To connect to the Aurora PostgreSQL target, use the PostgreSQL connector.
Previously, you could use Amazon Aurora PostgreSQL as a target only in database ingestion jobs that had a Db2 for i source.
Microsoft Fabric OneLake targets in initial load jobs
You can now use Microsoft Fabric OneLake as a target for application ingestion and database ingestion initial load jobs.
To connect to a Microsoft Fabric OneLake target, use the Microsoft Fabric OneLake connector.
Support for Unity Catalog and personal staging locations in Databricks Delta target connections
Mass Ingestion now supports Databricks Delta Unity Catalog and personal staging locations for application ingestion and database ingestion tasks of any load type. When you define a Databricks Delta target connection for a task, you can specify a catalog in the Unity Catalog metastore and indicate whether to use a personal staging location.
In the connection properties, specify the name of the catalog in the Catalog Name field. The catalog name is appended to the SQL Warehouse JDBC URL value for a data warehouse. The catalog, which contains schemas, is the first layer in the Unity Catalog hierarchy for organizing data assets. Databricks Delta recommends using Unity Catalog to administer data access policies and permissions, capture audit logs that record data access, capture lineage information on data assets, and query account data.
If you use Unity Catalog and want to stage data internally in Databricks Delta instead of using an Azure or AWS staging environment, you can select Personal Staging Location in the Staging Environment connection property. The Parquet data files for application ingestion or database ingestion jobs can then be staged to a local personal storage location, which has a data retention period of 7 days. If you use Unity Catalog, a personal storage location is automatically provisioned. Personal staging locations do not support Databricks Delta unmanaged tables, which are stored externally.
Mass Ingestion Applications
The April 2024 release of Mass Ingestion Applications includes the following new features and enhancements:
Audit apply mode for Google BigQuery and Oracle targets with SAP sources
For application ingestion incremental load and combined initial and incremental load jobs with Google BigQuery and Oracle targets, you can configure Audit apply mode, instead of using the Standard apply mode, for tasks to write a row for each DML operation on a source table to the generated target table. You can optionally add columns that contain metadata about the changes to the target table.
This feature is useful when you need an audit trail of changes to perform downstream processing on the data before writing it to the target database or when you need to examine the metadata for the changes. The target tables with the audit information can't have constraints other than indexes.
Ability to select Salesforce source fields for data replication
If you use rules to select Salesforce source objects when defining an application ingestion task, you can individually reselect or clear the fields in each of the selected objects from which to replicate data. Previously, all of the fields were selected and could not be cleared. This feature allows you to replicate only the data you need, thereby reducing the amount of data to be replicated and the replication cost and overhead.
Mass Ingestion Applications provides two new resync options, Resync (refresh) and Resync (retain), that you can use, instead of the existing Resync command, to resychronize the target with the Salesforce source. The options either refresh the target to match the current structure of the source or retain the existing source and table structure that has been used for CDC.
Mass Ingestion Databases
The April 2024 release of Database Ingestion and Replication includes the following new features and enhancements:
Query-based CDC support for database ingestion jobs with Db2 for LUW sources
You can now use the query-based CDC method for database ingestion incremental load and combined initial and incremental load tasks that have Db2 for LUW sources and Snowflake targets. This method captures inserts and updates from the source tables by querying a timestamp column that is updated when a change occurs. When you define an incremental load or combined load task on the Source page in the task wizard, the CDC Method field is automatically set to Query-based. You must enter the query column name and set the column type to Timestamp. The Include LOBs option is not supported.
Previously, for jobs with Db2 for LUW sources, only the initial load type was supported and the query-based CDC method was not available.
"Add Last Replicated Time" metadata column records the timestamp of the last DML operation applied to a Microsoft Azure Synapse Analytics or SQL Server target table
For application ingestion and database ingestion jobs that have a Microsoft Azure Synapse Analytics or SQL Server target and use any load type, you can add a metadata column to the target tables that records the date and time at which the last DML operation was applied to the target table. To add the column, select the Add Last Replicated Time check box in the Advanced section on the Target page of the task wizard. You can optionally add a prefix to the name of the metadata column to easily identify it and to prevent conflicts with the names of existing columns.
Mass Ingestion Streaming
The April 2024 release of Mass Ingestion Streaming includes the following new feature and enhancement:
Cloud-hosted Bitbucket repositories
You can use the cloud-hosted Atlassian Bitbucket repository for source-controlled Mass Ingestion Streaming assets.