The November 2024 release of Data Integration includes the following new features and enhancements.
Watch the What's New video to learn about new features and enhancements in the November 2024 release.
Comparing asset versions
When you commit a Data Integration mapping, mapping task, or mapplet to your Git repository, Data Integration generates a JSON file with a vc.json extension that you can use to compare versions.
This feature is not enabled by default in the November 2024 release. To enable the feature, contact Informatica Global Customer Support.
Creating a RAG ingestion pipeline in advanced mode
You can create a retrieval augmented generation (RAG) ingestion pipeline in advanced mode using the following new features and enhancements:
•In a Source transformation, you can read PDF files and extract the text to pass to downstream transformations.
•You can use a Chunking transformation to split large pieces of text into smaller segments, or chunks, to increase the content's relevance before storing it in a vector database.
•You can use a Vector Embedding transformation to generate vector embeddings for input text, capturing the semantic meaning of the text in a vector format.
•In a Target transformation, you can write vectors to a vector database to build a knowledge base for a large language model (LLM) application.
For more information, see Mappings and Transformations.
CurrentTaskId
You can use the system variable CurrentTaskId to return the task ID as a string value.
For more information about system variables, see Function Reference.
Intelligent structure models
You can use normalized hierarchy relationships to generate output groups for JSON files.
For more information, see Components.
Source partitioning
When you create a mapping task with a parameterized source connection or object, you can configure pass-through partitioning for the source in the task.
For more information about source partitioning see, Transformations. For more information about tasks, see Tasks.
Transformations
The following transformations are new or have enhancements this release:
•The Chunking transformation is a new transformation that you can use to split large pieces of text into smaller segments, or chunks, to increase the content's relevance before storing it in a vector database.
•When you configure an Output transformation in a mapplet, you can generate transformation output fields based on incoming fields.
•The Vector Embedding transformation is a new transformation that you can use to generate vector embeddings for input text, capturing the semantic meaning of the text in a vector format.