Mappings > Mappings > Mappings in advanced mode

Mappings in advanced mode

Create a mapping in advanced mode when you want to process multilevel hierarchical data, embedded code snippets, and workloads at any scale. The Mapping Designer updates the mapping canvas to include the transformations and functions that enable advanced functionality.

A mapping in advanced mode can perform the following types of complex processing:

•Read hierarchical or relational input and convert it to relational, hierarchical, or flattened denormalized output using the Hierarchy Processor transformation.
•Embed code snippets in a mapping using the Python and Java transformations.
•Pass data to a machine learning model using the Machine Learning transformation.
•Reprocess incrementally-loaded source files to create a snapshot of the data from a specified time interval, to debug and discover the source of bad data in your target, and to restore deleted data.
•Use a code task API to run hand-coded Scala jobs.
•Use native NoSQL connectors, such as MongoDB, DynamoDB, and CosmosDB, to connect to databases without using third-party drivers.
•Process data using any connector.

A mapping in advanced mode requires an advanced cluster to run mapping logic. When you start running mappings in advanced mode, Data Integration can automatically create a local advanced cluster for you to use.

Note: To enable local clusters, contact your system administrator to update the Secure Agent configuration. For more information, see Prepare for local clusters in the Administrator help.

Because an advanced cluster is required, the mapping must run in one of the following types of runtime environments:

•A Secure Agent group with one Secure Agent. The Secure Agent must be installed on Linux.
•A serverless runtime environment.

Using an advanced cluster

Mappings in advanced mode run on an advanced cluster, which is a Kubernetes cluster that runs mapping logic in a distributed processing environment.

Advanced clusters are available in the following types:

•A local cluster with a single node that you can use to quickly onboard projects.
•A fully-managed cluster that Informatica creates, manages, and deletes.
•A self-service Kubernetes cluster that your organization manages.
•An advanced cluster that a serverless runtime environment creates.

When you start running mappings in advanced mode, you can use a local cluster. As soon as you run a mapping in advanced mode, Data Integration automatically creates a temporary, local cluster to run the mapping so that you can access the mapping output immediately.

As you develop larger data integration projects, your administrator can configure your runtime environment to connect to a fully-managed or self-service cluster that adapts to the size of the workload. Or, your administrator can set up a serverless runtime environment that includes an advanced cluster for your organization to use.

For more information about advanced clusters, see the Administrator help.

Mapping configuration in advanced mode

Use the Mapping Designer to create a mapping in advanced mode and update advanced mode settings.

You can complete the following tasks:

•Create a mapping in advanced mode.
•Copy a mapping to advanced mode.
•Update advanced mode settings.

Note: After the Mapping Designer updates the mapping canvas to use advanced mode, you cannot revert the operation.

Creating a mapping in advanced mode

To create a mapping in advanced mode, create a mapping and then update the mapping canvas in the Mapping Designer.

1In Data Integration, click New > Mappings > Mapping.

2In the Mapping Designer, click Switch to Advanced.

The following image shows the Switch to Advanced button in the Mapping Designer:

In the Mapping Designer, the header includes the Switch to Advanced button.

3In the Switch to Advanced dialog box, optionally choose to hide the advanced mode confirmation dialog box or choose to always create mappings in advanced mode.

4Click Switch to Advanced.

The Mapping Designer updates the mapping canvas to advanced mode.

Copying a mapping to advanced mode

To copy an existing mapping to advanced mode, update the mapping canvas in the Mapping Designer.

1Open a mapping.

2In the Mapping Designer, click Switch to Advanced.

3In the Switch to Advanced dialog box, optionally choose to hide the advanced mode confirmation dialog box or choose to always create mappings in advanced mode.

4Click Switch to Advanced.

Data Integration retains the original mapping and creates a copy of the mapping in advanced mode. In the copy, the Mapping Designer updates the mapping canvas to advanced mode.

Updating advanced mode settings

Update the advanced mode settings to show or hide the Switch to Advanced dialog box or to create mappings in advanced mode by default.

1Open a mapping in advanced mode.

2In the Mapping Designer header, click the Settings icon.

3In the Advanced Mode Settings dialog box, check or uncheck the following options:

- Ask for confirmation before copying a mapping to advanced mode.
- Always create mappings in advanced mode.

4Click Save.

Runtime plans

To troubleshoot a mapping in advanced mode, you can view the runtime plan when you monitor the job. The runtime plan shows which mapping logic runs on the Data Integration Server or on an advanced cluster.

When you run a mapping in advanced mode, Data Integration uses the mapping compilation log to create a visualization of the mapping logic at runtime. The visualization is available as a runtime plan that you can view in the job details.

The following image shows an example of a runtime plan:

The runtime plan includes two segments. The first segment includes the transformations that run on an advanced cluster. The transformations in the first segment link to another set of transformations in the second segment. The second segment runs on the Data Integration Server.

The runtime plan is a close approximation of how the data actually flows through the mapping at runtime. The runtime plan is generated after mapping compilation, so the transformations in the runtime plan might appear in a different order than in the designed mapping.

Note: If a mapping runs using SQL ELT optimization, the mapping logic runs on the database, so a runtime plan isn't available.

For information about monitoring a mapping in advanced mode, see Monitor.

Creating a RAG ingestion pipeline

In advanced mode, you can create a retrieval augmented generation (RAG) ingestion pipeline to build a knowledge base for your large language model (LLM) application.

To create a RAG ingestion pipeline, you can use a mapping in advanced mode to upload documents such as articles, invoices, and reports. You can split the text into chunks and convert the chunked text into vector embeddings. Then, you can store both the chunked text and the vector embeddings in a vector database.

When you submit a query to your LLM application, you can provide assisting text by calculating the similarity of the query’s embedding and the existing embeddings stored in the vector database to find the most relevant chunks of text that semantically match the query. The LLM incorporates both the query and the assisting text in the response that it generates and returns to user.

Create the mapping using the following transformations, in order:

1Source transformation. Read PDFs to extract the text.
2Chunking transformation. Split large pieces of text into smaller segments, or chunks, to increase the content's relevance.
3Vector Embedding transformation. Generate vector embeddings for input text, capturing the semantic meaning of the text in a vector format.
4Expression or Sequence Generator transformation. Create an identifier for each vector.

- If you use an Expression transformation, use the UUID_STRING function without passing any arguments. The function returns a globally unique ID that can be stored in a string field with a precision of 100.

Note: UUID_STRING is an internal function that you can use only in advanced mode. Using it to create identifiers for other use cases might produce unexpected results.

- If you use a Sequence Generator transformation, create a shared sequence to use across all mappings that load data to the same index in the vector database.

5Target transformation. Write vectors to a vector database.

For more information about each transformation, see Transformations.