Data Preview and Provisioning
An effective data catalog enables business intelligence users and data scientists to locate, preview, and provision data for ad-hoc analysis. Data provisioning is the process of moving data from a source in the catalog to a target for further processing. You can preview data for table and file asset types, and create provisioning tasks in Enterprise Data Catalog. Preview a sample of the source data before you create a provisioning task.
Data Preview and Provisioning Prerequisites
Before you enable the data preview and provisioning capability, make sure that the following prerequisites are met:
- 1. Ensure that you have access to the Informatica Intelligent Cloud Services account as an administrator.
- 2. Create an Informatica Intelligent Cloud Services organization.
- 3. Create the necessary database connections in Informatica Intelligent Cloud Services for the sources and targets.
- 4. Ensure that you have created the EdcProvisionMapping mapping using the Copy data into an existing target integration template in Informatica Intelligent Cloud Services.
- 5. Configure the LdmCustomOptions.enableDataProvision and LdmCustomOptions.provision.ics.master.app.url custom properties in the Catalog Service for data provisioning in Informatica Administrator console.
- 6. Add the Informatica Intelligent Cloud Services organizations to Catalog Administrator.
Data Preview
You can preview a sample of the source data in Enterprise Data Catalog. You might want to preview data before you perform a provisioning task. Preview data to assess the source data before you transfer the data to the target. For example, consider an Oracle source that contains some information about customers, and you want to migrate the data to an Oracle target. Before you move data from source to target, you might want to verify the data in the table and know if the table contains relevant information. To verify the data, you can preview the table and ensure that the table contains relevant columns, such as Address, Phone Number, and other information about the customers.
You can perform data preview for the following sources:
- •Amazon Redshift
- •Amazon S3
- •Azure Microsoft SQL Data Warehouse
- •Azure Microsoft SQL Server
- •Hive
- •JDBC
- •Microsoft SQL Server
- •Oracle
- •Salesforce
- •Teradata
You can preview only table and file assets in the Catalog. Configure data provisioning in Catalog Administrator before you can preview data in the Catalog.
Alternatively, you can preview data for data object using the REST APIs. For more information, see the
Preview Data for a Data Object topic in the
Enterprise Data Catalog REST API Reference guide.
Data Preview Process
The data preview process involves various configuration steps in the Informatica Intelligent Cloud Services, Informatica Administrator, and Catalog Administrator. The data preview configuration steps are the same as the data provisioning configuration steps.
The following image shows the data preview process:
- 1. Configure. Set up Informatica Intelligent Cloud Services, configure custom properties for data provisioning in Informatica Administrator, and add Informatica Intelligent Cloud Services Organizations (Orgs) to Catalog Administrator.
- 2. Enable Data Provisioning. Configure a resource for data provisioning.
- 3. Configure Data Preview. Configure data preview for table and file asset types in Enterprise Data Catalog.
Data preview process involves the following steps:
- You configure custom properties for data provisioning in Informatica Administrator
- You configure the LdmCustomOptions.enableDataProvision and LdmCustomOptions.provision.ics.master.app.url custom properties of the Catalog Service.
Note: If you do not provide a valid Informatica Intelligent Cloud Services Oraganizations URL for the LdmCustomOptions.provision.ics.master.app.url custom property, Informatica Administrator console considers the default Informatica Intelligent Cloud Services URL https://dm-us.informaticacloud.com/.
- You add Informatica Intelligent Cloud Services Organizations to Catalog Administrator
- You add the organizations to get access to the Informatica Intelligent Cloud Services. Catalog Administrator connects to Informatica Intelligent Cloud Services to import database connections for the source and target. For more information about adding Informatica Intelligent Cloud Services Organizations, see the Adding an Informatica Intelligent Cloud Services Organization topic.
- You enable data provisioning for a resource
- You create a resource and configure the parameters in the Data Provisioning > Enable Data Provisioning section, and run the resource. When you enable data provisioning for the resource, Enterprise Data Catalog fetches source connection information for the data that you want to preview.
Note: If you want to enable data provisioning for a resource, you must select the same schema name or directory name that you configured for a connection in Informatica Intelligent Cloud Services.
- For more information about enabling data provisioning for a resource, see the Enable Data Provisioning topic.
- You configure the connection properties for the data source in Enterprise Data Catalog
- You specify the database user credentials in the connection properties for the data source, and then preview data.
For more information about configuring connection properties for the source data, see the
Data Preview topic in the
Informatica Enterprise Data Catalog User Guide.
Data Provisioning
Data Provisioning is a process of transferring data from the source to target for further analysis and processing. You can provision tables and files to the target. Enterprise Data Catalog uses a mapping and database connections from Informatica Intelligent Cloud Services to perform provisioning of data. A mapping is a set of inputs and outputs that represent the data flow between sources and targets. A mapping contains components such as source objects, target objects, and transformations. Enable data provisioning in Catalog Administrator. After you enable data provisioning for a resource, you can create provisioning tasks in Enterprise Data Catalog. You can create provisioning tasks for table and file asset types.
You can perform data provisioning for the following sources:
- •Amazon Redshift
- •Amazon S3
- •Azure Data Lake Store
- •Azure Microsoft SQL Data Warehouse
- •Azure Microsoft SQL Server
- •Hive
- •JDBC
- •Microsoft SQL Server
- •Microsoft Azure Blob Storage
- •Oracle
- •Salesforce
- •Teradata
You can perform data provisioning for the following targets:
- •Amazon Redshift
- •Amazon S3
- •Azure Data Lake Store
- •Azure Microsoft SQL Data Warehouse
- •Azure Microsoft SQL Server
- •Google BigQuery
- •Google Cloud Storage
- •HDFS
- •Hive
- •JDBC
- •Microsoft Azure Blob Storage
- •Microsoft SQL Server
- •Oracle
- •QlikView
- •Teradata
- •Tableau Online
- •Tableau Server
Data Provisioning Process
The data provisioning process involves various configuration steps in the Informatica Intelligent Cloud Services, Informatica Administrator, and Catalog Administrator.
The following image shows the data provisioning process:
- 1. Configure. Set up Informatica Intelligent Cloud Services, configure custom properties for data provisioning in Informatica Administrator, and add Informatica Intelligent Cloud Services Organizations (Orgs) to Catalog Administrator.
- 2. Enable Data Provisioning. Configure a resource for data provisioning.
- 3. Create Data Provisioning Tasks. Create data provisioning tasks for assets in the resource that is configured for data provisioning in Enterprise Data Catalog.
The Data Provisioning process involves the following steps:
- You configure custom properties for data provisioning in Informatica Administrator console
- You configure the LdmCustomOptions.enableDataProvision and LdmCustomOptions.provision.ics.master.app.url custom properties of the Catalog Service.
Note: If you do not provide a valid Informatica Intelligent Cloud Services Oraganizations URL for the LdmCustomOptions.provision.ics.master.app.url custom property, Informatica Administrator console considers the default Informatica Intelligent Cloud Services URL https://dm-us.informaticacloud.com/.
- You add Informatica Intelligent Cloud Services Organizations to Catalog Administrator
- You add the organizations to get access to the Informatica Intelligent Cloud Services. Catalog Administrator connects to Informatica Intelligent Cloud Services to import database connections for the source and target. For more information about adding Informatica Intelligent Cloud Services Organizations, see the Adding an Informatica Intelligent Cloud Services Organization topic.
- You configure and run a resource in Catalog Administrator
- You create a resource and configure parameters in the Metadata Load Settings > Source Metadata and Data Provisioning > Enable Data Provisioning section, and then run the resource.
Note: If you want to enable data provisioning for a resource, you must select the same schema name that you configured for a connection in Informatica Intelligent Cloud Services.
- For more information about enabling data provisioning for a resource, see the Enable Data Provisioning topic.
- You create data provisioning tasks in Enterprise Data Catalog
- You configure the source and target connection information of the database to provision data.
For more information about creating data provisioning tasks, see the
Creating Provisioning Task topic in the
Informatica Enterprise Data Catalog User Guide.
Alternatively, you can perform this task using the REST APIs. For more information, see the
Data Provisioning REST APIs and
List All the Provisioning Tasks topics in the
Enterprise Data Catalog REST API Reference guide.