Troubleshooting

In the Developer tool, I run a profile on the first three columns of a data source in the first run. Then, I run the profile on the next three columns in the data source. What will Enterprise Data Catalog display after I run the profiling warehouse resource?

By default, the catalog displays the last profile run results. If you want to view all the six columns in the catalog, then choose the Cumulative option in the profiling warehouse resource and run the resource. Enterprise Data Catalog displays the latest results for each column in the data source based on the timestamp.

How does the Cumulative option work?

Assume that you have an Employee table with columns C1, C2, C3, and C4. In the Developer tool, you create and run a profile on the table. The profile runs include R1, R2, and R3.

The following list shows the columns that you choose when you run the profile:

1. R1. You chose C1 and C2.
2. R2. You chose C3 and C4.
3. R3. You chose C1 and C4.

In Catalog Administrator, you create and run the profiling warehouse resource with the Cumulative option. The Informatica Data Quality scanner extracts the following latest column results based on the timestamp:

1. C1. The scanner extracts the latest results from R3.
2. C2. The scanner extracts the latest results from R1.
3. C3. The scanner extracts the latest results from R2.
4. C4. The scanner extracts the latest results from R3.

How does the Incremental option work?

1. In the Developer tool, you run a profile P1 on the Employee and Department tables at 10.00 AM on 05/02/19.
2. In the Catalog Administrator, you create a profiling warehouse resource PWH with the Incremental option, and run the resource at 10.15 AM on 05/02/19.

The scanner migrates the profile results of P1 to the catalog.

3. You run P1 again at 10.00 AM on 05/03/19.
4. You run PWH at 10.15 AM on 05/03/19.

The scanner does not migrate any results as there are no new results or change in timestamp.

5. You create and run profile P2 on the Employee and Department tables at 10.00 AM on 05/04/19.
6. You run PWH at 10.15 AM on 05/04/19.

The scanner migrates P2 profile results to the catalog.

7. You run the profile on the Payroll table at 10.00 AM on 05/06/19.
8. You run PWH at 10.15 AM on 05/06/19.

The scanner migrates only the profile results of the Payroll table because you chose the Incremental option. In this case, the profile results of the Payroll table is also called delta.

What results appear in the catalog if I run the profiling resource and profiling warehouse resource on the same table?

The catalog compares the timestamp of the profile results from the profiling resource run and the migrated profile results. It displays the results with the latest timestamp.

What job types are available for the profiling warehouse resource in the Monitoring tab?

The Metadata Load job type is available for the profiling warehouse resource.

What profiling statistics and progress operations appear In the Monitoring tab for the profiling warehouse resource?

In the Statistics tab, the following profiling statistic names and their values appear:

- Number of Profile Result Identifiers Fetched
- Number of Profile Results Fetched
- Value Frequency results for Tables extracted

In the Progress tab, the following progress operation names with their timestamp and outcomes appear:

- Profile Results Extraction
- Profile Results Identifiers Extraction
- Profile Results Ingestion
- VF Extraction

If I have multiple resource results in the catalog, which resource results are overwritten by the profiling warehouse scanner?

Assume that you have an Oracle schema S1. You create Oracle connections C1 and C2 in the Administrator tool.

In the Developer tool, you choose C1 to import the Oracle tables from S1, create a profile for the tables, and run the profile.

In Catalog Administrator, you choose C1 and S1 to create the Oracle resource R1. You choose C2 and S1 to create the Oracle resource R2. You run the resources. The catalog displays the resource results.

When you create a profiling warehouse resource, you choose the profiling warehouse connection C1. When you run the profiling warehouse resource, the catalog compares and overwrites the resource results of R1.

I see a mismatch in data domain names in the catalog. How do I resolve this issue?

This issue appears when the profiling warehouse and Enterprise Data Catalog are in different domains. To synchronize the data domains, export the data domains from the profiling warehouse domain and import them into the Enterprise Data Catalog domain.

What are the supported database types and their fully qualified JDBC connection strings that I can use for the profiling warehouse scanner?

The following table lists the sample JDBC connection strings for all the supported database types:

Database Type	Class Name Value	Connection String Value
Oracle	com.informatica.jdbc.oracle.OracleDriver	jdbc:informatica:oracle://<hostname>:<port>;SID=<sid>
DB2	com.informatica.jdbc.db2.DB2Driver	jdbc:informatica:db2://<hostname>:<port>;DatabaseName=<dbname>
SQL Server	com.informatica.jdbc.sqlserver.SQLServerDriver	jdbc:informatica:sqlserver://<host>:<port>;databaseName=<dbname>
Sybase	com.informatica.jdbc.sybase.SybaseDriver	jdbc:informatica:sybase://<host>:<port>;databaseName=<dbname>
MYSQL	com.informatica.jdbc.mysql.MySQLDriver	jdbc:informatica:mysql://<host>:<port>;databaseName=<dbname>

I run a profile on a source column that has a special character in its name. Why does the profile run fail?

If a source table or column has a special character in its name, or the name starts with a number, the profiling warehouse replaces each special character or the number with an underscore ( _ ) character. Therefore, the table is not stored in the profiling warehouse by its original name and you can see two tables instead of the source table in Enterprise Data Catalog. When you run a profile, the profiling warehouse cannot update the table with the profiling results, and the profile run fails. You can observe the similar behavior when a reference resource is created on the Data Quality scanner.

Before you select columns on which you want to run the profile, remove special characters and numbers from the table and column names.

In a Developer tool, I run a profile to perform data domain discovery and migrate profile results to Enterprise Data Catalog. Then I perform curation on the profile results and again run Informatica Data Quality resource to migrate results to the Enterprise Data Catalog. Will I see the updated profile results in Enterprise Data Catalog?

You will not see the updated results in the Enterprise Data Catalog. To view the updated profile results make sure you run a profile after curation and then migrate the profile results to the Enterprise Data Catalog.

What are the resources that support Parquet file formats for profiling?

The following table lists resources and their supported Parquet file formats:

Resources	Parquet file formats supported
Amazon S3	Single and partitioned Parquet files
Azure Data Lake Store Gen2	Single and partitioned Parquet files
HDFS	Single Parquet file
Local File Systems	Single Parquet file