Enterprise Data Catalog Scanner Configuration Guide > Configuring Informatica Platform Resources > Troubleshooting
  

Troubleshooting

In the Developer tool, I run a profile on the first three columns of a data source in the first run. Then, I run the profile on the next three columns in the data source. What will Enterprise Data Catalog display after I run the profiling warehouse resource?
By default, the catalog displays the last profile run results. If you want to view all the six columns in the catalog, then choose the Cumulative option in the profiling warehouse resource and run the resource. Enterprise Data Catalog displays the latest results for each column in the data source based on the timestamp.
How does the Cumulative option work?
Assume that you have an Employee table with columns C1, C2, C3, and C4. In the Developer tool, you create and run a profile on the table. The profile runs include R1, R2, and R3.
The following list shows the columns that you choose when you run the profile:
  1. 1. R1. You chose C1 and C2.
  2. 2. R2. You chose C3 and C4.
  3. 3. R3. You chose C1 and C4.
In Catalog Administrator, you create and run the profiling warehouse resource with the Cumulative option. The Informatica Data Quality scanner extracts the following latest column results based on the timestamp:
  1. 1. C1. The scanner extracts the latest results from R3.
  2. 2. C2. The scanner extracts the latest results from R1.
  3. 3. C3. The scanner extracts the latest results from R2.
  4. 4. C4. The scanner extracts the latest results from R3.
How does the Incremental option work?
  1. 1. In the Developer tool, you run a profile P1 on the Employee and Department tables at 10.00 AM on 05/02/19.
  2. 2. In the Catalog Administrator, you create a profiling warehouse resource PWH with the Incremental option, and run the resource at 10.15 AM on 05/02/19.
  3. The scanner migrates the profile results of P1 to the catalog.
  4. 3. You run P1 again at 10.00 AM on 05/03/19.
  5. 4. You run PWH at 10.15 AM on 05/03/19.
  6. The scanner does not migrate any results as there are no new results or change in timestamp.
  7. 5. You create and run profile P2 on the Employee and Department tables at 10.00 AM on 05/04/19.
  8. 6. You run PWH at 10.15 AM on 05/04/19.
  9. The scanner migrates P2 profile results to the catalog.
  10. 7. You run the profile on the Payroll table at 10.00 AM on 05/06/19.
  11. 8. You run PWH at 10.15 AM on 05/06/19.
  12. The scanner migrates only the profile results of the Payroll table because you chose the Incremental option. In this case, the profile results of the Payroll table is also called delta.
What results appear in the catalog if I run the profiling resource and profiling warehouse resource on the same table?
The catalog compares the timestamp of the profile results from the profiling resource run and the migrated profile results. It displays the results with the latest timestamp.
What job types are available for the profiling warehouse resource in the Monitoring tab?
The Metadata Load job type is available for the profiling warehouse resource.
What profiling statistics and progress operations appear In the Monitoring tab for the profiling warehouse resource?
In the Statistics tab, the following profiling statistic names and their values appear:
In the Progress tab, the following progress operation names with their timestamp and outcomes appear:
If I have multiple resource results in the catalog, which resource results are overwritten by the profiling warehouse scanner?
Assume that you have an Oracle schema S1. You create Oracle connections C1 and C2 in the Administrator tool.
In the Developer tool, you choose C1 to import the Oracle tables from S1, create a profile for the tables, and run the profile.
In Catalog Administrator, you choose C1 and S1 to create the Oracle resource R1. You choose C2 and S1 to create the Oracle resource R2. You run the resources. The catalog displays the resource results.
When you create a profiling warehouse resource, you choose the profiling warehouse connection C1. When you run the profiling warehouse resource, the catalog compares and overwrites the resource results of R1.
I see a mismatch in data domain names in the catalog. How do I resolve this issue?
This issue appears when the profiling warehouse and Enterprise Data Catalog are in different domains. To synchronize the data domains, export the data domains from the profiling warehouse domain and import them into the Enterprise Data Catalog domain.
What are the supported database types and their fully qualified JDBC connection strings that I can use for the profiling warehouse scanner?
The following table lists the sample JDBC connection strings for all the supported database types:
Database Type
Class Name Value
Connection String Value
Oracle
com.informatica.jdbc.oracle.OracleDriver
jdbc:informatica:oracle://<hostname>:<port>;SID=<sid>
DB2
com.informatica.jdbc.db2.DB2Driver
jdbc:informatica:db2://<hostname>:<port>;DatabaseName=<dbname>
SQL Server
com.informatica.jdbc.sqlserver.SQLServerDriver
jdbc:informatica:sqlserver://<host>:<port>;databaseName=<dbname>
Sybase
com.informatica.jdbc.sybase.SybaseDriver
jdbc:informatica:sybase://<host>:<port>;databaseName=<dbname>
MYSQL
com.informatica.jdbc.mysql.MySQLDriver
jdbc:informatica:mysql://<host>:<port>;databaseName=<dbname>
I run a profile on a source column that has a special character in its name. Why does the profile run fail?
If a source table or column has a special character in its name, or the name starts with a number, the profiling warehouse replaces each special character or the number with an underscore ( _ ) character. Therefore, the table is not stored in the profiling warehouse by its original name and you can see two tables instead of the source table in Enterprise Data Catalog. When you run a profile, the profiling warehouse cannot update the table with the profiling results, and the profile run fails. You can observe the similar behavior when a reference resource is created on the Data Quality scanner.
Before you select columns on which you want to run the profile, remove special characters and numbers from the table and column names.
In a Developer tool, I run a profile to perform data domain discovery and migrate profile results to Enterprise Data Catalog. Then I perform curation on the profile results and again run Informatica Data Quality resource to migrate results to the Enterprise Data Catalog. Will I see the updated profile results in Enterprise Data Catalog?
You will not see the updated results in the Enterprise Data Catalog. To view the updated profile results make sure you run a profile after curation and then migrate the profile results to the Enterprise Data Catalog.
What are the resources that support Parquet file formats for profiling?
The following table lists resources and their supported Parquet file formats:
Resources
Parquet file formats supported
Amazon S3
Single and partitioned Parquet files
Azure Data Lake Store Gen2
Single and partitioned Parquet files
HDFS
Single Parquet file
Local File Systems
Single Parquet file