PowerCenter
You can configure a PowerCenter resource type to extract metadata from PowerCenter repository objects. Use PowerCenter to extract data from multiple sources, transform the data according to business logic you build in the client application, and load the transformed data into file and relational targets.
Objects Extracted
Enterprise Data Catalog extracts mapping metadata from the PowerCenter source.
Permissions to Configure the Resource
- •Make sure that you configure the Access Repository Manager privilege for the user who accesses the PowerCenter repository.
- •Configure read permission on the PowerCenter data source for the user account that you use to access the PowerCenter data source.
- •Ensure that you run the pmrep ObjectExport command to export the mappings in PowerCenter.
Prerequisites
If the domain is SSL-enabled and cluster is Kerberos-enabled, perform the following steps:
- •Copy the SSL certificate in all the cluster machines. Ensure that the infa_truststore.jks file path is common across all the cluster machines.
- •To get the encrypted password, navigate to <INFAHOME>/server/bin file path, and then type pmpasswd <Truststore Password> in the command line.
- •Configure the following JVM parameters for the resource:
- - -DINFA_TRUSTSTORE=<trust store path>
- - -DINFA_TRUSTSTORE_PASSWORD=<trust store encrypted key>
Resource Connection Properties
The following table describes the properties for the PowerCenter resource:
Property | Description |
|---|
Gateway Host Name or Address | PowerCenter domain gateway host name or address. |
Gateway Port Number | PowerCenter domain gateway port number. |
Informatica Security Domain | LDAP security domain name if one exists. Otherwise, enter "Native." |
Repository Name | Name of the PowerCenter repository. |
Repository User Name | Username for the PowerCenter repository. |
Repository User Password | Password for the PowerCenter repository. |
PowerCenter Version | PowerCenter repository version. Note: Informatica does not provide support for PowerCenter versions earlier than 9.6.0. |
PowerCenter Code Page | Code page for the PowerCenter repository. |
The following table describes the Additional and Advanced properties for source metadata settings in the Metadata Load Settings tab:
Property | Description |
|---|
Enable Source Metadata | Select to extract metadata from the data source. Note: You can run Data Flow Analytics on the resource even if you have not enabled the source metadata setting. |
Parameter File | Specify the parameter file zip that you want to attach from a local system. |
Auto assign Connections | Specifies whether Enterprise Data Catalog assigns the connection is automatically. |
Enable Reference Resources | Option to extract metadata about assets that are not included in this resource, but referred to in the resource. Examples include source and target tables in PowerCenter mappings, and source tables and files from Tableau reports. |
Retain Unresolved Reference Assets | Option to retain unresolved reference assets in the catalog after you assign connections. Retaining unresolved reference assets help you view the complete lineage. The unresolved assets include deleted files, temporary tables, and other assets that are not present in the primary resource. |
Repository subset | Enter the file path list separated by semicolons for the Informatica PowerCenter Repository object. Note: If you want to run Data Flow Analytics on the resource, you need to enter the file path for the Informatica PowerCenter Repository object. If you do not enter the file path to run Data Flow Analytics, by default, the catalog runs Data Flow Analytics on all the PowerCenter Repository objects. |
Detailed Lineage | Select to extract and ingest metadata related to transformation logic for assets that include transformations. A transformation indicates generation, modification, or passage of data between source and target connections. A transformation logic displays the mappings or data flow relation types between source assets and target assets related to the asset you select in Enterprise Data Catalog. Note: To view the detailed lineage for assets that include web service transformations in the Catalog, you must select the Enable Reference Resources option. |
Memory | Specify the memory value required to run a scanner job. Specify one of the following memory values: Note: Informatica recommends that you use the High memory value if you enable the Data Flow Analytics. For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal |
JVM Options | JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters: - - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
- - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number.
- - -Dscanner.yarn.app.environment=<key=value>. Key pair value that you need to set in the Yarn environment. Use a comma to separate the key pair value.
- - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. The default value is 1.
|
The following table describes the Repository details and Advanced properties for data flow analytics settings in the Metadata Load Settings tab:
Property | Description |
|---|
Enable Data Flow Analytics | Select to enable Data Flow Analytics and view analytical data about PowerCenter mappings in the catalog. Note: You must configure Data Asset Analytics for the Catalog Service to run Data Flow Analytics on the resource. |
Select PowerCenter Repository Type | Type of database for the PowerCenter repository. Select from the following options: - - Oracle
- - Microsoft SQL Server
- - IBM DB2
- - Sybase
- - PostgreSQL
|
Username | Name for the PowerCenter repository database user account. |
Password | Password for the PowerCenter repository database user account. |
Database Connection String | JDBC connection string to connect to the secure database, including the host name and port number and the security parameters for the database. |
Custom Options | Specify the custom options to run the scanner job. |
Note: Effective in version 10.5, Data Flow Analytics is available for technical preview. Technical preview functionality is supported but is unwarranted and is not production-ready. Informatica recommends that you use these features in non-production environments only. Informatica intends to include the preview functionality in an upcoming GA release for production use, but might choose not to in accordance with changing market or technical circumstances. For more information, contact Informatica Global Customer Support.
Connecting to an SSL-enabled PowerCenter Resource
Perform the following steps before you create the PowerCenter resource if you want to connect to an SSL-enabled PowerCenter data source:
- 1. Disable the Catalog Service.
- 2. Run the following command to export the certificates from the domain truststore of PowerCenter:
- a. $INFAINSTALL/java/bin/keytool -export -keystore
- b. $INFAINSTALL/services/shared/security/infa_truststore.jks -alias <alias_name> -file
- c. $INFAINSTALL/services/shared/security/certExportFromSSLDomainTruststore.cert
- 3. On the node where Enterprise Data Catalog runs, run the following command to import the certificate into the Informatica domain truststore:
- a. $LDMINSTALL/java/bin/keytool -import -file certExportFromSSLDomainTruststore.cert -alias <new_alias_name>-keystore
- b. $LDMINSTALL/services/shared/security/infa_truststore.jks -storepass pass2038@infaSSL
Note: Make sure that you use a new alias name when you import the certificate.
- 4. Copy the truststore and the keystore files from the PowerCenter domain to the Informatica domain truststore.
- 5. Run the following commands to configure the environment variables:
- a. Edit the infaservice.sh script on the Informatica domain using the following command: $INFAINSTALL/tomcat/bin/infaservice.sh
- b. In the script, add the following line to set the environment variable: INFA_TRUSTSTORE=<the location where you copied the truststore and keystore files.> Ensure that you add the line after the following command in the script: unset INFA_TRUSTSTORE INFA_TRUSTSTORE_PASSWORD INFA_KEYSTORE INFA_KEYSTORE_PASSWORD
- c. For a Hortonworks cluster, open the YARN application and set the INFA_TRUSTSTORE=<the location where you copied the truststore and keystore files.> variable in the advanced YARN environment variables section, and restart YARN.
- d. For a Cloudera cluster, edit the yarn-env.sh script and set the INFA_TRUSTSTORE=<the location where you copied the truststore and keystore files.> variable, and restart YARN.
- 6. Restart the Catalog Service.
Creating a Parameter File
A zip file that contains all the parameters and variables and their associated values configured for workflows, worklets, or sessions in PowerCenter repository. If a PowerCenter source repository uses parameter files in sessions and workflows, you can configure Catalog Administrator to read the parameter files when you create the PowerCenter resource. PowerCenter parameters can represent flat file sources, flat file lookups, flat file targets, relational connections, expressions at the transformation level, or objects in SQL overrides. The PowerCenter resource parses the parameter files and substitutes the parameter values to extract metadata for the flat file sources, flat file lookups, flat file targets, relational connections, and objects in SQL overrides.
You can create a parameter file in a directory similar to a PowerCenter folder in any of the following ways:
- Create parameter files within a directory
- 1. Create a directory with the same name as the folder in the PowerCenter repository.
- 2. Create parameter files named as <workflow name>.prm.
- 3. Place all the parameter definitions that are needed for the workflow and the corresponding sessions in the directory you created.
- 4. Zip the folder. The zip file can contain multiple folders containing parameter files.
- Create multiple parameter files
- Create parameter files named as <PowerCenter folder name>.<workflow name>.prm. Each parameter file applies to the workflow with which it is named.
- Create a single parameter file
Create a single parameter file named as parameters.prm, and then zip the parameter file. The parameter file should contain the parameters which apply to all the workflows within the PowerCenter folders.
Parameter File Requirements
Use a text editor to create the parameter file. To enable the PowerCenter resource to read parameter values from a parameter file, the file must have a .prm extension. You group parameters and variables in different sections of the parameter file. Each section is preceded by a heading that identifies the folder, workflow, worklet, and session to which you want to pass parameter or variable values. You define parameters and variables directly below the heading, entering each parameter or variable on a new line.
The following table describes the headings that define each section in the parameter file and the scope of the parameters and variables that you define in each section:
Heading | Scope |
|---|
[Global] | All folders, workflows, worklets, and sessions. |
[folder name.WF:workflow name] | The named workflow and all sessions within the workflow. |
[folder name.WF:workflow name.WT:worklet name] | The named worklet and all sessions within the worklet. |
[folder name.WF:workflow name.WT:worklet name.WT:worklet name...] | The nested worklet and all sessions within the nested worklet. |
[folder name.WF:workflow name.ST:session name] -or- [folder name.WF:workflow name.WT:worklet name.ST:session name] -or- [folder name.WF:workflow name.WT:worklet name.WT:worklet name.ST:session name] -or- [folder name.session name] -or- [session name] | The named session. |
Sample Parameter File
The following example shows a sample parameters.prm file:
[Map_Param.WF:WF_Src_Tgt_map_param_case.ST:s_src_tgt_tbl_override_default_map_param]
$$Src_OwnName=MM_PERF6
$$Src_TblName=TBL_SAME_COL
$$Tgt_Tbl_Prefix=MM_PERF6
$$Tgt_TblName=INVENTORY_Q4_2005
[Param_lookup.WF:wf_M_LKP_schema_tble
$$LKP_SCHEMA=TEST_DATA
$$LKP_TBL=LKP_TBL_PARAM
[Param_lookup.WF:wf_M_LKP_schema_tbl_sess_param]
$Param_Lkp_Schema=TEST_DATA
$Param_Lkp_Tbl=LKP_TBL_PARAM[
Param_session.WF:wf_session_param.ST:s_session_param]
$Param_Schema_Name=CROSS_RESOURCE_LINKING_DUP
$Param_SrcTbl_Name=SRC_TBL_NAME_OVERRIDE_PARAM
$Param_TgtTbl_Name=TGT_TBL_NAME_OVERRIDE_PARAM
[Param_Sql_override.WF:wf_M_schema_table_map_parm_sql.ST:s_M_schema_table_map_parm_sql]
$$Map_Schema_Name=CROSS_RESOURCE_LINKING
$$Map_Tbl_Name=SRC_TBL_NAME_OVERRIDE_DUP