Administration > Link catalog sources to generate lineage > Linking catalog sources
  

Linking catalog sources

Select source and target catalog sources and schemas to link and generate lineage.
Generate automated lineage with CLAIRE or define rules to use name-based matching or construct an inclusion rule with expressions. Save and run the configuration to start a lineage generation job.

Step 1. Register general information

Provide general information about the configuration on the Registration tab.
    1In Metadata Command Center, go to the Configure page.
    2Select the Lineage tab and then select the Link Catalog Sources (Preview) tab.
    3Click the Add icon.
    The Link Catalog Sources page appears.
    The following image shows the Registration tab of the Link Catalog Sources page:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    4On the General Information area, enter a name and an optional description for the configuration.
    5Click Next.
    The Configuration tab appears.

Step 2. Configure source and target catalog sources

Select source and target catalog sources and schemas on the Configuration tab.
    1In the Source Catalog Source area of the Configuration tab, select a source catalog source from which you want to link and generate lineage.
    The following image shows the Configuration tab of the Link Catalog Sources page:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    The Select Catalog Source dialog box appears.
    The following image shows the Select Catalog Source dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    2Choose a source catalog source. The overview and related assets of the catalog source appear on the preview pane.
    You can filter the list based on the catalog source type and name.
    The following image shows a selected source catalog source on the Select Catalog Source dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    3Click Select to select the source catalog source.
    4 Select a schema of the source catalog source.
    The following image shows a selected schema of the source catalog source on the Select Schema dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    5Click Select to select the schema.
    6In the Target Catalog Source area, select a target catalog source and schema to which you want to link and generate lineage.
    The following image shows selected source and target catalog sources and schemas on the Configuration tab:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    7Click Next.
    The Rule Definition tab appears.

Step 3. Perform rule-based or automated linking, save, and run the configuration

Generate automated lineage with CLAIRE or define rules to use name-based matching or construct an inclusion rule with expressions on the Linking Method tab.
    1On the Linking Method tab, choose to either generate automated lineage with CLAIRE or define rules to generate catalog source links between assets of the source and target catalog sources.
    The following image shows the Linking Method tab of the Link Catalog Sources page:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    2To refresh catalog source links whenever the source or target catalog source job is run, click Refresh Lineage.
    3Choose the linking method.
    4If you choose the Automated Linking option, you can either automatically accept CLAIRE-generated lineage recommendations or manually accept them.
    The following image shows the Linking Method tab with the Automated Linking option selected:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    The following table describes the properties that you can enter for automated linking:
    Property
    Description
    Enable auto-acceptance
    Select to automatically accept CLAIRE-generated lineage recommendations.
    If disabled, you must manually accept the lineage recommendations.
    Confidence Score Threshold for Auto-Acceptance
    If you enable auto-acceptance, specify a threshold limit based on which the CLAIRE-generated lineage recommendations are automatically accepted.
    Specify a percentage from 80 to 100. If the confidence score of the catalog source links generated between a source and target asset is higher than the configured threshold limit, the recommended links are automatically accepted. Default is 95%.
    Stakeholders of the source and target catalog sources can reject the auto-accepted and manually accepted catalog source links generated by CLAIRE in Data Governance and Catalog.
    5If you choose the Rule-based Linking option, choose the rule type.
    6If you choose the Name Matching rule type, select the asset types to specify prefix and suffix strings to ignore.
    The following image shows the Linking Method tab with the Name Matching rule type selected:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    The following table describes the properties that you can enter for name matching:
    Property
    Description
    Source Data Set - Ignore Prefix
    Specify the prefix of source data set names to ignore and match the rest of the source data set names with target data set names.
    Source Data Set - Ignore Suffix
    Specify the suffix of source data set names to ignore and match the rest of the source data set names with target data set names.
    Target Data Set - Ignore Prefix
    Specify the prefix of target data set names to ignore and match the rest of the target data set names with source data set names.
    Target Data Set - Ignore Suffix
    Specify the suffix of target data set names to ignore and match the rest of the target data set names with source data set names.
    Source Data Element - Ignore Prefix
    Specify the prefix of source data element names to ignore and match the rest of the source data element names with target data element names.
    Source Data Element - Ignore Suffix
    Specify the suffix of source data element names to ignore and match the rest of the source data element names with target data element names.
    Target Data Element - Ignore Prefix
    Specify the prefix of target data element names to ignore and match the rest of the target data element names with source data element names.
    Target Data Element - Ignore Suffix
    Specify the suffix of target data element names to ignore and match the rest of the target data element names with source data element names.
    Prefixes and suffixes that you specify can contain alphanumeric characters, underscore (_), and hyphen (-).
    Examples:
    Note: If you don't select an asset type, you can't enter a prefix or suffix. In such cases, the lineage generation job searches for and matches exact source and target asset names.
    7If you choose the Expression rule type, construct an inclusion rule using expressions.
    The following image shows the Linking Method tab with the Expression rule type selected:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    You can use a combination of attributes, operators, functions, and comments to define an inclusion rule. You can type your expressions directly and view autocompleted suggestions as you type your expression in the editor. Expressions are created using a Spark SQL-based language. Expression values cannot exceed 5000 characters.
    You can use the following components to construct an inclusion rule:
    Example of a valid inclusion rule:
    srcDataElement.name == tgtDataElement.name and srcDataSet.name == tgtDataSet.name
    /* The source data element name must be the same as the target data element name, and the source data set name must be the same as the target data set name. */
    Important: Construct expressions with both data sets and data elements to avoid generating unnecessary catalog source links.
    8Click Validate to validate your expression.
    If the validation is successful, a success message appears.
    9To save and run the configuration, click Save and then Run.
    A Lineage Generation job is created to link catalog sources and to generate catalog source links. Check the status of the job on the Monitor page.