Administration > Link catalog sources to generate lineage > Linking catalog sources
  

Linking catalog sources

Select source and target catalog sources and schemas to link and generate lineage.
Generate automated lineage with CLAIRE or define rules to use name-based matching or construct an inclusion rule with expressions. Save and run the configuration to start a lineage generation job.

Step 1. Register general information

Provide general information about the configuration on the Registration tab.
    1In Metadata Command Center, go to the Configure page.
    2Select the Lineage tab and then select the Link Catalog Sources tab.
    3Click the Add icon.
    The Registration tab of the Link Catalog Sources page appears.
    4On the General Information area, enter a name and an optional description for the configuration.
    5Click Next.
    The Configuration tab appears.

Step 2. Configure source and target catalog sources

Select source and target catalog sources on the Configuration tab.
    1In the Source Catalog Source area of the Configuration tab, select a source catalog source from which you want to link and generate lineage.
    The Select Catalog Source dialog box appears.
    The following image shows the Select Catalog Source dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    2Choose a source catalog source and click Select.
    The overview and related assets of the catalog source appear on the preview pane.
    You can filter the list based on the catalog source type and name.
    The following image shows a selected source catalog source on the Select Catalog Source dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    3Choose one of the following options from which you want to link and generate lineage:
    The following image shows a selected schema of the source catalog source on the Select Schema dialog box:
    The screenshot shows the Informatica Intelligent Cloud Services selelction screen with the Metadata Command Center highlighted.
    4Optional. In the Filters area, define one or more filters to apply.
    If you selected a relational database-based catalog source, perform the following steps:
    1. aFrom the Include or Exclude metadata list, choose to include or exclude metadata based on the filter parameters.
    2. bFrom the Object type list, select All, Tables, or Views.
    3. cEnter a value to specify the object location.
    4. Filters can contain the following wildcards:
      For object hierarchies, use a dot as a separator. Enclose filter values in double quotes if you use a space or a dot in a single segment.
      The following image shows the filter condition options:
      The image shows the filter conditions for linking catalog sources if you select a relational catalog source. It shows 'All', 'Tables', and 'Views' as the object types.
      For example:
    5. dTo define an additional filter with an OR condition, click the Add icon.
    If you selected a file system-based catalog source, perform the following steps:
    1. aFrom the Include or Exclude metadata list, choose to include or exclude metadata based on the filter parameters.
    2. bEnter a value to specify the object location.
    3. Filters are case-insensitive.
      Filters can contain asterisk as a wildcard to represent multiple characters.
      Use the following rules when you enter filter values:
      The following image shows the filter condition options:
      The image shows the filter conditions for linking catalog sources if you select a file system based catalog source. It shows 'Path' as the object type.
      Path filters apply to the files and folders in the path that you filter. The path filter is non-recursive. If you provide only the file or folder names, the path filters apply on the first level files or directories.
      For example:
    4. cTo define an additional filter with an OR condition, click the Add icon.
    Note: If you add a filter that includes metadata from all objects, or if you don't add a filter, Metadata Command Center generates additional lineage for a few objects. These objects might include parameter containers, result sets, stages, and other objects that belong to the core.DataSet super class within the metadata model.
    5In the Target Catalog Source area, select a target catalog source and schema or root directory to which you want to link and generate lineage. Optionally, you can add a filter.
    6Click Next.
    The Rule Definition tab appears.

Step 3. Perform rule-based or automated linking, save, and run the configuration

Generate automated lineage with CLAIRE or define rules to use name-based matching or construct an inclusion rule with expressions on the Linking Method tab.
    1On the Linking Method tab, choose to either generate automated lineage with CLAIRE or define rules to generate catalog source links between assets of the source and target catalog sources.
    2To refresh catalog source links whenever the source or target catalog source job is run, click Refresh Lineage.
    3Choose one of following linking methods:
    4If you choose the Automated Linking option, you can either automatically accept CLAIRE-generated lineage recommendations or manually accept them.
    The following table describes the properties that you can enter for automated linking:
    Property
    Description
    Enable auto-acceptance
    Select to automatically accept CLAIRE-generated lineage recommendations.
    If disabled, you must manually accept the lineage recommendations.
    Confidence Score Threshold for Auto-Acceptance
    If you enable auto-acceptance, specify a threshold limit based on which the CLAIRE-generated lineage recommendations are automatically accepted.
    Specify a percentage from 80 to 100. If the confidence score of the catalog source links generated between a source and target asset is higher than the configured threshold limit, the recommended links are automatically accepted. Default is 95%.
    Stakeholders of the source and target catalog sources can reject the auto-accepted and manually accepted catalog source links generated by CLAIRE in Data Governance and Catalog.
    5If you choose the Rule-based Linking option, choose the rule type.
    6If you choose the Name Matching rule type, select the asset types to specify prefix and suffix strings to ignore.
    The following table describes the properties that you can enter for name matching:
    Property
    Description
    Source Data Set - Ignore Prefix
    Specify the prefix of source data set names to ignore and match the rest of the source data set names with target data set names.
    Source Data Set - Ignore Suffix
    Specify the suffix of source data set names to ignore and match the rest of the source data set names with target data set names.
    Target Data Set - Ignore Prefix
    Specify the prefix of target data set names to ignore and match the rest of the target data set names with source data set names.
    Target Data Set - Ignore Suffix
    Specify the suffix of target data set names to ignore and match the rest of the target data set names with source data set names.
    Source Data Element - Ignore Prefix
    Specify the prefix of source data element names to ignore and match the rest of the source data element names with target data element names.
    Source Data Element - Ignore Suffix
    Specify the suffix of source data element names to ignore and match the rest of the source data element names with target data element names.
    Target Data Element - Ignore Prefix
    Specify the prefix of target data element names to ignore and match the rest of the target data element names with source data element names.
    Target Data Element - Ignore Suffix
    Specify the suffix of target data element names to ignore and match the rest of the target data element names with source data element names.
    Prefixes and suffixes that you specify can contain alphanumeric characters, underscore (_), and hyphen (-).
    For example:
    Note: If you don't select an asset type, you can't enter a prefix or suffix. In such cases, the lineage generation job searches for and matches exact source and target asset names.
    7If you choose the Expression rule type, construct an inclusion rule using expressions.
    You can use a combination of attributes, operators, functions, and comments to define an inclusion rule. You can type your expressions directly and view autocompleted suggestions as you enter your expression in the editor. Expressions are created using a Spark SQL-based language. Expression values cannot exceed 5000 characters.
    You can use the following components to construct an inclusion rule:
    Example of a valid inclusion rule:
    srcDataElement.name == tgtDataElement.name and srcDataSet.name == tgtDataSet.name
    /* The source data element name must be the same as the target data element name, and the source data set name must be the same as the target data set name. */
    Important: Construct expressions with both data sets and data elements to avoid generating unnecessary catalog source links.
    8Click Validate to validate your expression.
    If the validation is successful, a success message appears.
    9To save and run the configuration, click Save and then Run.
    A Lineage Generation job is created to link catalog sources and to generate catalog source links. Check the status of the job on the Monitor page.