AWS Glue Sources > Create catalog sources in Metadata Command Center > Step 2. Configure capabilities
  

Step 2. Configure capabilities

When you configure the AWS Glue catalog source, you define the settings for the metadata extraction capability.
The metadata extraction capability extracts source metadata from external source systems. You can also configure other capabilities that the catalog source includes.
You can save the catalog source configuration at any point after you enter the connection information. After you save the catalog source, you can choose to run the catalog source job. To run the job once, click Run. To run metadata extraction and other capabilities on a recurring schedule, configure schedules on the Schedule tab.

Configure metadata extraction

When you configure the AWS Glue catalog source, you choose a runtime environment, define filters, and enter configuration parameters for metadata extraction.
    1In the Connection and Runtime area, choose a serverless runtime environment or the Secure Agent group where you want to run catalog source jobs.
    Note:
    Serverless runtime environment options are available if the catalog source works with a serverless runtime environment.
    2Choose to retain, delete, or deprecate objects that are deleted from the source system in the catalog with the Metadata Change Option.
    Note:
    You can also change the configured metadata change option when you run a catalog source.
    3In the Filters area, define one or more filter conditions to apply for metadata extraction:
    1. aFrom the Include or exclude metadata list, choose to include or exclude metadata based on the filter parameters.
    2. bFrom the Object type list, select JobName to filter the metadata based on the job names of the AWS Glue jobs.
    3. c Enter a value to specify the object location.
    4. Filter values can contain the following wildcards:
      The following image shows the filter options:
      The image shows the Filters area for the AWS Glue catalog source. You can choose to include or exclude metadata from AWS Glue jobs or from Amazon Athena source objects and enter a value to specify the object location.
    5. dOptionally, to define an additional filter with an OR condition, click the Add icon.
    6. The following image shows the filter conditions for an AWS Glue catalog source:
      The image shows the filter conditions for an AWS Glue catalog source that includes metadata from AWS Glue jobs.
      The filter includes metadata from AWS Glue jobs with names that start with CreateTableFromS followed by a single character in addition to jobs with names that start with IntegrationJob.
    4Optionally, in the Configuration Parameters area, enter properties to override default context values and job parameters.
    The following table describes the property that you enter for Catalog Source Configuration Options:
    Property
    Description
    Runs Per Job
    The number of job runs to fetch. Default is 1.
    If you don't specify a value, the catalog source job fetches the last job run. The catalog source job only extracts unique job instances that run successfully and differ in lineage.
    The following table describes the properties that you enter for Amazon Athena Catalog Preload Filters:
    Property
    Description
    Include filter
    A list of filters to preload Amazon Athena catalog assets. Use the include filter to load a limited set of assets and optimize job time. A job processes an asset when it matches at least one include filter.
    A filter value contains segments separated by periods. You can enter two wildcards in each segment. Use a question mark to represent a single character and an asterisk to represent multiple characters.
    The filter segments contain the Amazon Athena database name and the asset name, such as <Database name>.<Asset name>
    To configure a filter, click the Add icon and provide a value in the Value field.
    Exclude filter
    A list of filters to exclude Amazon Athena catalog assets. A job doesn't process an asset when it matches any of the exclude filters.
    A filter value contains segments separated by periods. You can enter two wildcards in each segment. Use a question mark to represent a single character and an asterisk to represent multiple characters.
    The filter segments contain the Amazon Athena database name and the asset name, such as <Database name>.<Asset name>
    To configure a filter, click the Add icon and provide a value in the Value field.
    5Optional. In the Configuration Parameters area, enter additional settings.
    The following table describes the property that you enter for additional settings:
    Note:
    The
    Additional Settings
    section appears when you click
    Show Advanced
    .
    Property
    Description
    Expert Parameters
    Enter additional configuration options to be passed at runtime. Required if you need to troubleshoot the catalog source job.
    Caution:
    Use expert parameters when it is recommended by Informatica Global Customer Support.
    6Configure additional capabilities for the catalog source by clicking on the tabs.

Configure lineage discovery

Enable the lineage discovery capability and use CLAIRE to build complete lineage by recommending endpoint catalog source objects to assign to reference catalog source connections.
    1Click the Lineage Discovery tab.
    2Select Enable Lineage Discovery.
    3In the Filters area, define one or more filter conditions to apply for lineage discovery.
    To define filters, you can choose to select catalog source types, asset groups, or enter a catalog source name or search from a list of catalog sources.
    1. aSelect Yes to view filter options.
    2. bFrom the Include/Exclude list, choose to include or exclude catalog sources for lineage discovery based on the filter parameters.
    3. cFrom the filter type list, select catalog source type, catalog source name, or asset group.
    4. dIn the filter value field, select the required catalog source types, or click the Search button and select catalog sources or asset groups.
    5. Filters can contain the asterisk wildcard to represent multiple characters or empty text.
      The filter options appear.The filter options include multiple filter conditions that you can choose.
      Examples:
      Note:
      You can't add more than one include or exclude filter for the same filter type.
    6. eOptionally, to define an additional filter with an AND condition, click the Add icon.
    7. For more information about lineage discovery, see Lineage discovery.