When you configure the Talend Data Integration catalog source, you define the settings for the metadata extraction capability.
The metadata extraction capability extracts source metadata from external source systems. You can also configure other capabilities that the catalog source includes.
You can save the catalog source configuration at any point after you enter the connection information. After you save the catalog source, you can choose to run the catalog source job. To run the job once, click Run. To run metadata extraction and other capabilities on a recurring schedule, configure schedules on the Schedule tab.
Configure metadata extraction
When you configure the Talend Data Integration catalog source, you choose a runtime environment, define filters, and enter configuration parameters for metadata extraction.
Before you configure metadata extraction, configure runtime environments in the IDMC Administrator.
1In the Connection and Runtime area, choose a serverless runtime environment or the Secure Agent group where you want to run catalog source jobs.
Note:
Serverless runtime environment options are available if the catalog source works with a serverless runtime environment.
2Choose to retain, delete, or deprecate objects that are deleted from the source system in the catalog with the Metadata Change Option.
- Retain. Retains objects that are deleted from the source system in the catalog. If you update or add a filter, the catalog retains objects extracted from the previous job and extracts additional objects that match the current filter. Objects deleted from the source system are not deleted from the catalog. Enrichments added on deleted objects and relationships are retained.
- Delete. Deletes metadata from the catalog based on objects deleted from the source system and changes you make to the filter. Enrichments added on deleted objects and relationships are also permanently lost. Objects renamed in the source system are removed and recreated in the catalog.
- Deprecate. The lifecycle of objects imported into the catalog moves to Obsolete based on objects deleted from the source system and changes you make to the filter. This does not impact enrichments added on deprecated objects and relationships. Objects renamed in the source system are removed and recreated in the catalog. When you run the catalog source job again for other capabilities such as data classification, relationship discovery, or glossary association, the job doesn't consider obsolete objects. Obsolete objects remain in the catalog until they are purged when you run a Purge Obsolete Objects job on the Explore page.
Note:
You can also change the configured metadata change option when you run a catalog source.
3In the Filters area, define one or more filter conditions to apply for metadata extraction:
aSelect Yes to view filter options.
bFrom the Include/Exclude list, choose to include or exclude metadata based on the filter parameters.
cFrom the Object type list, select Path.
dFrom the Filter criteria list, select Pattern.
eClick Select.
fIn the Select values dialog box, enter the path and click OK.
Filters are not case sensitive. A single filter includes a path to a job or folder and can match multiple folders and jobs when you use wildcard characters. The root of the path starts in the Talend folder that contains the talend.project file.
Filters can contain the following wildcard characters:
▪ Question mark. Represents a single character.
▪ Asterisk. Represents multiple characters or empty text.
The following image shows the filter condition options:
gTo define an additional filter with an OR condition, click the plus icon.
The following image shows that the filter can include metadata from all jobs and folders within the folder 'myFolder' that start with the prefix 'data' or include metadata from all jobs within the folder 'myFolder' that start with the prefix 'Job' followed by a single character:
4In the Configuration Parameters area, enter properties to override default context values and job parameters.
Note:
Click
Show Advanced
to view all configuration parameters.
The following table describes the properties that you can enter:
Property
Description
Default Values
Advanced parameter. Default values to provide output from the tContextLoad and tFlowToIterate components and to return values for unsupported Java methods in evaluation of Java code.
Specify default values in the following format:
- [Project.Job]
- tUniqueName.Column1=66
- tUniqueName.Column2="cards"
Examples for Java default values are:
- Math.function() = 0
- Routine.create(0) = 'text'
Job Context Override
Talend job and job context names to override the default context set in the *.item file.
Click plus to add Talend jobs and job context names.
Context Files Map
Map the Talend context file path to a local file path so Metadata Command Center can use local copies of context files.
Click plus to map context file paths to local file paths.
Talend job parameter names and values to override parameter values.
Specify a job parameter name using the job name and the parameter name separated by a colon. You can use wildcard characters '*' and '?' in the job name.
Enable the lineage discovery capability and use CLAIRE to build complete lineage by recommending endpoint catalog source objects to assign to reference catalog source connections.
1Click the Lineage Discovery tab.
2Select Enable Lineage Discovery.
3In the Filters area, define one or more filter conditions to apply for lineage discovery.
To define filters, you can choose to select catalog source types, asset groups, or enter a catalog source name or search from a list of catalog sources.
aSelect Yes to view filter options.
bFrom the Include/Exclude list, choose to include or exclude catalog sources for lineage discovery based on the filter parameters.
cFrom the filter type list, select catalog source type, catalog source name, or asset group.
dIn the filter value field, select the required catalog source types, or click the Search button and select catalog sources or asset groups.
Filters can contain the asterisk wildcard to represent multiple characters or empty text.
The filter options appear.
Examples:
▪ To include or exclude all Oracle catalog sources, select Catalog Source Type as the filter type and select Oracle in the filter value field.
▪ To include or exclude the 'Oracle_Retail' catalog source, select Catalog Source Name as the filter type and search for the catalog source or enter Oracle_Retail in the filter value field.
▪ To include or exclude all catalog sources with names that start with 'Oracle', select Catalog Source Name as the filter type and search for the catalog source or enter Oracle* in the filter value field.
▪ To include or exclude all catalog sources with names that end with 'Retail', select Catalog Source Name as the filter type and search for the catalog source or enter *Retail in the filter value field.
▪ To include or exclude all catalog sources with names that contain 'Ret', select Catalog Source Name as the filter type and search for the catalog source or enter *Ret* in the filter value field.
▪ To include or exclude all catalog sources that are part of the 'Financial Group' asset group, select Asset Group as the filter type and search Financial Group in the filter value field.
Note:
You can't add more than one include or exclude filter for the same filter type.
eOptionally, to define an additional filter with an AND condition, click the Add icon.
For more information about lineage discovery, see Lineage discovery.