When you configure the SAP SuccessFactors catalog source, you define the settings for the metadata extraction capability.
The metadata extraction capability extracts source metadata from external source systems. You can also configure other capabilities that the catalog source includes.
You can save the catalog source configuration at any point after you enter the connection information. After you save the catalog source, you can choose to run the catalog source job. To run the job once, click Run. To run metadata extraction and other capabilities on a recurring schedule, configure schedules on the Schedule tab.
Configure metadata extraction
When you configure the SAP SuccessFactors catalog source, you choose a runtime environment, define filters, and enter configuration parameters for metadata extraction.
1In the Connection and Runtime area, choose a serverless runtime environment or the Secure Agent group where you want to run catalog source jobs.
Note:
Serverless runtime environment options are available if the catalog source works with a serverless runtime environment.
2Choose to retain, delete, or deprecate objects that are deleted from the source system in the catalog with the Metadata Change Option.
- Retain. Retains objects that are deleted from the source system in the catalog. If you update or add a filter, the catalog retains objects extracted from the previous job and extracts additional objects that match the current filter. Objects deleted from the source system are not deleted from the catalog. Enrichments added on deleted objects and relationships are retained.
- Delete. Deletes metadata from the catalog based on objects deleted from the source system and changes you make to the filter. Enrichments added on deleted objects and relationships are also permanently lost. Objects renamed in the source system are removed and recreated in the catalog.
- Deprecate. The lifecycle of objects imported into the catalog moves to Obsolete based on objects deleted from the source system and changes you make to the filter. This does not impact enrichments added on deprecated objects and relationships. Objects renamed in the source system are removed and recreated in the catalog. When you run the catalog source job again for other capabilities such as data classification, relationship discovery, or glossary association, the job doesn't consider obsolete objects. Obsolete objects remain in the catalog until they are purged when you run a Purge Obsolete Objects job on the Explore page.
Note:
You can also change the configured metadata change option when you run a catalog source.
3In the Filters area, define one or more filter conditions to apply for metadata extraction:
aFrom the Include/Exclude list, choose to include or exclude metadata based on the filter parameters.
bFrom the Object type list, select Table, View or All.
cEnter the filter values.
Filters can contain asterisks. Asterisks represent multiple characters.
The following image shows the filter condition options:
dTo define an additional filter with an OR condition, click the Add icon.
The following image shows a sample filter added:
4Optional. In the Configuration Parameters area, enter additional settings.
The following table describes the property that you enter for additional settings:
Note:
The
Additional Settings
section appears when you click
Show Advanced
.
Property
Description
Expert Parameters
Enter additional configuration options to be passed at runtime. Required if you need to troubleshoot the catalog source job.
Caution:
Use expert parameters when it is recommended by Informatica Global Customer Support.
5Configure additional capabilities for the catalog source by clicking on the tabs.
Configure data profiling and quality
Enable the data profiling capability to evaluate the quality of metadata extracted from the SAP SuccessFactors source system.
You can run data profiling and quality capabilities on SAP SuccessFactors using data integration.
1Click the Data Profiling and Quality tab.
2Expand Data Profiling and select Enable Data Profiling.
Note:
Ensure that you have permissions on all the staging connections that you use in your data profiling configuration. You can't run the job if you don't have permissions on the connections that you use. Select connections that you have access to, or ask the administrator to grant the necessary permissions on the connections that you want to use.
3In the Connection and Runtime area, choose the Secure Agent group where you want to run catalog source jobs.
4 Optionally, specify data profiling filters to run the profile on a subset of the metadata that you extract.
aSelect Yes to view filter options.
bFrom the Include or Exclude metadata list, choose to include or exclude metadata based on the filter parameters.
cFrom the Object type list, select Table.
dEnter the name of the object as the filter value.
Example:
You extracted metadata of all tables from a schema and now you want to profile a specific table. Select Table from the Object type option and then enter the table name in the input field. For example, BenefitProgramEnrollmentDetail includes or excludes tables named 'BenefitProgramEnrollmentDetail'.
To include or exclude multiple objects, click the Add icon to add filters with the OR condition.
5In the Parameters area, configure the parameters.
The following table describes the parameters that you can enter:
Parameter
Description
Modes of Run
Determine the type of data that you want the data profiling task to collect.
Choose one of the following options:
- Keep signatures only. Collects only aggregate information such as data types, average, standard deviation, and patterns.
- Keep signatures and values. Collects both signatures and data values.
Profiling Scope
Determine whether you want to run data profiling only on the changes made to the source system or on the entire source system.
Choose one of the following options:
- Incremental. Includes only source metadata that is changed or updated since the last profile run.
- Full. Includes the entire metadata that is extracted based on the filters applied for extraction.
Sampling Type
Determine the sample rows on which you want to run the data profiling task.
Choose one of the following options:
- All Rows. Runs data profiling on all rows in the metadata.
- Limit N Rows. Runs data profiling on a limited number of rows.
No of rows to limit
Required if you select Limit N Rows in Sampling Type. Specify the number of rows on which you want to run data profiling.
Maximum Precision of String Fields
The maximum precision value for profiles on string data type. You can set a maximum precision value of 255 characters. Default is 50.
Text Qualifier
The character that defines string boundaries. If you select a quote character, profiling ignores delimiters within the quotes. Select a qualifier from the list. Default is Double Quote.
6Expand Data Quality and select Enable Data Quality.
Note:
You can click
Use Data Profiling Parameters
to use the same parameters as in the
Data Profiling
section.
Note:
Ensure that you have permissions on all the staging and flat file connections that you use in your data quality configuration. You can't run the job if you don't have permissions on the connections that you use. Select connections that you have access to, or ask the administrator to grant the necessary permissions on the connections that you want to use.
7In the Connection and Runtime area, choose the Secure Agent group where you want to run catalog source jobs.
8In the Parameters area, configure the parameters.
The following table describes the properties that you can enter:
Parameter
Description
Data Quality Rule Automation
Enable the option to automatically create or update rule occurrences for data elements in the catalog source.
Choose one of the following options:
- Apply on Data Elements linked with Business Dataset. Creates rule occurrences for all data elements that are linked with business data sets in the catalog source.
- Apply on all Data Elements. Creates rule occurrences for all data elements in the catalog source.
Cache Result
Specify how you want to preview rule occurrence results.
Choose one of the following options:
- Agent Cache. Generates a cache file in the runtime environment. You can preview the cached results faster in subsequent data preview runs. The results are cached for seven days by default after the first run in the runtime environment.
- No Cache. Doesn't cache the preview results. You can view the live results.
Run Rule Occurrence Frequency
Specify whether you want to run data quality rules based on the frequency defined for the rule occurrence in Data Governance and Catalog.
Sampling Type
Determine the sample rows on which you want to run the data quality task.
Choose one of the following options:
- All Rows. Runs data quality on all rows in the metadata.
- Limit N Rows. Runs data quality on a limited number of rows.
No of rows to limit
Required if you select Limit N Rows in Sampling Type. Specify the number of rows on which you want to run data quality.
Maximum Precision of String Fields
The maximum precision value for profiles on string data type. You can set a maximum precision value of 255 characters. Default is 50.
Text Qualifier
The character that defines string boundaries. If you select a quote character, data quality ignores delimiters within the quotes. Select a qualifier from the list. Default is Double Quote.