Automate Object Onboarding from Enterprise Data Catalog
After you integrate Axon with Enterprise Data Catalog, you can choose to automate the process of onboarding the discovered objects from Enterprise Data Catalog. In the onboarding process, Informatica identifies the critical data elements and automatically creates objects, such as data sets and attributes, and brings them to Axon. This automation eliminates the need for you to manually create content and link Axon objects and Enterprise Data Catalog technical assets. The automation increases efficiency, productivity, and adds value to the business.
You define a glossary and system in Axon. Configure data onboarding rules and apply the onboarding rules to systems in which you want to onboard data sets and attributes. You need to enable the automated onboarding process from the System objects. Enterprise Data Catalog is master catalog of technical assets and contains data domains and physical fields. Based on the onboarding rule criteria and associations between a glossary and data domain or physical field linking, the onboarding job discovers objects that can be onboarded from Enterprise Data Catalog. You can choose to discover the attributes that are related to an Axon glossary term with the help of Enterprise Data Catalog instead of manually searching and linking them. Informatica automates the process of creating data sets and attributes and onboards them into Axon. Axon sends notifications about the discovered objects for a glossary to the glossary stakeholders. You can curate the attributes by accepting or rejecting them. Ensure that the onboarded data sets have assigned stakeholders to curate the attributes.
Note: If objects are assigned to segments in Axon, the automated onboarding of objects are based on the segments that the objects belong to.
Prerequisites
Verify the following prerequisites:
- •You have installed Enterprise Data Catalog that runs on Informatica version 10.2.2 or higher.
- •You have configured the Enterprise Data Catalog parameters from the Admin Panel in Axon. For more information, refer to the Configure Access to Enterprise Data Catalog topic in the Informatica Axon Data Governance 7.1 Administrator Guide.
- •You have created the required systems and glossaries in Axon.
- •You have linked an Axon system to at least one Enterprise Data Catalog resource. For more information, refer to Linking Resources with Systems Manually
- •You have created a data onboarding rule from the Admin Panel, applied the rule to an Axon system, and enabled the automated onboarding process from the system. For more information, refer to Configure and Apply Data Onboarding Rules
- •Each glossary has a stakeholder to receive onboarding notifications.
Automated Onboarding Process
In the automated onboarding process, the key elements are identified based on a data onboarding rule specified for a system. Configure a data onboarding rule and apply to a system. After you enable automated onboarding from the system, the data sets and attributes are automatically created.
The following image shows the automated onboarding process flow from Enterprise Data Catalog to Axon:
The following process describes the series of events for automatic onboarding of objects into Axon:
- 1. Create a resource in Enterprise Data Catalog. Optionally, you can enable data domain and similarity discovery options. For more information, refer the topics Creating a Resource and Enable Data Discovery in the Informatica 10.4.1 Catalog Administrator Guide.
- 2. Create an Axon resource type in Enterprise Data Catalog. Use the Axon scanner to scan the Axon glossaries. For more information, refer the topics Creating a Resource and Axon Resource Type Properties in the Informatica 10.4.1 Catalog Administrator Guide.
- 3. Create glossaries and systems in Axon. Assign at least one stakeholder to a glossary to receive notifications about onboarding objects into Axon. For more information, see Creating an Object.
- 4. To automate the onboarding of data sets and attributes, you must first configure data onboarding rules in the Admin Panel. For more information on creating data onboarding rules, refer the Configure Data Onboarding Rules topic in the Informatica Axon Data Governance 7.1 Administrator Guide.
- 5. Apply onboarding rules and enable automated onboarding from Axon systems. Based on the configured onboarding rules, Axon includes or excludes onboarding of the data sets and attributes from Enterprise Data Catalog. For more information, see Data Onboarding Rules for a System.
- 6. A background job is scheduled to run daily at a predefined interval. When the job runs, Informatica identifies critical data elements based on the configured onboarding rules and the data sets and attributes are automatically created in Axon systems. For more information, see Onboarding Scheduler.
The onboarding process is based on any of the following associations:
- - When you link a physical field to a glossary in Enterprise Data Catalog. For more information, refer the Add Business Title topic in the Informatica 10.4.1 Enterprise Data Catalog User Guide.
- - When you link a physical field to a data domain and the data domain is linked to a glossary in Enterprise Data Catalog. For more information, refer the Column and Field Assets and Data Domain Assets topics in the Informatica 10.4.1 Enterprise Data Catalog User Guide.
- - When a physical field displays recommended glossaries in Enterprise Data Catalog. For more information, refer the Add Business Title topic in the Informatica 10.4.1 Enterprise Data Catalog User Guide.
- - When a glossary is linked to a data domain discovery rule in Axon. For more information, see Data Domain Discovery Rules.
- - When a default glossary is specified in the Admin Panel. If a glossary association is selected in the onboarding rule for the system, Axon searches for the physical fields that are associated to a glossary. If a glossary association is not selected in the onboarding rule for the system, Axon uses the default glossary to onboard the objects from Enterprise Data Catalog.
For more information, see
Linking Glossaries.
- 7. Axon sends notifications to the glossary stakeholders about the discovered objects. For more information, see Notify Stakeholders.
- 8. View the discovered data sets and attributes. Curate the attributes by accepting or rejecting them. Ensure that the onboarded data sets have assigned stakeholders with Edit permissions to curate the attributes. For more information, see View Discovered Objects and Curate Discovered Attributes.
- 9. You can view newly created relationships between attributes in Axon if you choose to accept the lineage recommendations from Enterprise Data Catalog and create links automatically. You can view the attribute links in both inbound and outbound relationship grids. You can filter based on the created date to find the newly created attribute links.
Configure and Apply Data Onboarding Rules
To automatically onboard objects from Enterprise Data Catalog, you must configure data onboarding rules and enable automated onboarding from Axon systems.
After you configure the Enterprise Data Catalog parameters in Axon, you can create data onboarding rules from the Admin Panel. In a data onboarding rule, you specify the properties based on which you want to automatically onboard the objects. You can define multiple rules as per your requirement for different systems. You can apply a single data onboarding rule for a system. For more information on creating data onboarding rules, refer the Configure Data Onboarding Rules topic in the Informatica Axon Data Governance 7.1 Administrator Guide.
Navigate to a System object in which you want the data sets and attributes to be onboarded. Click the
Enterprise Catalog >
Data Onboarding Rules view. Click
Edit, select the configured onboarding rule, and choose the
Enable Automated Onboarding option to automatically onboard objects from Enterprise Data Catalog based on the onboarding rule. For more information, see
Data Onboarding Rules for a System.
Onboarding Scheduler
Axon has a predefined scheduled job that runs every day at 2 a.m. After the job runs, the discovered objects are automatically onboarded from Enterprise Data Catalog to Axon.
When the onboarding script runs for the first time, Informatica onboards all the discovered data sets and attributes into Axon. When it runs the next time, the newly discovered data sets and attributes are onboarded.
When the onboarding job runs, the relationship between attributes is also onboarded if you choose to automatically create links from the Admin Panel.
Common Onboarding Conditions
The following conditions apply to an onboarding process:
- •The name of the onboarded attribute is same as that of either the glossary name or physical field name based on the configuration in the Admin Panel.
- •The name of the onboarded data set is that of the parent name of the field.
- •If a parent is associated to a glossary in Enterprise Data Catalog, the glossary is used to automatically create a data set with the parent name. Otherwise, the glossary that is required to automatically create a data set is randomly selected. For example, consider that table EMP_DATA has columns EmpName and EmpID. The column EmpName is linked to glossary "Name" and column "EmpID" is linked to glossary ID. If the table EMP_DATA is linked to a glossary, a data set with the name EMP_DATA is automatically created using the glossary. If the table is not linked to any glossary, a data set with the name EMP_DATA is automatically created using either glossary "Name" or "ID".
- •If a data set exists in Axon, only the attributes are onboarded within the data set.
- •If data sets and attributes exist in Axon, the data sets and attributes are not onboarded, but the link is created in Axon.
- •If you reject a discovered attribute in Axon, the attribute is not onboarded again.
- •If a field in a custom resource does not have a parent in Enterprise Data Catalog, then the attribute corresponding to the field is not onboarded to Axon. This happens because you need a data set to onboard attributes, and without a parent, the data set context is missing.
- •If you configure a mandatory custom field for data sets and attributes in Axon, the onboarded data sets and attributes display the mandatory field with the default value that you provide while creating the mandatory custom field. You can change the default value for the mandatory custom field from the Data Set and Attribute objects. The onboarding of objects do not happen for some of the Business Intelligence resources that do not display the parent information.
- •If you configure a non-mandatory custom field for data sets and attributes in Axon, the onboarded data sets and attributes display the non-mandatory field without any value. You can specify a value from the Data Set and Attribute objects.
- •If objects are part of different segments, the objects are not automatically onboarded. For example, consider that glossary G1 is part of segment Seg1 and system S1 is part of segment Seg2. If G1 is connected to physical field C1 of resource R1 and R1 is connected to S1, the attributes are not automatically onboarded because S1 and G1 belong to two different segments.
Data Onboarding Rule Conditions
The following onboarding rule conditions apply to an onboarding process:
- Associated Glossary Required
- If you select the Requires an Associated Glossary option and specify the Confidence Score value in an onboarding rule, the confidence score value is considered. The data is onboarded only if the calculated confidence score value is greater than or equal to the specified confidence score value.
- For example, consider that the Requires an Associated Glossary option is selected and confidence score is 50% in a data onboarding rule. In Enterprise Data Catalog, physical field "EMP" is linked to inferred data domain "ID" that has a confidence score of 66.67% and "ID" is linked to glossary "Ref". Based on the rule conditions, the attribute "Ref" is onboarded with a calculated confidence score of 63.3% and the onboarded attribute "Ref" is linked to glossary "Ref".
- Associated Glossary Not Required
- If the Requires an Associated Glossary option is not selected in an onboarding rule, the confidence score value specified in the rule is not considered for the following cases:
- - Consider that a physical field is associated with an inferred data domain and the physical field is linked to a glossary. If the inferred data domain is also linked to a glossary, the attributes are onboarded with a confidence score of 100% based on the physical field to glossary linking.
For example, consider that physical field "Length" is associated with inferred data domain "Measure" that has confidence score of 60%. Physical field "Length" is associated with glossary "Unit" and data domain "Measure" is associated with glossary "Height". The attribute "Unit" is onboarded with a confidence score of 100% and glossary "Unit" linked to onboarded attribute "Unit".
- - Consider that a physical field is associated with an inferred data domain and the inferred data domain is linked to a glossary. If the physical field is not linked to a glossary, the attribute is onboarded with a confidence score of 100% based on the inferred data domain to glossary linking without considering the confidence score value of the inferred data domain.
For example, consider that physical field "Length" is associated with inferred data domain "Measure" that has confidence score of 60%. Data domain "Measure" is associated with glossary "Height" and the physical field is not linked to any glossary. The attribute "Height" is onboarded with a confidence score of 100% and glossary "Height" linked to onboarded attribute "Height".
- - Consider that a physical field is associated with an inferred data domain and the physical field and inferred data domain are not linked to any glossary. If a default glossary is specified in the Admin Panel, the attribute is onboarded based on the default glossary.
For example, consider that physical field "EMP" is associated with inferred data domain "Record" that has a confidence score of 50%. Both the physical field and inferred data domain are not linked to any glossary. The default glossary "Cust" is configured in the Admin Panel. The attribute "Cust" is onboarded with a confidence score of 100% and glossary "Cust" is linked to onboarded attribute "Cust".
- - Consider that a physical field is associated with an accepted data domain and the physical field is linked to a glossary. If the accepted data domain is also linked to a glossary, the attributes are onboarded with a confidence score of 100% based on the physical field to glossary and data domain to glossary linking.
For example, consider that physical field "Age" is associated with accepted data domain "Number" that has confidence score of 100%. Accepted data domain "Number" is linked to glossary "ID" and physical field "Age" is glossary "Years". The attributes "ID" and "Years" are onboarded with a confidence score of 100%. The glossary "ID" is linked to the onboarded attribute and glossary "Years" is linked to the onboarded attribute "Years".
- - Consider that a physical field is associated with an inferred data domain and the physical field and inferred data domain are not linked to any glossary. If a default glossary is not specified in the Admin Panel, the attribute is not onboarded at all.
Objects Onboarding Scenarios
Consider the onboarding scenarios for resources with relational sources, such as tables and columns. The process of automated onboarding of objects vary based on the following types of links between a system and resource:
- •A single system is linked to a single resource.
- •A single system is linked to multiple resources.
- •Multiple systems are linked to a single resource.
In all the scenarios, consider that you choose the glossary names for the onboarded attributes.
Scenario 1. Single System to Single Resource
After you link an Axon system to a single Enterprise Data Catalog resource, the onboarding results can vary based on whether columns of the resource are linked to different glossaries or the same glossary.
Consider that system S1 is linked to resource R1. Resource R1 contains tables T1 and T2. Table T1 contains columns C1, C2, C3, and C5. Table T2 has a column C4.
Multiple Columns Linked to Different Glossaries
In this case, columns C1, C2, C3, and C5 of table T1 are linked to different glossaries G1, G2, G3, and G5 respectively. Column C4 of table T2 is linked to glossary G4.
The following image shows a sample linking of multiple columns from different tables of a single resource to different glossaries:
In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1.
- •Attributes are created and onboarded with the same names as of glossaries G1, G2, G3, and G5. Attributes G1, G2, G3, and G5 are created within data set T1.
- •Another data set is created and onboarded with the same name as of table T2.
- •Another attribute is created and onboarded with the same name as of glossary G4. Attribute G4 is created within the data set T2.
The following image shows a sample list of the onboarded objects:
Multiple Columns Linked to Same Glossary
In this case, resource R1 is linked to system S1. Consider that all the columns of tables T1 and T2 of the resource R1 are linked to the same glossary G1.
The following image shows a sample linking of multiple columns from different tables of a single resource to the same glossary:
In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1.
- •An attribute is created and onboarded with the same name as of glossary G1. Attribute G1 is created within data set T1.
- •Another data set is created and onboarded with the same name as of table T2.
- •Another attribute is created and onboarded with the same name as of glossary G1. Attribute G1 is created within the data set T2.
The following image shows a sample list of the onboarded objects:
Scenario 2. Single System to Multiple Resources
After you link an Axon system to multiple Enterprise Data Catalog resources, the onboarding results can vary based on whether columns of the same table from multiple resources are linked to different glossaries or the same glossary, or columns of different tables from multiple resources are linked to the same glossary.
Consider that system S1 is linked to resources R1 and R2.
Multiple Columns from Same Tables Linked to Different Glossaries
In this case, both the resources R1 and R2 contain the same table T1. Table T1 from resource R1 contains columns C1, C2, and C5. Table T1 from resource R2 contains columns C3 and C4. Columns C1, C2, and C5 are linked to different glossaries G1, G2, and G5 respectively. Columns C3 and C4 are linked to different glossaries G3 and G4 respectively.
The following image shows a sample linking of multiple columns of the same table from multiple resources to different glossaries:
In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1 for resources R1 and R2.
- •Within data set T1, the attributes are created and onboarded with the same names as of glossaries G1, G2, G3, G4, and G5.
The following image shows a sample list of the onboarded objects:
Multiple Columns from Same Tables Linked to Same Glossaries
In this case, both the resources R1 and R2 contain the same table T1. Table T1 from resources R1 and R2 contains columns C1 and C2. Column C1 from both the resources R1 and R2 is linked to the same glossary G1. Column C2 from both the resources R1 and R2 is linked to the same glossary G2.
The following image shows a sample linking of multiple columns in the same table from different resources to the same glossary:
In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1 for resources R1 and R2.
- •Within data set T1, the attributes are created and onboarded with the same names as of glossaries G1 and G2.
The following image shows a sample list of the onboarded objects:
Multiple Columns from Different Tables linked to Same Glossary
In this case, both the resources R1 and R2 contain tables T1 and T2. Table T1 from resources R1 and R2 contains columns C1, C3, and C5. Table T2 from resources R1 and R2 contains columns C2 and C4. All the columns are linked to the same glossary G1.
The following image shows a sample linking of multiple columns in different tables from different resources to the same glossary:
In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1 for resources R1 and R2.
- •An attribute is created and onboarded with the same name as of glossary G1. Attribute G1 is created within data set T1.
- •Another data set is created and onboarded with the same name as of table T2 for resources R1 and R2.
- •An attributes is created and onboarded with the same name as of glossary G1. Attribute G1 is created within the data set T2.
The following image shows a sample list of the onboarded objects:
Scenario 3. Multiple Systems to Single Resource
After you link multiple Axon systems to a single Enterprise Data Catalog resource, the onboarding results can vary based on whether columns of the resource are linked to different glossaries or same glossary.
Consider that systems S1 and S2 are linked to resource R1. Resource R1 contains tables T1 and T2. Table T1 contains columns C1, C2, and C3. Table T2 contains columns C4 and C5.
Multiple Columns Linked to Different Glossaries
In this case, columns C1, C2, and C3 of table T1 are linked to different glossaries G1, G2, and G3 respectively. Columns C4 and C5 of table T2 are linked to glossaries G4 and G5 respectively.
The following image shows a sample linking of multiple columns from different tables of a resource to different glossaries:
If multiple systems are linked to a single resource and multiple columns from different tables of the resource are linked to different glossaries, the data sets and attributes are created and onboarded to each system. In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1 in systems S1 and S2.
- •Attributes are created and onboarded with the same names as of glossaries G1, G2, and G3. Attributes G1, G2, and G3 are created within data set T1 in systems S1 and S2.
- •Another data set is created and onboarded with the same name as of table T2 in systems S1 and S2.
- •Attributes are created and onboarded with the same names as of glossaries G4 and G5. Attributes G4 and G5 are created within data set T2 in systems S1 and S2.
The following image shows a sample list of the onboarded objects:
Multiple Columns Linked to Same Glossary
In this case, all columns from tables T1 and T2 are linked to the same glossary G1.
The following image shows a sample linking of multiple columns from different tables to the same glossary:
If multiple systems are linked to a single resource and multiple columns from different tables of the resource are linked to the same glossary, data sets along with attributes are created and onboarded to each system. In this case, the objects are onboarded in the following way:
- •A data set is created and onboarded with the same name as of table T1 in systems S1 and S2.
- •Another data set is created and onboarded with the same name as of table T2 in systems S1 and S2.
- •An attribute is created and onboarded with the same name as of glossary G1 in both data sets T1 and T2 for systems S1 and S2.
The following image shows a sample list of the onboarded objects:
Notify Stakeholders
Axon sends notifications about the onboarded objects to all the stakeholders of a glossary. You can click the bell icon (
) and view the onboarding and relationship recommendation notifications in the
Catalog tab.
The following image shows a sample list of discovered objects that you can view in the notifications:
If the notifications include links to the glossaries for which the data sets and attributes were discovered, you can click the glossary name to navigate to the glossary object and view the discovered data sets and attributes. For example, 11 new attributes and 8 data sets were discovered for Glossary FNAME. When you click the FNAME glossary, you can view the discovered data sets and attributes under the Data tab.
Axon displays a notification for each glossary. After the next onboarding scheduler run, the discovered objects are updated for each glossary.
View Discovered Objects
In Axon, you can view data sets and attributes that are discovered and onboarded from Enterprise Data Catalog.
You can view the discovered data sets from Data > Data Sets view of a Glossary object. You can view the discovered attributes from the following areas in the Axon interface:
- •The Data > Data Attributes view of a Glossary object
- •The Attributes facet view from the Unison search
- •The Attributes tab of a Data Set object
- •The Data > Data Attributes view of a System object
The following image shows the Data > Data Attributes view of a Glossary object:
You can view that the attributes are populated from Enterprise Data Catalog with the following properties:
- •The onboarded name of the attribute is same as that of the glossary name or physical field name.
- •The attribute definition displays the same definition of the glossary.
- •The Origin field has the "Enterprise Catalog" value.
- •The data type is the same as that of the Enterprise Data Catalog field.
- •The data length is same as that of the data type length in Enterprise Data Catalog.
- •The data set name is the same name as of the parent of the field in Enterprise Data Catalog.
- •The Physical Fields column shows the names in orange color. You can click the hyperlink on the linked physical field to view the details of the field, such as Last Updated, Field Type, Resource, Table, Schema, Data Domains, and Data Domain Groups.
- •The Review Status column of the discovered attributes show "Accepted" or "Discovered" status. If you choose to accept onboarded objects from the Admin Panel and the Confidence Score is 100%, the attributes from Enterprise Data Catalog appear with the "Accepted" review status in Axon. If you do not choose to accept onboarded objects, the review status appears as "Discovered." The "Accepted" review status appears only when a physical field or curated data domain is directly linked to a glossary. You cannot see the "Accepted" review status when you use recommended glossaries and data domain links.
- •The Confidence Score column shows the percentage of conformance of the discovered attributes.
- •The onboarded attributes metadata includes the Data Type, DataLength, and Key columns with values populated from Enterprise Data Catalog.
Note: If objects are part of segments, you might see some masked data if you do not have access to the segment.
Curate Discovered Attributes
To curate the discovered attributes, ensure that the onboarded data sets have assigned stakeholders with Edit permissions. You can curate the discovered attributes from the following areas in the Axon interface:
- Objects View
- If you want to curate the attributes within a data set, you must have Edit permissions on the data set. You can view and curate the discovered attributes from the following objects:
- - The Data Attributes sub-tab on the Data tab of a Glossary object
- - The Attributes tab of a Data Set object
- - The Data Attributes sub-tab on the Data tab of a System object
- You can choose whether you want to accept or reject the discovered attributes. Select the attributes with the "Discovered" review status, click the Actions menu and choose Accept or Reject. You cannot undo the changes after you accept or reject the discovered attributes.
- If you do not have the required permissions to curate the attributes, the Actions menu does not appear for the Attributes tab of a Data Set object. In System and Glossary objects, you can view the Actions menu even if you do not have the required permissions to curate the attributes. But, if you try to curate the discovered attributes, an error occurs.
- In a Glossary object, you cannot curate the attributes that are discovered for the child glossaries. To curate the child glossary attributes, navigate to the child glossary objects.
- Bulk Update
- Use the Bulk Update option from the Unison search to curate the discovered attributes.
- The following image shows the Attributes facet with a sample list of discovered objects that you can curate from the Unison search:
In the Attributes facet, select the attributes with the "Discovered" review status, and click Bulk Update. In the Bulk Update Items section, you can see the selected attributes. In the Definition section, you can view the fields that you can update for the attributes. To curate the attributes, select Accept or Reject in the Review Status field.
If you choose to reject a discovered attribute, Axon permanently deletes the discovered attribute without changing the review status to "Rejected". You can view the audit history to identify who curated the attributes and when the attributes were curated.