The November 2024 release of CLAIRE GPT includes the following new features and enhancements.
Metadata access control
This release of Data Governance and Catalog introduces metadata access control, which provides options for enhanced control over how users interact with the assets in the catalog. Organization administrators create access policies to control the level of access that users have on assets. When you discover or explore assets in CLAIRE GPT, you can view and perform actions on assets only if your organization administrator has explicitly granted access to you through access policies.
CLAIRE GPT responses
CLAIRE GPT responses have the following enhancements:
•The relevance of data discovery, metadata exploration, and data exploration responses has increased with the introduction of interactive and applicable visual and textual summaries.
•You can interact with the responses and switch between charts and the raw data in tabular format to meet your business requirements.
•The responses allow users to compare and engage with data. For example, the responses now appear in a comparative tabular format, and you can copy the entire table or specific sections for further analysis.
•Data discovery responses include explanations for each item in the response. The explanation provides information such as why particular items appear in the response, as well as unique insights for different asset types.
•You can click Show more in the responses to see the next 10 records.
The revamped responses with detailed summary and other improvements enhance your user experience.
For more information about data discovery, metadata exploration, and data exploration, see Using CLAIRE GPT.
Data exploration
This release includes the following enhancements related to data exploration:
•CLAIRE GPT implicitly understands key business metrics including KPIs such as CAGR, CLV, YoY growth, and QoQ growth.
•The responses provide business insights on the data without explicitly asking for it.
•You can ask analytical questions.
Data quality
This release includes the following enhancements related to data quality:
Data quality for business assets
Ask questions to explore the data quality of business assets such as business data sets, glossary, and policies.
For example, you can enter the following prompts:
- Show all data quality rules for the critical data elements in the <data asset name> data set
- Show me the data quality rules without stakeholders
- Show me business data sets that do not have data quality rules
Assess data quality of data sets and data elements
Discover data sets and data elements based on their data quality score statuses. You can ask questions to discover assets with poor, good, low, and acceptable data quality scores that correspond to the Good, Acceptable, and Not Acceptable score statuses in Data Governance and Catalog. The response returns the score status from the last data quality rule run.
For example, you can enter the following prompts to asses the data quality score status:
- Show data sets with poor data quality status
- Show data sets having dq score < 50
- Show the data elements from <data set name> with acceptable DQ status
Summarize data and identify outliers
Ask questions to view the data summary of data sets and data elements in your organization.
The data summary includes the data profile of assets such as the data type, statistics, null values, distinct values, and information about the outliers detected in the profiled data. If the data profile of the requested asset is not available in the catalog, CLAIRE GPT performs on-demand data profliling for the critical data elements in the data set to return the data summary.
For example, you can enter the following prompts to view the data summary of assets:
- Show me the data summary of the <asset name>
- Summarize the data for the column <column name>
Diagnose the data quality of assets
To gain additional insights into your assets, diagnose the data quality of data sets and data elements that have poor data quality scores. You can identify the root cause that might have led to the poor data quality scores for assets.
CLAIRE GPT identifies the root cause that led to the poor data quality of assets by analyzing the specific data quality rules, data quality dimensions, data elements in a data set with poor data quality scores, and upstream assets with poor data quality scores.
For example, you can enter the following prompts:
- What could have led to the low DQ score for <data set name>
- What is the reason for the decline of the average data quality score for <data set name> compared to the previous run?
- What are the key data attributes in <data set name> that are most impacting the data quality score?
Recommendation of data quality rules
Get recommendations for data quality rules for data elements and data sets that don't have any rules either manually created or associated with them through business terms. CLAIRE GPT can recommend rules only under the following conditions:
- There are business terms in your data catalog that match the data elements for which you want to generate rules.
- The matching business terms are associated with a data quality rule template.
CLAIRE GPT tries to match the data element to the closest business term in your catalog. After the business term is matched to the data element, the data quality rule associated with the business term is recommended as a rule for the data element.
You can accept one or more recommended rules using the suggested prompts to instantly create a data quality rule occurrence for the data elements in Data Governance and Catalog. You can also open the rule occurence in Data Governance and Catalog to review or modify it.
For example, you can enter the following prompts:
- Recommend dq rules for <data set name>
- Recommend data quality rules for <data element name>
- Accept all rules
Lineage
This release includes the following enhancements to lineage:
•You can run impact analysis queries on business and technical assets to get detailed information about assets that will be impacted if you update an asset. For large lineages, CLAIRE GPT now extracts directly connected assets as well as indirectly connected assets that are involved in the lineage diagram of the asset in question by up to five hops. For assets that are connected to the asset in question by five or more hops, the response contains information that indicates the presence of further connected assets.
•You can ask indirect prompts to show detailed lineage of an asset.
For example, you can ask the following prompts to view lineage:
- Explain the impact if I change <asset name>
- If the <field> in the <source system> changes, what systems/reports/dashboards could potentially be impacted?
- What is the eventual/ultimate target that gets impacted if <asset name> asset changes? Also, provide me with its stakeholder details
- How many systems will be impacted if the <asset name> asset changes?
•You can ask questions to explore the business lineage of business data sets. While exploring the business lineage, you can ask questions to explore the target and parent business assets in the lineage, assets contributing to the upstream and downstream lineage, stakeholders involved in the lineage, and more.
For example, you can enter the following prompts:
- Show the business lineage for <data set name>
- Trace the data flow for <data set name> and give a report on upstream/downstream assets that have missing policy and glossary associations
- Can you trace the lineage of <data set name> and identify the missing stakeholders?
For more information about lineage, see Using CLAIRE GPT.
Recall previous prompts
If you wish to reuse a previous prompt, you can recall prompts within a conversation. To access previously sent prompts, use the up and down arrow keys. You can opt to try the same prompt again or modify it further. Press Tab to modify a previously entered prompt.
For more information about recalling previous prompts, see Getting Started with CLAIRE GPT.
Creating an ELT pipeline
You can create an ELT pipeline based on any data exploration prompt. CLAIRE GPT has improved how it understands prompts, so it can generate a pipeline based on a wider range of data exploration questions.