Introduction to Apache HiveQL Script catalog sources
You can use Metadata Command Center to extract metadata from a source system.
A source system is any system that contains data or metadata. For example, Apache Hive is a source system from which you can extract metadata through an Apache HiveQL Script catalog source with Metadata Command Center. A catalog source is an object that represents and contains metadata from the source system.
Before you extract metadata from a source system, you first create and register a catalog source that represents the source system.
When Metadata Command Center extracts metadata, Data Governance and Catalog displays the extracted metadata and its attributes as technical assets. You can then perform tasks such as analyzing the assets, viewing lineage, and creating links between those assets and their business context.
You can only extract metadata using this catalog source.
Extraction and view process
To extract metadata from a source system, configure the catalog source and run the extraction job in Metadata Command Center. Then view the results in Data Governance and Catalog.
The following image shows the process to extract metadata from a source system:
After you verify prerequisites, perform the following tasks to extract metadata from Apache HiveQL Script:
1Register a catalog source. Create a catalog source object, select Apache HiveQL Script, and select the connection.
2Configure the catalog source. Specify the runtime environment, configure the metadata extraction capability, and add filters for metadata extraction.
3Optionally, associate stakeholders. Associate users with technical assets, giving the users permission to perform actions determined by their roles.
4Run or schedule the catalog source job.
5Optionally, if the catalog source job generates referenced asset objects, you can assign a connection to referenced source system assets.
You can view the lineage with object references without performing connection assignment. After connection assignment, you can view the objects.
After you run the catalog source job, you view the results in Data Governance and Catalog.
About the Apache HiveQL Script catalog source
You can use the Apache HiveQL Script catalog source to extract metadata from Apache Hive scripts.
Apache HiveQL Script is a set of Apache HiveQL statements stored in files that you can use to run sequential scripts.
Catalog source capabilities
You can configure capabilities for a catalog source.
You can configure metadata extraction for the Apache HiveQL Script catalog source. The metadata extraction capability extracts source metadata from external source systems.
Extracted metadata
You can use the Apache HiveQL Script catalog source to extract metadata from Apache Hive scripts.
Objects extracted
Metadata Command Center extracts the following metadata from Apache Hive scripts:
•Calculation
•Folder
•Script
•Statements
Compatible connectors
Before you configure an Apache HiveQL Script catalog source, you must connect to the Apache Hive source system.
Use the Hive connector to connect to the Apache Hive source system.
For information about configuring a connection, see Connections.