Enterprise Data Catalog Scanner Configuration Guide > Configuring Cloud Resources > Google BigQuery
  

Google BigQuery

You can use the Google BigQuery resource to collect metadata from the assets in Google BigQuery.

Objects Extracted

The Google BigQuery resource extracts metadata from the following assets in a Google BigQuery data source:

Permissions to Configure the Resource

Make sure that you perform the following steps before you configure the Google BigQuery resource:

Connect to a Google BigQuery Data Source Enabled for SSL

To connect to a Google BigQuery data source enabled for SSL, perform the following steps:
  1. 1. Download the Google BigQuery SSL certificates using a web browser.
  2. Note: Make sure that you import the Google Trust Services certificate in the certification path.
  3. 2. Copy the certificates to the <INFA_HOME>/services/shared/security/ directory.
  4. 3. Go to the <INFA_HOME>/source/java/jre/bin directory and run the following keytool command to import each copied certificate as a trusted certificate in the Informatica domain keystore:
  5. keytool -import -file <INFA_HOME>/services/shared/security/<certificate>.cer -alias <alias name> -keystore <INFA_HOME>/services/shared/security/infa_truststore.jks -storepass <Informatica domain keystore password>

Resource Connection Properties

The General tab includes the following properties:
Property
Description
Project ID
Name of the Google Cloud Platform project that you want to access.
Private Key
The private key associated with the service account.
Client Email
The client email address associated with the service account.
The Metadata Load Settings tab includes the following properties:
Property
Description
Enable Source Metadata
Extracts metadata from the data source.
Scan Hidden Datasets
Extracts metadata from hidden and anonymous datasets.
Dataset
Select the datasets that you want to use to import metadata from Google BigQuery tables in the project. Default is all datasets.
Source Metadata Filter
You can include or exclude tables and views from the resource run. Use semicolons (;) to separate the table names and view names.
For more information about the filter field, see Source Metadata and Data Profile Filter.
Case Sensitive
Specifies that the resource is configured for case sensitivity. Select one of the following values:
  • - True. Select this check box to specify that the resource is configured as case sensitive.
  • - False. Clear this check box to specify that the resource is configured as case insensitive.
The default value is True.
Memory
Specifies the memory required to run the scanner job. Select one of the following values based on the data set size that you plan to import into the catalog:
  • - Low
  • - Medium
  • - High
Note: For more information about the memory values, see the Tuning Enterprise Data Catalog Performance article on How To-Library Articles tab in the Informatica Doc Portal
JVM Options
JVM parameters that you can set to configure scanner container. Use the following arguments to configure the parameters:
  • - -Dscannerloglevel=<DEBUG/INFO/ERROR>. Changes the log level of scanner to values, such as DEBUG, ERROR, or INFO. Default value is INFO.
  • - -Dscanner.container.core=<No. of core>. Increases the core for the scanner container. The value should be a number.
  • - -Dscanner.yarn.app.environment=<key=value>. The key-value pair that you need to set in the Yarn environment. Use a comma to separate the key pair value.
  • - -Dscanner.pmem.enabled.container.memory.jvm.memory.ratio=<1.0/2.0>. Increases the scanner container memory when pmem is enabled. The default value is 1.
Track Data Source Changes
View metadata source change notifications in Enterprise Data Catalog.
Note: Following are the list of features that the Google BigQuery resource does not support: