Create a Google BigQuery V2 connection to securely read data from or write data to Google BigQuery.
Connect to Google BigQuery
Let's configure the Google BigQuery V2 connection properties to connect to Google BigQuery.
Before you begin
Before you configure a connection, ensure that you download the Google service account key file in JSON format. The service account key file is created when you create a Google service account.
You require the client email, private key, and project ID from the service account key JSON file to create a Google BigQuery connection.
Connection details
The following table describes the basic connection properties:
Property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Description
Description of the connection. Maximum length is 4000 characters.
Type
Google BigQuery V2
Runtime Environment
The name of the runtime environment where you want to run tasks.
You cannot run an application ingestion and replication task, database ingestion and replication task, or streaming ingestion and replication task on a Hosted Agent or serverless runtime environment.
Authentication type
Select the Service Account authentication type to access Google BigQuery and configure the authentication-specific parameters.
Service Account authentication
Service Account authentication requires at a minimum your Google BigQuery service account email, service account key, and project ID.
The following table describes the basic connection properties for Service Account authentication:
Property
Description
Service Account Email
The client_email value from the Google service account key JSON file.
Service Account Key
The private_key value from the Google service account key JSON file.
Project ID
The project_id value from the Google service account key JSON file.
If you have created multiple projects with the same service account, enter the ID of the project that contains the dataset that you want to connect to.
Note: If you want to validate the credentials for the Service Account Email, Service Account Key, and Project ID during a test connection, set the flag CredentialValidation:true in the Provide Optional Properties field in advanced settings.
Advanced settings
The following table describes the advanced connection properties for Service Account authentication:
Property
Description
Storage Path
Path in Google Cloud Storage where the agent creates a local stage file to store the data temporarily. The agent uses this storage when it reads data in staging mode or writes data in bulk mode.
Use one of the following formats:
- gs://<bucket_name>
- gs://<bucket_name>/<folder_name>
When you enable cross-region replication in Google BigQuery, enter a Google Cloud Storage path that supports dual region storage.
Connection Mode
The mode that you want to use to read data from or write data to Google BigQuery.
Select one of the following connection modes:
- Simple. Flattens each field within the Record data type field as a separate field in the mapping.
- Hybrid. Displays all the top-level fields in the Google BigQuery table including Record data type fields. Google BigQuery V2 Connector displays the top-level Record data type field as a single field of the String data type in the mapping.
- Complex. Displays all the columns in the Google BigQuery table as a single field of the String data type in the mapping.
Default is Simple.
Use Legacy SQL for Custom Query
Select this option to use legacy SQL to define a custom query. If you clear this option, use standard SQL to define a custom query.
This property doesn't apply if you configure the Google BigQuery V2 connection in hybrid or complex mode.
Dataset Name for Custom Query
When you define a custom query, specify a Google BigQuery dataset.
Schema Definition File Path
Directory on the Secure Agent machine where the Secure Agent creates a JSON file with the sample schema of the Google BigQuery table. The JSON file name is the same as the Google BigQuery table name.
Alternatively, you can specify a storage path in Google Cloud Storage where the Secure Agent creates a JSON file with the sample schema of the Google BigQuery table. You can download the JSON file from the specified storage path in Google Cloud Storage to a local machine.
The schema definition file is required if you configure complex connection mode in the following scenarios:
- You add a Hierarchy Builder transformation in a mapping to read data from relational sources and write data to a Google BigQuery target.
- You add a Hierarchy Parser transformation in a mapping to read data from a Google BigQuery source and write data to relational targets.
Region ID
The region name where the Google BigQuery dataset that you want to access resides.
Note: Ensure that you specify a bucket name or the bucket name and folder name in the Storage Path property that resides in the specified region.
For more information about the regions supported by Google BigQuery, see Dataset locations.
Staging Dataset
The Google BigQuery dataset name where you want to create the staging table to stage the data. You can define a Google BigQuery dataset that is different from the source or target dataset.
Provide Optional Properties
Comma-separated key-value pairs of custom properties in the Google BigQuery V2 connection to configure certain source and target functionalities.
For more information about the list of custom properties that you can specify, see Optional Properties configuration Knowledge Base.
Proxy server settings
If your organization uses an outgoing proxy server to connect to the Internet, the Secure Agent connects to Informatica Intelligent Cloud Services through the proxy server.
You can configure the Secure Agent to use the proxy server on Windows and Linux. You can use the unauthenticated or authenticated proxy server.
Use one of the following methods to configure the proxy settings:
•Configure the Secure Agent through the Secure Agent Manager on Windows or shell command on Linux.
•Configure the JVM options for the DTM in the Secure Agent properties. For instructions, see the Proxy server settings Knowledge Base article.
Configure proxy settings for NTLM authentication
You can use a proxy server that uses NTLM authentication to connect to Google BigQuery. To configure the proxy settings for NTLM authentication, perform the following steps:
1In Administrator, select Runtime Environments.
2Select the Secure Agent for which you want to configure from the list of available Secure Agents.
3In the upper-right corner, click Edit.
4In the System Configuration Details section, select the Type as DTM for the Data Integration Server.
5Edit the JVMOption1 and add the following value:
-Dhttp.auth.ntlm.domain=<domain name>
6Select the Type as Platform for the Data Integration Server.
7Edit the INFA_DEBUG property and add the following value: