Google BigQuery targets

The following list identifies considerations for preparing and using Google BigQuery targets:

Target preparation

•Download and install a Google BigQuery JDBC driver .jar file.

1Download the Google BigQuery JDBC driver SimbaJDBCDriverforGoogleBigQuery42_1.6.1.1002 from the Google Cloud website.
2Copy only the GoogleBigQueryJDBC42.jar file from the installation zip to the following directory:

<Secure_Agent_installation_directory>/apps/Database_Ingestion/ext/

Note:

If other Google BigQuery driver .jar files exist in that directory, remove them. They're now bundled with Database Ingestion and Replication.

3Restart the Secure Agent.

•Ensure that you have a service account in your Google account to access Google BigQuery and Google Cloud Storage.

•Ensure that you have the client_email, project_id, private_key, and region ID values for the service account. Enter the values in the corresponding Service Account ID, Project ID, Service Account Key, and Region ID connection properties when you create a Google BigQuery connection.

•If you want to configure a timeout interval for a Google BigQuery connection, specify the timeout interval property in the Provide Optional Properties field of the connection properties. Use the following format:

"timeout": "<timeout_interval_in_seconds>"

•Verify that you have read and write access to the following entities:

- Google BigQuery datasets that contains the target tables
- Google Cloud Storage path where Database Ingestion and Replication creates the staging file

•To use a Google BigQuery target, you must configure the required permissions. First, create an IAM & Admin service account in your Google Cloud project and assign the Custom Role to it. Then add the following permissions to the account's custom role:

- bigquery.datasets.get - To get metadata about a data set.
- bigquery.jobs.create - To run jobs and queries.
- bigquery.models.create - To create new models.
- bigquery .models.delete - To delete models.
- bigquery .models.export - To export a model.
- bigquery.models.getData - To get model data. The bigquery.models.getMetadata permission must also be specified.
- bigquery.models.getMetadata - To get model metadata. The bigquery.models.getData permission must also be specified.
- bigquery .models.list - To list models and metadata for the models.
- bigquery.models.updateData - To update model data. The bigquery.models.updateMetadata permission must also be specified.
- bigquery.models.updateMetadata - To update model metadata. The bigquery.models.updateData permission must also be specified.
- bigquery.routines.create - To create new routines, including stored procedures.
- bigquery.routines.delete - To delete routines.
- bigquery.routines.get - To get routine definitions and metadata.
- bigquery.routines.list - To list routines and metadata for the routines.
- bigquery.routines.update - To update routine definitions and metadata for the routines.
- bigquery.routines.updateTag - To update tags for routines.
- bigquery.tables.create - To create new tables.
- bigquery.tables.delete - To delete tables.
- bigquery.tables.deleteIndex - To drop search indexes on tables.
- bigquery.tables.deleteSnapshot - To delete table snapshots.
- bigquery.tables.export - To export table data out of BigQuery.
- bigquery.tables.get - To get table metadata. The bigquery.tables.getData permission must also be specified.
- bigquery.tables.getData - To get and query table data. The bigquery.tables.get permission must also be specified.
- bigquery.tables.list - To list tables and metadata for the tables.
- bigquery.tables.update - To update table metadata. The bigquery.tables.updateData permission must also be specified.
- bigquery.tables.updateData - To update table data. Thebigquery.tables.update permission must also be specified.
- bigquery.tables.updateTag - To update tags for tables.
- resourcemanager.projects.get - Get the name of the billing account associated with the project.
- storage.objects.create - To allow users to create objects.
- storage.objects.delete - To grant permissions to delete objects.
- storage.objects.get - To read object metadata when listing and reading bucket metadata.
- storage.objects.list - To list objects in a bucket.

Target usage

•Database Ingestion and Replication loads source data in bulk mode to Google BigQuery targets.

•For database ingestion and replication incremental load tasks, make sure that source database Change Data Capture (CDC) is enabled on all source columns.

•Database Ingestion and Replication supports special characters in table and column names in the source tables. The special characters are replaced with an underscore (_) in the Google BigQuery target table or column names.

The custom property key-value pair targetReplacementValue=toHex prevents Database Ingestion and Replication from replacing the special characters with an underscore in the generated target schema and converts the special characters to hexadecimal format.

To convert the special characters to their hexadecimal value, perform the following actions before deploying the database ingestion and replication task:

1Create a properties file to be used at the metadata-manager layer. Add the targetReplacementValue=toHex key-value pair to the properties file.
2Open the Runtime Environment page in Administrator and edit the Secure Agent. Create a custom property in the Custom Configuration Details area:

▪ Select the Database Ingestion service.
▪ Select the DBMI_AGENT_ENV type.
▪ Enter DBMI_TASK_OVERRIDE_PROPERTIES as the property name.
▪ Enter the properties file location as the property value.

3Set the targetReplacementValue custom property to toHex on the Target page of the task wizard.

Before running the task, add <jobname>. to the "targetReplacementValue" key in the properties file:

<jobname>.targetReplacementValue=toHex

If the property affects all jobs, add "alljobs." to the "targetReplacementValue" key:

alljobs.targetReplacementValue=toHex

•If you select Audit apply mode for a database ingestion and replication task, you can select the audit metadata columns to include in the target tables on the task wizard's Target page under Advanced. If you specify a value in the Prefix for Metadata Columns field, do not include special characters. Otherwise, task deployment will fail.

•By default, the following types of source columns are mapped to Google BigQuery string columns with no length specification:

- Source columns that have a character data type
- SAP HANA source columns that have the longdate or timestamp data type

•When you deploy database ingestion and replication jobs, Database Ingestion and Replication generates Google BigQuery target tables clustered by primary key or unique key columns, by default. Each key column must have one of the following data types that Google BigQuery supports for clustering:

- STRING
- INT64
- NUMERIC
- BIGNUMERIC
- DATE
- DATETIME
- TIMESTAMP
- BOOL
- GEOGRAPHY

If any column in the primary key or unique key has an unsupported data type, that column is skipped during clustering. For example, if the primary key contains the C1, C2, C3, C4, C5 columns and C2 has an unsupported data type, the target table is created with the C1, C3, C4, and C5 columns in the CLUSTER BY clause.