When you use Google BigQuery Connector, you can write data by using bulk mode or streaming mode. Before you choose a mode, see the Google documentation to understand the cost implications and trade-offs for each mode.
You can write data to a Google BigQuery target by using one of the following modes:
Bulk mode
Use bulk mode when you want to write large volumes of data in a cost-efficient manner.
In bulk mode, Google BigQuery Connector first writes the data to a staging file in Google Cloud Storage. When the staging file contains all the data, Google BigQuery Connector loads the data from the staging file to the BigQuery target.
When you enable staging file compression, Google BigQuery Connector compresses the size of the staging file before it writes data to Google Cloud Storage. Google BigQuery Connector writes the compressed file to Google Cloud Storage and then submits a load job to the BigQuery target.
Note: Enabling compression reduces the time that Google BigQuery Connector takes to write data to Google Cloud Storage. However, there will be a performance degradation when Google BigQuery Connector writes data from Google Cloud Storage to the BigQuery target.
Google BigQuery Connector deletes the staging file unless you configure the task or mapping to persist the staging file. You can choose to persist the staging file if you want to archive the data for future reference.
Streaming mode
Use streaming mode when you want the Google BigQuery target data to be immediately available for querying and real-time analysis. Evaluate Google's streaming quota policies and billing policies before you use streaming mode.
In streaming mode, Google BigQuery Connector directly writes data to the BigQuery target. Google BigQuery Connector appends the data into the BigQuery target.
You can configure the number of rows that you want Google BigQuery Connector to stream in one request. If you want to stream a larger number of rows than the maximum permissible limit prescribed by Google, you can write the data to multiple smaller target tables instead of one large target table. You can create a template table based on which Google BigQuery must create multiple tables. You can define a unique suffix for each table. Google BigQuery creates each table based on the template table and adds the suffix to uniquely identify each table.