Troubleshooting > Troubleshooting > Troubleshooting a database ingestion and replication task
  

Troubleshooting a database ingestion and replication task

If you change an unsupported data type of a source column to a supported data type, the change might not be replicated to the target.
This problem occurs when the Modify column schema drift option is set to Replicate and the Add column option is set to Ignore.
Database Ingestion and Replication does not create target columns for source columns that have unsupported data types when you deploy a task. If you change the unsupported data type to a supported data type for the source column later, Database Ingestion and Replication processes the modify column operation on the source but does not replicate the change to the target. When Database Ingestion and Replication tries to add a column with the supported data type to the target, the operation is ignored because the schema drift option Add column is set to Ignore.
To handle this situation, perform the following steps:
  1. 1On the Schedule and Runtime Options page in the database ingestion and replication task wizard, under Schema Drift Options, set the Add column option to Replicate.
  2. 2Change the source column data type to a supported data type again so that the database ingestion and replication job can detect this schema change.
  3. The job processes the DDL operation and creates the new target column.
    Note: The job does not propagate the column values that were added prior to changing the source column data type.
  4. 3If you want to propagate all of the values from the source column to the target, resynchronize the target table with the source.
If you change a primary key constraint on the source, Database Ingestion and Replication stops processing the source table on which the DDL change occurred.
This problem occurs if you add or drop a primary key constraint, or if you add or drop a column from an existing primary key.
To resume processing the source table for combined initial and incremental jobs, resynchronize the target table with the source.
To resume processing the source table for incremental jobs, perform the following steps:
  1. 1On the Source tab in the database ingestion and replication task definition, add a table selection rule to exclude the source table.
  2. 2Redeploy the task.
  3. Database Ingestion and Replication deploys the edited task and deletes the information about the primary keys of the excluded table.
  4. 3Edit the task again to delete the table selection rule that excluded the source table.
  5. 4Redeploy the task.
If a DDL column-level change causes a source table subtask to stop or be in error and then you resume the database ingestion and replication job, the expected change in the table state is delayed.
If a DDL column-level change on a source table causes a table subtask to stop or be in error and then you resume the database ingestion and replication job, the state of the table subtask might remain unchanged until a DML operation occurs on the table. For example, if you set a schema drift option to Stop Table for an incremental or initial and incremental database ingestion task and then deploy and run the job, when a DDL change occurs on a source table, the job monitoring details shows the table subtask to be in the Error state. If you stop the job and then resume it with a schema drift override to replicate the DDL change, the table subtask temporarily remains in the Error state until the first DML operation occurs on the source table.
Database Ingestion and Replication failed to deploy a task that has a Snowflake target with the following error:
Information schema query returned too much data. Please repeat query with more selective predicates.
This error occurs because of a known Snowflake issue related to schema queries. For more information, see the Snowflake documentation.
In Database Ingestion and Replication, the error can cause the deployment of a database ingestion and replication task that has a Snowflake target to fail when a large number of source tables are selected.
To handle the deployment failure, drop the target tables. Then update the database ingestion and replication task to select fewer source tables for generating the target tables. Then try to deploy the task again.
A database ingestion and replication job that runs on Linux ends abnormally with the following out-of-memory error:
java.lang.OutOfMemoryError: unable to create new native thread
The maximum number of user processes that is set for the operating system might have been exceeded. If the Linux ulimit value for maximum user processes is not already set to unlimited, set it to unlimited or a higher value. Then resume the job.
If you copy an asset to another location that already includes an asset of the same name, the operation might fail with one of the following errors:
Operation succeeded on 1 artifacts, failed on 1 artifacts.
Operation did not succeed on any of the artifacts.
If you try to copy an asset to another location that already has an asset of the same name, Database Ingestion and Replication displays a warning message that asks if you want to keep both assets, one with a suffix such as "- Copy 1". Note that when you choose to keep both assets, Database Ingestion and Replication validates the name length to ensure that it will not exceed the maximum length of 50 characters after the suffix is added. If the name length will exceed 50 characters, the copy operation will fail. In this case, you must copy the asset to another location, rename the copy, and then move the renamed asset back to the original location.
A Kafka consumer ends with one of the following errors:
org.apache.avro.AvroTypeException: Invalid default for field meta_data: null not a {"type":"array"...
org.apache.avro.AvroTypeException: Invalid default for field header: null not a {"type":"record"...
This error might occur because the consumer has been upgraded to a new Avro version but still uses the Avro schema files from the older version.
To resolve the problem, use the new Avro schema files that Database Ingestion and Replication provides.
A database ingestion and replication job that propagates incremental change data to a Kafka target that uses Confluent Schema Registry fails with the following error:
io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Register operation timed out; error code: 50002
This problem might occur when the job is processing many source tables, which requires Confluent Schema Registry to process many schemas. To resolve the problem, try increasing the value of the Confluent Schema Registry kafkastore.timeout.ms option. This option sets the timeout for an operation on the Kafka store. For more information, see the Confluent Schema Registry documentation.
Subtasks of a database ingestion and replication job that has a Google BigQuery target fail to complete initial load processing of source tables with the following error:
The job has timed out on the server. Try increasing the timeout value.
This problem occurs when the job is configured to process many source tables and the Google BigQuery target connection times out before initial load processing of the source tables is complete. To resolve this problem, increase the timeout interval in the Google BigQuery V2 target connection properties.
  1. 1In Administrator, open the Google BigQuery V2 connection that is associated with the database ingestion job in Edit mode.
  2. 2In the Provide Optional Properties field, set the timeout property to the required timeout interval in seconds. Use the following format:
  3. "timeout": "<timeout_interval_in_seconds>"
  4. 3Save the connection.
  5. 4Redeploy the database ingestion and replication task.
A database ingestion and replication task with an Amazon Redshift target returns one of the following errors during deployment:
Database Ingestion and Replication could not find target table 'table_name' which is mapped to source table 'table_name' when deploying the database ingestion task.
com.amazon.redshift.util.RedshiftException: ERROR: Relation "table_name" already exists
This problem occurs because Amazon Redshift reads table and column names as lowercase by default.
To prevent this error, you can set the enable_case_sensitive_identifier parameter to "true" when configuring the database parameter group. For more information about this parameter, see the AWS Amazon Redshift documentation at https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html.
Deployment of a database ingestion and replication task fails if the source table or column names include multibyte or special characters and the target is Databricks Delta.
When a new Databricks target table is created during deployment, an entry is added to the Hive metastore that Databricks uses. The Hive metastore is typically a MySQL database. More specifically, column names are inserted into the TABLE_PARAMS field of the metastore. The charset collation of the PARAM_VALUE from TABLE_PARAMS is latin1_bin, and the charset is latin1. This charset does not support Japanese characters. To resolve the problem, create an external metastore with UTF-8_bin as the collation and UTF-8 as the charset. For more information, see the Databricks documentation at https://docs.microsoft.com/en-us/azure/databricks/kb/metastore/jpn-char-external-metastore and https://kb.databricks.com/metastore/jpn-char-external-metastore.html.
A database ingestion and replication initial load task with an Azure Synapse Analytics target and an Oracle, PostgreSQL, or SAP HANA source that includes tables with Japanese data might write the Japanese data as ??? characters to the target columns
To resolve this issue, edit the task to add custom data type mappings that map the following source data types to target data types:
A database ingestion and replication initial load job that has a SQL Server source with XML data and a Microsoft Azure Synapse Analytics target fails
When you run a database ingestion and replication job that has a SQL Server source with XML column data of 500,000 or more single-byte characters and a Microsoft Azure Synapse Analytics target, subtasks might fail when Synapse Analytics tries to process SQL queries for creating the target tables. Before writing data to the target tables, Database Ingestion and Replication truncates XML data to 500,000 bytes by default and adds x bytes of auxiliary metadata. Synapse Analytics stores each source character as 2 bytes and has a maximum row size of 1000000 bytes. As a result, the number of bytes in a row to be written to a target table can be greater than the maximum target row size. In this case, the subtask fails and a trace message reports information such as:
Unexpected error encountered filling record reader buffer: HadoopExecutionException: The size of the schema/row at ordinal 1 is 1000050 bytes. It exceeds the maximum allowed row size of 1000000 bytes for Polybase.
To correct the problem, determine an appropriate lower truncation point and specify it in the unloadClobTruncationSize custom property on the Target page of the task wizard. If only one XML column occurs in a row, decrease the truncation point by the difference between <actual schema/row size> and maximum row size. For example, based on the preceding sample message, you'd calculate the lower truncation point for a row with a single XML column as 500000 - 50, or 499950 bytes.
A database ingestion and replication incremental load job with a Db2 for z/OS source ends abnormally when processor resource limits are exceeded during Informatica stored procedure processing
If you use Db2 DSNRLSTxx resource limit table to limit the amount of processor resources that are used by SQL operations, such as SELECT, INSERT, UPDATE, and DELETE, on a z/OS source system, database ingestion and replication incremental load jobs with Db2 for z/OS sources might end abnormally. The jobs end if the default or set resource limits are not large enough to accommodate long-running processing of the WLM stored procedure that database ingestion and replication jobs use to process captured change data. If abends related to resource limits occur, perform the following steps:
  1. 1Add a row to the resource limit table specifically for the database ingestion and replication packages in your runtime environment. Add the row with the following columns:
  2. 2For the changes to the resource limit table to take effect, issue the Db2 -START RLIMIT command.
If you do not have the authority to perform these steps, contact your Db2 DBA or z/OS system programmer.