Database Ingestion and Replication > Database Ingestion and Replication > Database Ingestion and Replication sources - preparation and usage
  

Database Ingestion and Replication sources - preparation and usage

Before you configure database ingestion and replication tasks for initial load, incremental load, or combined initial and incremental operations, prepare the source database and review any usage considerations for your sources to avoid unexpected results.

Db2 for i sources

To use Db2 for i sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Change capture mechanism for Db2 for i sources

Database Ingestion and Replication provides a single change capture mechanism and architecture for capturing change data from a Db2 source on IBM i and applying the data to a target.
The Secure Agent must run on a Linux or Windows box, separate from the IBM i source system. The network bandwidth between the IBM i system and the Secure Agent box must be robust. Informatica recommends network transfer rates that can handle100s of gigabits or one or more gigabytes of log data. If the network transfer rate is not capable of delivering the log data to the Secure Agent at speeds equal to or greater than the rate at which Db2 produces log data of CDC interest, the database ingestion job will not be able to provide the data to the target in a timely manner. If the data throughput does not meet your SLA, consider changing the hardware to increase the ethernet bandwidth between the IBM i system and the Secure Agent box.
The following image shows the Db2 for i change capture components and data flow:
  1. 1The log collector, which runs in the DBMI Agent service under the Secure Agent, sends a request to the Db2 DISPLAY LOG command for Db2 for i journaled data.
  2. Each request includes a starting RBA along with a list of tables of CDC interest for database ingestion.
  3. 2The command requests the data for the source tables from the Db2 journals and journal receiver.
  4. 3The command returns the journal entries containing the data to the UOW (Unit Of Work) Manager.
  5. 4The UOW Manager sends the journaled data in committed transaction order to the log parser.
  6. 5The log parser parses the DML changes from the committed transactions. Before sending the data to the CDC writer, the log parser transforms it into a canonical form of the Db2 journal data that can be consumed by the DBMI Agent task and applied to the target.
  7. Note: Because this resource-intensive activity occurs on the Secure Agent box, CPU consumption on the IBM i system is minimized.
  8. 6The CDC writer applies the formatted data to the target.

Database Ingestion and Replication journal receiver exit

Database Ingestion and Replication provides an exit program to prevent the deletion of Db2 for i journal receivers that are in use for change data capture processing. Use the journal receiver exit to prevent potential errors related to journal receiver deletion during CDC.
The exit program locks the journal receivers while they're being read by database ingestion and replication incremental load or combined load jobs. If the exit program finds a lock record for a journal receiver during CDC processing, the exit program returns a response that prevents the system from deleting the journal receiver.
When the database ingestion and replication job switches to the next journal receiver on the chain, Database Ingestion and Replication removes the lock record for the journal receiver that was last read and adds a lock record for the next journal receiver to be read. The system will not remove any journal receivers that are created subsequent to the currently locked receiver.
The control information for the journal receiver exit is stored in a Db2 for i table on the source system. The table contains the following columns:
Each combination of JOURNAL_RECEIVER_LIBRARY and JOURNAL_RECEIVER_NAME entries identifies a journal receiver instance that will be locked so that it can't be deleted during change data capture.
To use the journal receiver exit, perform the following steps:
  1. 1Install the journal receiver exit. The installation process registers the Delete Journal Receiver exit program at the QIBM_QJO_DLT_JRNRCV exit point. For more information, see Installing the Database Ingestion and Replication journal receiver exit.
  2. 2Specify the following custom properties on the Source page of the task wizard, for each task with a Db2 for i CDC source:
  3. Custom property
    Description
    pwx.cdcreader.iseries.option.useJournalReceiverExit
    Set this property to true to enable the use of the journal receiver exit for job instances deployed from the task.
    pwx.cdcreader.iseries.option.JournalReceiverExitJobToken
    When pwx.cdcreader.iseries.option.useJournalReceiverExit is set to true, you must specify a token string up to 256 characters in length that is unique for each database ingestion and replication job instance.
    If the token string is not unique across all job instances, unpredictable results can occur, especially when multiple jobs are accessing the same journal receivers.
    Also, ensure that the token value remains the same after a job Resume or Redeploy operation.
    pwx.cdcreader.iseries.option.useJournalReceiverQueries
    Set this property to true to enable Database Ingestion and Replication to check the maintenance level on the Db2 machine to determine if it's capable of doing journal receiver queries.

Installing the Database Ingestion and Replication journal receiver exit

Use the Data Ingestion and Replication journal receiver exit to prevent the deletion of Db2 for i journal receivers while they're in use for change data capture. To manually install journal receiver exit, complete the following steps:
  1. 1Unzip the V01_Exit_Install file to a local folder on your computer.
  2. 2Run the SQL statements in the SQL_1.txt file to check if the system has enough space for the installation. A return value of 1 means the system has enough space.
  3. 3Run the SQL statements in the SQL_2.txt file to create the schema and the default journal and journal receiver for journal receiver exit point handling.
  4. 4Run the FTP commands in the FTP_3.txt file to install the journal receiver exit.
  5. Note: Before starting FTP, navigate to the directory with the IBMi_SaveFile_V01.savf file, which contains journal receiver exit program.
  6. 5Run the SQL statements in the SQL_4.txt file to add the exit point to the IBM i system.
  7. Note: Before running the SQL, replace <userId> with a valid user ID for the system.
  8. 6Use the SQL statements in the SQL_5.txt file to create the Db2 objects used for locking journal receivers to prevent them from being deleted.

Db2 for LUW sources

To use Db2 for Linux, UNIX, and Windows (LUW) sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Db2 for z/OS sources

To use Db2 for z/OS sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Installing and configuring the stored procedure for Db2 for z/OS CDC

To perform Db2 for z/OS CDC processing for incremental load jobs, Database Ingestion and Replication provides a stored procedure that runs on the z/OS source system. The stored procedure calls the Db2 Instrumentation Facility Interface (IFI) to collect change data from the Db2 logs.

z/OS system requirements

Before you begin, verify that the Db2 for z/OS source system meets the following requirements:

Install the Stored Procedure Libraries and Customize the JCL

On your client machine, perform the following steps:
  1. 1Verify that the Db2 for zOS Database Ingestion connector is available.
  2. The Db2 for zOS Database Ingestion connector package contains the Db2 for z/OS stored procedure libraries. When the connector is enabled for the Secure Agent Group (runtime environment), the connector package .zip file is downloaded to the downloads folder in the installation location. The package name has the format package-DB2ZMI.nnnnn, where nnnnn is an incremented package version number. If multiple package versions exist, use the latest one.
  3. 2Unzip the package-DB2ZMI.nnnnn.zip file. The stored procedure files are added to the Db2WLMStoredProcedure folder under the package name.
  4. 3Use FTP to transfer the #STPINST file in the Db2WLMStoredProcedure folder to a sequential file, PDS, or PDSE on the z/OS system.
  5. Notes:
  6. 4Use FTP or another file transfer method to transfer each of the following files to a separate data set on the z/OS system:
  7. Notes:
On the z/OS system, use TSO to receive the transmit (XMI) data sets into an APF-authorized library and then edit the stored procedure JCL member. Also, set Db2 user privileges and add a row to the resource limit table if used.
  1. 1Receive the DBRMLIB transmit data set:
  2. RECEIVE INDATASET(DBMI.ZOS.DBRMLIB.XMI)
    Notes:
  3. 2Receive the LOADLIB transmit data set:
  4. RECEIVE INDATASET(DBMI.ZOS.LOADLIB.XMI)
    See the Notes in Step 1.
  5. 3Receive the USERJCL transmit data set:
  6. RECEIVE INDATASET(DBMI.ZOS.USERJCL.XMI)
    See the Notes in Step 1.
  7. 4Customize the #STPINST file that contains the stored procedure JCL for your environment.
  8. The JCL creates the stored procedure and a global temporary table that holds the results of the requests to the Db2 IFI for data. It also binds the stored procedure package.
    Notes:
  9. 5Customize the JCL in members of the received USERJCL data set. For more information, see Db2 for z/OS USERJCL Data Set.
  10. 6Ensure that the required Db2 privileges have been granted before you run the stored procedure JCL job. For more information, see Db2 for z/OS privileges.
  11. 7Add a row in the Db2 DSNRLSTxx resource limit table for database ingestion packages to ensure processor resources are adequate for stored procedure processing. Otherwise, incremental load jobs might abend. Add the row with the following columns:
  12. Then issue the Db2 -START RLIMIT command for For the changes to the resource limit table to take effect.

Db2 for z/OS USERJCL Data Set

The downloaded USERJCL data set is a partitioned data set (PDS) or extended partitioned data set (PDSE) that contains a JCL member for running a job that collects information that is not readily available otherwise when using Informatica-supplied WLM Db2 stored procedure.
You can fully customize member JCL in the USERJCL PDS or PDSE. Informatica recommends that you create a copy of a delivered JCL member under another name and then customize the copy so that original member is retained intact for reference.
If you install multiple stored procedures on multiple Db2 subsystems, you can install only one USERJCL library and create members in it that are tailored for each Db2 subsystem. Alternatively, you can create a separate library for a specific Db2 subsystem. Make sure that the database ingestion and replication job contains the correct library and member information for the Db2 subsystem.
The USERJCL PDS or PDSE contains the following member:
LOGINV member
LOGINV contains JCL for a job that obtains a Db2 log inventory list. If you set the Initial Start Point for Incremental Load property to Specific Date and Time for an ingestion task in the task wizard, the inventory list is used to determine the start point in the logs when the database ingestion and replication job runs for the first time or restarts. The log inventory list provides Database Ingestion and Replication with a starting RBA or LSN and an ending RBA or LRSN, along with the starting and ending timestamps of all active and archive logs for the Db2 subsystem. Database Ingestion and Replication uses this information to select the appropriate log archive to search for the requested start point. If the Initial Start Point for Incremental Load property is set to any option other than Specific Date and Time, you do not need to install the USERJCL library.
Contents of the LOGINV member:
//<USERID>I JOB 'LOG INVENTORY',MSGLEVEL=(1,1),MSGCLASS=X,
// NOTIFY=&SYSUID,CLASS=A,REGION=0M
//* -------------------------------------------------------
//*
//* PLEASE CHANGE DSN? TO THE DB2 SUBSYSTEM OF YOUR CHOICE.
//*
//* THIS JCL MEMBER CAN BE USED TO RUN A LOG INVENTORY
//* LIST. DBMI WILL REQUEST THIS IF A DBMI JOB IS TRYING TO
//* RETART USING TIME. THE LOG INVERNTORY GIVES DBMI THE
//* ABILITY TO CORRELATE RBA/LSRN TO ACTUAL LOG RESTART
//* POSITIONS. DBMI PARSES THE SYSPRINT OUTPUT TO GET THE
//* REQUIRED INFORMATION.
//*
//*
//* SUBSTITUTION TAGS
//* _________________
//*
//* SUBSTITUTION TAGS ARE USED TO INSERT DATA INTO THE JOB
//* BY DBMI BEFORE IT IS SUBMITTED. YOU MAY COPY THIS JCL
//* INTO ANOTHER MEMBER AND MODIFY ITS CONTENTS. SUBSTITUTION
//* TAGS MAY ALSO BE REMOVED AND HARD CODED FOR INDIVIDUAL
//* JOB NEEDS.
//*
//*
//* <USERID> WILL BE REPLACED WITH THE USER ID.
//*
//* -------------------------------------------------------
//LOGINV EXEC PGM=DSNJU004
//STEPLIB DD DISP=SHR,DSN=DSN?10.SDSNLOAD
// DD DISP=SHR,DSN=DSN?10.DSNC.RUNLIB.LOAD
//SYSUT1 DD DISP=SHR,DSN=DSN?.BSDS01
//SYSPRINT DD SYSOUT=*
//SYSSUMRY DD SYSOUT=*
//
To submit USERJCL member jobs and retrieve output from submitted jobs, Database Ingestion and Replication executes Db2-supplied stored procedures in batch as required. Make sure that the Database Ingestion and Replication user has the required privileges to execute these stored procedures. For more information about Db2-supplied procedures, see https://www.ibm.com/docs/en/db2-for-zos/12?topic=sql-procedures-that-are-supplied-db2.
A USERJCL member is processed in the following manner:
  1. 1For the LOGINV member, if the Initial Start Point for Incremental Load advanced property is set to Specific Date and Time on the Source page of the database ingestion and replication task wizard, Database Ingestion and Replication checks whether the following custom properties are specified for the source:
  2. 2Database Ingestion and Replication reads the LOGINV member by using the Db2-supplied stored procedure ADMIN_DS_BROWSE.
  3. 3Database Ingestion and Replication substitutes any tags required for the job execution.
  4. 4Database Ingestion and Replication submits the job to the Db2 for z/OS database by using the Db2-supplied stored procedure ADMIN_SUBMIT_JOB.
  5. 5Database Ingestion and Replication queries the status of the submitted jobs by using the Db2-supplied stored procedure ADMIN_JOB_QUERY.
  6. 6After the job completes, Database Ingestion and Replication obtains the job output by using the Db2-supplied stored procedure ADMIN_JOB_FETCH.

Db2 for z/OS privileges

To deploy and run a database ingestion and replication task that has a Db2 for z/OS source, the source connection must specify a Database Ingestion and Replication user who has the privileges required for the ingestion load type.
Privileges for initial load processing
Privileges for incremental load processing
For incremental load jobs, grant the following privileges.
Note: If you do not have the authority required to issue these grants, ask a Db2 administrator who has SYSDBA authority or a higher authority level to issue them.
Permissions Required for Executing the Stored Procedure on z/OS
Ensure that the following Db2 permissions are granted before you run the stored procedure JCL job:
Additionally, grant INSERT and DELETE authority on the schema and stored procedure name that are specified in the JCL for the global temporary table:
GRANT INSERT, DELETE ON schema.stored_procedure_name_RS_TBL TO user

Change capture mechanism for Db2 for z/OS sources

Database Ingestion and Replication provides a single change capture mechanism and architecture for capturing change data from a Db2 source on z/OS and applying the data to a target. This architecture uses multithreaded processing to optimize the performance of collecting data and of parsing and transforming the data into a format the target accepts.
The Secure Agent must run on a Linux or Windows box, separate from the Db2 z/OS source system. The network bandwidth between the z/OS system and the Secure Agent box must be robust. Informatica recommends network transfer rates that can handle100s of gigabytes or one or more gigabits of log data. If the network transfer rate is not capable of delivering the log data to the Secure Agent at speeds equal to or greater than the rate at which Db2 produces log data of CDC interest, the database ingestion and replication job will not be able to provide the data to the target in a timely manner. If the data throughput does not meet your SLA, consider changing the hardware to increase the ethernet bandwidth between the z/OS system and the Secure Agent box.
The following image shows the Db2 for z/OS change capture components and data flow:
  1. 1The multithreaded log collector, which runs in the DBMI Agent service under the Secure Agent, issues multiple concurrent requests for Db2 log data to the Db2 stored procedure.
  2. Each request includes a starting RBA, or a starting LRSN if the logs are in a data-sharing environment, along with a list of tables of CDC interest for database ingestion.
    While processing a sequence of log data, the log collector can request the next sequence of log data.
  3. 2The Db2 Instrumentation Facility Interface ( IFI) API extracts data for the selected source tables of CDC interest from active and archive logs. The IFI then transfers the data, in raw, native form, to the Informatica Db2 stored procedure in z/OS Workload Manager (WLM).
  4. 3The Db2 stored procedure returns the UOWs with captured data to the UOW Manager.
  5. 4The UOW Manager sends the UOWs in commit order to the log parser.
  6. 5The multithreaded log parser concurrently parses DML changes from the committed UOWs. The result is raw native Db2 log data in the format that the target type expects.
  7. Note: Because this resource-intensive activity occurs on the Secure Agent box, CPU consumption on the z/OS system is minimized.
  8. 6The CDC writer applies the formatted data to the target.

Microsoft SQL Server, RDS for SQL Server, Azure SQL Database, and Azure Managed Instance sources

To use Microsoft SQL Server sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations. SQL Server source types include on-premises SQL Server, Relational Database Service (RDS) for SQL Server, Azure SQL Database, and Azure SQL Managed Instance.

Source preparation

Usage considerations

Change capture mechanisms for SQL Server sources

Database Ingestion and Replication provides multiple change capture mechanisms for capturing change data from a SQL Server source and applying the data to a target.

Log-based change data capture with transaction log and CDC tables

Database ingestion and replication incremental load jobs that use log-based CDC capture DML and DDL changes by parsing records from the active transaction log and by reading change records directly from the CDC tables. Change data is read from the active transaction log if the required restart point (LSN) is available there. If the capture start point pre-dates records in the transaction log or in certain other situations, Database Ingestion and Replication automatically transitions to reading change data from the CDC tables. After reading changes from the CDC tables, Database Ingestion and Replication switches back to reading changes from the active transaction log in a transparent manner.
The following components are involved in log-based change capture processing:
The following image shows the SQL Server log-based change data capture components and data flow:
  1. 1The Log Reader process reads log records to capture DML and DDL changes in committed transactions.
  2. Capture processing continues until the committed transactions are complete, the capture process is stopped or interrupted by a fatal error, or a switch to reading change records from a cdc.<capture_instance>_CT table is triggered.
  3. 2In certain situations, Database Ingestion and Replication automatically switches to reading changes from a cdc. <capture instance>_CT table by using the CDC Capture-Instance Table Reader.
  4. Processing switches to the cdc.<capture instance>_CT table under the following conditions:
  5. 3The Log Reader and CDC Capture-Instance Table Reader send change records to the CDC Writer.
  6. 4The CDC Writer formats the change records and applies them to the target.

Change capture with CDC tables only

Database ingestion and replication incremental load jobs can capture changes directly from the SQL Server CDC tables without using the transaction log.
The following components are involved in the change capture processing that uses CDC tables only:
The following image shows the SQL Server log-based change data capture components and data flow:
  1. 1The CDC Capture-Instance Table Reader reads changes from a cdc. <capture instance>_CT table.
  2. 2The CDC Capture-Instance Table Reader sends change records to the CDC Writer.
  3. 3The CDC Writer formats the change records and applies them to the target.

Query-based change capture

Database ingestion and replication jobs capture insert and update changes in the source by querying a timestamp column that is updated when a change occurs. Configuration of the source is limited to adding the common CDC query column to each source table. The query-based CDC method uses the query column to identify the rows that changed since the beginning of a specified CDC interval.
To implement the query-based change capture, set the following options on the Source page of the task wizard:
After the CDC interval elapses, Database Ingestion and Replication uses a SQL query with a WHERE clause that references the CDC query column to identify the rows that received changes during the CDC interval. The change data is captured and applied to the target.
If the source tables selected for the query-based CDC do not have the CDC query column, change data capture will ignore these tables and will continue with the rest of the tables. For the tables that are skipped, corresponding tables generated in the target database will be empty. If none of the source tables have a CDC query column, the job will fail at runtime.
The following image shows the SQL Server query-based change data capture components and data flow:
  1. 1After the CDC interval elapses, Database Ingestion and Replication executes a SQL query in the source database that uses the CDC query column to extract the change data.
  2. 2Change records are sent to the CDC Writer.
  3. 3The CDC Writer formats the change records and applies them to the target.

MongoDB sources

To use MongoDB sources in database ingestion and replication tasks, review the following considerations.

Source preparation

{ resource:
{ db: "", collection: "" }
, actions: [ "find", "changeStream" ] }

Usage considerations

MySQL sources

To use MySQL sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Netezza sources

To use Netezza sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Oracle sources

To use Oracle sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Amazon Relational Database Service (RDS) for Oracle source preparation:
  1. 1Create the ONLINELOG_DIR and ARCHIVELOG_DIR directories that will hold the online and archive redo logs, respectively, on the RDS file system. Use the following exec statements:
  2. exec rdsadmin.rdsadmin_master_util.create_archivelog_dir;
    exec rdsadmin.rdsadmin_master_util.create_onlinelog_dir;
  3. 2Grant the Oracle privileges that are required for the Amazon RDS for Oracle source type to the Database Ingestion and Replication user.
  4. For more information about the privileges required for an Amazon RDS for Oracle source, see Oracle privileges for Amazon RDS for Oracle sources.
  5. 3Define an appropriate retention time for the archived redo logs. Use the following exec statement:
  6. exec rdsadmin.rdsadmin_util.set_configuration('archivelog retention days',number_of_days);
  7. 4In the Amazon RDS console, set the backup retention period for the source database to a value greater than zero to enable automated backups of the database instance.
  8. Note: This step enables ARCHIVELOG mode for the database.
  9. 5Ensure that supplemental logging is enabled at the database level. Use the following statement:
  10. exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');
    When you create a database ingestion and replication task, you can generate a script to enable supplemental logging for the selected source tables.
  11. 6Optionally, in the Amazon RDS console, you can create a parameter group and define the cache sizes of the default buffer pool. The default buffer pool holds buffers that use the primary database block size. Use the following DB_CACHE_SIZE parameter values:
  12. Then select the parameter group for the source database.

Usage considerations

Gathering Information About the Database Ingestion and Replication environment

Before you start creating database ingestion and replication tasks, gather the following information:

General Information

What Oracle version do you use?
Answer: ___________________________________________________
Do you run Oracle on premises or in a cloud-based Amazon RDS for Oracle environment?
Answer: ___________________________________________________
What is the target type?
Answer: ___________________________________________________
What type of load operation do you plan to perform: an initial load (point-in-time bulk load), incremental load (only the changes), or combined initial and incremental load (initial load followed by incremental load)?
Answer: ___________________________________________________
What are the number of cores, memory, and disk space on the system where the Secure Agent will run?
Answer: ___________________________________________________

Oracle environment

What are the host name and port number of the Oracle source database server?
Answer: ___________________________________________________
What is the Oracle system identifier (SID) for the database?
Answer: ___________________________________________________
What are the Oracle database user name and password to use for connecting to the database?
Answer: ___________________________________________________
Is the Oracle Database Client or Instant Client installed on the system where the Secure Agent will run?
Answer: ___________________________________________________
Does the database run in an Oracle Real Application Cluster (RAC)? What's the maximum number of RAC members, including inactive nodes?
Answer: ___________________________________________________
Do you need to capture change data from an Oracle Data Guard logical or physical standby database?
Answer: ___________________________________________________
Do you need to capture change data from tables in a pluggable database (PDB) in an Oracle multitenant environment?
Answer: ___________________________________________________
Do you need to capture change data from tablespaces that use Oracle Transparent Data Encryption (TDE)? If yes, what are the TDE wallet directory and password?
Answer: ___________________________________________________
What is the typical size of units of work (UOWs) for the source tables?
Answer: ___________________________________________________

Oracle redo logs

Are the redo logs in an Oracle Automatic Storage Management (ASM) environment? If you plan to connect to an ASM instance to read redo logs, are you allowed to create a login user ID for ASM that has SYSDBA or SYSASM authority?
Answer: ___________________________________________________
Are ARCHIVELOG mode and minimal global supplemental logging enabled for the Oracle source database? If not, can they be enabled?
Answer: ___________________________________________________
What are the primary and secondary archive log destinations for the archived redo logs from which you want to read change data?
Answer: ___________________________________________________
What is the average amount of archived redo log that is created per hour during peak and non-peak periods for the Oracle database?
Answer: ___________________________________________________
Do you have read access to the redo logs in your environment?
Answer: ___________________________________________________
If you do not have the authority to read the redo logs directly, can the archived redo log files be copied to shared disk or to a file system from which you can access them?
Answer: ___________________________________________________
Do you want Database Ingestion and Replication to read change data from the online log as well as the archived logs?
Answer: ___________________________________________________
Can you make your archived redo logs available for diagnostic use by Informatica Global Customer Support, if necessary, to diagnose an error or anomaly during CDC processing?
Answer: ___________________________________________________

More details for configuring database ingestion

What is the schema name for the source tables from which to replicate data?
Answer: ___________________________________________________
Do you want to replicate data from all tables in the schema or a subset of those tables? If a subset, create a list of them.
Answer: ___________________________________________________
Do the source tables have primary keys? Can supplemental logging be enabled for all of the primary keys?
Answer: ___________________________________________________
Do you have any unkeyed source tables?
Answer: ___________________________________________________
Do the source tables contain columns that have unsupported data types? To determine which data types are not supported for your source types, see the source-specific topics under "Database Ingestion and Replication source considerations" in the Database Ingestion and Replication help.
Answer: ___________________________________________________
Is the default code page of UTF-8 acceptable? If not, which code page do you want to use?
Answer: ___________________________________________________
Do you want to use SSL to encrypt data exchanged between the Secure Agent and database server? Which encryption SSL or TLS protocol do you use?
Answer: ___________________________________________________
Are you allowed to create a new Oracle user and assign the privileges that Database Ingestion and Replication requires to that user? Determine the user name to use.
Answer: ___________________________________________________
Do the source tables contain any Oracle data types that Database Ingestion and Replication does not support?
Answer: ___________________________________________________
Do you want to capture schema drift changes on the source, including add, drop, modify, and rename column operations?
Answer: ___________________________________________________

Oracle privileges

To deploy and run a database ingestion and replication task that has an Oracle source, the source connection must specify a Database Ingestion and Replication user who has the privileges required for the ingestion load type.
Privileges for incremental load processing with log-based CDC
Note: If the Oracle logs are managed by ASM, the user must have SYSASM or SYSDBA authority.
For a database ingestion and replication task that performs an incremental load or combined initial and incremental load using the log-based CDC method, ensure that the Database Ingestion and Replication user (cmid_user) has been granted the following privileges:
GRANT CREATE SESSION TO <cmid_user>;

GRANT SELECT ON table TO <cmid_user>; -- For each source table created by user

-- The following grants are required for combined initial and incremental loads only. Do not
-- use ANY TABLE unless your security policy allows it.
GRANT EXECUTE ON DBMS_FLASHBACK TO <cmid_user>;
GRANT FLASHBACK ON table|ANY TABLE TO <cmid_user>;

-- Include the following grant only if you want to Execute the CDC script for enabling
-- supplemental logging from the user interface. If you manually enable supplemental
-- logging, this grant is not needed.
GRANT ALTER table|ANY TABLE TO <cmid_user>;

GRANT SELECT ON DBA_CONSTRAINTS TO <cmid_user>;
GRANT SELECT ON DBA_CONS_COLUMNS TO <cmid_user>;
GRANT SELECT ON DBA_INDEXES TO <cmid_user>;
GRANT SELECT ON DBA_LOG_GROUPS TO <cmid_user>;
GRANT SELECT ON DBA_LOG_GROUP_COLUMNS TO <cmid_user>;
GRANT SELECT ON DBA_OBJECTS TO <cmid_user>;
GRANT SELECT ON DBA_OBJECT_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_TABLESPACES TO <cmid_user>;
GRANT SELECT ON DBA_USERS TO <cmid_user>;

GRANT SELECT ON "PUBLIC".V$ARCHIVED_LOG TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$CONTAINERS TO <cmid_user>; -- For Oracle multitenant environments
GRANT SELECT ON "PUBLIC".V$DATABASE TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$DATABASE_INCARNATION TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$ENCRYPTION_WALLET TO <cmid_user>; -- For Oracle TDE access
GRANT SELECT ON "PUBLIC".V$LOG TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$LOGFILE TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$PARAMETER TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$PDBS TO <cmid_user>; -- For Oracle multitenant environments
GRANT SELECT ON "PUBLIC".V$SPPARAMETER TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$STANDBY_LOG TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$THREAD TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$TRANSACTION TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$TRANSPORTABLE_PLATFORM TO <cmid_user>;
GRANT SELECT ON "PUBLIC".V$VERSION TO <cmid_user>;

GRANT SELECT ON SYS.ATTRCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CDEF$ TO <cmid_user>;
GRANT SELECT ON SYS.COL$ TO <cmid_user>;
GRANT SELECT ON SYS.COLTYPE$ TO <cmid_user>;
GRANT SELECT ON SYS.IDNSEQ$ TO <cmid_user>;
GRANT SELECT ON SYS.IND$ TO <cmid_user>;
GRANT SELECT ON SYS.INDPART$ TO <cmid_user>;
GRANT SELECT ON SYS.OBJ$ TO <cmid_user>;
GRANT SELECT ON SYS.PARTOBJ$ TO <cmid_user>;
GRANT SELECT ON SYS.RECYCLEBIN$ TO <cmid_user>;
GRANT SELECT ON SYS.TAB$ TO <cmid_user>;
GRANT SELECT ON SYS.TABCOMPART$ TO <cmid_user>;
GRANT SELECT ON SYS.TABPART$ TO <cmid_user>;
GRANT SELECT ON SYS.TABSUBPART$ TO <cmid_user>;

-- Also ensure that you have access to the following ALL_* views:
ALL_CONSTRAINTS
ALL_CONS_COLUMNS
ALL_ENCRYPTED_COLUMNS
ALL_INDEXES
ALL_IND_COLUMNS
ALL_OBJECTS
ALL_TABLES
ALL_TAB_COLS
ALL_TAB_PARTITIONS
ALL_USERS
Privileges for incremental load processing with query-based CDC
For a database ingestion and replication task that performs an incremental load or combined initial and incremental load using the query-based CDC method, ensure that the user has the following privileges at minimum:
GRANT CREATE SESSION TO <cmid_user>;

GRANT SELECT ON DBA_INDEXES TO <cmid_user>;
GRANT SELECT ON DBA_OBJECT_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_OBJECTS TO cmid_user;
GRANT SELECT ON DBA_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_USERS TO <cmid_user>;
GRANT SELECT ON DBA_VIEWS TO <cmid_user>; -- Only if you unload data from views

GRANT SELECT ANY TABLE TO <cmid_user>;
-or-
GRANT SELECT ON table TO <cmid_user>; -- For each source table created by user

GRANT SELECT ON ALL_CONSTRAINTS TO <cmid_user>;
GRANT SELECT ON ALL_CONS_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_ENCRYPTED_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_IND_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_INDEXES TO <cmid_user>;
GRANT SELECT ON ALL_OBJECTS TO <cmid_user>;
GRANT SELECT ON ALL_TAB_COLS TO <cmid_user>;
GRANT SELECT ON ALL_USERS TO <cmid_user>;

GRANT SELECT ON "PUBLIC"."V$DATABASE" TO cmid_user;
GRANT SELECT ON "PUBLIC"."V$CONTAINERS" TO cmid_user;
GRANT SELECT ON SYS.ATTRCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CDEF$ TO <cmid_user>;
GRANT SELECT ON SYS.COL$ TO <cmid_user>;
GRANT SELECT ON SYS.COLTYPE$ TO <cmid_user>;
GRANT SELECT ON SYS.IND$ TO <cmid_user>;
GRANT SELECT ON SYS.IDNSEQ$ TO cmid_user;
GRANT SELECT ON SYS.OBJ$ TO <cmid_user>;
GRANT SELECT ON SYS.RECYCLEBIN$ TO <cmid_user>;
GRANT SELECT ON SYS.TAB$ TO <cmid_user>;
Privileges for initial load processing
For a database ingestion and replication task that performs an initial load, ensure that the user has the following privileges at minimum:
GRANT CREATE SESSION TO <cmid_user>;

GRANT SELECT ON DBA_INDEXES TO <cmid_user>;
GRANT SELECT ON DBA_OBJECT_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_OBJECTS TO cmid_user;
GRANT SELECT ON DBA_TABLES TO <cmid_user>;
GRANT SELECT ON DBA_USERS TO <cmid_user>;
GRANT SELECT ON DBA_VIEWS TO <cmid_user>; -- Only if you unload data from views

GRANT SELECT ANY TABLE TO <cmid_user>;
-or-
GRANT SELECT ON table TO <cmid_user>; -- For each source table created by user

GRANT SELECT ON ALL_CONSTRAINTS TO <cmid_user>;
GRANT SELECT ON ALL_CONS_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_ENCRYPTED_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_IND_COLUMNS TO <cmid_user>;
GRANT SELECT ON ALL_INDEXES TO <cmid_user>;
GRANT SELECT ON ALL_OBJECTS TO <cmid_user>;
GRANT SELECT ON ALL_TAB_COLS TO <cmid_user>;
GRANT SELECT ON ALL_USERS TO <cmid_user>;

GRANT SELECT ON "PUBLIC"."V$DATABASE" TO cmid_user;
GRANT SELECT ON "PUBLIC"."V$CONTAINERS" TO cmid_user;
GRANT SELECT ON SYS.ATTRCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CCOL$ TO <cmid_user>;
GRANT SELECT ON SYS.CDEF$ TO <cmid_user>;
GRANT SELECT ON SYS.COL$ TO <cmid_user>;
GRANT SELECT ON SYS.COLTYPE$ TO <cmid_user>;
GRANT SELECT ON SYS.IND$ TO <cmid_user>;
GRANT SELECT ON SYS.IDNSEQ$ TO cmid_user;
GRANT SELECT ON SYS.OBJ$ TO <cmid_user>;
GRANT SELECT ON SYS.RECYCLEBIN$ TO <cmid_user>;
GRANT SELECT ON SYS.TAB$ TO <cmid_user>;n

Oracle privileges for Amazon RDS for Oracle sources

If you have an Amazon RDS for Oracle source, you must grant certain privileges to the Database Ingestion and Replication user.
Important: You must log in to Amazon RDS under the master username to run GRANT statements and procedures.
Grant the SELECT privilege, at minimum, on objects and system tables that are required for CDC processing to the Database Ingestion and Replication user (cmid_user). Some additional grants are required in certain situations.
Use the following GRANT statements:
GRANT SELECT ON "PUBLIC"."V$ARCHIVED_LOG" TO "cmid_user";

GRANT SELECT ON "PUBLIC"."V$DATABASE" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$LOG" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$LOGFILE" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$TRANSPORTABLE_PLATFORM" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$THREAD" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$DATABASE_INCARNATION" TO "cmid_user";
GRANT SELECT ON "PUBLIC"."V$TRANSACTION" TO "cmid_user";

GRANT SELECT ON "SYS"."DBA_CONS_COLUMNS" TO "cmid_user";
GRANT SELECT ON "SYS"."DBA_CONSTRAINTS" TO "cmid_user";
GRANT SELECT ON DBA_INDEXES TO "cmid_user";
GRANT SELECT ON "SYS"."DBA_LOG_GROUP_COLUMNS" TO "cmid_user";
GRANT SELECT ON "SYS"."DBA_TABLESPACES" TO "cmid_user";

GRANT SELECT ON "SYS"."OBJ$" TO "cmid_user";
GRANT SELECT ON "SYS"."TAB$" TO "cmid_user";
GRANT SELECT ON "SYS"."IND$" TO "cmid_user";
GRANT SELECT ON "SYS"."COL$" TO "cmid_user";

GRANT SELECT ON "SYS"."PARTOBJ$" TO "cmid_user";
GRANT SELECT ON "SYS"."TABPART$" TO "cmid_user";
GRANT SELECT ON "SYS"."TABCOMPART$" TO "cmid_user";
GRANT SELECT ON "SYS"."TABSUBPART$" TO "cmid_user";
COMMIT;

/* For combined load jobs:*/
GRANT EXECUTE ON DBMS_FLASHBACK TO "cmid_user";

/*To provide read access to the Amazon RDS online and archived redo logs:*/
GRANT READ ON DIRECTORY ONLINELOG_DIR TO "cmid_user";
GRANT READ ON DIRECTORY ARCHIVELOG_DIR TO "cmid_user";
Additionally, log in as the master user and run the following Amazon RDS procedures to grant the SELECT privilege on some more objects:
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'DBA_TABLES',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'DBA_OBJECTS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'DBA_OBJECT_TABLES',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'DBA_VIEWS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
/begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'DBA_USERS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT',
p_grant_option => false);
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$CONTAINERS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$PARAMETER',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$SPPARAMETER',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$STANDBY_LOG',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$VERSION',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_CONS_COLUMNS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_CONSTRAINTS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_OBJECTS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_TABLES',
p_grantee => 'cmid_user',
p_privilege => 'SELECT',
p_grant_option => false);
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_TAB_PARTITIONS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT',
p_grant_option => false);
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_USERS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rd 'ALL_TABLES',
p_grantee => 'sadmin_util.grant_sys_object(
p_obj_name => 'cmid_user
',
p_privilege => 'SELECT',
p_grant_option => false);
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ALL_TAB_PARTITIONS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT',
p_grant_option => false);
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'ATTRCOL$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'CCOL$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'COLTYPE$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'INDPART$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'IDNSEQ$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'CDEF$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'RECYCLEBIN$',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/
/* Only required for RDS21 which supports PDB*/
begin
rdsadmin.rdsadmin_util.grant_sys_object(
p_obj_name => 'V_$PDBS',
p_grantee => 'cmid_user',
p_privilege => 'SELECT');
end;
/

Oracle log access methods for CDC

Database ingestion and replication incremental load and combined initial and incremental load jobs can access the Oracle redo logs for CDC processing in alternative ways, depending on your environment and requirements.

Direct log access

Database ingestion and replication jobs can directly access the physical Oracle redo logs on the on-premises source system to read change data.
Note: If you store the logs on solid-state disk (SSD), this method can provide the best performance.
The following image shows the data flow:
  1. 1The Oracle database writes change records to the database log files on disk.
  2. 2The Database Ingestion and Replication CDC Reader reads the physical log files and extracts change records from the log files for the source tables of CDC interest.
  3. 3The Database Ingestion and Replication CDC Writer reads the change records.
  4. 4The CDC Writer applies the change records to the target.

NFS-mounted logs

Database ingestion and replication jobs can access to Oracle database logs from shared disk by using a Network File Sharing (NFS) mount or another method such as Network Attached Storage (NAS) or clustered storage.
The following image shows the data flow:
  1. 1The Oracle database writes change records to database log files. The log files are written to shared disk.
  2. The shared disk can be on any system that allows the files to appear as local to both the database and Secure Agent hosts. This sharing can be achieved by using NFS, as shown above, or by using Network Attached Storage (NAS) or clustered storage.
  3. 2The Database Ingestion and Replication CDC Reader reads the log files from the NFS server over the network and extracts the change records for the source tables of CDC interest.
  4. 3The Database Ingestion and Replication CDC Writer reads the change records.
  5. 4The CDC Writer applies the change records to the target.

ASM-managed logs

Database ingestion and replication jobs can access Oracle redo logs that are stored in an Oracle Automatic Storage Management (ASM) system. To read change data from the ASM-managed redo logs, the ASM user must have SYSASM or SYSDBA authority on the ASM instance.
When you configure an Oracle Database Ingestion connection, complete the properties that include "ASM" in their names.
Also, Informatica recommends that you set the sqlnet.recv_timeout parameter in the local sqlnet.ora file to less than 5 minutes when reading data from redo log files in Oracle ASM. This parameter specifies the duration of time that the Oracle client waits for a response from ASM before a query times out. Network interrupts and other factors can occasionally make Oracle connections unresponsive. Setting this value ensures that the reader can respond and recover from any such situation in a timely manner.
The following image shows the data flow:
  1. 1The Oracle database writes change records to the ASM-managed database log files.
  2. 2The Database Ingestion and Replication CDC Reader reads the ASM-managed log files and extracts the change records for the source tables of CDC interest.
  3. 3The Database Ingestion and Replication CDC Writer reads the change records.
  4. 4The CDC Writer applies the change records to the target.

ASM-managed logs with a staging directory

Database ingestion jobs can access ASM-managed redo logs from a staging directory in the ASM environment. In comparison to using ASM only, this method can provide faster access to the log files and reduce I/O on the ASM system. To read change data from the ASM-managed logs, the ASM user must have SYSASM or SYSDBA authority on the ASM instance.
The following image shows the data flow:
  1. 1The Oracle database writes change records to the ASM-managed log files.
  2. 2ASM copies the logs to a staging directory.
  3. The staging directory must be on shared disk, such as an NFS mount, so that ASM can write data to it and database ingestion and replication jobs can read data from it.
  4. 3The Database Ingestion and Replication CDC Reader reads the log files in the staging directory and extracts the change records for the source tables of CDC interest.
  5. 4The Database Ingestion and Replication CDC Writer reads the change records.
  6. 5The CDC Writer applies the change records to the target.

BFILE access to logs in the Oracle server file system by using directory objects

On an on-premises Oracle source system, you can configure Database Ingestion and Replication to read online and archived redo logs from the local Oracle server file system by using Oracle directory objects with BFILE locators. You must create Oracle directory objects named ARCHIVELOG_DIR and ONLINELOG_DIR that point to the locations of the Oracle redo log files. For information about configuring BFILE access, see Configuring BFILE access to Oracle redo logs in the Oracle file system.
The following image shows the data flow:
  1. 1The Oracle database writes change records to the redo log files in the local Oracle server file system. When a database ingestion and replication task needs to read log files, it connects to Oracle and issues a select request that references the ARCHIVELOG_DIR or ONLINELOG_DIR directory object to access the logs.
  2. Note: If you use BFILE access to Oracle data that is stored in multiple log locations, such as in an environment with a standby database, RDS database instance, or Fast Recovery Area, Database Ingestion and Replication can automatically find the logs by querying for all directory objects that begin with ARCHIVELOG_DIR and ONLINELOG_DIR and end with any suffix, for example, ARCHIVELOG_DIR_01, ARCHIVE_LOG_DIR_02, and so on. If database users who run database ingestion and replication jobs have the CREATE ANY DIRECTORY and DROP ANY DIRECTORY privileges, the directory objects can be created at runtime, as needed.
  3. 2The Database Ingestion and Replication CDC Reader reads the log file by using the OCI Client that is installed on the Secure Agent box and extracts the change records for the source tables of CDC interest.
  4. 3The CDC Reader sends the change records to the CDC Writer.
  5. 4The CDC Writer applies the change records to the target.

Configuring BFILE access to Oracle redo logs in the Oracle file system

If you store redo logs in the local Oracle server file system and want to access the logs by using Oracle directory objects with BFILEs, perform the following configuration tasks:
Complete the following usual Oracle source prerequisite tasks that are not specific to BFILE access:
For more information, see Oracle sources.
Additionally, for BFILE access, perform the following steps to configure an online log destination and archive log destination:
  1. 1Query the Oracle database for the online and archived redo log locations in the Oracle server file system. You can use the following example queries:
  2. To get location of the online redo logs:
    select * from v$logfile;
    To get the archive log destination:
    select dest_id, dest_name, destination, status from V$ARCHIVE_DEST;
  3. 2Create the ONLINELOG_DIR and ARCHIVELOG_DIR directory objects that point to the locations of log files from step 1. An Oracle directory object specifies a logical alias name for a physical directory on the Oracle server file system under which the log files to be accessed are located. For example:
  4. CREATE DIRECTORY ONLINELOG_DIR AS '/u01/oracle/data';
    CREATE DIRECTORY ARCHIVELOG_DIR AS '/u01/oracle/archivedata';
    If you plan to set the reader mode to ARCHIVEONLY in the Oracle Database Ingestion connection to read changes only from archive logs, you do not need to create an ONLINELOG_DIR directory or directory object.
    The Oracle database does not verify that the directories you specify exist. Make sure that you specify valid directories that exist in the Oracle file system.
  5. 3To verify that the directory objects were created with the correct file system paths for the redo logs, issue a select statement such as:
  6. select * from all_directories;
    OWNER DIRECTORY_NAME DIRECTORY_PATH
    -------- ------------------- ----------------------------------
    SYS ARCHIVELOG_DIR /u01/oracle/data/JO112DTL
    SYS ONLINELOG_DIR /u01/oracle/data/JO112DTL
  7. 4Grant read access on the ONLINELOG_DIR and ARCHIVELOG_DIR directory objects to the Database Ingestion and Replication user who is specified in the Oracle Database Ingestion connection properties. For example:
  8. grant read on directory "ARCHIVELOG_DIR" to "cmid_user";
    grant read on directory "ONLINELOG_DIR" to "cmid_user";
    Note: If the ONLINELOG_DIR path does not exist or match the path to the active redo logs, Database Ingestion and Replication tries to create the directory. If you do not have sufficient privileges to create the directory object, an error message is issued. In this case, ask your DBA to create the directory object with the correct path.
  9. 5In the Oracle Database Ingestion connection properties, select the BFILE Access check box.
Important: If you use BFILE access to Oracle data stored in multiple log locations, Database Ingestion and Replication can automatically find the logs by querying for all directory objects that begin with ARCHIVELOG_DIR and ONLINELOG_DIR and end with any suffix, for example, ARCHIVELOG_DIR_01, ARCHIVE_LOG_DIR_02, and so on. This capability enables Database Ingestion and Replication to support Oracle standby databases that use log locations different from those on the primary database, replica logs on Amazon RDS database instances, and archive logs in the Fast Recovery Area to which the USE_DB_RECOVERY_FILE_DEST parameter points.
You can override the ARCHIVELOG_DIR and ONLINELOG_DIR names by using the custom properties pwx.cdcreader.oracle.database.additional BFILEARCHIVEDIR=<directory_object> and pwx.cdcreader.oracle.database.additional BFILEONLINEDIR=<directory_object> if necessary.
If you grant the CREATE ANY DIRECTORY and DROP ANY DIRECTORY privileges to database users who run database ingestion and replication jobs, the directory objects can be created at runtime, as needed. For example, if multiple archive and online log destinations exist in a database, the directory objects could be created with the following naming convention:
Database Ingestion and Replication performs no cleanup processing on these generated directory objects.
If you use the USE_DB_RECOVERY_FILE_DEST parameter and the CREATE ANY DIRECTORY and DROP ANY DIRECTORY privileges have not been granted, your DBA must create the directory objects daily or weekly, before your database ingestion and replication jobs run, by using a script such as:
create or replace directory ARCHIVELOG_DIR_2024_08_19 as '<DB_RECOVERY_FILE_DEST>/2024_08_19'
If you use the USE_DB_RECOVERY_FILE_DEST parameter and the database user has the CREATE ANY DIRECTORY and DROP ANY DIRECTORY privileges, the directory objects are created as needed at runtime and dropped after 14 days. These directory objects have the naming convention <ARCHIVEDIRNAME>_YYYY_MM_DD.

Oracle Data Guard databases or far sync instances as sources

Database Ingestion and Replication can capture change data from Oracle Data Guard primary databases, logical or physical standby databases, and far sync instances.
A far sync instance is a remote Oracle Data Guard destination that accepts redo from the primary database and then ships that redo to other members of the Oracle Data Guard configuration.
You can initially load a target with data either from the Oracle Data Guard primary database or from a standby database that is open in read mode.
Configuration
Oracle change capture configuration depends on the Oracle Data Guard database type.
Standby-to-primary role transitions
In an Oracle Data Guard environment, a physical standby database can transition to the primary role. Usually, the role transition occurs because of a failover or switchover. During the transition, all active connections to the physical standby database terminate.
To be able to resume CDC processing after the physical standby database transitions to the primary role, you might need to adjust some connection configuration properties on the original standby system for Database Ingestion and Replication to process past the transition point. After the transition, you can adjust the properties again for optimal performance in the new primary database environment.
The following table describes these connection properties by transition phase:
Connection Property
Before Transition
During Transition
After Transition
RAC Members
Specify the number of active threads on the primary database.
Specify the total number of active threads with unique thread IDs on both the standby database and primary database.
For example, if the primary database is a two-node RAC database that uses thread IDs 1 and 2 and the standby database is a 3-node RAC database that uses thread IDs 2, 3, and 4, specify a property value of 4.
After the restart point has progressed beyond the transition point, edit the property value, as needed, for optimal performance of change data capture from the new primary database.
Informatica recommends that you use the lowest value that is suitable for your environment to minimize the overhead of CDC thread tracking.
Reader Standby Log Mask
Standby Connect String
Standby User Name
Standby Password
Remove all standby properties. They're not applicable to physical standby databases open for read only access.
Properties remain removed.
Do not specify these properties. They're not used for a primary database.
Database Connect String
If the standby database is not open, define the connection string for the primary database.
If the standby database is open, define the connection string for the standby database.
Specify the connection string for the database that will have the primary role after the role transition.
Ensure that this connection property defines the connection string for the new primary database.

Oracle archive log retention considerations

Database ingestion and replication incremental load and combined initial and incremental load jobs must be able to access transaction data in Oracle online and archive redo logs. If the logs are not available, database ingestion and replication jobs end with an error.
Typically, the Oracle DBA sets the archive log retention period based on your oganization's particular business needs and Oracle environment. Make sure that the source archive logs are retained for the longest period for which you expect change capture to be stopped or latent, plus about 1 hour, so that the logs will be available for restart processing.
To determine if the current log retention policy in your environment is sufficient to accommodate database ingestion and replication change capture processing, consider the following factors:
If the archive logs are not available when you need to restart capture processing in the logs, you can ask your DBA to restore them and to modify the retention period if necessary. Otherwise, perform another initial load to re-materialize the target and then start incremental change data processing again. However, in this case, you might lose some changes.

PostgreSQL sources

To use PostgreSQL sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

To use PostgreSQL sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.
On the Secure Agent system, install the ODBC driver that is appropriate for your operating system.
On the PostgreSQL database system, perform the following configuration steps:
  1. 1For incremental load and initial and incremental load jobs, ensure that the PostgreSQL postgresql.conf configuration file specifies the wal_level=logical parameter.
  2. This parameter determines how much information PostgreSQL writes to the Write-Ahead Log (WAL). The setting of logical adds information that is required to support logical decoding.
    To set wal_level to logical on Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL sources, set the rds.logical_replication parameter to 1 in the cluster parameter group. For Azure Database for PostgreSQL - Flexible Server, set the wal_level parameter to logical on the Server Parameters page in the Azure portal.
    For Cloud SQL for PostgreSQL sources, perform the following actions:
    1. aConnect to the database by using the public IP.
    2. Note: Ensure that you add the required IPs under Authorized networks in the Google Cloud console.
    3. bCreate a Cloud SQL for PostgreSQL database instance replica.
    4. cIn the Cloud Shell, as a local admin user, run the following commands:
    5. alter database postgres set default_transaction_read_only = off;
      gcloud sql connect database_replica --user=postgres --quiet;
      ALTER USER postgres WITH REPLICATION;
      CREATE USER replication_user WITH REPLICATION IN ROLE cloudsqlsuperuser LOGIN PASSWORD 'password';
      ALTER USER postgres WITH REPLICATION;
    6. dIn the Google Cloud console, add the following database flags:
    7. eRestart the database instance.
  3. 2If you use the DataDirect ODBC for PostgreSQL driver, ensure that the database does not use the SCRAM-SHA-256 authentication method. Use another authentication method, such as MD5.
  4. Note: The PostgreSQL ODBC driver supports the SCRAM-SHA-256 authentication method. In PostgreSQL 13, this authentication method became the default method.
  5. 3To deploy and run a database ingestion and replication task that includes a PostgreSQL source, the source connection must specify a database user who has the required privileges. Create the user and grant privileges to that user in the following ways:
  6. 4If you plan to use the wal2json plugin for logical decoding output for incremental load or initial and incremental load jobs, install the plugin.
  7. 5If you plan to use the pgoutput plugin for incremental load or initial and incremental load jobs, use the following SQL statement to create publications for database ingestion jobs:
  8. CREATE PUBLICATION publication_name [FOR TABLE [ONLY] table_name [*] [,...] | FOR ALL TABLES ];
    Ensure that the publication includes all tables that you want to replicate to the target.
  9. 6For incremental load and initial and incremental load jobs, use the following function to create a PostgreSQL logical replication slot:
  10. SELECT pg_create_logical_replication_slot('slot_name', 'plugin_type');
    Where the plugin_type is either the pgoutput plugin or the wal2json plugin.
  11. 7For incremental load and initial and incremental load jobs with PostgreSQL 9.6 sources, ensure that the max_replication_slots parameter in the postgresql.conf configuration file has a value greater than or equal to the number of concurrent database ingestion jobs that you plan to use.
  12. Important: All replication slots must be unique across all concurrent jobs.
  13. 8For incremental load and initial and incremental load jobs, ensure that the PostgreSQL sources use the UTF-8 encoding.
  14. 9Ensure that the PostgreSQL sources use the UTF-8 encoding. If you use another encoding for the source database, your initial load, incremental load, and combined initial and incremental load jobs might fail.

Usage considerations

SAP HANA and SAP HANA Cloud sources

To use SAP HANA and SAP HANA Cloud sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations

Teradata sources

To use Teradata sources in database ingestion and replication tasks, first prepare the source database and review the usage considerations.

Source preparation

Usage considerations