Skip to main content

Implement log based change data capture in ODI

Hello everyone!
Change data capture is mainly used to capture the data inserted, deleted or updated at the source side. The changes thus captured are then replicated to the target side. There are many ways to implement Change Data Capture in Oracle Data Integrator. Two commonly used methods are:
1. Log based change data capture
2. Trigger based change data capture
I had implemented trigger based change data capture in one of my earliertutorials. In this post, I will implement the log based change data capture method!
Pre-requisites: Oracle database 11gR2 Enterprise edition with *SQLPlus, Oracle data integrator 11g (build version 11.1.1.7.0)
Must read: Previous version of Oracle data integrator namely ODI 10gincluded the JKM Oracle 11g Consistent LOGMINER (REALTIME)module for change data capture. However, this knowledge module is not present from ODI 11g onwards. As Oracle is slowly moving towards streams, this KM was then replaced by JKM Oracle 11g Consistent Streams. Hence in order to implement log based CDC we either need to download the older LOGMINER KM or use the current Streams KM. For using streams in Oracle it is mandatory to have the Oracle database Enterprise Edition (EE) to be downloaded and installed. Other versions of Oracle database such as XE (Express edition), SE (Standard edition) do not support streams.
Note: I assume you have the data models, physical and logical schemas already set up for the connection to your Oracle database.
Step 1: Create source and target tables
source table for CDC
create table cdc_src(empid int primary key, empname varchar(20),empcity varchar(20));
target table for CDC
create table cdc_target(empid int, empname varchar(20), empcity varchar(20));
Insert a few dummy records in the source table only.
sample_data
Next, reverse engineer both source and target tables in a new Modelnamed cdc_model. Add the source data store cdc_src to CDC by right clicking–> Change data capture–>Add to CDC
data_models
Step 2: Apply log based method
Open the cdc_model and select the Journalizing option. You now need to select the Journalizing modeSimple is used for CDC implementation on a single table and Consistent on multiple tables or models. In this tutorial I will be implementing the Consistent Set option even though I have a single source table. You can always try it using multiple source tables described over here
Once you have selected the Consistent option, you need to choose the corresponding Knowledge module. I have selected JKM Oracle 11g Consistent Streams for my Project cdc_test_project.
Make sure you follow the Basic Configuration steps that appear in the description below before you start implementing your Interface!
using_consistent_option
Basic Configuration Steps
1. Connect SYS as SYSDBA with SQL*Plus and run the following statements
    -- Replace "SAS" with the name of your user
    -- This user is specified in the Data Server definition
    grant dba to SAS;
    BEGIN
    DBMS_STREAMS_AUTH.GRANT_ADMIN_PRIVILEGE
    (
    grantee => 'SAS',
    grant_privileges=> true);
    END;
    /
     
    SHUTDOWN IMMEDIATE;
    STARTUP MOUNT;
    ALTER DATABASE ARCHIVELOG;
    ALTER DATABASE OPEN;
 
2. Exit from SQL*Plus
3. Select the JKM on the model you want to journalize
4. Add the Datastores to the CDC Set (with this JKM, it is not mandatory to define an order for the datastores in the CDC Set)
5. Make sure that the AUTO_CONFIGURATION and VALIDATE options are set to Yes
6. Start the Journal on the Model
7. Create a Subscriber on the Model
Note: If you are logged in as SUPERVISOR the default subscriber chosen would be SUNOPSIS.
Once again, right click the source data store cdc_src and select Start Journal. You should see a green color icon on the data store. This indicates that the data store is now in journalizing mode ready to capture changes. Check the status of the operation in the Operator tab.
journalizing_1
journalizing_2
journalizing_2_1
Step 3: Make changes to source table
After you have started the journal successfully, you now need to update or insert any row in the source table cdc_src. In my case, I have added a new row as follows!
SQL> insert into cdc_src values(108,'salman','mumbai');
1 row created.
SQL> commit;
Commit complete.
Step 4: Create an interface
Create a new Interface named cdc_test. Drag and drop the journalized source table and the target table in Mapping. In Mapping, select source data store and tick the option Journalized data only in the Source properties window. Finally, select IKM as Oracle Incremental Update and run the interface.
Ensure that no errors occur. Check the status of the operation. If the operation was successful, right click the source data store –> Change data capture and select Journal data. You should now see the changed data getting reflected! :)

output_1
output_2

Comments

  1. Simply wish to say your article is as astonishing. The clarity in your post is simply great, and I could assume you are an expert on this subject. Same as your blog i found another one Oracle ODI 12c .Actually I was looking for the same information on internet for Oracle ODI 11g and came across your blog. I am impressed by the information that you have on this blog. Thanks a million and please keep up the gratifying work.

    ReplyDelete

Post a Comment

Popular posts from this blog

ODI KM Adding Order by Option

You can add Order by statement to queries by editing KM.I have edited IKM SQL Control Append to provide Order by.  1) Add an option to KM named USE_ORDER_BY, its type is Checkbox and default value is False. This option determines you want an order by statement at your query. 2)Add second option to KM named ORDER_BY, type is Text. You will get order by values to your query by this option. 3) Editing Insert New Rows detail of KM. Adding below three line code after having clause. That's it! <% if (odiRef.getOption("USE_ORDER_ BY").equals("1")) { %> ORDER BY <%=odiRef.getOption("ORDER_BY" )%> <%} %>  If USE_ORDER_BY option is not used, empty value of ORDER_BY option get error. And executions of KM appears as such below; At this execution, I checked the KM to not get errors if ORDER_BY option value is null. There is no prove of ORDER BY I'm glad.  Second execution to get  Ord...

Oracle Data Integrator tools: OdiFileDelete and OdiOutFile

Hello everyone! It’s time for another cool ODI tutorial. Last time, I spoke about the   OdiZip tool and how it can be used to create zip files from a directory. Through this post, I will talk about two more tools related to  Files  namely  OdiFileDelete and  OdiOutFile . 1. OdiFileDelete The  OdiFileDelete  is a tool used to delete files present in a directory or a complete directory on the machine running the agent. Usage OdiFileDelete -DIR=<dir> | -FILE=<file> [-RECURSE=<yes|no>] [-CASESENS=<yes|no>] [-NOFILE_ERROR=<yes|no>] [-FROMDATE=<fromdate>] [-TODATE=<todate>] If  -FROMDATE  is omitted, all files with a modification date earlier than the  -TODATE  date will be deleted. If  -TODATE  is omitted, all files with a modification date later than the  -FROMDATE  date will be deleted. If both parameters are omitted, all files matching the  -FILE...

Synchronous and Asynchronous execution in ODI

In data warehouse designing, an important step is to deciding which step is before/after. Newly added packages and required DW data must be analyzed carefully. Synchronous addings can lengthen ETL duration. Interfaces, procedures without generated scenario cannot be executed in parallel. Only scenario executions can be parallel in ODI. Default scenario execution is synch in ODI. If you want to set a scenario to executed in parallel then you will write “-SYNC_MODE=2″ on command tab or select Synchronous / Asynchronous option Asynchronous in General tab. I have created a package as interfaces executes as; INT_JOBS parallel  INT_REGIONS synch  INT_REGIONS synch  INT_COUNTRIES synch  INT_LOCATIONS parallel  INT_EMPLOYEES parallel (Interfaces are independent.) Selecting beginning and ending times and durations from repository tables as ODI 11g operator is not calculating these values. It is obvious in ODI 10g operator. SELECT    sess_no...