1)how to implement the logic
in procedures if the source side data deleted that will reflect the target side
table
ANS)User this query on Command on target Delete from Target_table where not exists (Select 'X' From Source_table Where Source_table.ID=Target_table.ID).
2)if the src hav total 15 records with 2 records are updated and 3 records are newly inserted at the target side we have to load the newly changed and inserted records
ANS) Use IKM Incremental Update Knowledge Module for Both Insert n Update operations.
3)can we implement package in package?
ANS:) Yes we can call one package into other package.
4)how to load the data with one flat file and one rdbms table using joins?
ANS:) Drog n drop both File and table into source area and join as in Staging area.
5)in the package one interface got failed how to know which interface got failed if we no access to operator?
ANS:)Make it mail alert or check into SNP_SESS_LOg tables for session log details.
6)if the src and tgt are oracle technology tell me the process to achieve this requirement(interfaces,kms,models)
ANS) Use LKM SQL to SQL or LKM SQL to Oracle , IKM Oracle Incremental update or Control append.
7)how to implement data validations?
ANS:) Use Filters & Mapping Area AND DataQuality related to constraints use CKM Flowcontrol.
8)how to handle exceptions?
ANS) Exceptions In packages advanced tab and load plan exception tab we can handle exceptions.
9)what we specify the in xml dataserver and parameters for to connect to xml file?
ANS:) Filename with location :F and Schema :S this two parameters
10)how to implement cdc
ANS) CDC is a step by step process please refer. http://oditraining.blogspot.in/2012/09/cdc-changed-data-capture-in-odi-step-by.html
11)how to reverse engineer views(how to load the data from views)?
ANS) In Models Goto Reverse engineering tab and select Reverse engineering object as
VIEW.
ANS)User this query on Command on target Delete from Target_table where not exists (Select 'X' From Source_table Where Source_table.ID=Target_table.ID).
2)if the src hav total 15 records with 2 records are updated and 3 records are newly inserted at the target side we have to load the newly changed and inserted records
ANS) Use IKM Incremental Update Knowledge Module for Both Insert n Update operations.
3)can we implement package in package?
ANS:) Yes we can call one package into other package.
4)how to load the data with one flat file and one rdbms table using joins?
ANS:) Drog n drop both File and table into source area and join as in Staging area.
5)in the package one interface got failed how to know which interface got failed if we no access to operator?
ANS:)Make it mail alert or check into SNP_SESS_LOg tables for session log details.
6)if the src and tgt are oracle technology tell me the process to achieve this requirement(interfaces,kms,models)
ANS) Use LKM SQL to SQL or LKM SQL to Oracle , IKM Oracle Incremental update or Control append.
7)how to implement data validations?
ANS:) Use Filters & Mapping Area AND DataQuality related to constraints use CKM Flowcontrol.
8)how to handle exceptions?
ANS) Exceptions In packages advanced tab and load plan exception tab we can handle exceptions.
9)what we specify the in xml dataserver and parameters for to connect to xml file?
ANS:) Filename with location :F and Schema :S this two parameters
10)how to implement cdc
ANS) CDC is a step by step process please refer. http://oditraining.blogspot.in/2012/09/cdc-changed-data-capture-in-odi-step-by.html
11)how to reverse engineer views(how to load the data from views)?
ANS) In Models Goto Reverse engineering tab and select Reverse engineering object as
VIEW.
12) How to Run parallel
jobs in Oracle Data Integrator Package?
In package we can run parallel only Scenarios not for Direct objects like procedures, interfaces,
packages, we can run only their scenarios enabling Synchronous and Asynchronous Mode
Asynchronous Mode: Parallel
Synchronous Mode: Serial
In package we can run parallel only Scenarios not for Direct objects like procedures, interfaces,
packages, we can run only their scenarios enabling Synchronous and Asynchronous Mode
Asynchronous Mode: Parallel
Synchronous Mode: Serial
ODI Notes – ORACLE DATA INTEGRATION 11G/12C
=====================================================
SECURITY: (Profiles, Users, Objects) :
1. This component we are using for to
maintain users and user privileges.
2. In security, We can create new users
and we can assign privileges through profiles or objects.
Objects: List
of available objects in ODI that privileges will be available in SECURITY
=> Object tab.
USERS:
Supervisors
Profiles:
A profile is a set of objects privileges. One profile may contain more
than one object and we can create new profile and we can assign objects to the
profile.
Or
we can use existing profiles.
Profile Types:
There are two types of profiles.
1) Generic Profile
2)Non-Generic Profile
1) Generic Profile:
If any profile listed objects have selected Generic Privileges option
that is called generic profile.
This generic profiles, oracle providing individual components wise.
1)SECURITY ADMIN - Security component
2)TOPOLOGY ADMIN - Topology component
3)OPERATOR - Operator component
4)DESIGNER - Designer component except MODELS & Solutions
5)METADATA ADMIN - MODELS in Designer component
6) VERSION ADMIN - Solutions in Designer component
7)REPOSITORY EXPLORER - ODI Console
8) CONNECT - This is for read only (view) privileges for Topology,
Designer & Operator components.
2)Non-Generic Profile:
This Non-Generic profile having objects but does'nt have privileges. We
will use this NG profiles for customized security. This NG profiles
only available for DESIGNER Component including ODI Console.
1)NG-Designer
2)NG_MetadataAdmin
3)NG- Version Admin
4)NG-Repository Explorer.
USERS:
Here one Default Installation Admin user SUPERVISOR will be there and we
can create new user and we can make that user as a supervisor or using profiles
we can assign privileges.
=====================================================================
TOPOLOGY:
In Topology we maintain Repositories like Master repository
and work repository.
We can create new work repositories but we can't create
master repository here bcz we can maintain only one master repository for one
project. Withour master repository we can loginto ODI Studio.
Available objects in Topology:
1) Physical Architecture
1) Technologies
2) Agents
In Physical Architecture objects will be directly connecting physical
sources and targets databases or applications or flat files.
1)Technologies:
In physical architecture we can see available or supporting technologies
in ODI like Databases, applications, flat files, OS, middleware ...
In technologies we will create Data Server for each technology and for
that data server we will create Physical Schema.
Data Server:
Data Server is nothing but for particular technology Driver Details with
Credentials.
for example If oracle means
JDBC Driver details and Login Credentials (User name and Password).
We need to create dataserver for our all sources and target databases or
files.
Physical Schema:
Physical Schema is mainly for one part of your database. We can call
this as a database user or Schema.
Here there are two types of schemas
1)Data Schema : Data Schema means our main tables storage
location
2)Work Schema : work Schema means our staging tables storage
location
2) Agent:
Agent is a run time component it will interaction between client and
servers like ODI studio to source and target databases interactions.
There are two types of agents.
1)Standalone agent
2)J2EE Agent
1)Standalone agent:
This standalone agent will work odi standalone installations. This agent
job includes below list.
1)Establish connection between client to servers and transfer your
reqested source
code to source or target databases.
2)Creating session while execution your program or interface
3)Creating complete log for your program session
4)Closing session after finishing your program
2)J2EE Agent:
This agent will be available if you are using middleware like weblogic
or websphere.
if you are using j2ee installations we can use this agent and we will below
listed features.
1)Load balancing
2)High Availability
3)Multipl Agents
4)and same features as Standalone agent.
Physical Architecutre we will Physical Agents like Standalone or j2ee
agents and
we will Data Server for source and target technologies for earch data
server we need to create Physical schemas. In Data Server We need create
atleast one physical schema.
2: Context:
context is a mediator between Physical architecture and Logical
Architecture and Context we are using for Particular environments or Server
locations like DEVELOPMENT, TESTING , PRODUCTION Environments or US,
UK, ASIA ,.... like Server locations.
3: Logical Architecture:
In logical Architecture we will use logical Technologies and agents.
Logical architecture is nothing but alias name (synonym or another name)
of your physical architecture.
1)technologies:
Here we will create only Logical Schemas there is no Dataserver concept.
This logical schemas will be assigned your physical schemas through
Context.
2)Agents:
In Logical Architecture we will create only logical agents is nothing
but alias name
of your physical agent that it may be Standalone or J2ee agent.
This logical agent will connect to your physical agent through Conext.
4: Languages:
In Langauages we can list of supporting languages in ODI. ODI supporting
below listed langauages.
1)Groovy
2)JAVA
3)JYTHON
4)SAP
5)SQL
6)SQL FILE
This listed languages we can use in ODI Designer. Like we can run these
listed langaues code in Designer Procedure.
5: Repository:
In Repositories we can find Master repository and work repository.
But here we can create only work repositories and we con't create
another master repository here.
1)Master Repository
2)Work Repository
6: Generic Action:
In ODI these generic action we are using for to generate common ODI
Jython code for Oracle Database operations like
Creating table, Alter table, Creating primary key,indexex, modifying
columns
These availabe list we will using in designer.
Creating New Data Server for Oracle Technology database.
Right Click on=>Oracle=>New Data Server
we are creating new dataservers for below schemas available in oracle
database
DEV
TEST
PROD
These all schemas will be available in target database and we can
considar this
schemas like environments DEV,TEST and PROD
After creating Physical DataServer , Physical Schema , Contexts and
Logical Schemas we can go to Designer for creating models.
MODEL:
Model is containing metadata and we are import database objects from
database to models using Reverse engineering concept.
Reverse Engineering:
Reverse engineering is a process to import all database objects like
tables, columns, Constraints from data base to models.
There are two types Reverse engineering:
1)Standard Reverse Engineering
2)Customised Reverse Engineering
1)Standard Reverse Engineering:
In Standard Reverse Engineering odi will identify Database drivers and
it will connect to Database and it will import objects from database to Models.
2)Customised Reverse Engineering:
In Customised Reverse engineering we are providing syntax for our own
code to connect database and import selected objects from database to models.
That customised requirement is nothing but Knowledge Module.
We are using RKM Reverse engineering Knowledge Module for Customised
Reverse engineering.
Knowledge Module:
Knowledge module is a step by step process template and we can
customised this KM for our requirements.
Difference Between Standard and Customised Reverse Engineering?
1)ODI will identify standard database drivers and it will import database
objects
to models
1) Customised Reverse engineering will be working on RKM and this RKM
having
step by step import objects process.
2)In Standard reverse engineering we wont get Table partitions and
indexes
2)In Customised reverse engineering we will partitions and indexes
3)Standard reverse engineering will work withour KM
3)Customised reverse engineering will work only with RKM
4)In Standard reverse engineering we can import selected
objects or tables
4)In customised reverse engineering we can't import select objects or
tables
it will all tables from database to models if you want to restric some
tables you
have to use filter condition in models.
1) RKM (Reverse Engineering Knowledge Module):
This RKM we are using for customised reverse engineering.
If we are using customised reverse engineering we will get additional
objects like Table partitions, Sub Partitions and partitioned
indexes.
2) LKM (Loading Knowledge Module):
-------------------------------
This LKM will load the data from source table to staging table i.e.,
C$_tablename
example:
Source table: EMP
staging table: C$_EMP
LKM Process steps:
step1: LKM will drop C$_EMP table if already exists
step2: LKM will create C$_EMP table same structure from source table
EMP.
step3: LKM will select data from source table EMP and it will insert
into staging
table C$_EMP table.
Note: LKM will always load the data from source table to C$table.
How to identify LKM's for particular databases or files.?
1) IF source is SQL and staging is SQL we can select LKM SQL to SQL
2) If source is file and stagning is SQL we can select LKM File to SQL
3) If source is Oracle and stagning is oracle we can select LKM SQL to
SQL
or LKM Oracle to SQL or LKM Oracle to SQL
SQL we can use for any native SQL rdbms supporting databases.
3) IKM (Integration Knowledge Module):
IKM we are using in interface for data integration from Staging table
C$table to I$Table and I$table to Target Table.
IKM Process Steps:
step1: drop I$table if already exists in staging.
step2: creating I$table same as target table structure with aditional
column as UPDATE_FLAG. this flag having values like U-Update,I-Insert,N-no
change
step3: Selecting data from C$table and inserting into I$table with flag
as 'N' for all
rows.
step4: Comparing with I$table and Target table based on
Primary key value.
If all values are same with target table it will update flag as 'N'
if there is no PK data in target table it will update flag as 'I'
if there is PK data and there are some differences it will update flag
as 'U'
step5: Selecting data from I$table where flag is'I and it
will insert into target table
selecting data from I$table where flag is 'U' it will update into target
table
and it won't update or insert records where flag is 'N'
because N means
there is no change in data from target table.
Note : Above steps is for only if you are using IKM
Incremental Update Knowledge module. If you are using Control append knowledge
module it wont create UPDATE_FLAG in I$table.
Types of IKM:
There are three types of IKM
1) IKM control Append : it is only for insert in target
2)IKM Incremental Update: Insert or Update
3)IKM Slowly changing Dimention.: this if for SCD Type 1 and Type 2
4) CKM (Checking Knowledge Module):
This CKM we are using Data Quality Check for constraint related.
it will support PK,NOt null and Unique key constraint related data
quality checking.
There are two types of Data Quality Check in CKM.
1)FLOW CONTROL
2)STATIC CONTROL
1)FLOW CONTROL:
This flow control we have to enable at IKM level and we need to use CKM
knowledge module and this CKM will do below processing.
Flow control will validate data at I$table before inserting into Target
table.
If I$table having any duplicate or null records in I$table it will
delete from
I$table and it will insert into E$table error table.
CKM Processing Steps: (Flow Control)
step1: Dropping E$table if already exists
step2: Creating SNP_CHECK_TAB table in staging area.
SNP_CHECK_TAB: This table we are using for common error table for all
interfaces. this table contain error message ,error number and interface
name,
and schema name.
step3: Creating E$table same as I$table structure with additional
columns like
error number,error description,error date and error type(PK or NOTNULL).
step4: Validating I$table if I$table having any duplicate or null rows
it will select
and it will insert into E$table with error message,error type and error
number
and it will delete from those rows in I$table.
Note: Flow control will validate data at I$table before
inserting into Target table
and it wont insert invalid data into target table it will insert only
E$table..
2)STATIC CONTROL:
This static control we are using for data quality check at IKM level
using CKM but this static will validate data at Target table level after inserting
data into target table this Static control will validate data in target table
if target table having any duplicate or null rows it select and insert into
E$table but it won't delete from Target table.
Note: Static control wont validate at I$table level it
will validate only at target table that is after inserting data from
I$ table to Target table it will validate in target table. this static control
it wont delete from target table. this is we are using for only information to
know howmany rows are duplicate or nulls in target table.
Based on our requirement we can use Any FLOW CONTROL or STATIC COTROL or
BOTH. this option we can enable at IKM level options
FLOW_CONTROL=TRUE/FALSE or STATIC_CONTROL=TRUE/FALSE.
How to reload error records from E$table to Target table?
we can reprocess errored records from E$table to target table using One
Option at
IKM level i.e., RECYCLE_ERRORS=TRUE.
Note: This recycle errors will validate agin in E$table if
E$table having valid data
it will insert into target table other wise it wont insert into target
table.
5) JKM Journalization Knowledge module:
This JKM we are using CDC Change data capture feature. If we are using
CDC that time we can go for this JKM.
6)SKM (Service Knwoledge Module):
This SKM we are using for public data webservices.
like SOA or other public webserivces products.
INTERFACE:
Interface is object for mapping source and target datastores(tables) and
we will use interface for loading data from source to target databases.
In interface below Knowledge modules we will use.
1)LKM
2)IKM
3)CKM
Note: In ODI 11g we will call Interface and ODI 12c we will
call mapping.
Creating New Interface:
Goto=>Designer=>Projects=>NEw
Project=>Folder=>Interface=>Right Click=> New Interface.
Before creating interface first we need to identify KM's based source
and target databases.
source : Oracle
Target : Oracle
LKM: LKM SQL to SQL
IKM: IKM SQL Incremental update or IKM SQL Control Append
CKM: CKM SQL
we need to import all above knowledge modules in your project.
Right Click on Knowledge Modules=> Import Knowledge Modules.
All KM's XML files will be available in below ODI installed location.
C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\xml-reference
Create Interface:
Name: User defined EMP_INTF
Optimization Context:we need to select development server context.
Select Target Logical Schema if you are using staging in target. if you
to maintain staging other than target you have to select that logical schema.
Here we are maintaing staging in target bcz of that we are selecting
target logical schema.
Goto=>Mapping Tab.
Here we can see two panels. left side will be for source tables and
right side will be for target tables.
Drag n drop target table from target model to interface right side
target datastore area.
Drag and drop source table from source model to interface left sdie
source datastore area.
KM's Selection:
Goto=>FLow tab=>Click on Source Diagram=>Select LKM SQL to SQL
in properties window.
Click on Target Diagram =>Select IKM SQL Control Append in property
window.
Goto=> Controls Tab=>Select CKM SQL
Enable or Diable Flow Control:
Goto=>Flow Tab=> click on Target Diagram=> Goto Property
Inspector
select FLOW_CONTROL: TRUE/FALSE.
if you enable FLOW_CONTROL: TRUE target table should contain PK or SK
Interface practice with example:
Interface from File to SQL (Oracle):
Step1:
Identifying KM's.
1)LKM File to SQL
2)IKM SQL Incremental update or control append
3)CKM SQL or Oracle
step2:
Creating Data server and physical schema for File Technology.
Goto=>Topology=>Physical
Architecture=>Technologies=>File=>Right click =>New Data Server
NAME: SOURCE_DS
username/password : not required for file
Goto=>JDBC Tab=> Select ODI FILE jdbc driver
JDBC URL: jdbc:snps:dbfile
Creating Physical Schema for above file Data Server.
Right Click on SOURCE_DS=>New Physical Schema.
Schema NAme should be your source file folder path including folder name
: C:\SOURCE_FILES
Step3:
Creating Logical Schema.
Goto=>Logical Architecture=>Technologies=>File=>Right
Click=>New Logical Schema.
NAme: SOURCE_FILE_LS
Step4:
Creating Model for File Technology.
Goto=>Designer=>Models=>New Model=>
Name: SOURCE_FILE_MDL
Technology: File
Logical Schema: SOURCE_FILE_LS
Save.
Note: For File technologies we are not doing direct reverse engineering.
we have to add manually each file (datastore).
Adding File in Model.
Right Click on Model=>New DataStore
NAme: EMPF
Browse =>Resource NAme=> Select source file from source folder.
Goto=>Files Tab
Select File Format: Delimited
Header No of rows: 1 ( header means columns).
Field Seperator: Others (,)
Goto=> Columns Tab=> click on Reverse Engineering.
Save File.
Step5:
Creating Interface for loading data from file to sql.
Projects=>Interfaces=>Right Click=>New Interface.
NAme: FILE_TO_SQL_INTF.
Select Target Logical Schema.
Goto=>Mapping Tab=>Drag n drop Target table into target mapping
area.
Source file into source mapping area.
If source file data having any spaces we have to trim function in
mapping.
use TRIM() function in all mappings.
TRIM() its will remove both side spaces
LTRIM() it will remove left side spaces
RTRIM() it will remove right side spaces.
Goto=> Flow Tab=>Click on Source Diagram=> Select LKM File to
SQL
Click on Target Diagram select IKM SQL Control append or Incremental
Update.
Save your interface and run.
after running we can see data in target table.
EXCEL TO DATABASE (SQL):
Step1: Creating source xls data file
open excel file and change work book name as EMP
and select all data with columns and goto Formulas=>Define name=>
Select workbook=>Define name as EMP.
Save
Step2: Creating System DSN for excel
goto=>Control Panel=>Administrative tools=>ODBC Data
sources=>Click
Goto=>System DSN Tab=>Click Add
select Microsoft Excel driver(xls.,xlsx,....)
DSN (Data source name): XL_DSN
Select workbook (select emp excel file) and click on ok.
step3: Creating Data server for Excel
Goto=>Topology=>Physical Architecture=>Microsoft
Excel=>Right click =>NEW Dataserver.
Name: XL_DS
Goto=> JDBC Tab=>Select ODBC JDBC bridge driver
In JDBC URL Profile created System DSN: XL_DSN
Creating Physical Schema:
Right click Created Dataserver=>XL_DS=>New Physical Schema
Select Default schema and save.
Creating Logical Schema:
Goto=>Logical Architecture=>Technologies=>Mircosoft Excel=>Right
click=>
New Logical Schema.
Name: XL_LS
Assign Physical schemas to all contexts.
GLOBAL=>XL_DS.Defaultschema
TEST=>XL_DS.DefaultSchema
PROD=>XL_DS.DefualtSchema
step4: Creating Model For Excel:
Goto=>Designer =>Models=>New Model
NAme: XL_MDL
Technology: Mircosoft Excel
Logical Schema: XL_LS
Save
Goto=>Reverse Engineering Tab=>Select Objects to reverse engineer
as
SYSTEM TABLE
Note: For excel and xml data files we need to select object
type as SYSTEM TABLE
for reverse engineering.
Click on Reverse engineering button. it will import excel sheet into
models with name as EMP that we are provided in Excel Defiine name in formulas
tab.
Knowledge modules for EXCEL TO DATABASE:
1) LKM SQL to SQL
2) IKM SQL or Oracle Control append
3)CKM SQL or Oracle
Note: For excel and xml we are not using File type KM's we
have to use
SQL KM's only.
Note: SQL To FILE
while loading data from sql(database) to file stagning should not be in
target
stagning always should be in other than target. stagning we can maintain
in source or other databases not in target.
KM's identification for SQL to FILE:
1) LKM SQL to SQL ( source sql and stagning sql)
2) IKM SQL to FILE Append ( Stagning sql and target file)
kM's Identifiecation for FILE TO SQL:
1) LKM FILE to SQL ( Source file and stagning sql)
2) IKM SQL control append ( Target is sql)
KM's identification for File to File:
1) LKM File to SQL ( Source file and stagning sql)
2) IKM SQL to FILE Append ( Stagning sql and target file)
CDC (Change Data Capture):
CDC we are using to capture only changed data from source tables. For
implementing CDC
we need to use JKM(Journalization Knowledge Module).
This CDC will work for only Database Tables not for files, excel and xml
files.
Note: If we want to implement CDC on source table ,source table should
satify below requirement.
1)Source table should contain Primary key
2)Source table should allow to created DML Triggers on that table.
What is DML PLSQL Triggers: Whenver event occuers
(Delete,insert or update) on
source table this trigger will fire and it will insert modified rows
into our stagning tables.
Enable CDC at Table level in ODI:
step1: There
are two types Journalizatioin
1)Simple
2)Consistent set
JKM simple we will use individual tables without data dependency.
JKM Consistent set we will use for set of tables with having data
dependency. like
parent and child tables.
there are two JKM's will be available if simple we can go for JKM ORacle
Simple
if it is Consistent we can go for JKM Oracle Consistent.
Importing JKM Oracle Simple.
step2: Enable
CDC at Table level in source Model
Goto=>Designer=>Source Model=>Right click on EMP
Table=>Change Data Capture=>
select Add To CDC
step3: After
adding CDC then we need to start journal.
before starting Journal we need to enable JKM at Model level.
Open Source Model=>goto Journalizing Tab=>Select simple=>Select
JKM Oracle Simple.
Right click on EMP=>Change Data Capture=>Start Journal.
In this Journalization process it will create below objects in Source
schem for EMP table.
1) T$EMP DML trigger on Source emp table
2)J$EMP table for maintain changed data
3)JV$DEMP view it will create on both EMP and J$EMP tables.
step4: Enabling
CDC at your Interface.
Goto=> Interface=>Source mapping datastore area=>Click on source
emp table=>
goto Property inspector=>Select Journalization Data only.
After enable this journalization flag at interface it will select data
from JV$DEMP view.
this JV$DEMP having is selecting data from both EMP and J$EMP tables.
this view will return only modified data.
After finishing your interface it will delete data from J$table.
Note: Whenever changes happing on source table this DML
trigger will fire and if it is update or delete or inserting on source that
data trigger will insert into J$EMP table
JNR_FLAG='I' or 'U' or 'D' and JNR_DATE=modified date and PK Column
vlaue EMPNO.
while running our interface it will select data from JV$DVIEW, this view
having only
modified data. after loading data into target table it will delete from
J$EMP table.
SCD (Slowly Changing Dimension):
OLTP: Online transaction processing
if any database having day to day transactions and update or inserting
data frequently in database that database we will call OLTP Database. here only
normlised tables will be there.
OLAP: Online analytics or analysis processing:
If any database having or maintaing history data and they are using for
only
data analysis (reporting) not for manipulation (transaction insert or
update ) that type
of databases we are calling OLAP Database or Warehouse.
In OLAP database tables, we will call different names.
1)Dimension tables
2)Fact tables
1)Dimention tables: if any table maintaining master
data (primary key information) or parent data that type of we will call
dimension tables.
2) Fact Tables: if any table maintaining summary data
(aggregated data) or child table.
that type of tables we will call facts tables.
Note: In OLTP we will call PK(parent) and FK9child) Tables
and in OLAP we will call Dimention (parent having PK) and Fact tables (child
having FK).
In datawarehousing there are different types of dimensions available.
i,e., one type
we are going to discuss Slowly Changing Dimention.
SLOWLY CHANING DIMENTION:
If any dimention table data getting changed that type of dimentions we
will
slowly chaning dimention.
there are three type of SCD.
1) SCD TYPE1:
2)SCD TYPE2
3) SCD TYPE3:
Table: EMP
ENAME D.O.B D.O.J Experice
Address SAL
RAM 13-Jun-1984 05-Jyl-2011 3.5 5
lac p.a
he got a hike in 2012. how we will maintain this data in three types of
dimentions.
1) SCD TYPE1: This type 1 we will use if we dont
required history for particular dimention table that time we will go for SCD
Type 1. SCD Type 1 will alwasy overwrite
data if we are receiving changed data.
Table: EMP ( he got hike 5 to 6 lack in 2012
ENAME D.O.B D.O.J Experice Address SAL modifieddate
RAM 13-Jun-1984 05-Jyl-2011 3.5 6
lac p.a 05-jul-12
Note: TYPE1 will alwasy overwrite data in existig data. it
wont maintain history.
if you dont want history we can go for type1.
2)SCD TYPE2: This type 2 we are using to maintain
history with no of times. type 2
will maintain history with no of times whenever you are gettig changed
information and it will insert as new record. example if one employee got hike
in 10 times it will insert
10 rows in employee table.
If we want to maintain scd type 2 that table should contain below
additional columns.
1)Timestamp startdate
2)timestamp enddate
3)Active reocrd flag
Table: EMP ( he got hike 5 to 6 lack in 2012
ENAME D.O.B D.O.J Experice
Address SAL t_startdate t_enddate active_flag
RAM 13-Jun-1984 05-Jyl-2011 3.5 5
lac p.a 05-jul-11 01-01-2100 Y
2012 he got hike. so scd type we will insert new record and we will
update active_flag='N' for and endingtimestamp we will update
as enddate.
ENAME D.O.B D.O.J Experice
Address SAL t_startdate t_enddate active_flag
RAM 13-Jun-1984 05-Jyl-2011 3.5 5
lac p.a 05-jul-11 04-jul-2012 N
RAM 13-Jun-1984 05-Jyl-2011 4.5 6
lac
p.a 05-jul-12 01-01-2100 Y
againg he got hike in 2013
ENAME D.O.B D.O.J Experice
Address SAL t_startdate t_enddate active_flag
RAM 13-Jun-1984 05-Jyl-2011 3.5 5
lac
p.a 05-jul-11 04-jul-2012 N
RAM 13-Jun-1984 05-Jyl-2011 4.5 6
lac p.a 05-jul-12 04-jul-2013 N
RAM 13-Jun-1984 05-Jyl-2011 5.5 7
lac p.a 05-jul-13 01-01-2100 Y
like this whenever data getting changed it will insert new row and it
will update old records status as N (inactive) and endingtimestamp as enddate.
3)SCD Type3:
this type3 we are using for if we want to maintain history only one
time. this type 3 can maintain history only one time and
not a new row (record) it will maintain in same row(record) with
additional column. for example if we want maintain
2 columns data for one time history we to create adidional two columns in
that table.
Table: EMP
ENAME D.O.B D.O.J Experice
Address PREV_SAL CUR_SAL modified_date
RAM 13-Jun-1984 05-Jyl-2011 3.5 0 5
lac p.a 05-jul-11
he got a hike in 2012.
Table: EMP
ENAME D.O.B D.O.J Experice
Address PREV_SAL CUR_SAL modified_date
RAM 13-Jun-1984 05-Jyl-2011 4.5 5
lac p.a 6 lac
p.a 05-jul-12
he got a hike in 2013.
Table: EMP
ENAME D.O.B D.O.J Experice
Address PREV_SAL CUR_SAL modified_date
RAM 13-Jun-1984 05-Jyl-2011 3.5 6
lac p.a 7 lac
p.a 05-jul-13
In ODI we have separate IKM for SCD: IKM Oracle Slowly
Changing Dimention.
SCD Properties:
1) Surrogate key (PK): we need one PK key on SCD table
2) Overwrite on Change: whcih columns dont need history we will use this
for that column.
3) Add row on change: which column need a history data what
columns we will use this . TYPE2.
4) Starting timestamp: SCD TYPE 2 only we will use
5) Ending Timestamp: SCD Type 2 only we will use
6) Current record Flag: SCD Type 2 only we will use.
Steps for implementing SCD TYPE (1 & 2) in ODI.
Step1:
for training i am creating new table for supporting scd type
2 with below additional columns.
SCD2:
------
CREATE TABLE SCD2
(
EMPNO NUMBER(4,
0) ,
ENAME VARCHAR2(10
BYTE) ,
JOB VARCHAR2(9
BYTE) ,
MGR NUMBER(4,
0) ,
HIREDATE DATE ,
SAL NUMBER(7,
2) ,
COMM NUMBER(7,
2) ,
DEPTNO NUMBER(2,
0) ,
START_DATE DATE ,
END_DATE DATE ,
ACTIVE_FLAG VARCHAR2(1 BYTE)
) ;
After creating this table we have to do reverse engineering.
Step2: Changing
OLAP Table Type as : Slowly Changing Dimention
and enabling SCD behaviour at each column in Model=>SCD
Table=>Column=>Description tab.
enable all scd behaviour for all columns it should not leave empty.
EMPNO: Natural Key
ENAME: Overwrite on change
JOB: Overwrite on change
MGR : overwrite on change
hiredate: overwrite on change
sal: add row on change
comm: overwrite on change
deptno : overwrite on change
start_date: starting timestamp
end-date: ending timestamp
active_falg: current record flag
this all properties we need to enable at Model=>Target Model=>SCD2
table=>attributes(columns)=>Open column=>
goto Description tab=>Select Slowly changing dimention behaviour.
step3: Imporing
IKM Slowloy changing dimention.
step4: creating interface.
INTERFACE OTHER OBJECTS:
-----------------------
1)JOINS
2)FILTERS
3)LOOKUP
4)DATASETS
1)JOINS:
Joins we are using for to join more
than one table. ODI is supporting for all native sql joins.
1)Equijoin (inner join)
2)Left outer join
3)Right Outer join
4)Full outer join
5) cross join
examples:
--------
source tables: EMP, DEPT
TARGET table : EMP_DEPT
emp: 10,20,30,60
dept: 10,20,30,40,50
equi join(inner join or natual
join): 10,20,30
leftouter join
(EMP DEPT):10,20,30,60
Right Outer join (EMP
DEPT): 10,20,30,40,50
Full Outer
join: 10,20,30,40,50,60
Cross Join: Cross
product: EMP * DEPT
FILTERS:
--------
Filters we are using for to filter the
data at source level or staging level. it will apply in where clause and we can
apply
any filter condition.
I want to load only DEPTNO=30 to target
table.
In filters we can all conditional
operators and logical operators like
, =,>,<,>=,<=,<> and OR ,AND, NOT
LOOKUP:
--------
Lookup we are using for to select data
from another table while joining source tables. it will act as a left outer
join or select clause sub query and optional based data selection we can go for
lookup.
Lookup Advantage: lookup
will better selection compare to normal leftouter join.
Lookup we can use both ways:
1)Left outer join
2)SQL Sub query expression.
1) LEFT OUTER JOIN Query:
select
EMP.EMPNO C1_EMPNO,
EMP.ENAME C2_ENAME,
EMP.JOB C3_JOB,
EMP.MGR C4_MGR,
EMP.HIREDATE C5_HIREDATE,
EMP.SAL C6_SAL,
EMP.COMM C7_COMM,
EMP.DEPTNO C8_DEPTNO,
DEPT.DEPTNO C9_D_DEPTNO,
DEPT.DNAME C10_D_DNAME,
DEPT.LOC C11_D_LOC
from SCOTT.EMP EMP
LEFT OUTER JOIN SCOTT.DEPT DEPT ON
EMP.DEPTNO=DEPT.DEPTNO
where (1=1)
2) SQL Expression From clause.
select
EMP.EMPNO C1_EMPNO,
EMP.ENAME C2_ENAME,
EMP.JOB C3_JOB,
EMP.MGR C4_MGR,
EMP.HIREDATE C5_HIREDATE,
EMP.SAL C6_SAL,
EMP.COMM C7_COMM,
EMP.DEPTNO C8_DEPTNO,
(Select
DEPT.DEPTNO From SCOTT.DEPT DEPT where
EMP.DEPTNO=DEPT.DEPTNO) C9_D_DEPTNO,
(Select
DEPT.DNAME From SCOTT.DEPT DEPT where
EMP.DEPTNO=DEPT.DEPTNO) C10_D_DNAME,
(Select
DEPT.LOC From SCOTT.DEPT DEPT where
EMP.DEPTNO=DEPT.DEPTNO) C11_D_LOC
from SCOTT.EMP EMP
where (1=1)
DATASETS:
----------
Datasets is nothing but our set
operators.
1)UNION : union wont display duplicate
rows
2)UNIONALL: union all will
display duplicates
3)INTERSACT: intersact will display
common values
4)MINUS: minus will display
A-B values. it will remove common values and display only A values
EMP_BANG: 10,20,30,60
EMP_HYD: 10,20,30,40,50
EMP_BANG UNION
EMP_HYD: 10,20,30,40,50,60
EMP_BANG UNION ALL
EMP_HYD: 10,10,20,20,30,30,40,50,60
EMP_BANG INTERSACT
EMP_HYD: 10,20,30
EMP_BANG
MINUS EMP_HYD: 60
PROJECT OBJECTS:
---------------
1) Markers
----------
Markers we are using for to show the
status(progress) or priority for particular objects.
We can use this all objects like
Projects, packages, interfaces, procedures,....
And also we can customise our own
priorities and progresses
2) VARIABLES:
-------------
A variable is a object to store a value
at any data type( number, varchar, date) and we can use at dynamically.
There are two types of variables.
1) Declare variable (Default we can
assign a value and that values always same it wont change).
2) Refresh VARIABLES (Refresh variable
will change a value at dynamically whenever we are running a program.
Note: VARIABLES can store
one value at a time. if we are passing second value it will overwrite existing
value.
VARIABLES Usage:
-----------------
There are two types of variables in
ODI.
1) PROJECT variable
2) Global VARIABLE
example: VARIABLE
NAME: V_NAME
usage for Project
variable: #PROJECTNAME.V_NAME or
:PROJECTNAME.V_NAME
Usage for global
variables: #GLOBAL.V_NAME or :GLOBAL.V_NAME
This variables we can use in
Procedures, interfaces, packages, inside variable we can another variable.
Global variables we can use Topology
and Security as well.
In interface we can use in mapping like
: or #PROJECTNAME.VARIABLENAME or #GLOBAL.VARIABLENAME.
This same variable we can 4 types in
PACKAGE ***********
1)DECLARE VARIABLE
2)REFRESH VARIABLE
3)EVALUATE VARIABLE
4)SET VARIABLE
we will see this in packages examples.
3)SEQUENCES:
-------------
SEQUENCES we are using to generate
sequences of values to inserting data into table.
SEQUENCES will maintain current value
and increment by value.
Types of sequences:
------------------
1)Standard SEQUENCE
2)Specific SEQUENCE
3)Native SEQUENCE
1)Standard SEQUENCE
--------------------
This sequence we are creating at odi
level and we will provide increment by value and it will start with zero and
based on increment it will give us next value.
SEQUENCE Usage:
---------------
Project sequence: : or
#PROJECTNAME.SEQUENCENAME_NEXTVAL
Global Sequence: : or
#GLOBAL.SEQUENCENAME_NEXTVAL
SEQUENCE NAME : SEQ_EMPID
#NEW_PROJECT.SEQ_EMPID_NEXTVAL
Standard sequence it will start with 0
and based on our increment it will give us increment values.
Specific Sequence it will start with
based on our own condition and it will give us increment value depending on
condition returning a value.
for both standard and specific we will
use same increment only. specific we will go for if we want to use some
specific start with new value.
Native SEQUENCE: This
sequence available from 11g 11.1.1.5 version and above. this is we are directly
integrating Database Sequences.
If we are using Native sequence odi
increment will be disabled.
4) USER FUNCTIONS:
------------------
this user functions we are using for
our customized parameter based requirements.
example:
if standard functions like SUM,MIN,AVG
this functions having some parameter we can pass any parameter and we will
values.
for the same if you have any your own
requirements we can create new user function and reuse it in your project.
for user functions Syntax and
Technology code is mandatory.
user functions parameters we have to
give syntax like below.
FUNNAME($(P1),$(P2))
Example:
--------
right click on user
functions: NEw user function
Name: FUNNAME
Syntax: FUNNAME($(P1),$(P2)) (
This parameters are mandatory)
Goto=>Implementation tab=> Click
on +=>Select Oracle technology=> Write below code in code area.
==================================
CASE
WHEN $(P1) IS NULL THEN
$(P2)
ELSE
$(P1)
END
we can use this user functions in
mappings (interfaces).
FUNNAME(COL1,COL2)
PROCEDURES:
-----------
Procedures is step by step process we
are using for this procedures all technology code execution.
we can write our own code at any
technology and multiple steps.
we are using this procedures for
writing our own technology code or scipts like,
SQL scritps, plsql scripts, java,
jython, Linux, unix and windows commands etc......
In Procedure there are two commands.
1)Command on source
2)Command on target
1) COMMAND ON SOURCE:
---------------------
command on source we are using for only
selection (selecting data only).
2)COMMAND ON TARGET:
--------------------
Command on target we can use for all
DML (insert, update and delete), DDL (create table,drop table, truncate and
alter table),
select and other common programming
language scripts. we can use data selected from command on source.
whatever we are selected columns from
COMMAND ON SOURCE we can use in COMMAND ON TARGET with :COLUMNNAME or
:ALIASNAME
Command On SOURCE:
-------------------
SELECT EMPNO,ENAME NAME FROM EMP
COMMAND ON TARGET:
------------------
INSERT INTO SAMPLE VALUES(:EMPNO,:NAME)
PACKAGE:
--------
Package is a workflow object. we can
use for calling multiple interfaces, procedures, variables and packages in
packages.
IN package we can use odi tools as well
like, ftp tools, mailing tool, local file copy, filemove,file
delete....
SCENARIO:
---------
A scenario is a executable or compiled
object for interface, procedure, package and variables.
after developing our interfaces,
procedures, packages and variables we will generate scenario. this scenario is
not editable or readable
this scenario only executable and
scheduling as well.
In production we will use only
scenarios. we can't use direct interfaces , procedures or packages or
variables. we have to use only scenarios for those
objects.
How to generate scenario?
------------------------
after finishing our development for
interfaces , procedures , packages and variables we can generate scenario.
right click developed objects
(interface,procedure,package and variable) =>Generate scenario.
These scenarios is direct executable
and scheduled.
Can i edit a scenario?
No. we can't edit a scenario
and we can edit direct object (interface ...) and we can regenerate scenario or
we can create new scenario.
Regenerate scenario means it will
overwrite existing scenario
New scenario means it will create new
scenario. for this old scenario and new scenario both will be available.
Can i execute a scenario without agent
in Production?
------------------------------------------------------
No. If we want to execute a
scenario we need agent that it may be standard alone agent or j2ee agent.
Can i schedule a
scenario without agent?
-----------------------------------------
No. without agent we con't schedule
scenario. if we want to schedule a scenario we need at-least one agent , that
it may be standard alone or j2ee agent.
STANDALONE AGENT CONFIGURATION:
------------------------------
Step1:
------
Goto=> ODI Home
directory=>OracleDI=>AGENT=>Bin folder
C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\agent\bin
edit odiparams.batch file for
windows, odiparams.sh for unix or linux.
Step2:
------
goto => Repository connection
information
change your master repository database
details
JDBC
URL: localhost:1521:orcl
username: mclass
password: encrytped password
(mclass)
step3: Encrypt a password.
---------------------------
open CMD promt
change directory to agent/bin folder.
C:\Oracle\product\11.1.1\Oracle_ODI_1\oracledi\agent\bin
> type encode password
enter
right click=>Mark=>select
encrytped password=>righ click it will copy.
replace into odiparams.batch file
repository connection information.
step4:
-----
providing SUPERVISOR encrytped
password.
encode welcome
mark=>right
click=>select=>right click=>Paste into odiparams.bathc file supervisor
password.
step5:
------
Change work repository
name: WORKREP1
after these changes Save odiparams.bat
file and close.
step6:
------
Goto=> ODI
Studio=>Topology=>Physical architecture=>agens=>Right Click=>New
agent
Name: OracleDIAgent
Portno: 20910
and save your agent.
step7:
------
goto=>Agent/bin directory
click on agent.bat file.
it will start standalone agent.
or
we can run agent using commands
goto=>Agent/bin directory in your
command line
type below command for starting agent
>agent
-NAME="OracleDIAgent" -PORT=20910
step8:
------
creating logical agent for to access in
designer.
LOAD PLANS:
-----------
load plan is extension for package and
we can schedule all scenario parallel or sequential or hierarchical load.
in load plan we can use only scenarios,
we con't use direct objects like interfaces, procedures, packages and
variables.
Creating new loadplan.
SOLUTION:
---------
Solution we are using for to create
version for entire projects and project dependency objects like models creating
version controlling we will go for
solutions.
Create new solutions=>drag n drop
your project into new solution area box and it will ask confirmation for
creating version click yes to create new
version for all dependency objects.
Import/Export:
--------------
we can use this import/export for
moving ODI objects information from one repository to another repository.
we we want to move DEVELOPMENT to TEST
enviroment we can use smart import/export entire porject from development and
import into test environment.
We we want to move TEST to PRODUCTION
environments we can use smart export only scenarios and loadplans and we can
import into product using
smart import feature.
after exporting save xml as backup.
open xml fie note below points
1) top of xml file repository version
and odi version
2) bottom of xml file
note master repository id and work repository id .
Login into another enviroment like test
or QA or PRODUCTION where we want to import that repository version
and odi version we need to verify
both versions should be same
and check development Work and Master
repository ID' and Where we are moving that Work and Master repository ID's
should not be same.
if Work and master repository id are
same we need to change (renumber) ID for QA or TEST or PRODUCTION environments
using renumber feature in Topology=>
Repositories=>Master
Repository=>Right Click=>Renumber. same as for work repository we need to
renumber in Topology.
Now we can import that xml file into
new invironment.
IF odi repository versions are
different then we need to migrate from old version to new version using UA
repository migration utility tool.
Nice tutorial. The ODI tutorial was help ful for me. Keep Sharing Tutorials.
ReplyDeleteNice tutorial. The ODI tutorial was help ful for me. Keep Sharing Tutorials.
ReplyDelete