Analytic Functions

You can get any part of data and any sorted order you want with Analytic functions. Its syntax is OVER ((PARTITION BY … ORDER BY …)). Here is some samples with analytic functions I have used. First sample is getting number of (COUNT) employees by departments and ordered by (ROW_NUMBER ) departments.

select   A.DEPTNO

, row_number () over (partition by A.DEPTNO order by A.EMPNO) DEPT_EMP_ROWNO

, A.EMPNO

, row_number () over (order by A.DEPTNO, A.EMPNO) EMP_ROWNO

, count ( * ) over (partition by A.DEPTNO) DEPT_TOTAL_EMP_CNT

, count ( * ) over () TOTAL_EMP_CNT

, count (distinct A.DEPTNO) over () TOTAL_DEPT_CNT

from   SCOTT.EMP A

order by   A.DEPTNO, A.EMPNO;

RANK clause is used for getting row number of given order. RANK and DENSE_RANK is differentiates as RANK is considering number of records and at last you can get count of records and DENSE_RANK is not equal to. You must use rank when number of records is important. RATIO_TO_REPORT() gives us report percentages.

select   EMPNO

, SAL

, round (RATIO_TO_REPORT (SAL) over () * 100, 2) RAPOR_SAL_PERCENT

, rank () over (order by SAL desc) TOP_SAL_ROWNO

, dense_rank () over (order by SAL desc) DETAIL_TOTAL_SAL_ROWNO

, DEPTNO

, rank () over (partition by DEPTNO order by SAL desc) DEPT_SAL_ROWNO

, dense_rank () over (partition by DEPTNO order by SAL desc) DETAIL_DEPT_SAL_ROWNO

, round (RATIO_TO_REPORT (SAL) over (partition by DEPTNO) * 100, 2) DEPT_SAL_PERCENT

, sum (SAL) over (partition by DEPTNO) TOP_DEPT_SAL

, round ( ( (sum (SAL) over (partition by DEPTNO)) / (sum (SAL) over ())) * 100, 2) TOTAL_DEPT_PERCENT

, sum (SAL) over () TOTAL_SAL

from   SCOTT.EMP

order by   SAL desc;

You can access following or previous rows with LEAD(<value>, <offset>, <default>) and LAG(<value>, <offset>, <default>) clauses. Offset is 1 then, next record, 2 then following second record is get.

LEAD (hiredate, 3, TO_DATE (’29990101′, ‘YYYYMMDD’))

OVER (PARTITION BY deptno ORDER BY hiredate ASC)

Null records cen be sorted at last or first with NULLS LAST and NULLS FIRST clauses.

LEAD (hiredate, 1, TO_DATE (’29990101′, ‘YYYYMMDD’))

OVER (PARTITION BY deptno ORDER BY hiredate ASC NULLS LAST)

Calculate how many days past after last person is hired. (days_past)

SELECT deptno, empno, hiredate

, LAG (hiredate, 1

, TO_DATE ('10000101', 'YYYYMMDD'))

OVER (PARTITION BY deptno ORDER BY hiredate ASC)

previous_hiredate

, LEAD (hiredate, 1, TO_DATE ('29990101', 'YYYYMMDD'))

OVER (PARTITION BY deptno ORDER BY hiredate ASC)

next_hiredate

, hiredate- LAG (hiredate) OVER (PARTITION BY deptno ORDER BY hiredate ASC NULLSLAST) days_past

FROM scott.emp ORDER BY deptno, hiredate ASC;

Calculating running totals with getting previous records.

with temp_running_total as (

select 1 col,100 val from dual union all

select 2 col,200 val from dual union all

select 3 col,250 val from dual union all

select 4 col,300 val from dual

)

select col, val, sum(val) over (order by col rows between unbounded preceding andcurrent row)running_total

from temp_running_total

You can get first and last records with FIRST_VALUE() and LAST_VALUE() clauses . Null values will be not calculated using IGNORE NULLS clause. This clause is also used in MAX() and MIN(). Previous and following all records can be get with ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING clause.

SELECT deptno, empno, job, hiredate , FIRST_VALUE (hiredate IGNORE NULLS)

OVER (partition by deptno ORDER BY hiredate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

dept_first_hiredate

, LAST_VALUE (hiredate IGNORE NULLS)

OVER (partition by deptno ORDER BY hiredate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

dept_last_hiredate

,FIRST_value (case when job='MANAGER' then

hiredate else null end IGNORE NULLS)

OVER (partition by deptno ORDER BY hiredate

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )

dept_mngr_hiredate

FROM scott.emp

ORDER BY deptno, hiredate ASC;

Rolling between records:

Previous 2 and next 3 records ROWS BETWEEN 2 PRECEDING AND 3 FOLLOWING

Previous all and next 3 records ROWS BETWEEN UNBOUNDED PRECEDING AND 3 PRECEDING

Following 3 and following all records

ROWS BETWEEN 3 FOLLOWING AND UNBOUNDED FOLLOWING

Previous all records and current record

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Current record and following all

ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING

You can get after or before with count of records with ROWS and values with RANGE. Current records values+- Offset value, great isn't it? :D

WITH TEMP_RANGE AS(

SELECT 'A' NAME, 0 VALUE, 0-4 "PREC", 0+2 "FOLLOW" FROM DUAL

UNION

SELECT 'B', 3, 3-4, 3+2 FROM DUAL

UNION

SELECT 'C', 4, 4-4 , 4+2 FROM DUAL

UNION

SELECT 'D', 5, 5-4 , 6+2 FROM DUAL)

SELECT NAME, VALUE, PREC,FOLLOW,

COUNT(*) OVER ( ORDER BY VALUE RANGE

BETWEEN 4 PRECEDING AND 2 FOLLOWING) CNT FROM TEMP_RANGE

A new analytic function is introduced; Listagghttp://nimishgarg.blogspot.com/2010/02/oracle-new-string-aggregation.html

Top-n rows calculated with rownum. When it comes to n -3 between n-5 Row_number is the savior.

select * from (

SELECT a.deptno,

a.empno, ROW_NUMBER () OVER (ORDER BY a.empno) emp_rowno

FROM scott.emp a) b

where emp_rowno between 6 and 10

Pivot is used for transmuting columns to rows.

SELECT *

FROM (SELECT job, deptno, sum(sal) sal FROM emp GROUP BY job, deptno)

PIVOT ( sum(sal) FOR deptno IN (10, 20, 30, 40) )

To use analytic function on ODI convert below sql Or you can edit knowledge modules

sum(sal) over (partition by deptno)

to

<?out.print(“SUM”);?> (sal) OVER (PARTITION BY deptno).

Or you can edit knowledge modules:

Create option ANALYTIC as checkbox,

Edit Insert new rows detail as

<% if (odiRef.getOption("ANALYTIC").equals("0")) {%>
<%=odiRef.getGrpBy()%>
<%=odiRef.getHaving()%>
<%}else{%><%}%>

Synchronous and Asynchronous execution in ODI

In data warehouse designing, an important step is to deciding which step is before/after. Newly added packages and required DW data must be analyzed carefully. Synchronous addings can lengthen ETL duration. Interfaces, procedures without generated scenario cannot be executed in parallel. Only scenario executions can be parallel in ODI. Default scenario execution is synch in ODI. If you want to set a scenario to executed in parallel then you will write “-SYNC_MODE=2″ on command tab or select Synchronous / Asynchronous option Asynchronous in General tab. I have created a package as interfaces executes as; INT_JOBS parallel INT_REGIONS synch INT_REGIONS synch INT_COUNTRIES synch INT_LOCATIONS parallel INT_EMPLOYEES parallel (Interfaces are independent.) Selecting beginning and ending times and durations from repository tables as ODI 11g operator is not calculating these values. It is obvious in ODI 10g operator. SELECT sess_no...

TechWare

Search This Blog

Analytic Functions

Labels

Comments

Post a Comment

Popular posts from this blog

ODI KM Adding Order by Option

Synchronous and Asynchronous execution in ODI

Oracle Data Integrator Tools:OdiSqlUnload, OdiFileCopy ,OdiZip