Académique Documents
Professionnel Documents
Culture Documents
Tamilselvan G
Be
In ac
Co f o t o n
rp ech
or
at
io
n
-4
ss
a
Cl
2007-11-25
Analytic Functions
ROW_NUMBER
RANK
DENSE_RANK
COUNT
LEAD
NTILE
3.
SUM
FIRST_VALUE
LAST_VALUE
Ratio_To_Report
LAG
AVG
2007-11-25
Rollup Details:
ROLLUP's action is straightforward: it creates subtotals which "roll up" from the most
detailed level to a grand total, following a grouping list specified in the ROLLUP clause.
ROLLUP takes as its argument an ordered list of grouping columns. First, it calculates
the standard aggregate values specified in the GROUP BY clause. Then, it creates
progressively higher-level subtotals, moving from right to left through the list of grouping
columns. Finally, it creates a grand total.
CUBE Details:
CUBE takes a specified set of grouping columns and creates subtotals for all possible
combinations of them. In terms of multi-dimensional analysis, CUBE generates all the
subtotals that could be calculated for a data cube with the specified dimensions. If you
have specified CUBE(col1, col2, col3 ), the result set will include all the values that would
be included in an equivalent ROLLUP statement plus additional combinations.
Both the functions were introduced in Oracle 8.1.5
2007-11-25
SDATE
ST PRODUCT
--------------------- ---------07-MAY-2006 21:27:18 GA TV
06-MAY-2006 21:27:18 GA PHONE
05-MAY-2006 21:27:18 GA PHONE
05-MAY-2006 21:27:18 GA COMP
04-MAY-2006 21:27:18 FL TV
02-MAY-2006 21:27:18 FL PHONE
03-MAY-2006 21:27:18 FL COMP
7 rows selected.
SALES
---------100
40
30
200
200
20
220
9 rows selected.
2007-11-25
DECODE(GRO DECODE(GROUP
--------------------ALL STATES
ALL PRODUCTS
ALL STATES
TV
ALL STATES
COMP
ALL STATES
PHONE
FL
ALL PRODUCTS
FL
TV
FL
COMP
FL
PHONE
GA
ALL PRODUCTS
GA
TV
GA
COMP
GA
PHONE
SUM(SALES)
---------810
300
------- Extra row
420
------- Extra row
90
------- Extra row
440
200
220
20
370
100
200
70
will
e
w
,
e
b
In cu
l of
a
t
o
t
e
h
get t
ns
nd colum
2
the
n
i
d
e
s
u
n.
o
i
t
c
n
u
cube f
12 rows selected.
2007-11-25
Type
-----------------NUMBER(38)
VARCHAR2(20)
VARCHAR2(20)
CUST_LNAME
-------------------TOM
DAVID
BOBBY
2007-11-25
W
I
T
H
O
U
T
R
O
L
L
U
P
select *
from ( select /*+ ordered */
cust_fname, cust_lname, to_char(b.cust_id) as cust,
sum(order_qty) ord_tot, sum(supplied_qty) ord_supp
from cust_order a, customer b
where b.cust_id = a.cust_id
group by cust_fname, cust_lname, to_char(b.cust_id)
UNION
select /*+ Ordered */
'', '', 'Report Total' as cust,
sum(order_qty) ord_tot, sum(supplied_qty) ord_supp
from cust_order a, customer b
where b.cust_id = a.cust_id
)
order by 1 NULLS LAST
CUST_FNAME
-------------------ELLISON
PETER
SCOTT
CUST_LNAME
CUST
-----------------------------------BOBBY
1003
DAVID
1002
TOM
1001
Report Total
# of LIO = 28,409
ORD_TOT ORD_SUPP
------------------639105
127791
638318
170388
639977
212985
1917400
511164
2007-11-25
W
I
T
H
R
O
L
L
U
P
select *
From (
select decode(grouping(cust_fname),1, '*All First Name', cust_fname) FNAME,
decode(grouping(cust_lname),1, '*All Last Name', cust_lname) LNAME,
decode(grouping(b.cust_id), 1, '*Report Total', b.cust_id) custid,
sum(order_qty) ord_tot,
sum(supplied_qty) ord_supp
from cust_order a,
customer b
where b.cust_id = a.cust_id
group by rollup(cust_fname, cust_lname, b.cust_id)
) c
where (c.custid = '*Report Total' and
c.lname = '*All Last Name' and
c.fname = '*All First Name' ) OR
(c.custid != '*Report Total' )
FNAME
LNAME
CUSTID
-------------------- -------------------- --------------PETER
DAVID
1002
SCOTT
TOM
1001
ELLISON
BOBBY
1003
*All First Name
*All Last Name
*Report Total
ORD_TOT
---------638318
639977
639105
1917400
ORD_SUPP
---------170388
212985
127791
511164
Even though
I reduced the
elapsed time
by half, I am
not satisfied.
Can any one
further
reduce the
elapsed time
by half?
2007-11-25
2. Analytic Functions
One of the coolest things happened in Oracle 8.1.6 was the introduction
of Analytic Function.
Prior to Oracle 8.1.6, Developers wrote complex SQL statements for some
of the following reports:
1. TOP-N Sales Reps by States
2. Ratio to Total
3. Moving Average (Moving Window calculation)
4. Accessing previous and next rows in a group
5. First/Last Row within a Group (Window)
6. Ranking and Percentile
7. Liner Regression Statistics
With the help of new Analytic Functions, today we can write more
scalable application code that not only outperform the old SQL code,
but also solve complex business requirements.
2007-11-25
2. Analytic Functions
Processing Order
Joins, WHERE, GROUP BY and Having
Partitions Created;
Analytic Function applied for each row on
each partition
Final ORDER BY
Output is:
ST SALE_AMT
----------AL
0
FL
10
NY
80
GA
100
NY
120
GA
200
FL
220
GA
300
FL
330
GA
400
FL
450
GA
500
FL
510
NY
910
NY
730
NTILE4
---------1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
15 rows selected.
Beacon Infotech Corporation
www.oracleact.com
10
2007-11-25
2. Analytic Functions
Analytic_Function(argument1, argument2.)
OVER ( PARTITION clause ORDER BY clause WINDOW clause)
Window Clause
11
2007-11-25
2. Analytic Functions
Where
Analytic_Function is the name of the function such MIN, MAX,ROW_NUMBER, RANK,
DENSE_RANK etc.
Arguments can be 0,1,2,3. (No more than 3 arguments is allowed. )
query_partition_clause
Use the PARTITION BY clause to partition the query result set into groups based on one or more
value_expr. If you omit this clause, then the function treats all rows of the query result set as a
single group.
Order_by_clause
Use the order_by_clause to specify how data is ordered within a partition. For all analytic functions
except PERCENTILE_CONT and PERCENTILE_DISC (which take only a single key), you can order
the values in a partition on multiple keys, each defined by a value_expr and each qualified by an
ordering sequence.
Within each function, you can specify multiple ordering expressions. Doing so is especially useful
when using functions that rank values, because the second expression can resolve ties between
identical values for the first expression.
12
2007-11-25
2. Analytic Functions
windowing_clause
Some analytic functions allow the windowing_clause.
ROWS | RANGE These keywords define for each row a window (a physical or logical set of rows)
used for calculating the function result. The function is then applied to all the rows in the window.
The window moves through the query result set or partition from top to bottom.
ROWS specifies the window in physical units (rows).
RANGE specifies the window as a logical offset.
You cannot specify this clause unless you have specified the order_by_clause. Some window
boundaries defined by the RANGE clause let you specify only one expression in the
order_by_clause.
The value returned by an analytic function with a logical offset is always deterministic. However,
the value returned by an analytic function with a physical offset may produce nondeterministic
results unless the ordering expression results in a unique ordering. You may have to specify
multiple columns in the order_by_clause to achieve this unique ordering.
BETWEEN ... AND Use the BETWEEN ... AND clause to specify a start point and end point for the
window. The first expression (before AND) defines the start point and the second expression (after
AND) defines the end point.
If you omit BETWEEN and specify only one end point, then Oracle considers it the start point, and the end
point defaults to the current row.
Beacon Infotech Corporation
www.oracleact.com
13
2007-11-25
2. Analytic Functions
windowing_clause
UNBOUNDED PRECEDING Specify UNBOUNDED PRECEDING to indicate that the window starts at
the first row of the partition. This is the start point specification and cannot be used as an end point
specification.
UNBOUNDED FOLLOWING Specify UNBOUNDED FOLLOWING to indicate that the window ends at
the last row of the partition. This is the end point specification and cannot be used as a start point
specification.
CURRENT ROW As a start point, CURRENT ROW specifies that the window begins at the current
row or value (depending on whether you have specified ROW or RANGE, respectively). In this case
the end point cannot be value_expr PRECEDING.
As an end point, CURRENT ROW specifies that the window ends at the current row or value
(depending on whether you have specified ROW or RANGE, respectively). In this case the start
point cannot be value_expr FOLLOWING.
value_expr PRECEDING or value_expr FOLLOWING For RANGE or ROW:
If value_expr FOLLOWING is the start point, then the end point must be value_expr FOLLOWING.
If value_expr PRECEDING is the end point, then the start point must be value_expr PRECEDING.
If you are defining a logical window defined by an interval of time in numeric format, then you may
need to use conversion functions.
Beacon Infotech Corporation
www.oracleact.com
14
2007-11-25
2. Analytic Functions
FAQ
15
2007-11-25
2. Analytic Functions
What Shall I do if all the reps in a state made 0 sales? Can I include those guys in
the report?
2.
List out set of sales reps who made the top 3 sales by state
3.
List out up to 3 people who made the top sales. If 4 or more reps happen to make
the largest sales, the answer would be no rows. If 2 reps make the highest sales
and 2 reps make the same second highest, the answer will be only 2 reps
(highest). Or does the ABC Corp want all 4 reps?
4.
Sort the sales reps by salary from greatest to lowest with in a state. List out the
first 3 reps (or less if less than 3 reps work in a state).
16
2007-11-25
2. Analytic Functions
Null?
Type
CHAR(2)
ENAME
VARCHAR2(10)
SALE_AMT
NUMBER(6,2)
SOMETXT1
VARCHAR2(3000)
SOMETXT2
VARCHAR2(3000)
ST ENAME
-- ---------GA Tamil
GA Tom
GA Scott
GA Veera
GA Kumar
FL Bush
FL Gore
FL Ellison
FL Gates
FL Ravi
NY Alex
NY Mike
NY Lux
NY Mary
NY Kristine
AL Ann
AL Arasu
AL Ram
AL Gopi
SALE_AMT
---------100
200
300
400
500
10
220
330
450
510
80
120
730
730
910
0
0
0
0
19 rows selected.
Beacon Infotech Corporation
www.oracleact.com
17
2007-11-25
2. Analytic Functions
ST ENAME
SALE_AMT
-- ----------
---------
DR
----------
FL Ravi
510
FL Gates
450
FL Ellison
330
GA Kumar
500
GA Veera
400
GA Scott
300
NY Kristine
910
NY Lux
730
NY Mary
730
NY Mike
120
10 rows selected.
18
2007-11-25
2. Analytic Functions
ST ENAME
SALE_AMT
1 select *
-- ----------
----------
FL Ravi
510
CNT
---------1
count(*)
FL Gates
450
FL Ellison
330
GA Kumar
500
GA Veera
400
from ana_sales
GA Scott
300
NY Kristine
910
NY Lux
730
!!!!!
NY Mary
730
!!!!!
11 /
9 rows selected.
Note: !!!!! Both guys made same sales.
19
2007-11-25
2. Analytic Functions
Sort the sales reps by salary from greatest to lowest with in a state. List
out the first 3 reps (or less if less than 3 reps work in a state).
I updated the sale_amt to 500 for all guys in NY state and see what we get .
SQL> get top-4
ST ENAME
SALE_AMT
RN
1 select *
-- ----------
----------
---------
FL Ravi
510
row_number()
FL Gates
450
FL Ellison
330
GA Kumar
500
) rn
GA Veera
400
from ana_sales
GA Scott
300
NY Alex
500
9 where rn <= 3
NY Mike
500
NY Lux
500
11 /
9 rows selected.
20
2007-11-25
2. Analytic Functions
ST ENAME
SALE_AMT DR
RA
RN
-- ------------------------- ------ -------FL Ravi
510
1
1
1
FL Gates
450
2
2
2
FL Ellison
330
3
3
3
FL Gore
220
4
4
4
FL Bush
10
5
5
5
GA Kumar
500
1
1
1
GA Veera
400
2
2
2
GA Scott
300
3
3
3
GA Tom
200
4
4
4
GA Tamil
100
5
5
5
NY Lux
500
1
1
1
NY Mary
500
1
1
2
NY Kristine
500
1
1
3
NY Alex
300
2
4
4
NY Mike
300
2
4
5
15 rows
Beacon
Infotechselected.
Corporation
www.oracleact.com
21
2007-11-25
2. Analytic Functions
1.3 LAG & LEAD - Accessing Previous row and next row
RDATE
FAREN
------------------------07-MAY-2006 00:00:00 55
08-MAY-2006 00:00:00 56
09-MAY-2006 00:00:00 52
10-MAY-2006 00:00:00 53
11-MAY-2006 00:00:00 59
12-MAY-2006 00:00:00 50
13-MAY-2006 00:00:00 52
14-MAY-2006 00:00:00 51
15-MAY-2006 00:00:00 45
9 rows selected.
LEAD
----56
52
53
59
50
52
51
45
9 rows selected.
22
2007-11-25
2. Analytic Functions
RDATE
-------------------07-MAY-2006 00:00:00
08-MAY-2006 00:00:00
09-MAY-2006 00:00:00
10-MAY-2006 00:00:00
11-MAY-2006 00:00:00
12-MAY-2006 00:00:00
13-MAY-2006 00:00:00
14-MAY-2006 00:00:00
15-MAY-2006 00:00:00
9 rows selected.
Beacon Infotech Corporation
www.oracleact.com
23
2007-11-25
2. Analytic Functions
SELECT * FROM T1 ;
EMPLID
PAY_DATE
--------------- ----------A1
31-JAN-2000
A1
28-FEB-2000
A1
31-MAR-2000
A1
31-MAY-2000
B1
B1
B1
C1
C1
C1
C1
31-JAN-2000
31-MAR-2000
31-JUL-2000
31-JAN-2000
28-FEB-2000
31-MAR-2000
31-DEC-2000
D1
31-JAN-2000
D1
31-MAR-2000
D1
31-OCT-2000
14 rows selected.
Beacon Infotech Corporation
www.oracleact.com
PAY_DATE LV
31-JAN-2000 31-MAY-2000
28-FEB-2000 31-MAY-2000
31-MAR-2000 31-MAY-2000
31-MAY-2000 31-MAY-2000
B1
B1
B1
31-JAN-2000 31-JUL-2000
31-MAR-2000 31-JUL-2000
31-JUL-2000 31-JUL-2000
C1
C1
C1
C1
31-JAN-2000 31-DEC-2000
28-FEB-2000 31-DEC-2000
31-MAR-2000 31-DEC-2000
31-DEC-2000 31-DEC-2000
D1
D1
D1
31-JAN-2000 31-OCT-2000
31-MAR-2000 31-OCT-2000
31-OCT-2000 31-OCT-2000
24
2007-11-25
2. Analytic Functions
PRODUCT
SALES
-------------------- ---------P1
280
P2
400
P3
200
P4
300
P5
190
P6
310
PRODUCT
SALES
-------------------- ---------P1
280
P2
400
P3
200
P4
300
P5
190
P6
310
25
SALES_PER
------16.67
23.81
11.90
17.86
11.31
18.45
2007-11-25
2. Analytic Functions
NAME
---------DAVID
DAVID
DAVID
DAVID
DAVID
DAVID
DAVID
SCOTT
SCOTT
SCOTT
SCOTT
SCOTT
SCOTT
SCOTT
TEST_DATE
SUBJECT
MARKS
------------------------------------15-JUN-2005 00:00:00 CHEMISTRY
60
15-JUL-2005 00:00:00 CHEMISTRY
30
15-AUG-2005 00:00:00 CHEMISTRY
35
15-SEP-2005 00:00:00 CHEMISTRY
36
15-OCT-2005 00:00:00 CHEMISTRY
62
15-NOV-2005 00:00:00 CHEMISTRY
50
15-DEC-2005 00:00:00 CHEMISTRY
40
60
70
45
66
62
30
35
MIKE
MIKE
MIKE
MIKE
MIKE
MIKE
MIKE
30
20
35
26
72
40
50
21 rows selected.
School Policy:
The principal of a school declared that
any student scored less than 40 in 3
consecutive months would be declared
Failed in that academic year. He asked
the teacher to prepare a list of students.
26
2007-11-25
2. Analytic Functions
SUBJECT
--------CHEMISTRY
CHEMISTRY
27
2007-11-25
2. Analytic Functions
28
2007-11-25
2. Analytic Functions
SUBJECT TEST_DATE
----------------CHEMISTRY 15-JUN-05
CHEMISTRY 15-JUL-05
CHEMISTRY 15-AUG-05
CHEMISTRY 15-SEP-05
CHEMISTRY 15-OCT-05
CHEMISTRY 15-NOV-05
CHEMISTRY 15-DEC-05
CHEMISTRY 15-JUN-05
CHEMISTRY 15-JUL-05
CHEMISTRY 15-AUG-05
CHEMISTRY 15-SEP-05
CHEMISTRY 15-OCT-05
CHEMISTRY 15-NOV-05
CHEMISTRY 15-DEC-05
CHEMISTRY 15-JUN-05
CHEMISTRY 15-JUL-05
CHEMISTRY 15-AUG-05
CHEMISTRY 15-SEP-05
CHEMISTRY 15-OCT-05
CHEMISTRY 15-NOV-05
CHEMISTRY 15-DEC-05
21 rows selected.
29
2007-11-25
2. Analytic Functions
1.6 Some Practical Example - Getting 3 consecutive rows using lag and lead
1 select distinct name, subject
2 from (
3
select name, subject, test_date, marks,
4
lag(marks,1) over (partition by name, subject order by test_date) pr_marks,
5
lag(marks,2) over (partition by name, subject order by test_date) pr_to_pr_marks,
6
lead(marks,1) over (partition by name, subject order by test_date) ne_marks,
7
lead(marks,2) over (partition by name, subject order by test_date) ne_to_ne_marks
8
from marks
9
)
10 where (marks < 40 and pr_marks < 40 and pr_to_pr_marks < 40)
11
OR
12
(marks < 40 and ne_marks < 40 and ne_to_ne_marks < 40)
13 /
NAME
---------DAVID
MIKE
SUBJECT
--------CHEMISTRY
CHEMISTRY
30
2007-11-25
2. Analytic Functions
31
2007-11-25
32
2007-11-25
Note:
If OPTIMIZER_INDEX_CACHING is for index blocks, then
OPTIMIZER_INDEX_COST_ADJ is for table blocks available in the buffer cache.
Beacon Infotech Corporation
www.oracleact.com
33
2007-11-25
34
2007-11-25
Case 1
Let us first see what will be the cost for a query with default values of 0 and 100 for
optimizer_index_caching and optimizer_index_cost_adj respectively.
SQL> alter session set optimizer_index_caching = 0;
Session altered.
SQL> alter session set optimizer_index_cost_adj = 100 ;
Session altered.
SQL> explain plan for
2 select a.cust_id , b.name
3 from sales a, state b
4 where a.state = b.stateid and b.stateid = 'GA' ;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------| Id | Operation
| Name
| Rows | Bytes | Cost |
----------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT
| 573K |
15M | 45551 |
| 1 | NESTED LOOPS
| 573K |
15M | 45551 |
21
|* 3 |
| STATE_PK
| SALES
|
|
|
|
35
Case 2
I changed the OPTIMIZER_INDEX_COST_ADJ to 10 but kept the default value for OPTIMIZER_INDEX_CACHING.
SQL> alter session set optimizer_index_caching = 0 ;
Session altered.
SQL> alter session set optimizer_index_cost_adj = 10 ;
Session altered.
SQL> explain plan for
select a.cust_id , b.name from sales a, state b
where a.state = b.stateid and
b.stateid = 'GA' ;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT
|
| 573K| 15M | 37683 (1) |
| 1 | NESTED LOOPS
|
| 573K| 15M | 37683 (1) |
| 2 | TABLE ACCESS BY INDEX ROWID | STATE
| 1 | 21
| 2 (50) |
|* 3 | INDEX UNIQUE SCAN
| STATE_PK
| 1 |
|
|
| 4 | TABLE ACCESS BY INDEX ROWID | SALES
| 573K | 4480K | 37682 (1) |
|* 5 | INDEX RANGE SCAN
| SALES_IDX_2 | 573K |
| 30095 (0) |
------------------------------------------------------------------------------------------------------------------------------------------
The Query plan is completely changed from full table scan to indexed access on the SALES
table. The total cost of the query came down to 37,683 from 45,551, the TABLE ACCESS BY
INDEX ROWID cost is 37,682 that includes the cost of INDEX RANGE SCAN for
SALES_IDX_2 30,095.
Beacon Infotech Corporation
www.oracleact.com
36
2007-11-25
Case 3
In real life situation some amount of index leaf blocks are always cached in the
memory.
Now I tell Oracle that 50 % of index blocks are cached in the SGA by changing the
OPTIMIZER_INDEX_CACHING parameter.
37
2007-11-25
Case 4
If you further increase the parameter OPTIMIZER_INDEX_CACHING, then the cost of INDEX SCAN will further
come down.
Let us test this concept with a new value, 90.
SQL> alter session set optimizer_index_caching = 90 ;
Session altered.
SQL> alter session set optimizer_index_cost_adj = 10 ;
Session altered.
SQL> explain plan for
2 select a.cust_id , b.name
3 from sales a, state b
4 where a.state = b.stateid and b.stateid = 'GA' ;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------| Id | Operation
| Name
| Rows | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT
|
| 573K | 15M | 34980 (0) |
| 1 | NESTED LOOPS
|
| 573K | 15M | 34980 (0) |
| 2 | TABLE ACCESS BY INDEX ROWID | STATE
| 1 | 21
| 2 (50) |
|* 3 | INDEX UNIQUE SCAN
| STATE_PK
| 1 |
|
|
| 4 | TABLE ACCESS BY INDEX ROWID | SALES
| 573K | 4480K | 34979 (0) |
|* 5 | INDEX RANGE SCAN
| SALES_IDX_2 | 573K |
| 3069 (0) |
-------------------------------------------------------------------------------------------------------------------------------------The cost of INDEX RANGE SCAN has been reduced to 3069.
38
2007-11-25
In conclusion
When the SYSTEM statistics is off:
One thing you should remember, the above tests and various costs obtained did not
tell you which query will run faster.
I just demonstrated the impact of changing those 2 parameters.
There is no rule of thumb to fix those 2 values. How ever, One thing is sure that the
default value 0 and 100 for OPTIMIZER_INDEX_CACHING and
OPTIMIZER_INDEX_COST_ADJ respectively are set for Data Warehouse System.
If your system is OLTP, then these 2 parameters values should be changed. I would
test the sytem with different values before choosing the correct values.
Setting 90 to OPTIMIZER_INDEX_CACHING and 10 (or 15) to
OPTIMIZER_INDEX_COST_ADJ will perform good for OLTP system.
When the SYSTEM statistics is on:
Starting from 9i and 10g You can collect system statistics (CPU & IO). When those
statistics are available, then the optimizer uses those statistics in the execution
plan and ignores OPTIMIZER_INDEX_CACHING and
OPTIMIZER_INDEX_COST_ADJ values.
Beacon Infotech Corporation
www.oracleact.com
39
2007-11-25
40
2007-11-25