CUBE,ROLLUP,GROUPING SETS,GROUPING, GROUPING_ID, GROUP_ID

Oracle delivers very nice extensions for GROUP BY clause which are very useful for developers and DBAs:

ROLLUP – operator to produce subtotal values
CUBE – operator to produce cross-tabulation values
GROUPING, GROUPING_ID, GROUP_ID – functions used to identify rows created by ROLLUP, CUBE
GROUPING SETS – expression to produce a single result set

Setup

To present how the extensions are working I prepared sample data

create table test_ok
(dept_id varchar2(10),
 user_id varchar2(10),
 salary  number,
 commission_pct number);

begin
  for i in 1..10
  loop
    insert into test_ok 
    values('DEP'||mod(i,5),'USER'||i,i, mod(i,2)*5);
  end loop;
  commit;
end;
/

select * from test_ok
order by dept_id, commission_pct;

DEPT_ID    USER_ID        SALARY COMMISSION_PCT
---------- ---------- ---------- --------------
DEP0       USER10             10              0 
DEP0       USER5               5              5 
DEP1       USER6               6              0 
DEP1       USER1               1              5 
DEP2       USER2               2              0 
DEP2       USER7               7              5 
DEP3       USER8               8              0 
DEP3       USER3               3              5 
DEP4       USER4               4              0 
DEP4       USER9               9              5 

 10 rows selected

GROUP functions

Group function takes multiple rows as input and returns one aggregate result. Popular and very often used group functions are MAX, MIN, SUM and many more …

select max(salary), min(salary), sum(salary) from test_ok;

MAX(SALARY) MIN(SALARY) SUM(SALARY)
----------- ----------- -----------
         10           1          55

 GROUP BY clause

If you want to know what is MIN, MAX, SUM salary in each department you need to use GROUP BY [columns] clause.

select dept_id, max(salary), min(salary), sum(salary) 
from test_ok
group by dept_id
order by 1;

DEPT_ID    MAX(SALARY) MIN(SALARY) SUM(SALARY)
---------- ----------- ----------- -----------
DEP0                10           5          15 
DEP1                 6           1           7 
DEP2                 7           2           9 
DEP3                 8           3          11 
DEP4                 9           4          1

5 rows selected

We can specify more than one column in GROUP BY. As a result we get more combination of aggregated results.

select dept_id, commission_pct, max(salary), min(salary), sum(salary)
from test_ok
group by dept_id, commission_pct
order by 1, 2;

DEPT_ID    COMMISSION_PCT MAX(SALARY) MIN(SALARY) SUM(SALARY)
 ---------- -------------- ----------- ----------- -----------
 DEP0                    0          10          10          10
 DEP0                    5           5           5           5
 DEP1                    0           6           6           6
 DEP1                    5           1           1           1
 DEP2                    0           2           2           2
 DEP2                    5           7           7           7
 DEP3                    0           8           8           8
 DEP3                    5           3           3           3
 DEP4                    0           4           4           4
 DEP4                    5           9           9           9

10 rows selected

Number of results returned for GROUP BY col_1,…,col_n for a table equals to simple query: select distinct col_1,.., col_n from table.

ROLLUP extension for GROUP BY increases number of returned results by creation of extra subtotals. Number of results equals to N+1 combinations where N is number of columns specified in GROUP BY ROLLUP.

distinct col_1, col_2, col_3 - default subtotal for GROUP BY
distinct col_1, col_2, null  - extra subtotal for ROLLUP
distinct col_1, null , null  - extra subtotal for ROLLUP
distinct null,  null , null  - extra subtotal for ROLLUP
select dept_id, sum(salary) 
from test_ok
group by rollup(dept_id)
order by 1;

DEPT_ID    SUM(SALARY)
---------- -----------
DEP0                15 
DEP1                 7 
DEP2                 9 
DEP3                11 
DEP4                13 
                    55  <- extra subtotal (NULL)

 6 rows selected

select dept_id, commission_pct, sum(salary) 
from test_ok
group by rollup(dept_id, commission_pct)
order by 1, 2;

DEPT_ID  COMMISSION_PCT SUM(SALARY)
-------- -------------- -----------
DEP0                  0          10 
DEP0                  5           5 
DEP0                             15 <- extra subtotal (DEPT_ID, NULL)
DEP1                  0           6 
DEP1                  5           1 
DEP1                              7 <- extra subtotal (DEPT_ID, NULL)
DEP2                  0           2 
DEP2                  5           7 
DEP2                              9 <- extra subtotal (DEPT_ID, NULL)
DEP3                  0           8 
DEP3                  5           3 
DEP3                             11 <- extra subtotal (DEPT_ID, NULL)
DEP4                  0           4 
DEP4                  5           9 
DEP4                             13 <- extra subtotal (DEPT_ID, NULL)
                                 55 <- extra subtotal (NULL,    NULL)

 16 rows selected

CUBE extension for GROUP BY increases number of returned results by creation of extra subtotals. Number of results equals to 2n combinations where N is number of columns specified in GROUP BY CUBE.

distinct col_1, col_2, col_3 - default subtotal for GROUP BY
distinct col_1, col_2, null  - extra subtotal for CUBE
distinct col_1, null , null  - extra subtotal for CUBE
distinct col_1, null , col_3 - extra subtotal for CUBE
distinct null , col_2, col_3 - extra subtotal for CUBE
distinct null , col_2, null  - extra subtotal for CUBE
distinct null , null , col_3 - extra subtotal for CUBE
distinct null,  null , null  - extra subtotal for CUBE
select dept_id, commission_pct, sum(salary) 
from test_ok
group by cube(dept_id, commission_pct)
order by 1, 2;

DEPT_ID COMMISSION_PCT SUM(SALARY)
------- -------------- -----------
DEP0                 0     10 
DEP0                 5      5 
DEP0                       15 <- extra subtotal (DEPT_ID, NULL)
DEP1                 0      6 
DEP1                 5      1 
DEP1                        7 <- extra subtotal (DEPT_ID, NULL)
DEP2                 0      2 
DEP2                 5      7 
DEP2                        9 <- extra subtotal (DEPT_ID, NULL)
DEP3                 0      8 
DEP3                 5      3 
DEP3                       11 <- extra subtotal (DEPT_ID, NULL)
DEP4                 0      4 
DEP4                 5      9 
DEP4                       13 <- extra subtotal (DEPT_ID, NULL)
                     0     30 <- extra subtotal (NULL, COMMISSION_PCT)
                     5     25 <- extra subtotal (NULL, COMMISSION_PCT)
                           55 <- extra subtotal (NULL, NULL)

 18 rows selected

GROUPING – this function is used to distinguish superaggregate rows from regular grouped rows. It can return 0 or 1.

select 
  dept_id, commission_pct, sum(salary), 
  grouping(dept_id) g1, grouping(commission_pct) g2
from test_ok
group by rollup(dept_id, commission_pct)
order by 1, 2;

DEPT_ID    COMMISSION_PCT SUM(SALARY)         G1         G2
---------- -------------- ----------- ---------- ----------
DEP0                    0          10          0          0 
DEP0                    5           5          0          0 
DEP0                               15          0          1 
DEP1                    0           6          0          0 
DEP1                    5           1          0          0 
DEP1                                7          0          1 
DEP2                    0           2          0          0 
DEP2                    5           7          0          0 
DEP2                                9          0          1 
DEP3                    0           8          0          0 
DEP3                    5           3          0          0 
DEP3                               11          0          1 
DEP4                    0           4          0          0 
DEP4                    5           9          0          0 
DEP4                               13          0          1 
                                   55          1          1 

16 rows selected

You can use it to replace NULL values for subtotal to some meaningful values or exclude quickly some rows in HAVING clause

select 
  decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, 
  decode(grouping(commission_pct), 1, 'Subtotal', 
                  commission_pct) commission_pct,
  sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2
from test_ok
group by cube(dept_id, commission_pct)
order by 1, 2;
DEPT_ID    COMMISSION_PCT     SUM(SALARY)         G1         G2
---------- ------------------ ----------- ---------- ----------
DEP0       0                           10          0          0 
DEP0       5                            5          0          0 
DEP0       Subtotal                    15          0          1 
DEP1       0                            6          0          0 
DEP1       5                            1          0          0 
DEP1       Subtotal                     7          0          1 
DEP2       0                            2          0          0 
DEP2       5                            7          0          0 
DEP2       Subtotal                     9          0          1 
DEP3       0                            8          0          0 
DEP3       5                            3          0          0 
DEP3       Subtotal                    11          0          1 
DEP4       0                            4          0          0 
DEP4       5                            9          0          0 
DEP4       Subtotal                    13          0          1 
Subtotal   0                           30          1          0 
Subtotal   5                           25          1          0 
Subtotal   Subtotal                    55          1          1 

 18 rows selected
select 
  decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, 
  decode(grouping(commission_pct), 1, 'Subtotal', 
                                 commission_pct) commission_pct,
  sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2
from test_ok
group by cube(dept_id, commission_pct)
having grouping(commission_pct) != 1
order by 1, 2;
DEPT_ID    COMMISSION_PCT     SUM(SALARY)         G1         G2
---------- ------------------ ----------- ---------- ----------
DEP0       0                           10          0          0 
DEP0       5                            5          0          0 
DEP1       0                            6          0          0 
DEP1       5                            1          0          0 
DEP2       0                            2          0          0 
DEP2       5                            7          0          0 
DEP3       0                            8          0          0 
DEP3       5                            3          0          0 
DEP4       0                            4          0          0 
DEP4       5                            9          0          0 
Subtotal   0                           30          1          0 
Subtotal   5                           25          1          0 

 12 rows selected

GROUPING_ID – function returns unique number associated with subtotal combination.

For example for GROUP BY CUBE (col_1, col_2) we have 4 subtotals and for each we have unique number returned by GROUPING_ID function. You can call it aggregation “level” number.

col_1, col_2 - 0 <- default GROUP BY
col_1,  null - 1
null , col_2 - 2
null ,  null - 3
select 
  dept_id, commission_pct, sum(salary), 
  grouping(dept_id) g1, grouping(commission_pct) g2,
  grouping_id(dept_id, commission_pct) gid
from test_ok
group by cube(dept_id, commission_pct)
order by 1, 2;

DEPT_ID    COMMISSION_PCT SUM(SALARY) G1 G2 GID
---------- -------------- ----------- -- -- ---
DEP0                    0          10  0  0   0 
DEP0                    5           5  0  0   0 
DEP0                               15  0  1   1 
DEP1                    0           6  0  0   0 
DEP1                    5           1  0  0   0 
DEP1                                7  0  1   1 
DEP2                    0           2  0  0   0 
DEP2                    5           7  0  0   0 
DEP2                                9  0  1   1 
DEP3                    0           8  0  0   0 
DEP3                    5           3  0  0   0 
DEP3                               11  0  1   1 
DEP4                    0           4  0  0   0 
DEP4                    5           9  0  0   0 
DEP4                               13  0  1   1 
                        0          30  1  0   2 
                        5          25  1  0   2 
                                   55  1  1   3 

 18 rows selected

GROUP_ID – in case you have duplicate subtotals for GROUP BY you can filter them out using this function. The function starts with 0 and then assign higher number for each duplicated subtotal.

 select 
  dept_id, commission_pct, sum(salary),
  group_id ()
from test_ok
group by dept_id, rollup(dept_id, commission_pct)
order by 1, 2;

DEPT_ID    COMMISSION_PCT SUM(SALARY) GROUP_ID()
---------- -------------- ----------- ----------
DEP0                    0          10          0 
DEP0                    5           5          0 
DEP0                               15          1 <- duplication
DEP0                               15          0 
DEP1                    0           6          0 
DEP1                    5           1          0 
DEP1                                7          1 <- duplication 
DEP1                                7          0 
DEP2                    0           2          0 
DEP2                    5           7          0 
DEP2                                9          0 
DEP2                                9          1  <- duplication
DEP3                    0           8          0 
DEP3                    5           3          0 
DEP3                               11          0 
DEP3                               11          1  <- duplication
DEP4                    0           4          0 
DEP4                    5           9          0 
DEP4                               13          1 
DEP4                               13          0 

 20 rows selected

GROUPING SETS – this function is used to manually define which subtotals should be generated for GROUP BY clause. It’s very useful to avoid generating many subtotals like extensions ROLLUP and CUBE.

Lets’ remind how many subtotals is generated for GROUP BY, ROLLUP, CUBE

GROUP BY col_1, col_2 produces following subtotals:
col_1, col_2
GROUP BY ROLLUP (col_1, col_2) produces following subtotals:
col_1, col_2
col_1, null
null , null
GROUP BY CUBE (col_1, col_2) produces following subtotals:
col_1, col_2
col_1, null
null , col_2
null , null

GROUP BY GROUPING SETS enables to control it

GROUP BY GROUPING SETS ((col_1, col_2)) 
produces following subtotals:
col_1, col_2
GROUP BY GROUPING SETS ((col_1, null)) 
produces following subtotals:
col_1, null
GROUP BY GROUPING SETS ((col_1, col_2),(null, null)) 
produces following subtotals:
col_1, col_2
null, null
GROUP BY GROUPING SETS (col_1,(null, null)) 
produces following subtotals:
col_1, null
null, null
GROUP BY GROUPING SETS (col_1, col2) 
produces following subtotals:
col_1, null
null, col2
select 
  decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, 
  decode(grouping(commission_pct), 1, 'Subtotal', 
                                 commission_pct) commission_pct,
  sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2
from test_ok
group by grouping sets((dept_id, commission_pct),(null,null))
order by 1, 2;

DEPT_ID    COMMISSION_PCT                           SUM(SALARY) G1 G2
---------- ---------------------------------------- ----------- -- --
DEP0       0                                                 10  0  0 
DEP0       5                                                  5  0  0 
DEP1       0                                                  6  0  0 
DEP1       5                                                  1  0  0 
DEP2       0                                                  2  0  0 
DEP2       5                                                  7  0  0 
DEP3       0                                                  8  0  0 
DEP3       5                                                  3  0  0 
DEP4       0                                                  4  0  0 
DEP4       5                                                  9  0  0 
Subtotal   Subtotal                                          55  1  1 

 11 rows selected

Have a fun 🙂
Tomasz

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.