Oracle delivers very nice extensions for GROUP BY clause which are very useful for developers and DBAs:
ROLLUP – operator to produce subtotal values
CUBE – operator to produce cross-tabulation values
GROUPING, GROUPING_ID, GROUP_ID – functions used to identify rows created by ROLLUP, CUBE
GROUPING SETS – expression to produce a single result set
Setup
To present how the extensions are working I prepared sample data
create table test_ok (dept_id varchar2(10), user_id varchar2(10), salary number, commission_pct number); begin for i in 1..10 loop insert into test_ok values('DEP'||mod(i,5),'USER'||i,i, mod(i,2)*5); end loop; commit; end; / select * from test_ok order by dept_id, commission_pct; DEPT_ID USER_ID SALARY COMMISSION_PCT ---------- ---------- ---------- -------------- DEP0 USER10 10 0 DEP0 USER5 5 5 DEP1 USER6 6 0 DEP1 USER1 1 5 DEP2 USER2 2 0 DEP2 USER7 7 5 DEP3 USER8 8 0 DEP3 USER3 3 5 DEP4 USER4 4 0 DEP4 USER9 9 5 10 rows selected
GROUP functions
Group function takes multiple rows as input and returns one aggregate result. Popular and very often used group functions are MAX, MIN, SUM and many more …
select max(salary), min(salary), sum(salary) from test_ok; MAX(SALARY) MIN(SALARY) SUM(SALARY) ----------- ----------- ----------- 10 1 55
GROUP BY clause
If you want to know what is MIN, MAX, SUM salary in each department you need to use GROUP BY [columns] clause.
select dept_id, max(salary), min(salary), sum(salary) from test_ok group by dept_id order by 1; DEPT_ID MAX(SALARY) MIN(SALARY) SUM(SALARY) ---------- ----------- ----------- ----------- DEP0 10 5 15 DEP1 6 1 7 DEP2 7 2 9 DEP3 8 3 11 DEP4 9 4 1 5 rows selected
We can specify more than one column in GROUP BY. As a result we get more combination of aggregated results.
select dept_id, commission_pct, max(salary), min(salary), sum(salary) from test_ok group by dept_id, commission_pct order by 1, 2; DEPT_ID COMMISSION_PCT MAX(SALARY) MIN(SALARY) SUM(SALARY) ---------- -------------- ----------- ----------- ----------- DEP0 0 10 10 10 DEP0 5 5 5 5 DEP1 0 6 6 6 DEP1 5 1 1 1 DEP2 0 2 2 2 DEP2 5 7 7 7 DEP3 0 8 8 8 DEP3 5 3 3 3 DEP4 0 4 4 4 DEP4 5 9 9 9 10 rows selected
Number of results returned for GROUP BY col_1,…,col_n for a table equals to simple query: select distinct col_1,.., col_n from table.
ROLLUP extension for GROUP BY increases number of returned results by creation of extra subtotals. Number of results equals to N+1 combinations where N is number of columns specified in GROUP BY ROLLUP.
distinct col_1, col_2, col_3 - default subtotal for GROUP BY distinct col_1, col_2, null - extra subtotal for ROLLUP distinct col_1, null , null - extra subtotal for ROLLUP distinct null, null , null - extra subtotal for ROLLUP
select dept_id, sum(salary) from test_ok group by rollup(dept_id) order by 1; DEPT_ID SUM(SALARY) ---------- ----------- DEP0 15 DEP1 7 DEP2 9 DEP3 11 DEP4 13 55 <- extra subtotal (NULL) 6 rows selected select dept_id, commission_pct, sum(salary) from test_ok group by rollup(dept_id, commission_pct) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) -------- -------------- ----------- DEP0 0 10 DEP0 5 5 DEP0 15 <- extra subtotal (DEPT_ID, NULL) DEP1 0 6 DEP1 5 1 DEP1 7 <- extra subtotal (DEPT_ID, NULL) DEP2 0 2 DEP2 5 7 DEP2 9 <- extra subtotal (DEPT_ID, NULL) DEP3 0 8 DEP3 5 3 DEP3 11 <- extra subtotal (DEPT_ID, NULL) DEP4 0 4 DEP4 5 9 DEP4 13 <- extra subtotal (DEPT_ID, NULL) 55 <- extra subtotal (NULL, NULL) 16 rows selected
CUBE extension for GROUP BY increases number of returned results by creation of extra subtotals. Number of results equals to 2n combinations where N is number of columns specified in GROUP BY CUBE.
distinct col_1, col_2, col_3 - default subtotal for GROUP BY distinct col_1, col_2, null - extra subtotal for CUBE distinct col_1, null , null - extra subtotal for CUBE distinct col_1, null , col_3 - extra subtotal for CUBE distinct null , col_2, col_3 - extra subtotal for CUBE distinct null , col_2, null - extra subtotal for CUBE distinct null , null , col_3 - extra subtotal for CUBE distinct null, null , null - extra subtotal for CUBE
select dept_id, commission_pct, sum(salary) from test_ok group by cube(dept_id, commission_pct) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) ------- -------------- ----------- DEP0 0 10 DEP0 5 5 DEP0 15 <- extra subtotal (DEPT_ID, NULL) DEP1 0 6 DEP1 5 1 DEP1 7 <- extra subtotal (DEPT_ID, NULL) DEP2 0 2 DEP2 5 7 DEP2 9 <- extra subtotal (DEPT_ID, NULL) DEP3 0 8 DEP3 5 3 DEP3 11 <- extra subtotal (DEPT_ID, NULL) DEP4 0 4 DEP4 5 9 DEP4 13 <- extra subtotal (DEPT_ID, NULL) 0 30 <- extra subtotal (NULL, COMMISSION_PCT) 5 25 <- extra subtotal (NULL, COMMISSION_PCT) 55 <- extra subtotal (NULL, NULL) 18 rows selected
GROUPING – this function is used to distinguish superaggregate rows from regular grouped rows. It can return 0 or 1.
select dept_id, commission_pct, sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2 from test_ok group by rollup(dept_id, commission_pct) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) G1 G2 ---------- -------------- ----------- ---------- ---------- DEP0 0 10 0 0 DEP0 5 5 0 0 DEP0 15 0 1 DEP1 0 6 0 0 DEP1 5 1 0 0 DEP1 7 0 1 DEP2 0 2 0 0 DEP2 5 7 0 0 DEP2 9 0 1 DEP3 0 8 0 0 DEP3 5 3 0 0 DEP3 11 0 1 DEP4 0 4 0 0 DEP4 5 9 0 0 DEP4 13 0 1 55 1 1 16 rows selected
You can use it to replace NULL values for subtotal to some meaningful values or exclude quickly some rows in HAVING clause
select decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, decode(grouping(commission_pct), 1, 'Subtotal', commission_pct) commission_pct, sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2 from test_ok group by cube(dept_id, commission_pct) order by 1, 2;
DEPT_ID COMMISSION_PCT SUM(SALARY) G1 G2 ---------- ------------------ ----------- ---------- ---------- DEP0 0 10 0 0 DEP0 5 5 0 0 DEP0 Subtotal 15 0 1 DEP1 0 6 0 0 DEP1 5 1 0 0 DEP1 Subtotal 7 0 1 DEP2 0 2 0 0 DEP2 5 7 0 0 DEP2 Subtotal 9 0 1 DEP3 0 8 0 0 DEP3 5 3 0 0 DEP3 Subtotal 11 0 1 DEP4 0 4 0 0 DEP4 5 9 0 0 DEP4 Subtotal 13 0 1 Subtotal 0 30 1 0 Subtotal 5 25 1 0 Subtotal Subtotal 55 1 1 18 rows selected
select decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, decode(grouping(commission_pct), 1, 'Subtotal', commission_pct) commission_pct, sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2 from test_ok group by cube(dept_id, commission_pct) having grouping(commission_pct) != 1 order by 1, 2;
DEPT_ID COMMISSION_PCT SUM(SALARY) G1 G2 ---------- ------------------ ----------- ---------- ---------- DEP0 0 10 0 0 DEP0 5 5 0 0 DEP1 0 6 0 0 DEP1 5 1 0 0 DEP2 0 2 0 0 DEP2 5 7 0 0 DEP3 0 8 0 0 DEP3 5 3 0 0 DEP4 0 4 0 0 DEP4 5 9 0 0 Subtotal 0 30 1 0 Subtotal 5 25 1 0 12 rows selected
GROUPING_ID – function returns unique number associated with subtotal combination.
For example for GROUP BY CUBE (col_1, col_2) we have 4 subtotals and for each we have unique number returned by GROUPING_ID function. You can call it aggregation “level” number.
col_1, col_2 - 0 <- default GROUP BY col_1, null - 1 null , col_2 - 2 null , null - 3
select dept_id, commission_pct, sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2, grouping_id(dept_id, commission_pct) gid from test_ok group by cube(dept_id, commission_pct) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) G1 G2 GID ---------- -------------- ----------- -- -- --- DEP0 0 10 0 0 0 DEP0 5 5 0 0 0 DEP0 15 0 1 1 DEP1 0 6 0 0 0 DEP1 5 1 0 0 0 DEP1 7 0 1 1 DEP2 0 2 0 0 0 DEP2 5 7 0 0 0 DEP2 9 0 1 1 DEP3 0 8 0 0 0 DEP3 5 3 0 0 0 DEP3 11 0 1 1 DEP4 0 4 0 0 0 DEP4 5 9 0 0 0 DEP4 13 0 1 1 0 30 1 0 2 5 25 1 0 2 55 1 1 3 18 rows selected
GROUP_ID – in case you have duplicate subtotals for GROUP BY you can filter them out using this function. The function starts with 0 and then assign higher number for each duplicated subtotal.
select dept_id, commission_pct, sum(salary), group_id () from test_ok group by dept_id, rollup(dept_id, commission_pct) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) GROUP_ID() ---------- -------------- ----------- ---------- DEP0 0 10 0 DEP0 5 5 0 DEP0 15 1 <- duplication DEP0 15 0 DEP1 0 6 0 DEP1 5 1 0 DEP1 7 1 <- duplication DEP1 7 0 DEP2 0 2 0 DEP2 5 7 0 DEP2 9 0 DEP2 9 1 <- duplication DEP3 0 8 0 DEP3 5 3 0 DEP3 11 0 DEP3 11 1 <- duplication DEP4 0 4 0 DEP4 5 9 0 DEP4 13 1 DEP4 13 0 20 rows selected
GROUPING SETS – this function is used to manually define which subtotals should be generated for GROUP BY clause. It’s very useful to avoid generating many subtotals like extensions ROLLUP and CUBE.
Lets’ remind how many subtotals is generated for GROUP BY, ROLLUP, CUBE
GROUP BY col_1, col_2 produces following subtotals: col_1, col_2
GROUP BY ROLLUP (col_1, col_2) produces following subtotals: col_1, col_2 col_1, null null , null
GROUP BY CUBE (col_1, col_2) produces following subtotals: col_1, col_2 col_1, null null , col_2 null , null
GROUP BY GROUPING SETS enables to control it
GROUP BY GROUPING SETS ((col_1, col_2)) produces following subtotals: col_1, col_2
GROUP BY GROUPING SETS ((col_1, null)) produces following subtotals: col_1, null
GROUP BY GROUPING SETS ((col_1, col_2),(null, null)) produces following subtotals: col_1, col_2 null, null
GROUP BY GROUPING SETS (col_1,(null, null)) produces following subtotals: col_1, null null, null
GROUP BY GROUPING SETS (col_1, col2) produces following subtotals: col_1, null null, col2
select decode(grouping(dept_id), 1, 'Subtotal', dept_id) dept_id, decode(grouping(commission_pct), 1, 'Subtotal', commission_pct) commission_pct, sum(salary), grouping(dept_id) g1, grouping(commission_pct) g2 from test_ok group by grouping sets((dept_id, commission_pct),(null,null)) order by 1, 2; DEPT_ID COMMISSION_PCT SUM(SALARY) G1 G2 ---------- ---------------------------------------- ----------- -- -- DEP0 0 10 0 0 DEP0 5 5 0 0 DEP1 0 6 0 0 DEP1 5 1 0 0 DEP2 0 2 0 0 DEP2 5 7 0 0 DEP3 0 8 0 0 DEP3 5 3 0 0 DEP4 0 4 0 0 DEP4 5 9 0 0 Subtotal Subtotal 55 1 1 11 rows selected
Have a fun 🙂
Tomasz