GROUP BY

GROUP BY clause specifies a result table that consists of a grouping of the rows of intermediate result table that is the result of the previous clause. The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual rows.

ORDER BY clause specifies the particular order in which you want selected rows returned. This is sorted by ascending or descending collating sequence of a column’s or an expression’s value.

The ORDER BY statement is used to sequence the rows in the final output. The GROUP BY statement is used to combine multiple rows into one. The ORDER BY statement can only be applied to the result set of the SQL statement. Unlike the GROUP BY, it can not be used on any intermediate result set (e.g. a subquery or a nested-table expression). Nor can it be used in a view definition.

The GROUP BY statement is used to group individual rows into combined sets based on the value in one, or more, columns. The GROUPING SETS clause is used to define multiple independent GROUP BY clauses in one query. The ROLLUP and CUBE clauses are shorthand forms of the GROUPING SETS statement.

Notes:

  • Specifying the same field multiple times in an ORDER BY list is allowed, but silly. Only the first specification of the field will have any impact on the data output order.
  • If the ORDER BY column list does not uniquely identify each row, those rows with duplicate values will come out in random order. This is almost always the wrong thing to do when the data is being displayed to an end-user.
  • Use the TRANSLATE function to order data regardless of the case. Note that this trick may not work consistently with some European character sets.
  • NULL values always sort high.

GROUP BY & ORDER BY Rules and Restrictions:

  • There can only be one GROUP BY per SELECT. Multiple select statements in the same query can each have their own GROUP BY.
  • Every field in the SELECT list must either be specified in the GROUP BY or must have a column function applied against it.
  • The result of a simple GROUP BY (i.e. with no GROUPING SETS, ROLLUP, or CUBE clause) is always a distinct set of rows, where the unique identifier is whatever fields were grouped on.
  • There is no guarantee that the rows resulting from a GROUP BY will come back in any particular order unless an ORDER BY is also specified.
  • Variable-length character fields with differing numbers on trailing blanks are treated as equal in the GROUP. The number of trailing blanks, if any, in the result is unpredictable.
  • When grouping, all null values in the GROUP BY fields are considered equal.

GROUPING SETS Statement

The GROUPING SETS statement enables one to get multiple GROUP BY result sets from a single statement.

ROLLUP Statement

A ROLLUP expression displays sub-totals for the specified fields. This is equivalent to doing the original GROUP BY, and also doing more groupings on sets of the left-most columns.

CUBE Statement

A CUBE expression displays a cross-tabulation of the sub-totals for any specified fields. As such, it generates many more totals than the similar ROLLUP.

Input
WEEK         DAY_WEEK    SALES_PERSON    UNITS_SOLD 
13           6          LUCCHESSI                 3        
13           6          LUCCHESSI                 1        
13           6          LEE                       2        
13           6          LEE                       2        
13           6          LEE                       3        
13           6          LEE                       5        
13           6          GOUNOT                    3        
13           6          GOUNOT                    1        
13           6          GOUNOT                    7        
13           7          LUCCHESSI                 1        
13           7          LUCCHESSI                 2        
13           7          LUCCHESSI                 1        
13           7          LEE                       7        
13           7          LEE                       3        
13           7          LEE                       7        
13           7          LEE                       4        
13           7          GOUNOT                    2        
13           7          GOUNOT                   18        
13           7          GOUNOT                    1

Example 1: A query with a basic GROUP BY clause over 3 columns:

SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       SALES_PERSON, SUM(SALES) AS UNITS_SOLD
  FROM SALES
WHERE WEEK(SALES_DATE) = 13
GROUP BY WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), SALES_PERSON
ORDER BY WEEK, DAY_WEEK, SALES_PERSON;



 WEEK        DAY_WEEK    SALES_PERSON    UNITS_SOLD 
  13           6            GOUNOT           11        
  13           6            LEE              12        
  13           6            LUCCHESSI         4        
  13           7            GOUNOT           21        
  13           7            LEE              21        
  13           7            LUCCHESSI         4  

Example 2: The following query produces the result based on two different grouping sets of rows from the SALES table.

SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       SALES_PERSON, SUM(SALES) AS UNITS_SOLD
  FROM SALES 
WHERE WEEK(SALES_DATE) = 13
GROUP BY GROUPING SETS ((WEEK(SALES_DATE), SALES_PERSON),
                        (DAYOFWEEK(SALES_DATE), SALES_PERSON))
ORDER BY WEEK, DAY_WEEK, SALES_PERSON;



WEEK        DAY_WEEK    SALES_PERSON      UNITS_SOLD 
13           -          GOUNOT                   32        
13           -          LEE                      33        
13           -          LUCCHESSI                 8         
 -           6          GOUNOT                   11         
 -           6          LEE                      12        
 -           6          LUCCHESSI                 4        
 -           7          GOUNOT                   21        
 -           7          LEE                      21        
 -           7          LUCCHESSI                 4

The rows with WEEK 13 are from the first grouping set while the other rows are from the second grouping set.

Example 3: If you use the three distinct columns involved in the grouping sets in Example 2 in a ROLLUP clause, you can see grouping sets for (WEEK, DAY_WEEK, SALES_PERSON), (WEEK, DAY_WEEK), (WEEK), and grand-total.

SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       SALES_PERSON, SUM(SALES) AS UNITS_SOLD
  FROM SALES
WHERE WEEK(SALES_DATE) = 13
GROUP BY ROLLUP (WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE),  
                 SALES_PERSON)
ORDER BY WEEK, DAY_WEEK, SALES_PERSON;


WEEK        DAY_WEEK    SALES_PERSON      UNITS_SOLD 
13           6          GOUNOT                   11        
13           6          LEE                      12        
13           6          LUCCHESSI                 4        
13           6          -                        27        
13           7          GOUNOT                   21        
13           7          LEE                      21        
13           7          LUCCHESSI                 4        
13           7          -                        46       
13           -          -                        73         
 -           -          -                        73 

Example 4: The same query as Example 3, using CUBE instead of ROLLUP results in additional grouping sets for (WEEK, SALES_PERSON),  (DAY_WEEK, SALES_PERSON), (DAY_WEEK), and (SALES_PERSON) in the result.

SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       SALES_PERSON, SUM(SALES) AS UNITS_SOLD
  FROM SALES
WHERE WEEK(SALES_DATE) = 13
GROUP BY CUBE ( WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE), 
                SALES_PERSON )
ORDER BY WEEK, DAY_WEEK, SALES_PERSON;



WEEK        DAY_WEEK    SALES_PERSON     UNITS_SOLD 
13           6         GOUNOT                   11        
13           6         LEE                      12        
13           6         LUCCHESSI                 4        
13           6         -                        27        
13           7         GOUNOT                   21        
13           7         LEE                      21        
13           7         LUCCHESSI                 4        
13           7         -                        46        
13           -         GOUNOT                   32        
13           -         LEE                      33        
13           -         LUCCHESSI                 8        
13           -         -                        73   
 -           6         GOUNOT                   11        
 -           6         LEE                      12        
 -           6         LUCCHESSI                 4        
 -           6         -                        27        
 -           7         GOUNOT                   21        
 -           7         LEE                      21        
 -           7         LUCCHESSI                 4        
 -           7         -                        46        
 -           -         GOUNOT                   32        
 -           -         LEE                      33        
 -           -         LUCCHESSI                 8        
 -           -         -                        73 

Example 5: The following query returns a result set that includes a grand total of the selected rows from the SALES table together with a group of rows aggregated by SALES_PERSON and MONTH.

SELECT SALES_PERSON,
       MONTH(SALES_DATE) AS MONTH,
       SUM(SALES) AS UNITS_SOLD
  FROM SALES
GROUP BY GROUPING SETS((SALES_PERSON, MONTH(SALES_DATE)),())
ORDER BY SALES_PERSON, MONTH;


  SALES_PERSON    MONTH   UNITS_SOLD 
   GOUNOT           3          35
   GOUNOT           4          14
   GOUNOT          12           1
   LEE              3          60
   LEE              4          25
   LEE             12           6
   LUCCHESSI        3           9
   LUCCHESSI        4           4
   LUCCHESSI       12           1
      -             -         155 

Example 6: This example shows two simple ROLLUP queries followed by a query that treats the two ROLLUP clauses as grouping sets in a single result set and specifies row ordering for each column involved in the grouping sets.

SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       SUM(SALES) AS UNITS_SOLD
  FROM SALES
GROUP BY ROLLUP (WEEK(SALES_DATE), DAYOFWEEK(SALES_DATE))
ORDER BY WEEK, DAY_WEEK;
 

WEEK        DAY_WEEK    UNITS_SOLD 
13           6          27        
13           7          46        
13           -          73        
14           1          31        
14           2          43        
14           -          74        
53           1           8        
53           -           8        
 -           -         155


SELECT MONTH(SALES_DATE) AS MONTH,
       REGION,
       SUM(SALES) AS UNITS_SOLD
  FROM SALES
GROUP BY ROLLUP (MONTH(SALES_DATE), REGION)
ORDER BY MONTH, REGION;

MONTH       REGION          UNITS_SOLD 
3         Manitoba                 22         
3         Ontario-North             8         
3         Ontario-South            34         
3         Quebec                   40         
3         -                       104         
4         Manitoba                 17         
4         Ontario-North             1         
4         Ontario-South            14         
4         Quebec                   11         
4         -                        43        
12        Manitoba                  2        
12        Ontario-South             4        
12        Quebec                    2        
12        -                         8        
 -        -                       155       


SELECT WEEK(SALES_DATE) AS WEEK,
       DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
       MONTH(SALES_DATE) AS MONTH,
       REGION,
       SUM(SALES) AS UNITS_SOLD
  FROM SALES
GROUP BY GROUPING SETS (ROLLUP(WEEK(SALES_DATE), 
                        DAYOFWEEK(SALES_DATE)),
                        ROLLUP(MONTH(SALES_DATE), REGION))
ORDER BY WEEK, DAY_WEEK, MONTH, REGION;
 

WEEK        DAY_WEEK    MONTH       REGION          UNITS_SOLD 
13           6           -           -                      27        
13           7           -           -                      46        
13           -           -           -                      73        
14           1           -           -                      31        
14           2           -           -                      43        
14           -           -           -                      74        
53           1           -           -                       8        
53           -           -           -                       8         
 -           -           3         Manitoba                 22         
 -           -           3         Ontario-North             8         
 -           -           3         Ontario-South            34         
 -           -           3         Quebec                   40         
 -           -           3         -                       104         
 -           -           4         Manitoba                 17         
 -           -           4         Ontario-North             1         
 -           -           4         Ontario-South            14         
 -           -           4         Quebec                   11         
 -           -           4         -                        43         
 -           -          12         Manitoba                  2         
 -           -          12         Ontario-South             4         
 -           -          12         Quebec                    2         
 -           -          12         -                         8         
 -           -           -         -                       155         
 -           -           -         -                       155            

Using the two ROLLUP clauses as grouping sets causes the result to include duplicate rows. There are even two grand total rows.

Observe how the use of ORDER BY has affected the results:

  • In the first grouped set, week 53 has been repositioned to the end.
  • In the second grouped set, month 12 has now been positioned to the end and the regions are now displayed in alphabetic order.
  • Null values are sorted high.

Group By in Join

 ID NAME       DEPT JOB   YEARS   SALARY COMM
-- ---- ---- ----- ------ ----- ------
10 Sanders 20 Mgr 7 18357.50 -
20 Pernal 20 Sales 8 18171.25 612.45
80 James 20 Clerk - 13504.60 128.20
90 Koonitz 42 Sales 6 18001.75 1386.70
100 Plotz 42 Mgr 7 18352.80 -
130 Yamaguchi 42 Clerk 6 10505.90 75.60
140 Fraye 51 Mgr 6 21150.00 -
150 Williams 51 Sales 6 19456.50 637.65
160 Molinare 10 Mgr 7 22959.20 -
190 Sneider 20 Clerk 8 14252.75 126.50
200 Scoutten 42 Clerk - 11508.60 84.20
210 Lu 10 Mgr 10 20010.00 -
220 Smith 51 Sales 7 17654.50 992.80
230 Lundquist 51 Clerk 3 13369.80 189.65
240 Daniels 10 Mgr 5 19260.25 -
250 Wheeler 51 Clerk 6 14460.00 513.30
260 Jones 10 Mgr 12 21234.00 -

One side of join – using common table expression

WITH STAFF2(DEPT, AVGSAL) AS 
 (SELECT DEPT 
 ,AVG(SALARY) 
 FROM STAFF 
 GROUP BY DEPT 
 HAVING AVG(SALARY) > 18000 
 ) 
 SELECT A.ID 
 ,A.NAME
 ,A.DEPT
 FROM STAFF A
 ,STAFF2 B
 WHERE A.DEPT = B.DEPT
 ORDER BY A.ID;
 

ANSWER
 ID  NAME     DEPT
 160 Molinare 10
 210 Lu       10
 240 Daniels  10
 260 Jones    10

One side of join – using full-select

SELECT A.ID 
      ,A.NAME 
      ,A.DEPT 
  FROM STAFF A 
 ,(SELECT DEPT AS DEPT 
 ,AVG(SALARY) AS AVGSAL 
  FROM STAFF 
GROUP BY DEPT 
HAVING AVG(SALARY) > 18000)AS B
WHERE A.DEPT = B.DEPT
ORDER BY A.ID;
 

ANSWER
 ID  NAME     DEPT
 160 Molinare 10
 210 Lu       10
 240 Daniels  10
 260 Jones    10

Read DB2 blogs: Click Here IBM DB2 Manual: Click Here

Scroll to Top