The GROUP BY clause permits a WITH
ROLLUP modifier that causes extra rows to be added to
the summary output. These rows represent higher-level (or
super-aggregate) summary operations. ROLLUP
thus enables you to answer questions at multiple levels of
analysis with a single query. It can be used, for example, to
provide support for OLAP (Online Analytical Processing)
operations.
Suppose that a table named sales has
year, country,
product, and profit
columns for recording sales profitability:
CREATE TABLE sales
(
year INT NOT NULL,
country VARCHAR(20) NOT NULL,
product VARCHAR(32) NOT NULL,
profit INT
);
The table's contents can be summarized per year with a simple
GROUP BY like this:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
+------+-------------+
This output shows the total profit for each year, but if you also want to determine the total profit summed over all years, you must add up the individual values yourself or run an additional query.
Or you can use ROLLUP, which provides both
levels of analysis with a single query. Adding a WITH
ROLLUP modifier to the GROUP BY
clause causes the query to produce another row that shows the
grand total over all year values:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP;
+------+-------------+
| year | SUM(profit) |
+------+-------------+
| 2000 | 4525 |
| 2001 | 3010 |
| NULL | 7535 |
+------+-------------+
The grand total super-aggregate line is identified by the value
NULL in the year column.
ROLLUP has a more complex effect when there
are multiple GROUP BY columns. In this case,
each time there is a “break” (change in value) in
any but the last grouping column, the query produces an extra
super-aggregate summary row.
For example, without ROLLUP, a summary on the
sales table based on year,
country, and product might
look like this:
mysql>SELECT year, country, product, SUM(profit)->FROM sales->GROUP BY year, country, product;+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2001 | Finland | Phone | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | +------+---------+------------+-------------+
The output indicates summary values only at the
year/country/product level of analysis. When
ROLLUP is added, the query produces several
extra rows:
mysql>SELECT year, country, product, SUM(profit)->FROM sales->GROUP BY year, country, product WITH ROLLUP;+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | | 2000 | India | NULL | 1350 | | 2000 | USA | Calculator | 75 | | 2000 | USA | Computer | 1500 | | 2000 | USA | NULL | 1575 | | 2000 | NULL | NULL | 4525 | | 2001 | Finland | Phone | 10 | | 2001 | Finland | NULL | 10 | | 2001 | USA | Calculator | 50 | | 2001 | USA | Computer | 2700 | | 2001 | USA | TV | 250 | | 2001 | USA | NULL | 3000 | | 2001 | NULL | NULL | 3010 | | NULL | NULL | NULL | 7535 | +------+---------+------------+-------------+
For this query, adding ROLLUP causes the
output to include summary information at four levels of
analysis, not just one. Here is how to interpret the
ROLLUP output:
Following each set of product rows for a given year and
country, an extra summary row is produced showing the total
for all products. These rows have the
product column set to
NULL.
Following each set of rows for a given year, an extra
summary row is produced showing the total for all countries
and products. These rows have the country
and products columns set to
NULL.
Finally, following all other rows, an extra summary row is
produced showing the grand total for all years, countries,
and products. This row has the year,
country, and products
columns set to NULL.
The following items list some behaviors specific to the MySQL
implementation of ROLLUP.
When you use ROLLUP, you cannot also use an
ORDER BY clause to sort the results. In other
words, ROLLUP and ORDER BY
are mutually exclusive. However, you still have some control
over sort order. GROUP BY in MySQL implicitly
sorts results, and you can use explicit ASC
and DESC keywords with columns named in the
GROUP BY list to specify sort order for
individual columns. (The higher-level summary rows added by
ROLLUP still appear after the rows from which
they are calculated, regardless of the sort order.)
Implicit GROUP BY sorting in MySQL
5.5 is deprecated. To achieve a specific sort order
of grouped results, it is preferable to use an explicit
ORDER BY clause. You can work around the
restriction that prevents using ROLLUP with
ORDER BY by placing the
ROLLUP clause within a subquery. For example:
mysql>SELECT * FROM (SELECT year, country, SUM(profit) FROM sales GROUP BY year WITH ROLLUP)->derived_t1 ORDER BY year;
LIMIT can be used to restrict the number of
rows returned to the client. LIMIT is applied
after ROLLUP, so the limit applies against
the extra rows added by ROLLUP. For example:
mysql>SELECT year, country, product, SUM(profit)->FROM sales->GROUP BY year, country, product WITH ROLLUP->LIMIT 5;+------+---------+------------+-------------+ | year | country | product | SUM(profit) | +------+---------+------------+-------------+ | 2000 | Finland | Computer | 1500 | | 2000 | Finland | Phone | 100 | | 2000 | Finland | NULL | 1600 | | 2000 | India | Calculator | 150 | | 2000 | India | Computer | 1200 | +------+---------+------------+-------------+
Using LIMIT with ROLLUP
may produce results that are more difficult to interpret,
because you have less context for understanding the
super-aggregate rows.
The NULL indicators in each super-aggregate
row are produced when the row is sent to the client. The server
looks at the columns named in the GROUP BY
clause following the leftmost one that has changed value. For
any column in the result set with a name that matches any of
those names, its value is set to NULL. (If
you specify grouping columns by column number, the server
identifies which columns to set to NULL by
number.)
Because the NULL values in the
super-aggregate rows are placed into the result set at such a
late stage in query processing, you cannot test them as
NULL values within the query itself. For
example, you cannot add HAVING product IS
NULL to the query to eliminate from the output all but
the super-aggregate rows.
On the other hand, the NULL values do appear
as NULL on the client side and can be tested
as such using any MySQL client programming interface.
MySQL permits a column that does not appear in the
GROUP BY list to be named in the select list.
In this case, the server is free to choose any value from this
nonaggregated column in summary rows, and this includes the
extra rows added by WITH ROLLUP. For example,
in the following query, country is a
nonaggregated column that does not appear in the GROUP
BY list and values chosen for this column are
indeterminate:
mysql>SELECT year, country, SUM(profit)->FROM sales GROUP BY year WITH ROLLUP;+------+---------+-------------+ | year | country | SUM(profit) | +------+---------+-------------+ | 2000 | India | 4525 | | 2001 | USA | 3010 | | NULL | USA | 7535 | +------+---------+-------------+
This behavior occurs if the
ONLY_FULL_GROUP_BY SQL mode is
not enabled. If that mode is enabled, the server rejects the
query as illegal because country is not
listed in the GROUP BY clause. For more
information about nonaggregated columns and GROUP
BY, see Section 12.16.3, “MySQL Handling of GROUP BY”.