Data grouping and aggregate functions

Operator GROUP BY

Sometimes you need to find out information not about the objects themselves, but about the specific groups that they form. To do this, use the GROUP BY operator and aggregate functions.

The General structure of the query with GROUP BY

MySQL
SELECT [constants, aggregate_function, grouping_fields] 
FROM table_name 
GROUP BY grouping_fields;

Example

To demonstrate the work of the GROUP BY operator, we derive the total amount of money spent, grouped by family members.

MySQL
SELECT family_member, SUM(unit_price * amount) FROM Payments 
GROUP BY family_member;

Below is the data set in the Payments table:

family_memberunit_priceamount
120001
221001
3205
43501
41502
51001
21201
255001
52301
52301
52501
52501
322001
2660001
185
175
283
188
175
21503
31001
1103
13001
31508

When the request is executed, the grouping is done by the family_member field and the total amount spent on purchases by each family member is added up.

In order to better understand how grouping by field occurs, we transform the source data into a new, grouped view:

family_memberunit_priceamount
120001
3001
103
75
88
75
85
221001
1201
55001
1503
83
660001
31508
1001
22001
205
41502
3501
52501
2501
2301
2301
1001

As you can see, groups of records have been formed, united by onefamily_member. After that, we can apply the sum formula inside each of these groups, which will multiply the quantity of the product by its value, and then add up all the resulting values:

family_memberSUM(unit_price * amount)
12504
274194
33600
4650
51060
It should be borne in mind that for GROUP BY all NULL values are treated as equal, i.e. when grouped by a field containing NULL values, all such rows will fall into one group.

Aggregate functions

An aggregate function performs a calculation on a set of values and returns a single value.

Aggregate functions are used for non-null values. The exception is the COUNT() function.
FunctionDescription
SUM(поле_таблицы)Returns the sum of the values
AVG(поле_таблицы)Returns the average value
COUNT(поле_таблицы)Returns the number of records
MIN(поле_таблицы)Returns the minimum value
MAX(поле_таблицы)Returns the maximum value

HAVING operator

To filter rows by values of aggregate functions, the HAVING operator is used.

General query structure with the HAVING operator

MySQL
SELECT [constants, aggregate_function, grouping_fields] 
FROM table_name 
GROUP BY grouping_fields
HAVING conditions_for_selection_restrictions_after_grouping

Difference between HAVING and WHERE

  • WHERE — records are selected by condition first, and then can be grouped, sorted, etc.
  • HAVING — records are grouped first, and then selected by condition, while, unlike WHERE, you can use the values of aggregate functions in it

Example of use

Display the total amount spent on purchases for each family member, where the total purchase amount is less than 5000 rubles:

MySQL
SELECT family_member, SUM(unit_price * amount) AS sum 
FROM Payments GROUP BY family_member
HAVING sum < 5000);

Tasks for self-testing

Data sorting, operator ORDER BY
Multi-table queries, JOIN operator