Partitions in SQL Window Functions

In the previous article, we briefly mentioned what partitions are and how to use them in window functions. Now it's time to dive deeper into partitions 🤓.

Understanding partitions

Partitions are subsets of rows that are defined for a window function based on one or more columns in a table.

They are used to segment the data, allowing for more detailed analysis and calculations such as aggregation or ranking within each group.

By partitioning, for example, based on the type of housing in a table with housing price data, we can calculate the average price for each type of housing in a separate column.

Partitioning schema

Applying partitions in SQL

To use a partition with a window function, you need to follow the following syntax:

MySQL
SELECT <window_function>(<table_field>)
OVER (
    PARTITION BY <partition_columns>
)

Example usage

Now let's look at an example of using a partition with a window function using a simple example.

Consider the Rooms table from the Airbnb database, specifically the home_type and price fields:

MySQL
SELECT home_type, price FROM Rooms;
home_typeprice
Private room149
Entire home/apt225
Private room150
Entire home/apt89
Entire home/apt80
Entire home/apt200
Private room60
Private room79
Private room79
Entire home/apt150
Entire home/apt135
Private room85
Private room89
Private room85
Entire home/apt120
Entire home/apt140
Entire home/apt215
Private room140
Entire home/apt99
Entire home/apt190
Entire home/apt299
Private room130
Private room80
Private room110
Entire home/apt120
Private room60
Private room80
Entire home/apt150
Private room44
Entire home/apt180
Private room50
Private room52
Private room55
Private room50
Private room70
Private room89
Private room35
Entire home/apt85
Private room150
Shared room40
Private room68
Entire home/apt120
Private room120
Private room135
Entire home/apt150
Entire home/apt150
Private room130
Entire home/apt110
Entire home/apt115
Private room80

We can see that all rental homes are divided into 3 categories: "Private room," "Entire home/apt," and "Shared room."

Each category of housing has its own price range. To find out the average price within a specific category and compare it to the current price, we can use window functions.

Let's add another column, avg_price, to our result table that calculates the average price per category. It will look like this:

MySQL
SELECT
    home_type, price,
    AVG(price) OVER (PARTITION BY home_type) AS avg_price
FROM Rooms
home_typepriceavg_price
Entire home/apt225148.6667
Entire home/apt180148.6667
Entire home/apt150148.6667
Entire home/apt85148.6667
Entire home/apt120148.6667
Entire home/apt120148.6667
Entire home/apt299148.6667
Entire home/apt190148.6667
Entire home/apt99148.6667
Entire home/apt215148.6667
Entire home/apt140148.6667
Entire home/apt120148.6667
Entire home/apt150148.6667
Entire home/apt135148.6667
Entire home/apt150148.6667
Entire home/apt110148.6667
Entire home/apt115148.6667
Entire home/apt200148.6667
Entire home/apt150148.6667
Entire home/apt80148.6667
Entire home/apt89148.6667
Private room6889.4286
Private room5089.4286
Private room7089.4286
Private room8089.4286
Private room8989.4286
Private room14989.4286
Private room3589.4286
Private room15089.4286
Private room13089.4286
Private room12089.4286
Private room13589.4286
Private room13089.4286
Private room15089.4286
Private room6089.4286
Private room7989.4286
Private room7989.4286
Private room8589.4286
Private room8989.4286
Private room8589.4286
Private room14089.4286
Private room5589.4286
Private room8089.4286
Private room11089.4286
Private room6089.4286
Private room8089.4286
Private room4489.4286
Private room5089.4286
Private room5289.4286
Shared room4040

What's happening in the added line?

  • PARTITION BY home_type divides all records into different partitions based on the unique values in the home_type column.
  • Then, for each record, AVG(price) calculates the average price (price) within its partition (home_type).

The result of executing this part of the query will be the avg_price column, which indicates the average price for each record's housing category (home_type).

Partitions on multiple columns

Partitioning can also be done on multiple columns, allowing for more complex and precise segmentation for analysis.

For example, for our Rooms table, we can create partitions based on two columns: the housing category home_type and the presence of a TV in the accommodation has_tv .

Here's an example query with partitioning on two columns:

MySQL
SELECT
    home_type, has_tv, price,
    AVG(price) OVER (PARTITION BY home_type, has_tv) AS avg_price
    FROM Rooms
home_typehas_tvpriceavg_price
Entire home/apt0225170
Entire home/apt0180170
Entire home/apt080170
Entire home/apt0200170
Entire home/apt0150170
Entire home/apt0150170
Entire home/apt0190170
Entire home/apt0215170
Entire home/apt0140170
Entire home/apt199132.6667
Entire home/apt185132.6667
Entire home/apt1150132.6667
Entire home/apt1120132.6667
Entire home/apt1120132.6667
Entire home/apt1299132.6667
Entire home/apt1120132.6667
Entire home/apt1135132.6667
Entire home/apt1150132.6667
Entire home/apt1110132.6667
Entire home/apt189132.6667
Entire home/apt1115132.6667
Private room08578.5455
Private room03578.5455
Private room015078.5455
Private room05578.5455
Private room05278.5455
Private room05078.5455
Private room06878.5455
Private room06078.5455
Private room013578.5455
Private room08578.5455
Private room08978.5455
Private room112096.4706
Private room18096.4706
Private room114996.4706
Private room113096.4706
Private room18996.4706
Private room17096.4706
Private room15096.4706
Private room14496.4706
Private room18096.4706
Private room16096.4706
Private room111096.4706
Private room18096.4706
Private room113096.4706
Private room114096.4706
Private room17996.4706
Private room17996.4706
Private room115096.4706
Shared room14040

Here, PARTITION BY home_type, has_tv creates unique partitions for each combination of home_type and has_tv, allowing us to calculate the average price of housing for the current housing category with or without a TV.

Partitions on two columns