SQL in practice
4 min read·

CTE in SQL: Common Table Expressions Explained Simply

A CTE (Common Table Expression) is a named query result that exists only within a single SQL statement. A CTE is declared with the WITH operator, and afterwards you can refer to it by name like a regular table:

MySQL 8.1
WITH cte_name AS (
    SELECT ...
)
SELECT * FROM cte_name;

Essentially a CTE is a subquery that was given a name and moved to the top of the query. This simple trick radically changes readability: a complex query becomes a sequence of understandable steps.

A Subquery That Got a Name

Let's find orders above the average order value in a delivery service database. With a subquery it looks like this:

MySQL 8.1
SELECT order_id, total_amount
FROM orders
WHERE total_amount > (SELECT AVG(total_amount) FROM orders)
ORDER BY total_amount DESC, order_id
LIMIT 5;

And now the same thing with a CTE:

MySQL 8.1
WITH avg_check AS (
    SELECT AVG(total_amount) AS avg_amount FROM orders
)
SELECT order_id, total_amount
FROM orders
WHERE total_amount > (SELECT avg_amount FROM avg_check)
ORDER BY total_amount DESC, order_id
LIMIT 5;
order_idtotal_amount
2099.96
2699.96
21891.81
23780.62
25879.15

On a query this short the gain isn't obvious yet. It shows up once there is more than one step.

Multiple CTEs in One Query

CTEs can be declared in a chain separated by commas, and each next one may use the previous ones. Let's find products that bring in more revenue than the average product in their category:

MySQL 8.1
WITH product_revenue AS (
    SELECT p.category, p.name, SUM(o.total_amount) AS revenue
    FROM orders o
    JOIN products p ON p.product_id = o.product_id
    GROUP BY p.category, p.name
),
category_avg AS (
    SELECT category, AVG(revenue) AS avg_revenue
    FROM product_revenue
    GROUP BY category
)
SELECT pr.name, pr.category, pr.revenue, ROUND(ca.avg_revenue, 2) AS category_avg
FROM product_revenue pr
JOIN category_avg ca ON ca.category = pr.category
WHERE pr.revenue > ca.avg_revenue
ORDER BY pr.category, pr.revenue DESC;
namecategoryrevenuecategory_avg
Green SmoothieDrinks272.48227.15
LatteDrinks265.70227.15
Fresh Orange JuiceDrinks257.16227.15
Sushi SetFood1728.19843.88
Olive Oil 500mlGrocery789.92515.75
Dark ChocolateSnacks306.47242.84

The query reads top to bottom like a recipe: first compute revenue per product, then the category average, finally compare. The same query built on nested subqueries would have to be read inside out — and the per-product revenue, used twice, would have to be written out twice. The query is editable right on the page: flip the WHERE comparison to < and you will see the products lagging behind their category average.

CTE, Subquery or Temporary Table

All three tools solve a similar problem — "put an intermediate result somewhere". The difference is in lifetime and capabilities:

CriterionCTESubqueryTemporary table
Lives forOne queryOne spot in a queryThe whole session
Reusable twiceYes, within the queryNo, copied as textYes, in any session queries
Supports recursionYesNoNo
Needs write privilegesNoNoSometimes yes
Can be indexedNoNoYes

A practical rule: a subquery for a one-off detail, a CTE for readability and reuse within a query, a temporary table when several consecutive queries need the result or it needs an index.

What fundamentally distinguishes a CTE from a temporary table created with CREATE TEMPORARY TABLE?

Recursive CTEs

A recursive CTE is the only way in standard SQL to traverse a structure of unknown depth: an org chart, a category tree, a comment thread. It is a middle-level interview classic.

Let's start with the minimal example — generating the numbers 1 through 5:

MySQL 8.1
WITH RECURSIVE numbers AS (
    SELECT 1 AS n
    UNION ALL
    SELECT n + 1 FROM numbers WHERE n < 5
)
SELECT n FROM numbers;
n
1
2
3
4
5

A recursive CTE always has three parts:

  • The anchorSELECT 1 AS n: the starting row, executed once.
  • The recursive partSELECT n + 1 FROM numbers: refers to the CTE itself and runs again and again, each time receiving the rows of the previous step as input.
  • The termination conditionWHERE n < 5: as soon as the recursive part returns an empty result, the traversal ends. Without this condition the query would loop forever.

How a recursive CTE executes: the anchor and the recursion steps

What happens if you remove the termination condition from a recursive CTE?

Traversing a Hierarchy: The Org Chart

Now a real task. There is an employees table where everyone references their manager:

MySQL 8.1
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(50),
    manager_id INT
);
employee_idnamemanager_id
1Maria<NULL>
2Ivan1
3Olga1
4Peter2
5Anna2
6Sergey3
7Dmitry4

The question "list all employees with their depth in the hierarchy" cannot be answered with a regular JOIN: we don't know how many levels the hierarchy has. A recursive CTE solves it in one query:

MySQL 8.1
WITH RECURSIVE hierarchy AS (
    SELECT employee_id, name, manager_id, 1 AS level
    FROM employees
    WHERE manager_id IS NULL
    UNION ALL
    SELECT e.employee_id, e.name, e.manager_id, h.level + 1
    FROM employees e
    JOIN hierarchy h ON e.manager_id = h.employee_id
)
SELECT employee_id, name, level
FROM hierarchy
ORDER BY level, employee_id;
employee_idnamelevel
1Maria1
2Ivan2
3Olga2
4Peter3
5Anna3
6Sergey3
7Dmitry4

The anchor found the CEO (manager_id IS NULL), the first recursion step found her direct reports, the second the reports of those reports, and so on to the bottom. Termination happens by itself here: when a level has no subordinates, the JOIN returns an empty result.

The same template works for product categories, nested menus, comment threads — any table where a row references its parent.

When a CTE Can Slow a Query Down

CTEs are about readability, not speed. In most cases the optimizer unfolds a CTE the same way as a subquery and there is no performance difference. But there are nuances worth knowing:

  • Materialization. The DBMS may compute the entire CTE into a temporary area and only then apply outer filters. PostgreSQL before version 12 always did this: a WHERE condition outside the CTE did not "push down" into it, and a CTE over millions of rows was computed in full. Starting with PostgreSQL 12, single-use CTEs are inlined into the query, and the behavior can be controlled with MATERIALIZED / NOT MATERIALIZED.
  • Reuse. If a CTE is referenced several times in a query, the DBMS may compute it once and reuse it — a plus. But it may also compute it twice: check the query plan with EXPLAIN.
  • Unbounded recursion. A mistake in the termination condition or a cycle in the data (an employee who manages themselves) leads to infinite recursion. MySQL aborts it at 1000 iterations by default (cte_max_recursion_depth), PostgreSQL keeps going until it hits resource limits, so with untrusted data it's worth adding a safety belt like WHERE level < 100.

What's Next

Related articles