Skip to content

Instantly share code, notes, and snippets.

@swanandp
Last active November 29, 2023 17:23
Show Gist options
  • Save swanandp/ebc55ea0ba0364e137b6013c6eaedcbf to your computer and use it in GitHub Desktop.
Save swanandp/ebc55ea0ba0364e137b6013c6eaedcbf to your computer and use it in GitHub Desktop.
A demonstration of recursive query in SQL
CREATE TABLE posts
(
id bigserial PRIMARY KEY,
content text NOT NULL,
reply_to_id bigint REFERENCES posts (id)
);
TRUNCATE posts RESTART IDENTITY; -- this deletes all rows and restarts the auto-increment ID
/*
Let's create nested data like this:
-> R
-> R-O
-> R-O-A
-> R-O-A-D
-> R-O-D
-> R-A
-> R-A-P
-> S
-> H
*/
INSERT INTO posts(content)
VALUES ('R');
INSERT INTO posts(content, reply_to_id)
VALUES ('RO', 1),
('RA', 1);
INSERT INTO posts(content, reply_to_id)
VALUES ('ROA', 2),
('RAP', 3);
INSERT INTO posts(content, reply_to_id)
VALUES ('ROAD', 4);
INSERT INTO posts(content, reply_to_id)
VALUES ('ROD', 2);
INSERT INTO posts(content)
VALUES ('S');
INSERT INTO posts(content, reply_to_id)
VALUES ('SH', 7);
SELECT *
FROM posts;
-- select all children for 'R'
WITH RECURSIVE leaves(id, content, reply_to_id) AS
(SELECT id, content, reply_to_id
FROM posts
WHERE content = 'R'
UNION ALL
SELECT p.id, p.content, p.reply_to_id
FROM leaves l
INNER JOIN posts p ON l.id = p.reply_to_id)
SELECT *
FROM leaves;
-- select all children for 'S'
WITH RECURSIVE leaves(id, content, reply_to_id) AS
(SELECT id, content, reply_to_id
FROM posts
WHERE content = 'S'
UNION ALL
SELECT p.id, p.content, p.reply_to_id
FROM leaves l
INNER JOIN posts p ON l.id = p.reply_to_id)
SELECT *
FROM leaves;
-- select all children for 'RO'
WITH RECURSIVE leaves(id, content, reply_to_id) AS
(SELECT id, content, reply_to_id
FROM posts
WHERE content = 'RO'
UNION ALL
SELECT p.id, p.content, p.reply_to_id
FROM leaves l
INNER JOIN posts p ON l.id = p.reply_to_id)
SELECT *
FROM leaves;

How does it work?

The general format of a recursive query is:

WITH RECURSIVE cte_name(arg1, arg2, arg3) AS( -- arg1, arg2 are column outputs from seed_data_query
    seed_data_query -- non-recursive term, executes first
    UNION ALL -- can be just UNION, which returns unique rows. It's slower.
    recursive_query_based_on_seed_data  -- recursive term
) SELECT * FROM cte_name;

When executing, the seed_data_query executes first, giving a set of base results. Note that this is not the "base case" from recursion. In recursion, base case is a terminating clause and is the last to execute. Here, the non-recursive part acts as seed data for the recursive part. Termination of a recursive CTE happens when the recursive part yields 0 rows.

The result of seed_data_query is passed as argument to the next query. For recursive_query_based_on_seed_data, cte_name(arg1, arg2, arg3) is available as a temporary table with columns arg1, arg2, arg3, etc. All your regular join operations are possible with this temporary table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment