Skip to content

Instantly share code, notes, and snippets.

@mithi
Last active September 30, 2020 14:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mithi/e78fadb62562b2c16a31e18cf77eaf18 to your computer and use it in GitHub Desktop.
Save mithi/e78fadb62562b2c16a31e18cf77eaf18 to your computer and use it in GitHub Desktop.

Test data

CREATE TABLE nodes (
    id SERIAL PRIMARY KEY,
    parent VARCHAR(2) NOT NULL,
    child VARCHAR(2) NOT NULL
);

INSERT INTO nodes (parent, child) VALUES  ('B1', 'C1');
INSERT INTO nodes (parent, child) VALUES  ('B1', 'C2');
INSERT INTO nodes (parent, child) VALUES  ('B1', 'C3');
INSERT INTO nodes (parent, child) VALUES  ('B2', 'C2');
INSERT INTO nodes (parent, child) VALUES  ('B3', 'C2');
INSERT INTO nodes (parent, child) VALUES  ('B3', 'C4');
INSERT INTO nodes (parent, child) VALUES  ('B8', 'C4');

postgres=# SELECT * FROM nodes;
 id | parent | child 
----+--------+-------
  1 | B1     | C1
  2 | B1     | C2
  3 | B1     | C3
  4 | B2     | C2
  5 | B3     | C2
  6 | B3     | C4
  7 | B6     | C8
(7 rows)

Bidirected version of the graph

(2x the number of rows) Where the parent of the child is also the child of the parent

postgres=# SELECT parent, child from nodes UNION SELECT child, parent from nodes;
 parent | child 
--------+-------
 C4     | B3
 C2     | B3
 B3     | C4
 B1     | C2
 B1     | C3
 B1     | C1
 B3     | C2
 C2     | B1
 C3     | B1
 C8     | B6
 B2     | C2
 C1     | B1
 B6     | C8
 C2     | B2
(14 rows)

Our starting / base case

SELECT parent, child FROM nodes where parent = 'B1';

parent | child
--------+-------
 B1     | C1
 B1     | C2
 B1     | C3

Rows that include the nodes connected to 'B1'

(https://www.postgresqltutorial.com/postgresql-recursive-query/)

WITH RECURSIVE included_nodes (parent, child) AS (
    SELECT parent, child FROM bidirected WHERE parent='B1'
    UNION
    SELECT bidirected.parent, bidirected.child FROM bidirected INNER JOIN included_nodes ON bidirected.parent = included_nodes.child
),

bidirected AS (SELECT parent, child from nodes UNION SELECT child, parent FROM nodes)
SELECT * FROM included_nodes;

parent | child 
--------+-------
 B1     | C2
 B1     | C3
 B1     | C1
 C2     | B3
 C2     | B1
 C3     | B1
 C1     | B1
 C2     | B2
 B3     | C4
 B3     | C2
 B2     | C2
 C4     | B3
(12 rows)

The distinct nodes included in the graph

WITH RECURSIVE included_nodes (parent, child) AS (
    SELECT parent, child FROM bidirected WHERE parent='B1'
    UNION
    SELECT bidirected.parent, bidirected.child FROM bidirected INNER JOIN included_nodes ON bidirected.parent = included_nodes.child
),

bidirected AS (SELECT parent, child from nodes UNION SELECT child, parent FROM nodes)
SELECT DISTINCT parent FROM included_nodes;

 parent 
--------
 B3
 C1
 B1
 C2
 C3
 C4
 B2
(7 rows)

Number of distinct nodes

WITH RECURSIVE included_nodes (parent, child) AS (
    SELECT parent, child FROM bidirected WHERE parent='B1'
    UNION
    SELECT bidirected.parent, bidirected.child FROM bidirected INNER JOIN included_nodes ON bidirected.parent = included_nodes.child
),

bidirected AS (SELECT parent, child from nodes UNION SELECT child, parent FROM nodes)
SELECT COUNT(DISTINCT parent) FROM included_nodes;
 count 
-------
     7
(1 row)
@mithi
Copy link
Author

mithi commented Sep 29, 2020

Another example

CREATE TABLE nodes (
    id SERIAL PRIMARY KEY,
    parent VARCHAR(2) NOT NULL,
    child VARCHAR(2) NOT NULL
);

INSERT INTO nodes (parent, child) VALUES  ('a', 'c');
INSERT INTO nodes (parent, child) VALUES  ('b', 'f');
INSERT INTO nodes (parent, child) VALUES  ('a', 'g');
INSERT INTO nodes (parent, child) VALUES  ('c', 'h');
INSERT INTO nodes (parent, child) VALUES  ('b', 'j');
INSERT INTO nodes (parent, child) VALUES  ('d', 'f');
INSERT INTO nodes (parent, child) VALUES  ('e', 'k');
INSERT INTO nodes (parent, child) VALUES  ('l', 'h');

 id | parent | child
----+--------+-------
  1 | a      | c
  2 | b      | f
  3 | a      | g
  4 | c      | h
  5 | b      | j
  6 | d      | f
  7 | e      | k
  8 | l      | h
(8 rows)

In summary

Identifier | Gr_ID    |    Gr.Members
---------------------------------------------------
a       |      1      |   (a,c,g,h,l)
b       |      2      |   (b,d,f,j)
c       |      1      |   (a,c,g,h,l)
d       |      2      |   (b,d,f,j)
e       |      3      |   (e,k)
f       |      2      |   (b,d,f,j)
g       |      1      |   (a,c,g,h,l)
h       |      1      |   (a,c,g,h,l)
j       |      2      |   (b,d,f,j)
k       |      3      |   (e,k)
l       |      1      |   (a,c,g,h,l)

The statement

WITH RECURSIVE included_nodes (parent, child) AS (
    SELECT parent, child FROM bidirected WHERE parent='a'
    UNION
    SELECT bidirected.parent, bidirected.child FROM bidirected INNER JOIN included_nodes ON bidirected.parent = included_nodes.child
),

bidirected AS (SELECT parent, child from nodes UNION SELECT child, parent FROM nodes)
SELECT DISTINCT parent FROM included_nodes;

Results for a, c, h

 parent 
--------
 a
 g
 l
 c
 h
(5 rows)

Results for b or f

 parent 
--------
 d
 b
 j
 f
(4 rows)

Result for k or e

 parent 
--------
 k
 e
(2 rows)

@mithi
Copy link
Author

mithi commented Sep 29, 2020

3rd example

CREATE TABLE nodes (
    id SERIAL PRIMARY KEY,
    parent VARCHAR(2) NOT NULL,
    child VARCHAR(2) NOT NULL
);

INSERT INTO nodes (parent, child) VALUES  ('1', '10');
INSERT INTO nodes (parent, child) VALUES  ('1', '11');
INSERT INTO nodes (parent, child) VALUES  ('1', '12');
INSERT INTO nodes (parent, child) VALUES  ('2', '10');
INSERT INTO nodes (parent, child) VALUES  ('3', '11');
INSERT INTO nodes (parent, child) VALUES  ('3', '13');
INSERT INTO nodes (parent, child) VALUES  ('3', '14');
INSERT INTO nodes (parent, child) VALUES  ('4', '15');
INSERT INTO nodes (parent, child) VALUES  ('5', '15');
INSERT INTO nodes (parent, child) VALUES  ('6', '16');

postgres=# select * from nodes
postgres-# ;
 id | parent | child 
----+--------+-------
  1 | 1      | 10
  2 | 1      | 11
  3 | 1      | 12
  4 | 2      | 10
  5 | 3      | 11
  6 | 3      | 13
  7 | 3      | 14
  8 | 4      | 15
  9 | 5      | 15
 10 | 6      | 16
(10 rows)

1 or 12

 parent 
--------
 13
 2
 11
 12
 10
 3
 14
 1
(8 rows)

4 or 15

 parent 
--------
 4
 15
 5
(3 rows)

6 or 16

 parent 
--------
 6
 16
(2 rows)

@mithi
Copy link
Author

mithi commented Sep 30, 2020

MISC

SELECT parent, array_agg(child) as children
FROM nodes where parent = 'B1'
GROUP BY parent;

 parent |  children
--------+------------
 B1     | {C1,C2,C3}

SELECT parent, array_agg(child) as children
FROM nodes GROUP BY parent;

 parent |  children
--------+------------
 B3     | {C2,C4}
 B1     | {C1,C2,C3}
 B2     | {C2}
 

https://learnsql.com/blog/get-to-know-the-power-of-sql-recursive-queries/
https://www.fusionbox.com/blog/detail/graph-algorithms-in-a-database-recursive-ctes-and-topological-sort-with-postgres/620/
https://stackoverflow.com/questions/58081344/sql-recursive-cte-graph-traversal
https://www.postgresql.org/docs/9.1/queries-with.html
https://www.postgresqltutorial.com/postgresql-recursive-query/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment