Frontier Software

Graphs

Postgresql uses WITH RECURSIVE to traverse graphs stored as databases.

UNION ALL vs UNION

The result of UNION does not contain any duplicate rows unless the ALL option is specified. ALL prevents elimination of duplicates. (Therefore, UNION ALL is usually significantly quicker than UNION; use ALL when you can.) DISTINCT can be written to explicitly specify the default behavior of eliminating duplicate rows. — UNION Clause

Sometimes, using UNION instead of UNION ALL can accomplish this by discarding rows that duplicate previous output rows. — Cycle Detection

I’m opting for UNION without ALL to help avoid cycles.

Basic depth-first search:

WITH RECURSIVE search_graph(init_id, next_id) AS (
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
)
SELECT * FROM search_graph;

Depth-first with ordering information:

WITH RECURSIVE search_graph(init_id, next_id, path) AS (
    SELECT t.init_id, t.next_id, ARRAY[t.init_id]
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id, path || t.init_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
)
SELECT * FROM search_graph ORDER BY path;

or alterntatively

WITH RECURSIVE search_graph(init_id, next_id) AS (
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
) SEARCH DEPTH FIRST BY init_id SET ordercol
SELECT * FROM search_graph ORDER BY ordercol;

To create a breadth-first order, you can add a column that tracks the depth of the search. This also avoids cycles.

WITH RECURSIVE search_graph(init_id, next_id, depth) AS (
    SELECT t.init_id, t.next_id, 0
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id, st.depth + 1
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
)
SELECT * FROM search_graph;

or alterntatively

WITH RECURSIVE search_graph(init_id, next_id) AS (
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION ALL
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
) SEARCH BREADTH FIRST BY init_id SET ordercol
SELECT * FROM search_graph ORDER BY ordercol;

Breadth first with ordering information:

WITH RECURSIVE search_graph(init_id, next_id, depth, path) AS (
    SELECT t.init_id, t.next_id, 0, ARRAY[t.init_id]
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id, st.depth + 1, path || t.init_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
)
SELECT * FROM search_graph ORDER BY path;

Cycle detection

Besides breadth-first with depth as above, storing path can be used by either depth or breadth to avoid cycles:

WITH RECURSIVE search_graph(init_id, next_id, depth, is_cycle, path) AS (
    SELECT t.init_id, t.next_id, 0, false, ARRAY[t.init_id]
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id, st.depth + 1, t.init_id = ANY(path), path || t.init_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id AND NOT is_cycle
)
SELECT * FROM search_graph ORDER BY path;

There is built-in syntax to simplify cycle detection. The above query can also be written like this:

WITH RECURSIVE search_graph(init_id, next_id, depth) AS (
    SELECT t.init_id, t.next_id, 0
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION
    SELECT t.init_id, t.next_id, st.depth + 1
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
) CYCLE init_id SET is_cycle USING path
SELECT * FROM search_graph;

Unexplored nodes

WITH RECURSIVE search_graph(init_id, next_id) AS (
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t
    WHERE t.init_id = 1
  UNION ALL
    SELECT t.init_id, t.next_id
    FROM tictactoe_graph t, search_graph st
    WHERE t.init_id = st.next_id
)
SELECT next_id FROM search_graph
WHERE next_id IN (SELECT id FROM tictactoe_states WHERE NOT terminal)
AND next_id NOT IN (SELECT init_id FROM tictactoe_graph);

Data-Modifying Statements in WITH