Frontier Software

Transitive Closures

By Robert Laing

A transitive closure is written R+ using a notation developed by regular expressions and finite automata pioneer Stephen Cole Kleene.

R is a binary relation uRv where u and v are nodes in a set S joined by edges. More modern notation would be to write these as u → v, nearly identical to dot’s u -> v notation and easily translates to Prolog rules arc(u, v).

R+ is the set of all these pairs, with the proviso u ≠ v. The set which includes u = v is called a reflexive-transitive closure and written R*, making it possibly larger than R+ because it includes any uRu relations, ie edges which loop-back to a given node.

In the first example tree in graph traversal, there are no loop-back edges, so R+ and R* are the same.

tree.svg

To illustrate the difference, lets add a loop-back edge to node b.

tree.svg

The addition of the new arc

arc(b, b).

causes the query tc(a, X). to endlessly repeat bef, failing to explore the rest of the graph.

b, c, d, b, e, f, b, e, f, ...

Thanks to a relatively new addition to SWI Prolog, this endless cycle trap is easily fixed by adding :- table tc/2. to the top of tree.pl so that it looks like this:

:- table tc/2.

arc(a, b).
arc(a, c).
arc(a, d).
arc(b, b).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(m, t).

tc(X, Y) :-
  arc(X, Y).

tc(X, Z) :-
  arc(X, Y),
  tc(Y, Z).

Now running the query tc(a, X). I got Xs in this order, which is different to what I got a while back:

p, e, t, i, m, n, l, g, h, k, s, j, f, d, c, r, o, b, q

Tabling allows us to make our Prolog code even more succinct, and write our recursive rule in a more pure maths way.

:- table arc/2.

arc(a, b).
arc(a, c).
arc(a, d).
arc(b, b).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(m, t).

arc(X, Z) :- arc(X, Y), arc(Y, Z).

Prolog recursion style-tip

Something most of us Prolog novices learn the hard way is it’s very easy to hang your computer or get stack overflow errors if you’re not cautious about how you write recursive statements.

Returning to our original example with the b-loop along with tabling removed, but with the order of the two statements in the recursive tc(X, Z) rule switched:

arc(a, b).
arc(a, c).
arc(a, d).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(m, t).

tc(X, Y) :-
  arc(X, Y).

tc(X, Z) :-
  tc(X, Y),
  arc(Y, Z).

Running the query tc(a, X). starts promisingly with the order of Xs appearing alphabetical (ie breadth first in our example) until we get to t, whereapon a Ctrl-C is needed to get the commandline back.

Using separate names for recursive rules and always making the last statement the recursive one for the non-base case as in

tc(X, Y) :-
  arc(X, Y).

tc(X, Z) :-
  arc(X, Y),
  tc(Y, Z).

tends to simplify things for both human and computer readers.

The kind of recursive graph traversal xRyRz used above has become synonymous with Kleene stars, and in artificial intelligence textbooks it’s often called the A* algorithm. In Prolog textbooks, what I’ve called tc/2 is commonly written a_star/2.

An asterisk’s ASCII value is 42, and according to science fiction folklore, that’s why author and programer Douglas Adams picked it as the answer to life, the universe and everything in his The Hitchhiker’s Guide to the Galaxy novels.

Snags with the above is the order of traversal is hard to guess and tabling assumes we are not generating nodes as we explore the tree. For this we need breadth first or depth first searches.