Frontier Software

Graph Traversal

By Robert Laing

Traversing trees comes up a lot in coding. For intance, I needed to write a recursive HTML template to generate the table of the contents on the left of this web page using Hugo, and I posted my solution to Hugo’s discourse group for anyone interested.

The pattern I followed is called a transitive closure. Whenever I need some computer science theory, I turn to a textbook written by Turing Award winners Al Aho and Jeff Ullman which they have kindly made freely available online. They introduce transivitive closures in chapter 7, The Set Data Model, and expand on it in chapter 9, The Graph Data Model.

To illustrate ways of doing graph traversal in Prolog, I’ll initially use the following directed acyclic graph (DAG) before introducing a cycle to complicate things a little later.

After dabbling with the various JavaScript diagraming tools such as d3 and vis, I’ve fallen back on Graphviz’s DOT because its syntax for describing graphs is very prologish, and that’s what I’ve used here.

Dot unfortunately is not in the huge list of languages Hugo’s syntax highlighter supports.

digraph G {
  a -> b;
  a -> c;
  a -> d;
  b -> e;
  b -> f;
  c -> g;
  c -> h;
  c -> i;
  d -> j;
  e -> k;
  f -> l;
  f -> m;
  h -> n;
  i -> o;
  i -> p;
  j -> q;
  j -> r;
  j -> s;
  m -> t;
}

Running dot -T svg -o tree.svg tree.dot produces the following diagram:

tree.svg

The above diagram can be turned into Prolog code as below. You could either make that a file, called say tree.pl which can be loaded in a the swipl repl as consult(tree)., or you could simply cut ’n paste it into the left-hand box of the SWISH browser-based version of SWI Prolog.

arc(a, b).
arc(a, c).
arc(a, d).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(m, t).

tc(X, Y) :-
  arc(X, Y).

tc(X, Z) :-
  arc(X, Y),
  tc(Y, Z).

At this stage we are just concerned about visiting every descendent of a given starting node in our tree, not so much the order.

For the arcs ordered as above, the query tc(a, X) produces Xs in in the following order

b, c, d, e, f, k, l, m, t, g, h, i, n, o, p, j, q, r, s

This works fine provided we don’t introduce any cycles such as the small modification below:

tree_cycle.svg

Now our path is going to be

b, c, d, e, f, k, e, k, e, k, e,...

Various techniques for avoiding a graph traverser going into an endless loop are available and will be discussed in the coming sections.

  1. Transitive Closures
  2. CyclesWithTabling
  3. BreadthDepthFirst

Typically, we want to use graph traversal for PathFinding. To keep things simple here, we’ll cover that in a separate section later which expands on the examples here.

Combined with tabling which I’ll introduce in the next section. this gives a very succinct way of visiting all linked nodes in a graph. The snag is, we don’t have much control over the order in which the nodes are visited.

As I relearnt transversing the directories I’d split these notes into to create a table of content, order is usually very important in graph traversal applications. For instance, Hugo offers a variety of ways to order pages. The one I’ve used is giving each of my content files a weight attribute, but it could also be done alphabetically by title or by the time the file was last edited.

Depth First

Breadth First

Transitive Closures and Prolog are nearly synonymous. However, there are times we want to do GraphTraversal by listifying the tree, for instance with Puzzle Solving where we don’t initially have a tree to search via a transitive closure. This involves storing unexplored nodes in a list, conventionally called the frontier. Whether children of the currently visited node are stacked at the front (depth first) or queued to the back (breadth first) of the frontier can make a huge difference to the speed and quality of search results.

Transitive Closures

By Robert Laing

A transitive closure is written R+ using a notation developed by regular expressions and finite automata pioneer Stephen Cole Kleene.

R is a binary relation uRv where u and v are nodes in a set S joined by edges. More modern notation would be to write these as u → v, nearly identical to dot’s u -> v notation and easily translates to Prolog rules arc(u, v).

R+ is the set of all these pairs, with the proviso u ≠ v. The set which includes u = v is called a reflexive-transitive closure and written R*, making it possibly larger than R+ because it includes any uRu relations, ie edges which loop-back to a given node.

Route Finding

cycles-with-tabling

By adding arc(k, e). we alter the previous example in Transitive Closures to this:

tree_cycle.svg

Unless guarded against, this will cause tc(a, X). to never escape looping between k and e. Thanks to a relatively recent addition to SWI Prolog Tabled execution, all that’s needed to avoid this is adding one line of code prolog:- table tc/2. to that previously shown in TransitiveClosures.

:- table tc/2.

arc(a, b).
arc(a, c).
arc(a, d).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(k, e).
arc(m, t).

tc(X, Y) :-
    arc(X, Y).

tc(X, Z) :-
    arc(X, Y),
    tc(Y, Z).

Tabling changes the order of traversal returned by tc(a, X) to this:

iterative-deepening

By Robert Laing

A snag we hit in puzzle solving is the graph isn’t handily prestored in a database for route finding using transitive closures with infinite loop protection thanks to cycles with tabling. We have to turn to what’s sometimes called generative recursion to dynamically build our graph as we explore it. For this, we’ll return to the techniques introduced in BreadthDepthFirst.

We’ll be using iterative deepening depth-first search which gets depth first to mimic breadth first by doing it repeatedly, lengthening its depth-limit one step at a time. The reason is unbounded depth first instead of finding the shortest route, finds the leftmost route. Breadth first does find the shortest route, but even the cheap RAM available on modern computers is often insufficient for it to complete this task.