# Graph Traversal

*By Robert Laing*

Traversing trees comes up a lot in coding. For intance, I needed to write a recursive HTML template to generate the table of the contents on the left of this web page using Hugo, and I posted my solution to Hugo’s discourse group for anyone interested.

The pattern I followed is called a transitive closure. Whenever I need some computer science theory, I turn to a textbook written by Turing Award winners Al Aho and Jeff Ullman which they have kindly made freely available online. They introduce transivitive closures in chapter 7, The Set Data Model, and expand on it in chapter 9, The Graph Data Model.

To illustrate ways of doing graph traversal in Prolog, I’ll initially use the following directed acyclic graph (DAG) before introducing a cycle to complicate things a little later.

After dabbling with the various JavaScript diagraming tools such as
d3 and vis, I’ve fallen back on Graphviz’s DOT because its syntax for describing graphs is very *prologish*, and that’s what I’ve used here.

Dot unfortunately is not in the huge list of languages Hugo’s syntax highlighter supports.

```
digraph G {
a -> b;
a -> c;
a -> d;
b -> e;
b -> f;
c -> g;
c -> h;
c -> i;
d -> j;
e -> k;
f -> l;
f -> m;
h -> n;
i -> o;
i -> p;
j -> q;
j -> r;
j -> s;
m -> t;
}
```

Running `dot -T svg -o tree.svg tree.dot`

produces the following diagram:

The above diagram can be turned into Prolog code as below. You could either make that a file, called say *tree.pl*
which can be loaded in a the swipl repl as *consult(tree).*, or you could simply cut ’n paste it into the
left-hand box of the SWISH browser-based version of SWI Prolog.

```
arc(a, b).
arc(a, c).
arc(a, d).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(m, t).
tc(X, Y) :-
arc(X, Y).
tc(X, Z) :-
arc(X, Y),
tc(Y, Z).
```

At this stage we are just concerned about visiting every descendent of a given starting node in our tree, not so much the order.

For the arcs ordered as above, the query `tc(a, X)`

produces Xs in in the following order

```
b, c, d, e, f, k, l, m, t, g, h, i, n, o, p, j, q, r, s
```

This works fine provided we don’t introduce any cycles such as the small modification below:

Now our path is going to be

```
b, c, d, e, f, k, e, k, e, k, e,...
```

Various techniques for avoiding a graph traverser going into an endless loop are available and will be discussed in the coming sections.

- Transitive Closures
- CyclesWithTabling
- BreadthDepthFirst

Typically, we want to use graph traversal for PathFinding. To keep things simple here, we’ll cover that in a separate section later which expands on the examples here.

Combined with tabling which I’ll introduce in the next section. this gives a very succinct way of visiting all linked nodes in a graph. The snag is, we don’t have much control over the order in which the nodes are visited.

As I relearnt transversing the directories I’d split these notes into to create a table of content, order is usually very important in graph traversal applications. For instance, Hugo offers a variety of ways to order pages. The one I’ve used is giving each of my content files a weight attribute, but it could also be done alphabetically by title or by the time the file was last edited.

## Depth First

## Breadth First

Transitive Closures and Prolog are nearly synonymous. However, there are times we want to do GraphTraversal by *listifying* the tree, for instance with Puzzle Solving where we don’t initially have a tree to search via a transitive closure. This involves storing unexplored nodes in a list, conventionally called the *frontier*. Whether children of the currently visited node are *stacked* at the front (depth first) or queued to the back (breadth first) of the *frontier* can make a huge difference to the speed and quality of search results.

## Transitive Closures

*By Robert Laing*

A transitive closure is written *R ^{+}* using a notation developed by regular expressions and
finite automata pioneer Stephen Cole Kleene.

*R* is a binary relation *uRv* where *u* and *v* are nodes in a set *S* joined by edges. More modern
notation would be to write these as *u → v*, nearly identical to dot’s `u -> v`

notation and easily translates to Prolog
rules `arc(u, v).`

*R ^{+}* is the set of all these pairs, with the proviso

*u ≠ v*. The set which includes

*u = v*is called a

*reflexive-transitive closure*and written

*R*, making it possibly larger than

^{*}*R*because it includes any

^{+}*uRu*relations, ie edges which

*loop-back*to a given node.

## Route Finding

## cycles-with-tabling

By adding `arc(k, e).`

we alter the previous example in Transitive Closures to this:

Unless guarded against, this will cause `tc(a, X).`

to never escape looping between *k* and *e*. Thanks to a relatively recent addition to SWI Prolog Tabled execution, all that’s needed to avoid this is adding one line of code `prolog:- table tc/2.`

to that previously shown in TransitiveClosures.

```
:- table tc/2.
arc(a, b).
arc(a, c).
arc(a, d).
arc(b, e).
arc(b, f).
arc(c, g).
arc(c, h).
arc(c, i).
arc(d, j).
arc(e, k).
arc(f, l).
arc(f, m).
arc(h, n).
arc(i, o).
arc(i, p).
arc(j, q).
arc(j, r).
arc(j, s).
arc(k, e).
arc(m, t).
tc(X, Y) :-
arc(X, Y).
tc(X, Z) :-
arc(X, Y),
tc(Y, Z).
```

Tabling changes the order of traversal returned by `tc(a, X)`

to this:

## iterative-deepening

*By Robert Laing*

A snag we hit in puzzle solving is the graph isn’t handily prestored in a database
for route finding using transitive closures
with infinite loop protection thanks to cycles with tabling.
We have to turn to what’s sometimes called *generative recursion* to dynamically build our graph as we explore it. For this, we’ll return to the techniques introduced in BreadthDepthFirst.

We’ll be using iterative deepening depth-first search which gets depth first to mimic breadth first by doing it repeatedly, lengthening its depth-limit one step at a time. The reason is unbounded depth first instead of finding the shortest route, finds the leftmost route. Breadth first does find the shortest route, but even the cheap RAM available on modern computers is often insufficient for it to complete this task.