Frontier Software

Or

By Robert Laing

Once again, lets use Stanford University dean Jennifer Widom’s basic SQL examples from her college applications data as an illustration, modifying the same SQL examples done in conjunction as “which students applied for ‘CS’ and ‘EE’?”.

Which students applied for ‘CS’ or ‘EE’?

As with conjunction’s p ∧ q being a spiky version of set theory’s P ∩ Q, disjunction’s p ∨ q is a spiky version of set theory’s P ∪ Q.

In SQL, this query can be written using the UNION operator:

SELECT sid FROM apply WHERE major = 'CS'
UNION
SELECT sid FROM apply WHERE major = 'EE'
ORDER BY sid;

Which produces the set {123, 345, 543, 876, 987}. Since ‘EE’ is a subset of ‘CS’, ‘CS’ ∪ ‘EE’ = ‘CS’.

college-classes.svg

Whereas doing this without a set operator for and involved a self-join, the or version is much simpler:

SELECT DISTINCT sid
FROM apply
WHERE major = 'CS' OR major = 'EE'
ORDER BY sid;

In Prolog, this query can be written:

order_by([asc(SID)], distinct(SID, (apply(SID, _, 'CS', _); apply(SID, _, 'EE', _)))).

Note the only difference from the and version is a comma has been replaced by a semicolon.

While disjunction can be done in Prolog as above using a semicolon — and it’s fine for a simple binary or query as in this example — using semicolons for disjunction in the sense of flow control is a recipe for unreadable spaghetti code.

Prolog offers an elegant alternative form of or for flow control in that each case can be written as a separate rule, which I’ll explain later.

Membership

Another way of thinking of disjunction is as pi ∈ {p1, p2, …, pn}. In SQL, this example could be written like so:

SELECT DISTINCT sid 
FROM apply
WHERE major = ANY(ARRAY['CS', 'EE'])
ORDER BY sid;

Something I found confusing is ANY/SOME along with ALL in the above case are are boolean functions which expect an array as input argument. They have WHERE subquery counterparts which share the same names.

Prolog has a handy member(?Elem, ?List) builtin which can be used as follows to do an or query without any semicolons:

order_by([asc(SID)], distinct(SID, (apply(SID, _, Major, _), member(Major, ['CS', 'EE'])))).

An advantage of thinking of disjunction-style boolean queries as membership rather than ors is these types of queries are typically used to handle synonyms, which tend to proliferate.

For instance, a real life version of this example database is bound to have to deal with some colleges calling the major ‘CS’, others ‘Computer Science" etc. As the list of synonyms grows, it easier to add them in an array or list than appending or tests.