Difference

February 4, 2022

By Robert Laing

Lets start by repeating the example in not, however using SQL’s EXCEPT operator, which ties into set theory’s P - Q.

In relational algebra, the query we want is Π_sID(Student) - Π_sID(σ_{Major = ‘CS’}(Apply)).

Note that relational algebra involving different tables/sets requires them to be projected into a shared type of compound data structure. Here I’m keeping things simple by only working with their common column sID. In fill values I’ll go into the complexities that negation sometimes causes because of missing columns.

Here’s the graphical hint of what the following queries should produce:

Which students have not applied for CS?

SELECT sid FROM student
EXCEPT
SELECT sid FROM apply WHERE major = 'CS'
ORDER BY sid;

In Prolog, P - Q translates fairly directly as

student(SID, _, _, _),
\+ apply(SID, _, 'CS', _).

Of the students who applied for CS, which did not apply for EE?

SQL

SELECT sid FROM apply WHERE major = 'CS'
EXCEPT
SELECT sid FROM apply WHERE major = 'EE'
ORDER BY sid;

Prolog

apply(SID, _, 'CS', _), 
\+ apply(SID, _, 'EE', _).

The result for both is CS - EE = {543, 876, 987}.

Of the students who applied for EE, which did not apply for CS?

EE is a subset of CS, so EE - CS = ∅.

SQL

SELECT sid FROM apply WHERE major = 'EE'
EXCEPT
SELECT sid FROM apply WHERE major = 'CS'
ORDER BY sid;

It returns:

 sid 
-----
(0 rows)

These can be found as subqueries using NOT EXISTS as showin in implication.

Prolog

apply(SID, _, 'EE', _), 
\+ apply(SID, _, 'CS', _).

It returns false.

This query provides a clue how to do implication in SQL and Prolog.

When P - Q = ∅, we know P ⊆ Q — it’s only the case for diagram 1 or 2 — which ties into implication. When both P - Q = ∅ and Q - P = ∅, ie P ⊆ Q and Q ⊆ P, then we know we have equivalence P = Q.

Diagram 1 illustrates classical logic’s equivalence and in that secion I’ll look at translating set theory into SQL and Prolog queries to find majors holding the same students. In the starting state of the college examples, there aren’t any, but that can easily be changed by, say enrolling student 678 for psychology, which would then make history and psychology equal {678, 765}.

What the first three diagrams illustrate is that whereas in normal arithmetic we could say p - q = 0 means p = q, with sets we need to check that both P - Q = ∅ and Q - P = ∅ hold before concluding P = Q.

In these examples I’ve been sidestepping the problem of a type mismatch in the form of the student and apply tables not having the same columns by projecting both down to their single shared column. Next in fill values we’ll look at how SQL and Prolog take different approaches to the unknown values often created when taking set differences.