Negation

By Robert Laing

It is a very common mistake to imagine that the denial of a proposition gives the right to affirm the contrary; whereas it should be that the affirmation of a proposition gives a right to deny the contrary. — First Notions of Logic, Augustus De Morgan

In logic, negation is a unary operator which toggles whichever of the two values a variables holds — I’m using 1 or 0, but it could be true or false, ⊤ or ⊥, yes or no, on or off, … — to the other.

p	p
1	0
0	1

Of the many notations for NOT, I favour p over ¬p to avoid confusion with arithmetic’s minus sign. Whereas AND in logic algebra is nearly identical to product and OR is nearly identical to sum, -p is meaningless when p can only be 1 or 0.

Translating logic negation to set complement is full of traps for the unwary.

For instance thinking that when φ is a = b as in P = σ_{a = b}(R), it means we can get P^c by simply φ as in σ_{a != b}(R) is a mistake most people will make on their first attempt at writing a database query for “Which students have not applied for CS?” used in the next section NOT. I suggest trying to write the correct query in SQL or Prolog yourself before proceeding to get an idea of the spiked pitfalls that await.

The Law of Excluded Middle

p + p = 1

Logic’s law of the excluded middle says that p or not p is a tautology, ie always true. That is, something is either true or false; there is no middle ground.

By negating both sides of the above equation and then using De Morgan’s Law on the left-hand side, the law of excluded middle can alternatively be written:

p · p = 0

We need the above in equivalence to get from (p + q) · (q + p) to p · q + p · q.

I discovered the word casuistry — the use of clever but unsound reasoning — reading old logic textbooks, and I’d say confusing people by stating p and then suggesting a misleading p is a common form of this.

An example is the barber puzzle, associated with Russell’s Paradox. Betrand Russel used it as an illustration to simplify why he didn’t share Gottlob Frege’s conviction that all of mathematics could be boiled down to boolean functions, the φ in what relational algebra calls selection σ_φ(R).

I can’t claim to fully understand Russell’s Paradox, but the barber puzzle in no ways makes me think Frege was wrong.

The barber puzzle

A town’s barber claims to shave everyone who does not shave himself. So who shaves the barber?

I’d say the conundrum relies on a slight of hand to trick people into thinking if p = “shaved by the barber” then p = “self-shaved”, whereas “not shaved by the barber” and “self-shaved” are separate propositions.

The puzzle encourages us to jump to the wrong conclusion that we are looking at diagram 5 where “shaved by the barber” and “self-shaved” are disjointed sets.

Actually, it’s diagram 4. We can think of the left-hand set as σ_{“Shaved by the barber”}(P) and the right-hand set as σ_{“Self-shaved”}(Q). The barber could pass both tests, making him the sole element of σ_{“Shaved by the barber”}(P) ∩ σ_{“Self-shaved”}(Q) without contradicting the law of the excluded middle.

That’s assuming the barber is a clean shaven man. Why are we sure the barber isn’t a woman?

Set Complement

It’s easy to forget circular sets have surrounding rectangles, and the town won’t just consist of cleanly shaven men.

Forgetting the surrounding rectangle, E, is a trap that awaits in getting “Which students have not applied for CS?” right.

Negation as failure

Prolog follows a convention called negation as failure (NAF to its friends) which tripped me up initially, and I suspect will most other people learning the language.

To illustrate NAF, lets quickly return to our simple query for “Which studends applied for ‘CS’?” and let me repeat the way this data is stored in Prolog:

%! apply(?SID:integer, ?CName:text, ?Major:text, ?Decision:text) is nondet
apply(123, 'Stanford', 'CS',             'Y').
apply(123, 'Stanford', 'EE',             'N').
apply(123, 'Berkeley', 'CS',             'Y').
apply(123, 'Cornell',  'EE',             'Y').
apply(234, 'Berkeley', 'biology',        'N').
apply(345, 'MIT',      'bioengineering', 'Y').
apply(345, 'Cornell',  'bioengineering', 'N').
apply(345, 'Cornell',  'CS',             'Y').
apply(345, 'Cornell',  'EE',             'N').
apply(678, 'Stanford', 'history',        'Y').
apply(987, 'Stanford', 'CS',             'Y').
apply(987, 'Berkeley', 'CS',             'Y').
apply(876, 'Stanford', 'CS',             'N').
apply(876, 'MIT',      'biology',        'Y').
apply(876, 'MIT',      'marine biology', 'N').
apply(765, 'Stanford', 'history',        'Y').
apply(765, 'Cornell',  'history',        'N').
apply(765, 'Cornell',  'psychology',     'Y').
apply(543, 'MIT',      'CS',             'N').

Querying the above with apply(SID, _, 'CS', _). will cause Prolog to run through the above facts in the provided order, matching 123 twice, then 345 etc.

I initially assumed the query “Which studends have not applied for ‘CS’?” could be done by negating the first query, ie \+ apply(SID, _, 'CS', _).

But it turns out the negated version doesn’t run throug the facts to check each value of SID — negated Prolog queries don’t give meaningful results if they are given unset variables. Personally, I think an ERROR: Arguments are not sufficiently instantiated... would be more helpful than false. But the NAF convention is to reply to ambiguous questions with false rather than throw errors.

Before using negated statements, we have to instantiate (ie lookup and set) all the variables they contain to avoid simply getting a meaningless false. Spoiling the next subsection not we can find “Which studends have not applied for ‘CS’?” like so:

student(SID, _, _, _), 
\+ apply(SID, _, 'CS', _).

Note the above two statements will only work in that order. If you put the negated statement first, NAF causes it to simply return false — one of many examples of how claiming Prolog is a declarative programming language (one where control flow doesn’t matter because anded together statements are supposedly commutative and associative) just causes confusion, a discussion I’ll take further in series.

Next, lets write a query for “Which students have not applied for CS?”

Not

By Robert Laing

Lets turn the query “Which students applied for ‘CS’?” around, again using Stanford University dean Jennifer Widom’s basic SQL examples using her college applications data.

Which students have not applied for CS?

A trap many people will fall into is to rewrite the σ_{Major = ‘CS’}(Apply) selection to not equal, editing it into something like σ_{Major != ‘CS’}(Apply).

That would return the set for not ‘CS’ as {123, 234, 345, 678, 765, 876}. Recalling that the students who applied for ‘CS’ were {123, 345, 543, 876, 987}, we see we shouldn’t have {123, 345, 876} in our result. The reason those students match Major <> 'CS' is student 123 also applied for ‘EE’, 345 applied for ‘EE’ and ‘bioengineering’ besides ‘CS’, while 876 applied for ‘biology’ and ‘marine biology’ besides ‘CS’.

Difference

By Robert Laing

Lets start by repeating the example in not, however using SQL’s EXCEPT operator, which ties into set theory’s P - Q.

In relational algebra, the query we want is Π_sID(Student) - Π_sID(σ_{Major = ‘CS’}(Apply)).

Note that relational algebra involving different tables/sets requires them to be projected into a shared type of compound data structure. Here I’m keeping things simple by only working with their common column sID. In fill values I’ll go into the complexities that negation sometimes causes because of missing columns.

Fill Values

By Robert Laing

The problem of how to deal with unknown values for logical variables led to the development of three-valued logic, which SQL implemented with NULL, and Prolog with unset variables.

Which students have not made any college applications yet?

In preparation of outer joins, lets here do the portion called an antijoin

Student ▷ Apply

Our basic goal here is to find the set {456, 567, 654, 789}.

In the example so far involving both the Student and Apply tables — “Which students have not applied for CS?” — I sidestepped the problem that the student table has columns sID, sName, GPA and sizeHS while Apply has columns sID, cName, major and decision by projecting both to their single common column, SID.

De Morgan's Law

By Robert Laing

What I took to calling copula notation because I read somewhere that the ∧ and ∨ symbols are called copulas (but I can’t find the reference again) is a handy mnemonic for remembering two substitution rules commonly called De Morgan’s Laws:

¬(p ∧ q) can be substituted with ¬p ∨ ¬q
¬(p ∨ q) can be substituted with ¬p ∧ ¬q

There’s an interesting relationship with set theory in that