# Negation

*By Robert Laing*

It is a very common mistake to imagine that the— First Notions of Logic, Augustus De Morgandenialof a proposition gives the right to affirm the contrary; whereas it should be that theaffirmationof a proposition gives a right todenythe contrary.

In logic, negation is a unary operator which toggles whichever of the two values a variables holds
— I’m using 1 or 0, but it could be *true* or *false*, ⊤ or ⊥, *yes* or *no*,
*on* or *off*, … — to the other.

p | p |
---|---|

1 | 0 |

0 | 1 |

Of the many notations for *NOT*, I favour p over ¬p
to avoid confusion with arithmetic’s minus sign. Whereas
AND in logic algebra is nearly identical to
product
and OR is nearly identical to
sum, -p is meaningless when p can only be 1 or 0.

Translating logic negation to set complement is full of traps for the unwary.

For instance thinking that when φ is a = b as in P = σ_{a = b}(R), it means we can get
P^{c} by simply φ as in
σ_{a != b}(R) is a mistake most people will make on their first attempt at writing
a database query for
“Which students have *not* applied for CS?” used in the next section
NOT. I
suggest trying to write the correct query in SQL or Prolog yourself before proceeding
to get an idea of the spiked pitfalls that await.

## The Law of Excluded Middle

p + p = 1

Logic’s law of the excluded middle says that *p or not p* is a tautology, ie always true.
That is, something is either true or false; there is no middle ground.

By negating both sides of the above equation and then using De Morgan’s Law on the left-hand side, the law of excluded middle can alternatively be written:

p · p = 0

We need the above in equivalence to get from (p + q) · (q + p) to p · q + p · q.

I discovered the word *casuistry* — the use of clever but unsound reasoning —
reading old logic textbooks, and I’d say confusing people by stating p
and then suggesting a misleading p
is a common form of this.

An example is the barber puzzle, associated with
Russell’s Paradox.
Betrand Russel
used it as an illustration to simplify why he didn’t share
Gottlob Frege’s conviction that
all of mathematics could be boiled
down to boolean functions, the φ in what relational algebra calls selection σ_{φ}(R).

I can’t claim to fully understand Russell’s Paradox, but the barber puzzle in no ways makes me think Frege was wrong.

### The barber puzzle

A town’s barber claims to shave everyone who does not shave himself. So who shaves the barber?

I’d say the conundrum relies on a slight of hand to trick people into thinking if p = “shaved by the barber” then p = “self-shaved”, whereas “not shaved by the barber” and “self-shaved” are separate propositions.

The puzzle encourages us to jump to the wrong conclusion that we are looking at diagram 5 where “shaved by the barber” and “self-shaved” are disjointed sets.

Actually, it’s diagram 4. We can think of the left-hand set as
σ_{“Shaved by the barber”}(P) and the right-hand set as σ_{“Self-shaved”}(Q).
The barber could pass both tests, making him the sole element of
σ_{“Shaved by the barber”}(P) ∩ σ_{“Self-shaved”}(Q) without contradicting the
law of the excluded middle.

That’s assuming the barber is a clean shaven man. Why are we sure the barber isn’t a woman?

## Set Complement

It’s easy to forget circular sets have surrounding rectangles, and the town won’t just consist of cleanly shaven men.

Forgetting the surrounding rectangle, E, is a trap that awaits in getting
“Which students have *not* applied for CS?” right.

### Negation as failure

Prolog follows a convention called negation as failure (NAF to its friends) which tripped me up initially, and I suspect will most other people learning the language.

To illustrate NAF, lets quickly return to our simple query for “Which studends applied for ‘CS’?” and let me repeat the way this data is stored in Prolog:

```
%! apply(?SID:integer, ?CName:text, ?Major:text, ?Decision:text) is nondet
apply(123, 'Stanford', 'CS', 'Y').
apply(123, 'Stanford', 'EE', 'N').
apply(123, 'Berkeley', 'CS', 'Y').
apply(123, 'Cornell', 'EE', 'Y').
apply(234, 'Berkeley', 'biology', 'N').
apply(345, 'MIT', 'bioengineering', 'Y').
apply(345, 'Cornell', 'bioengineering', 'N').
apply(345, 'Cornell', 'CS', 'Y').
apply(345, 'Cornell', 'EE', 'N').
apply(678, 'Stanford', 'history', 'Y').
apply(987, 'Stanford', 'CS', 'Y').
apply(987, 'Berkeley', 'CS', 'Y').
apply(876, 'Stanford', 'CS', 'N').
apply(876, 'MIT', 'biology', 'Y').
apply(876, 'MIT', 'marine biology', 'N').
apply(765, 'Stanford', 'history', 'Y').
apply(765, 'Cornell', 'history', 'N').
apply(765, 'Cornell', 'psychology', 'Y').
apply(543, 'MIT', 'CS', 'N').
```

Querying the above with `apply(SID, _, 'CS', _).`

will cause Prolog to run
through the above *facts* in the provided order, matching 123 twice, then 345 etc.

I initially assumed the query “Which studends have not applied for ‘CS’?” could be
done by negating the first query, ie `\+ apply(SID, _, 'CS', _).`

But it turns out the negated version doesn’t run throug the *facts* to check each value
of SID — negated Prolog queries don’t give meaningful results if they are given
unset variables. Personally, I think an
`ERROR: Arguments are not sufficiently instantiated...`

would be more helpful than *false*. But the *NAF* convention is to reply to
ambiguous questions with *false* rather than throw errors.

Before using negated statements, we have to *instantiate* (ie lookup and set)
all the variables they contain to avoid simply getting a meaningless *false*.
Spoiling the next subsection
not
we can find “Which studends have not applied for ‘CS’?” like so:

```
student(SID, _, _, _),
\+ apply(SID, _, 'CS', _).
```

Note the above two statements will only work in that order. If you put the negated statement first,
NAF causes it to simply return *false* — one of many examples
of how claiming Prolog is a declarative
programming language (one where control flow doesn’t matter because *anded* together statements
are supposedly commutative and associative) just causes confusion, a discussion I’ll take further in
series.

Next, lets write a query for “Which students have not applied for CS?”

## Not

*By Robert Laing*

Lets turn the query “Which students applied for ‘CS’?” around, again using Stanford University dean Jennifer Widom’s basic SQL examples using her college applications data.

## Which students have *not* applied for CS?

A trap many people will fall into is to rewrite the σ_{Major = ‘CS’}(Apply) selection
to *not equal*, editing it into something like σ_{Major != ‘CS’}(Apply).

That would return the set for *not* ‘CS’ as
{123, 234, 345, 678, 765, 876}. Recalling that the students who applied for
‘CS’ were {123, 345, 543, 876, 987}, we see we shouldn’t have {123, 345, 876} in our result.
The reason those students match `Major <> 'CS'`

is student 123 also applied for ‘EE’,
345 applied for ‘EE’ and ‘bioengineering’ besides ‘CS’,
while 876 applied for ‘biology’ and ‘marine biology’ besides ‘CS’.

## Difference

*By Robert Laing*

Lets start by repeating the example in not, however using SQL’s EXCEPT operator, which ties into set theory’s P - Q.

In relational algebra, the query we want is
Π_{sID}(Student) - Π_{sID}(σ_{Major = ‘CS’}(Apply)).

Note that relational algebra involving different tables/sets requires them to be
*projected* into a shared type of compound data structure. Here I’m keeping things simple
by only working with their common column sID.
In fill values I’ll go into the complexities
that negation sometimes causes because of missing columns.

## Fill Values

*By Robert Laing*

The problem of how to deal with unknown values for logical variables led to the development of
three-valued logic, which
SQL implemented with *NULL*,
and Prolog with unset variables.

## Which students have not made any college applications yet?

In preparation of outer joins, lets here do the portion called an antijoin

Student ▷ Apply

Our basic goal here is to find the set {456, 567, 654, 789}.

In the example so far involving both the `Student`

and `Apply`

tables —
“Which students have not *applied* for CS?” — I sidestepped the problem
that the student table has columns `sID, sName, GPA`

and `sizeHS`

while
Apply has columns `sID, cName, major`

and `decision`

by
projecting both to their single common column, SID.

## De Morgan's Law

*By Robert Laing*

What I took to calling *copula notation* because I read somewhere that the ∧ and ∨ symbols are called
copulas (but I can’t find the reference again) is a handy mnemonic for remembering two substitution
rules commonly called De Morgan’s Laws:

- ¬(p ∧ q) can be substituted with ¬p ∨ ¬q
- ¬(p ∨ q) can be substituted with ¬p ∧ ¬q

There’s an interesting relationship with set theory in that