Frontier Software

Fill Values

By Robert Laing

The problem of how to deal with unknown values for logical variables led to the development of three-valued logic, which SQL implemented with NULL, and Prolog with unset variables.

Which students have not made any college applications yet?

In preparation of outer joins, lets here do the portion called an antijoin

Student ▷ Apply

Our basic goal here is to find the set {456, 567, 654, 789}.

college-classes.svg

In the example so far involving both the Student and Apply tables — “Which students have not applied for CS?” — I sidestepped the problem that the student table has columns sID, sName, GPA and sizeHS while Apply has columns sID, cName, major and decision by projecting both to their single common column, SID.

To explain the problem, lets look at the type (ie columns or attributes) of the two tables we want to think of as sets P and Q, thereby allowing P - Q:

Student Table
sid sname gpa sizehs
123 Amy 3.9 1000
234 Bob 3.6 1500
345 Craig 3.5 500
456 Doris 3.9 1000
567 Edward 2.9 2000
678 Fay 3.8 200
789 Gary 3.4 800
987 Helen 3.7 800
876 Irene 3.9 400
765 Jay 2.9 1500
654 Amy 3.9 1000
543 Craig 3.4 2000
Apply Table
sid cname major decision
123 Stanford CS Y
123 Stanford EE N
123 Berkeley CS Y
123 Cornell EE Y
234 Berkeley biology N
345 MIT bioengineering Y
345 Cornell bioengineering N
345 Cornell CS Y
345 Cornell EE N
678 Stanford history Y
987 Stanford CS Y
987 Berkeley CS Y
876 Stanford CS N
876 MIT biology Y
876 MIT marine biology N
765 Stanford history Y
765 Cornell history N
765 Cornell psychology Y
543 MIT CS N

SQL

Here’s the error I get if I try to take the difference of these two tables in SQL as is:

dbclass=> SELECT SID, SName, GPA, SizeHS FROM Student
EXCEPT
SELECT SID, CName, Major, Decision FROM Apply;
ERROR:  EXCEPT types real and text cannot be matched
LINE 3: SELECT SID, CName, Major, Decision FROM Apply;

To get the rows of the Student table to match Apply’s column, I need to fill in the missing values with NULL. This is the first time I’m using relational algebra’s rename operator ρ, which in SQL is AS.

SELECT sID, NULL AS cName, NULL AS major, NULL AS decision
FROM Student
EXCEPT
SELECT sID, NULL, NULL, NULL
FROM Apply
ORDER BY sID;

When psql outputs the table, it represents the NULL values as blanks.

Student ▷ Apply
sid cname major decision
456      
567      
654      
789      

Prolog

Prolog doesn’t throw an error when asked to do Student - Apply. Instead, it simply ignores the missing apply column values while keeping the known student column values:

?- student(SID, SName, GPA, SizeHS), \+ apply(SID, CName, Major, Decision).
SID = 456,
SName = 'Doris',
GPA = 3.9,
SizeHS = 1000 ;
SID = 567,
SName = 'Edward',
GPA = 2.9,
SizeHS = 2000 ;
SID = 789,
SName = 'Gary',
GPA = 3.4,
SizeHS = 800 ;
SID = 654,
SName = 'Amy',
GPA = 3.9,
SizeHS = 1000 ;
false.

Prolog doesn’t have the equivalent of a NULL. Lets use the Template argument of findall(+Template, :Goal, -Bag) to project the four sID values for which there are no CName, Major, Decision:

findall(apply(SID, CName, Major, Decision), (
  student(SID, _, _, _), 
  \+ apply(SID, CName, Major, Decision)
), List).

The result is:

List = [
apply(456, _, _, _), 
apply(567, _, _, _), 
apply(789, _, _, _), 
apply(654, _, _, _)
].

Instead of NULL, Prolog uses unset variables as placeholders.