Frontier Software

Outer Join

By Robert Laing

As explained in inner join relational algebra uses various bowtie symbols to describe binary operations on two tables, which are anologous to combining two sets as I’ll try to explain in this diagram.

P - Q P ∩ Q Q - P P Q

The P ∩ Q portion could be considered inner join P ⋈ Q. A left outer join P ⟕ Q is (P - Q) ∪ (P ∩ Q), a right outer join P ⟖ Q is (P ∩ Q) ∪ (Q - P), and a full outer join P ⟗ Q is (P - Q) ∪ (P ∩ Q) ∪ (Q - P).

To be able to use relations — tables — as sets, we need to project both to the same columns, which as explained in fill values can create problems for the unwary.

Left Outer Join

Now I want to modify the Three-way inner join example to include {456, 567, 789, 654}. This can be done with a three-way left outer join

Student ⟕ Apply ⟕ College

sid sname gpa sizehs major decision cname state enrollment
123 Amy 3.9 1000 CS Y Berkeley CA 36000
123 Amy 3.9 1000 EE Y Cornell NY 21000
123 Amy 3.9 1000 EE N Stanford CA 15000
123 Amy 3.9 1000 CS Y Stanford CA 15000
234 Bob 3.6 1500 biology N Berkeley CA 36000
345 Craig 3.5 500 bioengineering Y MIT MA 10000
345 Craig 3.5 500 EE N Cornell NY 21000
345 Craig 3.5 500 bioengineering N Cornell NY 21000
345 Craig 3.5 500 CS Y Cornell NY 21000
456 Doris 3.9 1000          
543 Craig 3.4 2000 CS N MIT MA 10000
567 Edward 2.9 2000          
654 Amy 3.9 1000          
678 Fay 3.8 200 history Y Stanford CA 15000
765 Jay 2.9 1500 history N Cornell NY 21000
765 Jay 2.9 1500 history Y Stanford CA 15000
765 Jay 2.9 1500 psychology Y Cornell NY 21000
789 Gary 3.4 800          
876 Irene 3.9 400 CS N Stanford CA 15000
876 Irene 3.9 400 biology Y MIT MA 10000
876 Irene 3.9 400 marine biology N MIT MA 10000
987 Helen 3.7 800 CS Y Stanford CA 15000
987 Helen 3.7 800 CS Y Berkeley CA 36000
SELECT sID, sName, GPA, sizeHS, major, decision, College.cName, state, enrollment
FROM Student LEFT OUTER JOIN Apply USING(sID) LEFT OUTER JOIN College USING(cName)
ORDER BY sID;

Full Outer Join

Lets insert a row into the apply table with no matching sID in the student table and cName in the College table:

INSERT INTO Apply VALUES (321, 'Harvard', 'history', 'N');

A reminder that USING allows us to dispense with tablename.shared_column syntax, which would cause unwanted NULL values in those respected columns otherwise.

SELECT sID, sName, GPA, sizeHS, major, decision, cName, state, enrollment
FROM Student FULL OUTER JOIN Apply USING(sID) FULL OUTER JOIN College USING(cName)
ORDER BY sID;

A synonymous way to do that is

SELECT sID, sName, GPA, sizeHS, major, decision, cName, state, enrollment
FROM Student NATURAL FULL OUTER JOIN Apply NATURAL FULL OUTER JOIN College
ORDER BY sID;
sid sname gpa sizehs major decision cname state enrollment
123 Amy 3.9 1000 EE Y Cornell NY 21000
123 Amy 3.9 1000 CS Y Stanford CA 15000
123 Amy 3.9 1000 EE N Stanford CA 15000
123 Amy 3.9 1000 CS Y Berkeley CA 36000
234 Bob 3.6 1500 biology N Berkeley CA 36000
321       history N Harvard    
345 Craig 3.5 500 bioengineering Y MIT MA 10000
345 Craig 3.5 500 bioengineering N Cornell NY 21000
345 Craig 3.5 500 EE N Cornell NY 21000
345 Craig 3.5 500 CS Y Cornell NY 21000
456 Doris 3.9 1000          
543 Craig 3.4 2000 CS N MIT MA 10000
567 Edward 2.9 2000          
654 Amy 3.9 1000          
678 Fay 3.8 200 history Y Stanford CA 15000
765 Jay 2.9 1500 history N Cornell NY 21000
765 Jay 2.9 1500 psychology Y Cornell NY 21000
765 Jay 2.9 1500 history Y Stanford CA 15000
789 Gary 3.4 800          
876 Irene 3.9 400 biology Y MIT MA 10000
876 Irene 3.9 400 marine biology N MIT MA 10000
876 Irene 3.9 400 CS N Stanford CA 15000
987 Helen 3.7 800 CS Y Stanford CA 15000
987 Helen 3.7 800 CS Y Berkeley CA 36000

Don’t forget to do delete the added row to get the database back to the starting state:

DELETE FROM Apply WHERE sID = 321;