Outer Join
By Robert Laing
As explained in inner join relational algebra uses various bowtie symbols to describe binary operations on two tables, which are anologous to combining two sets as I’ll try to explain in this diagram.
The P ∩ Q portion could be considered inner join P ⋈ Q. A left outer join P ⟕ Q is (P - Q) ∪ (P ∩ Q), a right outer join P ⟖ Q is (P ∩ Q) ∪ (Q - P), and a full outer join P ⟗ Q is (P - Q) ∪ (P ∩ Q) ∪ (Q - P).
To be able to use relations — tables — as sets, we need to project both to the same columns, which as explained in fill values can create problems for the unwary.
Left Outer Join
Now I want to modify the Three-way inner join example to include {456, 567, 789, 654}. This can be done with a three-way left outer join
Student ⟕ Apply ⟕ College
sid | sname | gpa | sizehs | major | decision | cname | state | enrollment |
---|---|---|---|---|---|---|---|---|
123 | Amy | 3.9 | 1000 | CS | Y | Berkeley | CA | 36000 |
123 | Amy | 3.9 | 1000 | EE | Y | Cornell | NY | 21000 |
123 | Amy | 3.9 | 1000 | EE | N | Stanford | CA | 15000 |
123 | Amy | 3.9 | 1000 | CS | Y | Stanford | CA | 15000 |
234 | Bob | 3.6 | 1500 | biology | N | Berkeley | CA | 36000 |
345 | Craig | 3.5 | 500 | bioengineering | Y | MIT | MA | 10000 |
345 | Craig | 3.5 | 500 | EE | N | Cornell | NY | 21000 |
345 | Craig | 3.5 | 500 | bioengineering | N | Cornell | NY | 21000 |
345 | Craig | 3.5 | 500 | CS | Y | Cornell | NY | 21000 |
456 | Doris | 3.9 | 1000 | |||||
543 | Craig | 3.4 | 2000 | CS | N | MIT | MA | 10000 |
567 | Edward | 2.9 | 2000 | |||||
654 | Amy | 3.9 | 1000 | |||||
678 | Fay | 3.8 | 200 | history | Y | Stanford | CA | 15000 |
765 | Jay | 2.9 | 1500 | history | N | Cornell | NY | 21000 |
765 | Jay | 2.9 | 1500 | history | Y | Stanford | CA | 15000 |
765 | Jay | 2.9 | 1500 | psychology | Y | Cornell | NY | 21000 |
789 | Gary | 3.4 | 800 | |||||
876 | Irene | 3.9 | 400 | CS | N | Stanford | CA | 15000 |
876 | Irene | 3.9 | 400 | biology | Y | MIT | MA | 10000 |
876 | Irene | 3.9 | 400 | marine biology | N | MIT | MA | 10000 |
987 | Helen | 3.7 | 800 | CS | Y | Stanford | CA | 15000 |
987 | Helen | 3.7 | 800 | CS | Y | Berkeley | CA | 36000 |
SELECT sID, sName, GPA, sizeHS, major, decision, College.cName, state, enrollment
FROM Student LEFT OUTER JOIN Apply USING(sID) LEFT OUTER JOIN College USING(cName)
ORDER BY sID;
Full Outer Join
Lets insert a row into the apply table with no matching sID in the student table and cName in the College table:
INSERT INTO Apply VALUES (321, 'Harvard', 'history', 'N');
A reminder that USING
allows us to dispense with tablename.shared_column
syntax, which
would cause unwanted NULL
values in those respected columns otherwise.
SELECT sID, sName, GPA, sizeHS, major, decision, cName, state, enrollment
FROM Student FULL OUTER JOIN Apply USING(sID) FULL OUTER JOIN College USING(cName)
ORDER BY sID;
A synonymous way to do that is
SELECT sID, sName, GPA, sizeHS, major, decision, cName, state, enrollment
FROM Student NATURAL FULL OUTER JOIN Apply NATURAL FULL OUTER JOIN College
ORDER BY sID;
sid | sname | gpa | sizehs | major | decision | cname | state | enrollment |
---|---|---|---|---|---|---|---|---|
123 | Amy | 3.9 | 1000 | EE | Y | Cornell | NY | 21000 |
123 | Amy | 3.9 | 1000 | CS | Y | Stanford | CA | 15000 |
123 | Amy | 3.9 | 1000 | EE | N | Stanford | CA | 15000 |
123 | Amy | 3.9 | 1000 | CS | Y | Berkeley | CA | 36000 |
234 | Bob | 3.6 | 1500 | biology | N | Berkeley | CA | 36000 |
321 | history | N | Harvard | |||||
345 | Craig | 3.5 | 500 | bioengineering | Y | MIT | MA | 10000 |
345 | Craig | 3.5 | 500 | bioengineering | N | Cornell | NY | 21000 |
345 | Craig | 3.5 | 500 | EE | N | Cornell | NY | 21000 |
345 | Craig | 3.5 | 500 | CS | Y | Cornell | NY | 21000 |
456 | Doris | 3.9 | 1000 | |||||
543 | Craig | 3.4 | 2000 | CS | N | MIT | MA | 10000 |
567 | Edward | 2.9 | 2000 | |||||
654 | Amy | 3.9 | 1000 | |||||
678 | Fay | 3.8 | 200 | history | Y | Stanford | CA | 15000 |
765 | Jay | 2.9 | 1500 | history | N | Cornell | NY | 21000 |
765 | Jay | 2.9 | 1500 | psychology | Y | Cornell | NY | 21000 |
765 | Jay | 2.9 | 1500 | history | Y | Stanford | CA | 15000 |
789 | Gary | 3.4 | 800 | |||||
876 | Irene | 3.9 | 400 | biology | Y | MIT | MA | 10000 |
876 | Irene | 3.9 | 400 | marine biology | N | MIT | MA | 10000 |
876 | Irene | 3.9 | 400 | CS | N | Stanford | CA | 15000 |
987 | Helen | 3.7 | 800 | CS | Y | Stanford | CA | 15000 |
987 | Helen | 3.7 | 800 | CS | Y | Berkeley | CA | 36000 |
Don’t forget to do delete the added row to get the database back to the starting state:
DELETE FROM Apply WHERE sID = 321;