Vous êtes sur la page 1sur 11

SQL Joins

Paul W. Harkins
2009-02-01
Purpose
One of the most common interview questions for IT o!s is "can #ou e$%lain the
&ifference !etween an inner oin an& an outer oin'( This question is common for a ver#
sim%le reason) un&erstan&in* the &ifference !etween an inner oin an& an outer oin an&
which one to use for a %articular %ur%ose is the ke# to writin* com%le$ +,- queries
necessar# to ca%ture an& anal#.e the com%le$ &ata use& in most lar*e a%%lications an&
man# small a%%lications/ so this knowle&*e ver# clearl# se%arates the em%lo#ees who
nee& no su%ervision from the em%lo#ees who will !e constantl# askin* for assistance.
The %ur%ose of this &ocument is not ust to *ive #ou a cheat sheet so #ou can *et
throu*h an interview soun&in* like #ou actuall# know +,-/ !ut rather to *ive #ou an
un&erstan&in* of oins that will allow #ou to e$cel in #our current %osition even if #ou are
not selecte& for a new one.
That sai&/ this is not an +,- trainin* course. This *ui&e is inten&e& to fill a *a% in
%o%ular +,- learnin* materials an& courses. 0earl# all instructional materials an&
instructors will at least attem%t to e$%lain the &ifference !etween an inner an& an outer
oin/ !ut few &o so in a wa# that stu&ents can un&erstan& if the# &o not alrea&# know the
&ifference. 1# intention is to e$%lain the &ifference in a wa# that the rest of us can
un&erstan&. To this en&/ I will avoi& "!i* wor&s( an& &iscussion of the un&erl#in*
mathematical theor# where %ossi!le.
Assumptions
Throu*hout this &ocument we will assume the followin*)
2ou alrea&# un&erstan& +,- well enou*h to write sin*le-ta!le queries.
2ou are at least va*uel# familiar with the %ur%ose of the WH343 clause.
2ou have access to a relational &ata!ase that su%%orts +,- queries.
There is alrea&# &ata in #our &ata!ase.
2ou alrea&# have access to a quer# e$ecution tool 5even if the tool is ust the
comman& line interface to the &ata!ase mana*ement s#stem6 an& #ou know
how to e$ecute queries with it.
If an# of the assum%tions a!ove are incorrect/ the information !elow ma# not !e
of much value to #ou. 1ost +,- manuals are clear enou*h to *ive the avera*e %erson the
require& knowle&*e/ an& most of the manuals also inclu&e either a &ata!ase with &ata in it
or instructions to create one. Please consult #our manual an&7or #our &ata!ase
a&ministrator for further information.
Sample Tables
The ta!les in the &ia*ram !elow show a common &ata!ase structure. 1an# of the
fiel&s that woul& normall# !e in these ta!les are not in them !ecause the# are not nee&e&
to e$%lain the conce%ts. 8elow the &ia*ram is a listin* of the &ata in each of the ta!les.
In the structure !elow/ a customer can have man# or&ers. 9n or&er can have man#
%ro&ucts/ an& a %ro&uct can !e on man# or&ers.
There is one customer in the &ata!ase who &oes not have an# or&ers. There are
also several or&ers that &o not have %ro&ucts. :or the %ur%ose of this e$ercise/ I turne& off
referential inte*rit# checks/ which allowe& me to create or&ers that &i& not have
customers. It mi*ht seem like this shoul& never ha%%en in a %ro&uction s#stem/ !ut quite
often an initial &ata loa& will !e e$ecute& without referential inte*rit# checks !ecause the
loa& can run much faster. We woul& all ho%e that all require& &ata woul& !e inclu&e&/ !ut
sometimes &ecisions are ma&e to loa& onl# %art of the le*ac# &ata/ an& even when all &ata
is e$%ecte& to !e loa&e&/ someone has to write the +,- to verif# that all &ata was
correctl# loa&e&/ so I will consi&er this a vali& e$am%le even thou*h I ha& to &isa!le
referential inte*rit# to create the e$am%le.
Inner Join
9n inner oin will out%ut onl# the results from each ta!le where the oin
con&itions are met. It will not out%ut an# rows from either ta!le where the con&itions are
not met. In the e$am%les !elow/ the oin con&ition is ;<+T.;<+T=I> ? O4>.;<+T=I>.
Onl# rows where the ;<+T=I> fiel& matches in !oth ta!les will !e &is%la#e& in the
out%ut.
There are two !asic wa#s to im%lement an inner oin. 1ost of #ou are %ro!a!l#
alrea&# familiar with one of them.
16 2ou can %ut #our oin con&itions in the WH343 clause)
SELECT
*
FROM
CUST
, ORD
WHERE
CUST.CUST_ID = ORD.CUST_ID
@
26 2ou can use I0034 AOI0 in the :4O1 clause 50ote) in most relational &ata!ase
mana*ement s#stems/ AOI0 is equivalent to I0034 AOI06)
SELECT
*
FROM
CUST
INNER JOIN ORD
ON CUST.CUST_ID = ORD.CUST_ID
;
Left Outer Join
Outer oins are use& to &is%la# the results from one ta!le where there are no
corres%on&in* results in another ta!le 5where the oin con&itions are not met6. If #ou want
to know which of #our customers have not or&ere& an#thin*/ #ou woul& nee& an outer
oin !ecause if a customer has not or&ere& an#thin*/ there will !e no entr# in the O4>
ta!le to match the customer/ an& an inner oin woul& not out%ut that customer at all.
1uch of the confusion aroun& outer oins is cause& !# the use of -3:T an&
4IBHT. -ook at the quer# !elow)
SELECT * FROM CUST LEFT OUTER JOIN ORD ON CUST.CUST_ID = ORD.CUST_ID;
With the quer# written on a sin*le line/ the first ta!le 5;<+T6 is to the left of the
secon& ta!le 5O4>6. 9 left outer oin will out%ut all of the rows in the first 5or left6 ta!le
an& onl# the rows from the secon& 5or ri*ht6 ta!le that match the oin con&itions. This
means that where the oin con&itions are not met/ the out%ut for the secon& 5or ri*ht6 ta!le
will !e fille& with nulls 5some quer# tools will actuall# &is%la# C0<--D/ an& others will
sim%l# &is%la# a !lank fiel&6.
In the result set !elow/ #ou can see that 0ick has not or&ere& an#thin*. If #ou *o
!ack an& look at the result sets for the inner oin e$am%les/ #ou will see that 0ick was not
&is%la#e& at all !ecause there was no matchin* entr# with his I> in the O4> ta!le.
Right Outer Join
4i*ht outer oins are use& for the same %ur%ose that left outer oins are use& for/
!ut the# are much less common. The reason the# are much less common is that in most
cases where one mi*ht want to use a ri*ht outer oin/ he or she can sim%l# swa% the or&er
of the ta!les in the quer# an& use a left outer oin instea&.
If we use the same quer# we use& for the left outer oin e$am%le a!ove an&
chan*e it to a ri*ht outer oin without chan*in* the ta!le or&er/ we will see all of the
or&ers that have customers an& all of the or&ers that &o not have customers/ where the
same quer# with a left outer oin showe& us all of the customers that have or&ers an& all
of the customers that &o not have or&ers.
SELECT * FROM CUST RIGHT OUTER JOIN ORD ON CUST.CUST_ID = ORD.CUST_ID;
Once a*ain/ the non-matchin* results will !e &is%la#e& as null. 0otice in the
out%ut !elow that the null values are on the left where the# were %reviousl# on the ri*ht
when we use& a left outer oin. +ince we &i& not s%ecif# the or&er to out%ut the fiel&s 5we
use& +3-3;T E6/ the fiel&s from the left ta!le are &is%la#e& on the left an& the fiel&s
from the ri*ht ta!le are &is%la#e& on the ri*ht.
Full Outer Join
+ince I &o not currentl# have access to a >82 &ata!ase that I can create the
sam%le ta!les in an& 1#+,- &oes not su%%ort full outer oins/ I will &escri!e the conce%t
an& %rovi&e simulate& out%ut !elow.
:ull outer oins are use& where we nee& all rows from !oth left an& ri*ht ta!les
!ut some rows in each ta!le &o not have corres%on&in* entries in the other ta!le. In our
e$am%le/ we will !e lookin* at all customers an& all or&ers. Where a customer matches
an or&er/ we want to &is%la# the results on one line. Where a customer &oes not have an#
or&ers/ we want to &is%la# the customer an& some null fiel&s where the or&er shoul& !e.
Where an or&er &oes not have a customer assi*ne& to it/ we want to &is%la# the or&er an&
some null fiel&s where the customer shoul& !e.
SELECT * FROM CUST FULL OUTER JOIN ORD ON CUST.CUST_ID = ORD.CUST_ID;
:ull outer oins are rare/ !ut in an# situation where it is nee&e&/ the full outer oin
is much less com%le$ than the alternatives.
Using Multiple Joins
-ookin* throu*h the &ataset we starte& with/ #ou mi*ht have notice& that not onl#
&o we have customers without or&ers an& or&ers without customers/ !ut we also have
or&ers with no %ro&ucts. The customers without or&ers mi*ht !e e$%laine& !# a le*ac#
&ata conversion that &i& not inclu&e out&ate& or&ers or !# an initial contact with a
customer who has not #et &eci&e& if he or she wants to or&er an#thin* at all. The or&ers
without customers an& the or&ers with no %ro&ucts on them/ however/ %ro!a!l# in&icate
that we have some %ro!lems with the software that create& the &ata!ase entries 5whether
that software is the user interface or a &ata conversion utilit#/ the %ro!lem remains6.
+ince solvin* %ro!lems is what we &o !est in IT 5at least if we want to retain our
o!s after someone asks us what we &o all &a#6/ we mi*ht want to &ocument the &ata that
is missin* so the &evelo%er can look at the source co&e an& correct it an& so we can
retroactivel# correct the &ata if %ossi!le.
The quer# !elow will &is%la# all of the or&ers with their associate& customers or a
null value if there is no customer an& the associate& %ro&ucts that are on the or&er or a
null value if there are no %ro&ucts.
SELECT
CUST.CUST_NM
, ORD.ORD_TS
, ORD.SHIP_TS
, ORD.ORD_ID
, PRD.PRD_DESC
FROM
ORD
LEFT OUTER JOIN CUST
ON CUST.CUST_ID = ORD.CUST_ID
LEFT OUTER JOIN ORD_PRD
ON ORD.ORD_ID = ORD_PRD.ORD_ID
LEFT OUTER JOIN PRD
ON ORD_PRD.PRD_ID = PRD.PRD_ID
@
The same quer#/ mo&ifie& to onl# &is%la# entries where customers or %ro&ucts are null is
actuall# far more useful !ecause #ou &o not have to sort throu*h all of the vali& &ata to
locate the invali& &ata)
SELECT
CUST.CUST_NM
, ORD.ORD_TS
, ORD.SHIP_TS
, ORD.ORD_ID
, PRD.PRD_DESC
FROM
ORD
LEFT OUTER JOIN CUST
ON CUST.CUST_ID = ORD.CUST_ID
LEFT OUTER JOIN ORD_PRD
ON ORD.ORD_ID = ORD_PRD.ORD_ID
LEFT OUTER JOIN PRD
ON ORD_PRD.PRD_ID = PRD.PRD_ID
WHERE
PRD.PRD_ID IS NULL
OR CUST.CUST_ID IS NULL
@
Eer!ises
1. 1o&if# the quer# with multi%le oins to also inclu&e customers that &o not have
or&ers.
2. Write a quer# to return all of the %ro&ucts that have never !een or&ere&.
F. Write a quer# to return onl# or&ers that have neither customers nor %ro&ucts
associate& with them.
G. +tart a%%l#in* this knowle&*e at work.

Vous aimerez peut-être aussi