Vous êtes sur la page 1sur 12

SQL: Joins

A join is used to combine rows from multiple tables. A join is performed whenever two or more tables is
listed in the FROM clause of an SQL statement.

There are different kinds of joins. Let's take a look at a few examples.

Inner Join (simple join)

Chances are, you've already written an SQL statement that uses an inner join. It is is the most common type
of join. Inner joins return all rows from multiple tables where the join condition is met.

For example,

SELECT suppliers.supplier_id, suppliers.supplier_name, orders.order_date


FROM suppliers, orders
WHERE suppliers.supplier_id = orders.supplier_id;

This SQL statement would return all rows from the suppliers and orders tables where there is a matching
supplier_id value in both the suppliers and orders tables.

Let's look at some data to explain how inner joins work:

We have a table called suppliers with two fields (supplier_id and supplier_ name).
It contains the following data:

supplier_id supplier_name
10000 IBM
10001 Hewlett Packard
10002 Microsoft
10003 Nvidia

We have another table called orders with three fields (order_id, supplier_id, and order_date).
It contains the following data:

order_id supplier_id order_date


500125 10000 2003/05/12
500126 10001 2003/05/13

If we ran the SQL statement below:

SELECT suppliers.supplier_id, suppliers.supplier_name, orders.order_date


FROM suppliers, orders
WHERE suppliers.supplier_id = orders.supplier_id;

Our result set would look like this:

supplier_id name order_date


10000 IBM 2003/05/12
10001 Hewlett Packard 2003/05/13

The rows for Microsoft and Nvidia from the supplier table would be omitted, since the
supplier_id's 10002 and 10003 do not exist in both tables.

Mahendra Sajjala Page 1 of 12 Joins


Outer Join

Another type of join is called an outer join. This type of join returns all rows from one table and only those
rows from a secondary table where the joined fields are equal (join condition is met).

For example,

select suppliers.supplier_id, suppliers.supplier_name, orders.order_date


from suppliers, orders
where suppliers.supplier_id = orders.supplier_id(+);

This SQL statement would return all rows from the suppliers table and only those rows from the orders table
where the joined fields are equal.

The (+) after the orders.supplier_id field indicates that, if a supplier_id value in the suppliers table does not
exist in the orders table, all fields in the orders table will display as <null> in the result set.

The above SQL statement could also be written as follows:

select suppliers.supplier_id, suppliers.supplier_name, orders.order_date


from suppliers, orders
where orders.supplier_id(+) = suppliers.supplier_id

Let's look at some data to explain how outer joins work:

We have a table called suppliers with two fields (supplier_id and name).
It contains the following data:

supplier_id supplier_name
10000 IBM
10001 Hewlett Packard
10002 Microsoft
10003 Nvidia

We have a second table called orders with three fields (order_id, supplier_id, and order_date).
It contains the following data:

order_id supplier_id order_date


500125 10000 2003/05/12
500126 10001 2003/05/13

If we ran the SQL statement below:

select suppliers.supplier_id, suppliers.supplier_name, orders.order_date


from suppliers, orders
where suppliers.supplier_id = orders.supplier_id(+);

Our result set would look like this:

supplier_id supplier_name order_date


10000 IBM 2003/05/12
10001 Hewlett Packard 2003/05/13
10002 Microsoft <null>
10003 Nvidia <null>

Mahendra Sajjala Page 2 of 12 Joins


The rows for Microsoft and Nvidia would be included because an outer join was used.
However, you will notice that the order_date field for those records contains a <null>
value.

What's the difference between an inner join and an outer join?


When you wish to get data from two related tables, you can use an
inner join or an outer join to define how the data is related.

For my examples, let's assume where is a table called


"student" to hold student information, a table called
"attendance" to hold student's attendance, and a table
called StudentSchedule that holds a student's current schedule:

Student has the following fields:


ID
FirstName
LastName
BirthDate

Attendance has the following fields:


ID
AttendanceDate
AttendanceCode
MinutesAbsent

StudentSchedule as the following fields:


ID
Course

An inner join gets data from both tables where the specified data
exists in both tables. For example, if you wanted a list of
students in your database that were absent on December 4, 2003, you
would use an inner join between the two examples tables "Student"
and "Attendance":

SELECT Student.ID, Student.FirstName, Student.LastName,


Attendance.AttendanceCode,
Attendance.MinutesAbsent FROM Student INNER JOIN Attendance
ON Student.ID=Attendance.ID
WHERE Attendance.AttendanceDate='12/4/2003'

The above statement will only return students with attendance


information on the specified date. Students who do not have
attendance would not display.

Like so:
ID FirstName LastName AttendanceCode MinutesAbsent
10 Steve Bartman Tardy 22
32 Dale Thropmorton ExcAbsent 200

(maybe there are 200 kids in the database, but only Steve and
Dale where absent on 12/4/2003. They are the only students to
display)

An outer join gets data from the source table at all times, and
returns data from the outer joined table ONLY if it matches the
criteria. You would use this type of join using my examples
tables if you wanted a list of all students in a specified
course, and you wanted attendance information if it existed. You
would use an inner join between Student and StudentSchedule to
Mahendra Sajjala Page 3 of 12 Joins
only get the students in the speicified course (for example
'ENGLISH 9'), but you would use an outer join against Attendance
because you want ALL students in the course, not just students
with attendance information on 12/4/2003.

When using outer joins, fields will be set to NULL if data does
not exist in the outer-joined table.

SELECT Student.ID, Student.FirstName, Student.LastName,


Attendance.AttendanceCode,
Attendance.MinutesAbsent
FROM Student
INNER JOIN StudentSchedule ON
StudentSchedule.ID=Student.ID
LEFT OUTER JOIN Attendance ON
Student.ID=Attendance.ID AND
Attendance.AttendanceDate='12/4/2003'
WHERE StudentSchedule.Course='ENGLISH 9'

ID FirstName LastName AttendanceCode MinutesAbsent


10 Steve Bartman Tardy 22
32 Dale Thropmorton ExcAbsent 200
44 Jennifer Lopez NULL
NULL
Description
A join is an operation that combines two tables to produce a third, subject to a
restrictive condition. Every row of the new table must satisfy the restrictive
condition. Joins provide the means of linking data in one table with data in another
table and are frequently used in defining reports and queries.
A join is used in a SELECT statement as part of the FROM clause.
Caché supports many different syntactical forms of JOIN. However, these many
formulations refer to the following four types of joins.
Type of Join Description

Same as using a comma to separate two table names in a FROM


CROSS JOIN
clause.

Same as JOIN. Symbolic representation: "=" (in a WHERE


INNER JOIN
clause).

LEFT OUTER Same as LEFT JOIN. Symbolic representation: "=*" (in a WHERE
JOIN clause).

RIGHT OUTER Same as RIGHT JOIN. Symbolic representation: "*=" (in a


JOIN WHERE clause).

The LEFT OUTER JOIN and the RIGHT OUTER JOIN are in most respects functionally
identical (with reversed syntax), and thus are frequently referred to collectively as
one-way outer joins.
The INNER JOIN, LEFT OUTER JOIN, and RIGHT OUTER JOIN can take one (and only
one) of the following qualifiers: a NATURAL keyword prefix, an ON clause, or a
USING clause. Thus a NATURAL JOIN is any of these three types of join prefixed with
the NATURAL keyword.

Mahendra Sajjala Page 4 of 12 Joins


The following joins are only supported as the first join within a join expression: a
RIGHT OUTER JOIN, a NATURAL JOIN (any type), or a JOIN with a USING clause
(any type).
Note that a JOIN can be recursive, such as involving the join argument in the last
line of syntax, which can include any form of JOIN at that point.
JOIN Definitions
A CROSS JOIN is a join that crosses every row of the first table with every row of the
second table. This results in a Cartesian product, a large, logically comprehensive
table with much data duplication.
An INNER JOIN is a join that links rows of the first table with rows of a second table,
excluding any row in the first table that finds no corresponding row in the second
table. This results in a small table in which every field contains unique data.
A One-Way Outer Join (LEFT OUTER JOIN or RIGHT OUTER JOIN) is a join that links
rows of the first (source) table with rows of a second table, including all rows from
the first table even if there is no match in the second table. This results in a table in
which some fields of the first (source) table may be paired with NULL data.
When specifying a one-way outer join, the order in which you name the tables in the
FROM clause is very important. For a LEFT OUTER JOIN, the first table you specify is
the source table for the join. For a RIGHT OUTER JOIN the second table you specify
is the source table for the join. These two types of one-way outer joins have different
restrictions.
Appending the word NATURAL to an inner or outer join specifies that you are joining
on all the columns of the tables that have the same name.
JOIN Restrictions
For an INNER JOIN, there is no restriction on the conditional expression — anything
works.
For LEFT or RIGHT joins, the conditional expression in the ON clause must consist of
ANDs or comparisons meeting one of the following criteria (given "A LEFT JOIN B ON
conditional-expression" or "A RIGHT JOIN B ON conditional-expression"):

• A column from A equals a column from B (or vice versa).


• For a LEFT JOIN, a field from B is a constant expression.
• For a RIGHT JOIN, a field from A is a constant expression.
• For a RIGHT OUTER JOIN, the left operand cannot be a join expression, a
view, or a (query).
• For a LEFT OUTER JOIN, the right operand cannot be a join expression. It can,
however, be a view or a (query).

For further details, see OUTER JOIN Limitations in Using Caché SQL.
For a NATURAL join, or a join with a USING clause, only simple base table references
are supported for either operand.
One-Way Outer Joins
Caché supports one-way outer joins: LEFT OUTER JOIN and RIGHT OUTER JOIN.
Caché does not support two-way outer joins.
With standard "inner" joins, when rows of one table are linked with rows of a second
table, a row in the first table that finds no corresponding row in the second table is
excluded from the output table.

Mahendra Sajjala Page 5 of 12 Joins


With one-way outer joins, all rows from the first table are included in the output
table even if there is no match in the second table. With one-way outer joins, the
first table pulls relevant information out of the second table but never sacrifices its
own rows for lack of a match in the second table.
For example, if a query lists Table1 first and creates a left outer join, then it should
be able to see all the rows in Table1 even if they don't have corresponding records in
Table2.
When specifying a one-way outer join, the order in which you name the tables in the
FROM clause is very important. For a left outer join, the first table you specify is the
source table for the join. For a right outer join the second table you specify is the
source table for the join.
Outer Join Syntax
Caché supports three formats for representing outer joins:

1. The ANSI standard syntax: LEFT OUTER JOIN and RIGHT OUTER JOIN. SQL
Standard syntax puts the outer join in the FROM clause of the SELECT
statement, rather than the WHERE clause, as shown in the following example:

FROM tbl1 LEFT OUTER JOIN tbl2 ON (tbl1.key = tbl2.key)

2. The ODBC Specification outer join extension syntax, using the escape-syntax
{oj ... }
3. Symbolic outer join extension syntax, using a condition such as A=*B in the
WHERE clause. A left outer join is specified using the symbol =* in place of =
in the WHERE clause. (Note that this is the reverse of the syntax used by
Microsoft SQL Server.) A right outer join is specified using the symbol *= in
place of = in the WHERE clause.

These three formats are interchangeable and can also be mixed. However, we
recommend the use of ANSI standard syntax, as it is the only one compatible with
ODBC (and portable to the latest Microsoft products). Additionally, it can specify
many operations not specifiable in principle with symbolic syntax.
Examples of Outer Joins
As an illustration of the difference between an inner join and a one-way outer join,
consider the query:
SELECT Patient.PName,Doctor.DName
FROM Patient,Doctor
WHERE Patient.Doctor = Doctor.Doctor
The FROM clause specifies a cross join, and the WHERE clause restricts the results to
an inner join. Thus, this query returns one row for each row of the "Patient" table
that has a non-null "Doctor" field. By contrast, examine the following left outer join
query:
SELECT Patient.PName,Doctor.DName
FROM Patient,Doctor
WHERE Patient.Doctor =* Doctor.Doctor
Though it differs only in the =* operator, this query returns all the rows of the
previous query plus the rows of the "Patient" table that have null value for the

Mahendra Sajjala Page 6 of 12 Joins


"Patient.Doctor" field. For such rows, "Doctor.Name" is returned as null. This is
known as "null padding." Caché automatically performs null padding.
The left outer join condition is expressed by the following syntax:
A.x LEFT OUTER JOIN B.y
This specifies that every row in A is returned. For each A row returned, if there is a B
row such that A.x=B.y, all of the corresponding B values are also returned.
If there is no B row such that A.x=B.y, null padding causes all B values for that A row
to return as null.
Order of Operations
One-way outer join conditions, including the necessary null padding, are applied
before other conditions. Therefore, a condition in the WHERE clause that cannot be
satisfied by a null-padded value (for example, a range or equality condition on a field
in B) effectively converts the one-way outer join of A and B into a regular join (an
inner join).
For example, if you add the clause "AND Doctor.Age < 45" to the two "Patient" table
queries above, it makes them equivalent. However, if you add the clause "AND
(Doctor.Age < 45 OR Doctor.Age IS NULL)", it preserves the difference between the
two queries.
Mixing Outer and Inner Joins
In general, Caché supports all syntax of mixed inner and outer join, in any order, as
long as no parenthesis is used. There are a few restrictions on the conditions within
the ON clause of an outer join. For example, for a left outer join these conditions
must be of the form A=B or of the form X=B, where A is a field in the left table, B is
a field in the right table, and X is a constant or a host variable.
The following examples display the results of the JOIN commands performed on
Table1 and Table2.
     Table1                Table2
Column1  Column2      Column1  Column3
  aaa      bbb          ggg      hhh
  ccc      ccc          xxx      zzz
  xxx      yyy
  hhh      zzz
CROSS JOIN Example
The statement:
SELECT * FROM Table1 CROSS JOIN Table2
yields the table:
Column1  Column2  Column1  Column3
  aaa      bbb      ggg      hhh
  aaa      bbb      xxx      zzz
  ccc      ccc      ggg      hhh
  ccc      ccc      xxx      zzz
  xxx      yyy      ggg      hhh
  xxx      yyy      xxx      zzz
  hhh      zzz      ggg      hhh
  hhh      zzz      xxx      zzz
NATURAL JOIN Example
The statement:
SELECT * FROM Table1 NATURAL JOIN Table2
Mahendra Sajjala Page 7 of 12 Joins
yields the table
Column1  Column2  Column1  Column3
  xxx      yyy      xxx      zzz
Note that the Caché implementation of NATURAL JOIN does not merge columns with
the same name.
JOIN with an ON Clause Example
The statement:
SELECT * FROM Table1 INNER JOIN Table2
ON Table1.Column1=Table2.Column3
yields the table:
Column1  Column2  Column1  Column3
  hhh      zzz      ggg      hhh
JOIN with a USING Clause Example
The statement:
SELECT * FROM Table1 INNER JOIN Table2
USING (Column1)
yields the table:
Column1  Column2  Column1  Column3
  xxx      yyy      xxx      zzz 
LEFT OUTER JOIN Example
The statement:
SELECT * FROM Table1 LEFT OUTER JOIN Table2
ON Table1.Column1=Table2.Column3
yields the table:
Column1  Column2  Column1  Column3 
  aaa      bbb      null     null
  ccc      ccc      null     null
  xxx      yyy      null     null
  hhh      zzz      ggg      hhh
RIGHT OUTER JOIN Example
The statement:
SELECT * FROM Table1 RIGHT OUTER JOIN Table2
ON Table1.Column1=Table2.Column3
yields the table:
Column1  Column2  Column1  Column3
  hhh      zzz      ggg      hhh
  null     null     xxx      zzz

So what are the other types of Joins?


As I told you initially, there are many other types of Joins. Inner Joins and Outer
Joins are 2 of them.

What are Inner Joins?


Inner Joins are the most commonly used type of Joins are are almost the same
thing that you just learnt above. Check the following query

SELECT * FROM (authors INNER JOIN books ON authors.author_id =


books.author_id)

Mahendra Sajjala Page 8 of 12 Joins


This would return 4 rows of data having all the information about the 4 books.
This gives exactly the same result as the following query

SELECT * FROM authors,books WHERE authors.author_id = books.author_id

Basically Inner Joins combines all the records in the first table with all the
records in the second table and then selects those rows depending on the
criteria that is present after the the ON keyword in the query.

The most important thing to remember in Inner Joins is that only those records
from both the tables are combined where there is a corresponding value in both
the tables. You will understand this point clearly when you read about Outer
Joins.

What are Outer Joins?


Outer Joins are almost similar to Inner Joins, but in Outer Joins all the records in
one table are combined with records in the other table even if there is no
corresponding common value. To get this point assume that our authors table is
now the one shown below instead of the original one. The books table remains
the same.

author_id lastname firstname company


1 Hunter Jason O' Reilly
2 Kanetkar Yashwant BPB Publications
3 Pai Kiran Pai Ltd. Note that there is a
new row added.

There are 2 types of Outer Joins - Left Join and Right Join. Consider an
example of Left Join

SELECT * FROM (authors LEFT JOIN books ON authors.author_id =


books.author_id)

This query would now (considering the new authors table) return 5 rows of data.
The last row would correspond to the record of the author named Kiran Pai but
there would be NULL values for the fields such as title,pages,book_id... since
there are no books written by Kiran Pai in our books table.

This is exactly what Left Joins do - they combine all the records in the first table
with those in the second table irrespective of whether there are corresponding
values in 2 tables. By corresponding I mean like in our case since there were no
books by Kiran Pai in our books table there are no corresponding values for Kiran
Pai in our books table. But inspite of there being no corresponding values all the
values were joined and thus you got the NULL values in your final query result.
This point is very important to understand Joins.. so please do read it again in
case you have not understood it clearly.

Could you explain the concept of Inner Joins once again with respect to
this Left Joins?
Yeah.. now you will understand better. As I mentioned in Left Joins all the
records in the first table are combined with data from the second table. And if
there is no corresponding data in the second table then a NULL value is inserted
into the results. Whereas in Inner Joins, records from the 2 tables are combined
only if there are corresponding values in both the tables. For example consider
Mahendra Sajjala Page 9 of 12 Joins
the following Inner Join on the books table and our new authors table -

SELECT * FROM (authors INNER JOIN books ON authors.author_id =


books.author_id)

This query would return only 4 rows of data and not 5 inspite of the new Kiran
Pai entry being present. This is because it could find no corresponding entry in
the books table for the author named Kiran Pai (there were no books written by
Kiran Pai)

There are are 2 types of Outer Joins ..right??


Yeah.. there are 2 types. One is Left Join and the other obviously is Right Join.
Consider the same example as shown previously for Left Join, but this time use a
Right Join instead of a Left Join

SELECT * FROM (authors RIGHT JOIN books ON authors.author_id =


books.author_id)

This time you would NOT GET 5 rows of data, instead you would get only 4 rows
of data. Why so??

Since while joining the data from the books table with the authors table, the
books table data is considered first.. since it is a Right Join. Thus for every row
in the books table a check for corresponding value in the authoirs table would be
made. Thus while adding all the rows of the books table to the joined table,
since there were no books written by Kiran Pai, so that particular record from the
authors table was not added at all to the joined table. Remember that the joined
table is the kind of temporary table that is created while using Joins. That table
is refined depending on the criteria present in the query.

Inner Joins and Outer Joins aren't really clear in my mind right now.
Why don't you explain it once again in short ?
Actually, all that is required is a re-reading of the previous 4-5 questions. In case
you still haven't got it, there is only way out. Create actual tables using some
software and then run some sample queries on those tables. Remember not to
put corresponding entries for all your data in both of your tables.

By that I mean in case you are making 2 tables - one for artists and one for
songs. See to it that in case you have 10 artists names in the artists table, you
do not have songs sung by all of those 10 artists in your songs table. Add songs
by only 6-7 artists. When you have such a kind of database where there isn't
corresponding information for all the records in one table in the other table,
Inner Joins and Outer Joins show their real working. In case you have songs
sung by all 10 artists in your Songs table, most of your query results would be
the same irrespective of what kind of Join you use. Only the order of the rows in
the resultset might be altered. So you would be confused..you wouldn't
understand the difference between the working of different Joins.

Are there other ways in which I can use Joins?


Yeah you can combine 2 Joins to create more complex Joins. I shall not go into
the details of those kind of Joins, but I shall outline the structure of the query
which you have to make. This kind of thing is required when you are combining

Mahendra Sajjala Page 10 of 12 Joins


data from say 3 tables. Your query would look something like the following

SELECT ... , ... FROM ( .... INNER JOIN .... ON .... = .... ) INNER JOIN .... ON ....
= .... WHERE .... = .....

This is not really for beginners, but if you can get it.. you are not a beginner..
you are already on your way to be an expert.

So basically there are 3 Joins right?? Cross Join, Inner Join and Outer
Join?
There is also a Self Join, which I shall not be explaining in this article. Just
remember that it exists. It is a fairly complex topic which you may not need right
away.

Give me more examples of Joins with new tables


I could do that.. but that would basically amount to waste of time. On the
contrary I shall explain how to make effective use of Joins along with another
SQL feature called Sub Query.

What's Sub Query?


Use the same tables that I have used before. In case you want to find the titles
of the books written by the Jason. What would you do?? Give it a thought and
try to figure some way to get it done.. then read on.

Yes I can do it.. I shall first make a query and find his author_id from
the authors table and then I shall make another query and get a list of
books having that id associated with it from the books table.
Ok.. good. but aren't you using 2 queries? Can you get it in one query?

That where Sub Queries come in use. You could find the author_id for that
particular author and then also find the books written by him in the same query.

Mahendra Sajjala Page 11 of 12 Joins


This sounds exciting.. How do I do it?
Ok here is what I am trying to do now. In English I could state it as follows - "I want
a list of the books that have been written by author whose firstname is Jason" . In
SQL I could possibly use the following

SELECT title FROM authors,books WHERE ( books.author_id = (SELECT author_id


from authors where firstname = 'Jason') AND books.author_id = authors.author_id)

Seems complex.. but its actually simple.. (This is after all SQL In Simple English :)
The first part is simple.. you select the title of the books whenever the criteria is
met. Lets analyze the criteria..

The inner most part SELECT author_id from authors where firstname = 'Jason'
evaluates to the 1 (its just a simple query which you learnt right at the beginning).
Thus the entire query now becomes

SELECT title FROM authors,books WHERE (books.author_id = 1 AND books.author_id


= authors.author_id)

Here we are using the 2 original tables the authors table with 2 rows and the books
table with 4 rows. Now when the joined table is created it would be consisting of 8
rows (2x4 as explained earlier). From these 8 rows only 4 would be containing
correct information and the rest of them would be just present as a result of the
various combinations that were possible. Now the books.author_id =
authors.author_id part of the WHERE clause separates these 4 correct rows and
discards the remaining rows. And finally the other part of the WHERE clause
books.author_id = 1 would selects the final single row from the 4 correct rows.

Thus you would get the names of the books written by the author named Jason
Hunter.

Mahendra Sajjala Page 12 of 12 Joins

Vous aimerez peut-être aussi