Joining tables yields meaningful information
Databases are not just for storing data, but for retrieving information. Unless you can retrieve relevant information from your database, a powerful server running the latest software is about as much use as a shoebox full of old receipts. You probably normalize your data thoroughly when building your application (for information about normalization, see Michelle Poolet, "Why You Need Data Normalization," Premiere issue), so that you have many small tables, each referring to some entity such as a student or a class. Now, when you need to retrieve information, you usually have to join multiple tables together, so that you can, for example, generate a list of which students are in each class.
Join Syntax
SQL Server supports two different ways to write a query with a join. One is the Transact-SQL (T-SQL) syntax, which SQL Server has supported since version 1. This method is familiar to users of programs such as Microsoft Access. But SQL Server also supports the ANSI-92 SQL syntax for writing joins. In both SQL Server 6.x and 7.0, you may use either syntax, but the results will differ. In a few cases involving NULLs, only the ANSI syntax gives the correct result; the T-SQL syntax query runs, but the output is not correct. Several other facts support a preference for the ANSI syntax, not the least of which is that it is now the preferred syntax for SQL Server 7.0. I will mention some other reasons as I explain the various types of join. As usual, I will use the PUBS database for this discussion. The examples shown here will work on SQL Server 6.5 and 7.0, so use whichever version you have available for testing.
Inner Joins
The simplest form of a join is the INNER JOIN. In English, you are saying, "Give me all the data that matches in both tables." You have to specify which column, or columns, in one table match the corresponding columns in the second table. Suppose you want a list of publishers and the titles they publish, as Screen 1 shows. In the old T-SQL style, you write this join as:
SELECT pub_name, title
FROM publishers, titles
WHERE publishers.pub_id = titles.pub_id
The SELECT clause says which columns to output, the FROM clause lists the tables, and the WHERE clause tells SQL Server how to join the tables. In this case, the publisher ID, which the pub_id column represents, is the common link between the publishers table and the titles table. Now here's the same query in the ANSI syntax:
SELECT pub_name, title
FROM publishers INNER JOIN titles
ON publishers.pub_id = titles.pub_id
Notice the differences. First, you type the word JOIN in the query, between the names of the tables you're joining. The word INNER is optional, because this is the default join type. Second, you do not need a WHERE clause for the join, so you can reserve the WHERE clause to restrict the rows you want displayed. Instead, you use the ON keyword to specify which columns to base the join on. In either syntax, the order of the tables is not critical. So
SELECT pub_name, title
FROM publishers INNER JOIN titles
ON publishers.pub_id = titles.pub_id
works the same way as
SELECT pub_name, title
FROM titles INNER JOIN publishers
ON titles.pub_id = publishers.pub_id
which reverses the order of the tables in the FROM clause.
My preference is to list the columns in the order in which I want them to appear in the output data, then list the tables in the FROM clause in the order in which I need them to retrieve the specified columns.
Equijoins and Natural Joins
An equijoin is an INNER JOIN in which you ask for the join columns from both tables. You may think that this join will return redundant data, and of course, you are correct. The whole premise of this join is that the data in the join column is the same in both tables. An equijoin would look like this:
SELECT pub_name, publishers.pub_id, titles.pub_id, title
FROM publishers INNER JOIN titles
ON publishers.pub_id = titles.pub_id
Screen 2 shows the results. Although this topic may seem of only theoretical interest, it does bring up an important point. The INNER JOIN examples did not request the pub_id column to be included in the output data. This example does. And if you request a column that exists in both tables, you must specify which table's column you want. Therefore, the query shows publishers.pub_id and titles.pub_id, to distinguish the two pub_id columns.
A more reasonable request would be to have the publisher ID in the output, but you need to see it only once. This approach is a natural join and it looks like this:
SELECT pub_name, publishers.pub_id, title
FROM titles INNER JOIN publishers
ON titles.pub_id = publishers.pub_id
Prev. page  
[1]
2
3
next page