• subscribe
December 23, 2003 12:00 AM

Take Control of Joins

Have a say in the logical order of join processing
SQL Server Pro
InstantDoc ID #40621
Downloads
40621.zip

When you write a multitable join that uses inner joins exclusively (or for that matter, only cross joins or only full outer joins), the table order you specify in the FROM clause doesn't affect the query result. However, when you're using left or right outer joins, if you revise the table order in the FROM clause, you change the logical meaning of the query. When you're writing a multitable query in which you need to join different tables by using different join types (e.g., some using inner joins, others using left or right outer joins), finding a logically correct solution can be tricky. Most T-SQL programmers are familiar with the conventional manner of writing joins—for each pair of tables you're joining, you write the left table name, the join type, the right table name, and the JOIN condition, in that order.

However, most T-SQL programmers don't know about a couple of undocumented techniques you can use for writing multitable joins. The main advantage of these undocumented techniques is that both give you more power to control the logical order of join processing than the conventional techniques do. With more options to control the logical order of join processing, you can write queries that closely resemble the way you think. Although I discuss both techniques for the sake of completeness, I recommend that you use only the first technique because it's much more readable and easier to maintain than the second.

The Orders Case Study
To demonstrate the undocumented join syntaxes, I use a common order-application scenario that uses four tables: Customers, Orders, OrderDetails, and Products. First, run the code in Listing 1 to create and populate the tables with sample data.

Each customer in the Customers table might have zero or more orders in the Orders table. In the sample data, Active customer 1001 placed two orders, while Inactive customer 1002 placed no orders. Each order in the Orders table can have one or more lines in the OrderDetails table, each containing a different product. Each product in the Products table might appear in zero or more order lines in the OrderDetails table. In the sample data, a customer ordered Used product 301 and Used product 302 at least once, but no customers ordered Unused product 303.

If you want to return order-detail information that includes the customer's company name, the order ID, the product name, and the quantity without returning either customers who placed no orders or products that no one ordered, you need to use inner joins exclusively, as the code in Listing 2 shows. This code generates the result that Figure 1 shows.

Adding a slight twist to the previous requirement forces you to mix inner and outer joins. Say you now also need to return customers who placed no orders and products that no one ordered. Figure 2 shows the desired result. Handling this task using conventional join techniques is fairly simple, but I find this scenario useful to demonstrate the undocumented syntaxes. In practice, you might need to handle much more complex problems that might be simpler to solve by using the undocumented join methods. Once you learn the syntaxes of the techniques I present, you can implement them in any scenario.

Using Conventional Join Techniques
The query that Listing 3 shows is one way to solve the problem by using conventional join techniques. The first step is to use a left outer join between Customers and Orders based on matching values in the custid columns of both tables. The result of the first join is a hidden virtual table (call it VT1) that contains matching rows from both tables and shows NULLs in the columns from the Orders table for customers who made no orders. Next, the code joins VT1 to the OrderDetails table. Typically, you'd use an inner join between Orders and OrderDetails. However, using an inner join between VT1 and OrderDetails eliminates customers who placed no orders because you already returned those with NULLs in the orderid column (the join column).

To keep in the result customers who made no orders, you must use a left outer join to the OrderDetails table or join Orders to OrderDetails, then use a right outer join to Customers. This process returns another hidden virtual table (call it VT2). Next, the code performs a full outer join to return matching rows from VT2 and Products, customers who made no orders, and products that no one ordered.

Usually, the best solution represents the way you think so that you and others can more easily understand the code when you review it. But Listing 3's solution might not represent your way of thought. For example, because Orders and OrderDetails are closely related, I prefer a simple solution that performs a left outer join between Customers and the result of an inner join between Orders and OrderDetails, then performs a full outer join to the Products table. Using conventional join techniques, I'd revise the query as Listing 4 shows to match my way of thinking.



ARTICLE TOOLS

Comments
  • Maurice Pelchat
    8 years ago
    Jan 23, 2004

    I was surprised that in your article that you didn't mention the documented technique about optimizer hints:

    Select ... From tableA inner LOOP join tableB on...

    The main advantage of this technique is that it is simple to implement (i use it indirectly to force use of indexes without using the hint with index()), and it also forces the order.

    The main disavantage of this technique is that it force also the type of join. But if the nested loop join is suitable for the job, the result is quite understandable.

    I used it many times to get the proper join in the proper order, and more recently today.

    By the way SQL 7.0 tends to go away too quickly from nested loop joins in favor of other join operators that give less performance (in the context of the query).

    I experienced that behavior many times in my pratice of query optimizations.

You must log on before posting a comment.

Are you a new visitor? Register Here