• subscribe
March 28, 2008 12:00 AM

Set Operations

SQL Server 2005 Tips
SQL Server Pro
InstantDoc ID #98159
Downloads
98159.zip

Treatment of NULLs
One of the key benefits of using set operations over alternative methods is that a set operation treats NULLs as equal when comparing rows. Take the following query as an example:

 SELECT Country, Region, City FROM
  dbo.Customers
  INTERSECT
  SELECT Country, Region, City FROM
  dbo.Employees;

This query returns distinct locations where both employees and customers exist. Web Table 1 shows the outputof the query. Notice that the location UK, NULL, London appears in the output because both customers and employees exist in this location. The set operation considers the NULL region in the customer location (UK, NULL, London) as being equal to the NULL region in the employee location (UK, NULL, London).

Other language elements such as ON and WHERE filters have a different treatment when comparing two NULLs. For example, an attempt to write a solution that uses an INNER JOIN statement, as Web Listing 6 shows, would return the output shown in Web Table 2. Notice that the location UK, NULL, London doesn’t appear in the output even though it does appear in both Customers and Employees. This location was eliminated by the INNER JOIN statement. When the JOIN predicate in the ON clause compared the NULL region in both sides, the comparison yielded UNKNOWN. An ON filter (as well as WHERE and HAVING) eliminates rows for which the predicate yields UNKNOWN. To get the exact logical equivalent of the INTERSECT query, you’d need to explicitly handle NULLs as in Web Listing 7. At this point, the solution is so convoluted that you can really appreciate the brevity and simplicity of the INTERSECT set operation.

In a similar manner, the EXCEPT operation will also treat two NULLs as equal. For example, the following query returns the output that Web Table 3 shows, with the distinct locations where employees exist but not customers:

 SELECT Country, Region, City FROM
  dbo.Employees
  EXCEPT
  SELECT Country, Region, City FROM
  dbo.Customers;

Notice that the location UK, NULL, London doesn’t appear in the output because it does appear in both Employees and Customers. Now run the alternative solution in Web Listing 8, using the NOT EXISTS predicate. You’ll get the output that Web Table 4 shows.

This time the location UK, NULL, London was returned even though it appears in both Employees and Customers. When the inner query’s filter compared the NULL region in the customer row and the NULL region in the employee row, the predicate evaluated to UNKNOWN, and the row was filtered out as if London, NULL, UK doesn’t exist in the Customers table. Hence, the location was returned from the Employees table. To get the true logical equivalent of the EXCEPT query, you’d need to handle NULLs explicitly, as in Web Listing 9. But again, this alternative solution is quite convoluted, emphasizing the advantage of using the shorter, simpler, and more elegant EXCEPT version.



ARTICLE TOOLS

Comments
  • Swati
    4 years ago
    Apr 28, 2008

    excellent. useful.

  • Rho Trading
    4 years ago
    Apr 07, 2008

    Exceptionally useful, even beyond Itzik's usual excellence!

  • meganbearly
    4 years ago
    Apr 07, 2008

    Hi gitega,
    Thanks so much for your feedback. Below is Itzik Ben-Gan's response to your questions.

    “When the comparison columns are all that you need to return from a query, then yes, you can consider EXCEPT as an alternative to outer joins (or the NOT EXISTS predicate).
    Just remember what I explained in the article in terms of the difference between EXCEPT and the alternatives in terms of NULL comparisons.

    If you need to return also columns that you’re not comparing, a join gives you this option while a set operation doesn’t.

    Cheers,
    Itzik”

    Please feel free to contact me if you have additional questions.

    Megan Bearly
    Associate Editor, SQL Server Magazine
    mbearly@penton.com

  • Lorrin
    4 years ago
    Apr 04, 2008

    Great article!

    How would the outputs of EXCEPT statement differ from a left join when looking for rows that need to be inserted into target table ? OR Stated another way, is the EXCEPT statement an alternative to a LEFT JOIN in this use case (identify rows to insert to a target table)?

You must log on before posting a comment.

Are you a new visitor? Register Here