DOWNLOAD THE CODE:
Download the Code 21453.zip

Finally, a use for the cross join

SQL Server developers must be endlessly inventive. Among the nifty devices bouncing around in their bag of tricks is the technique of using T-SQL to create T-SQL statements. When a potentially tedious project I worked on cried out for this technique, I ended up writing many such statements, which I share with you here. The project began when a client asked me to create 375 tables for a staging database. I spend a lot of time at the keyboard, but the idea of writing code to create 375 tables was daunting. So, I looked for another way to create the tables.

The solution to my problem lay in the data definition and the client's intended use for the tables. The 375 tables needed to hold data that automated data pumps funnel in from external data sources. After the data was lodged in the staging tables, my client could run the data through a sequence of data-integrity checks, transform it, then load it into a production warehouse. The data is organized into 75 regions that are numbered 01 through 75 and normalized at the external sources so that each region's data is distributed into five related tables. Because the data-integrity checks and transformations vary by table and by region, the region data must remain compartmentalized into these 75 separate tables. According to the client's established naming convention, the five types of tables are labeled A, B, C, D, and E. Consequently, I must name Region 01's tables A01, B01, C01, D01, and E01, for example, so that the automated data pump can find its target.

Because the table names and table structures are so standardized, I could develop a shortcut process for creating the 375 tables. Although several kinds of shortcuts would have worked in this scenario, I decided to try a type of query that I've had little use for in the past—the cross join. Most often, you use the cross join to quickly create large test data sets. SQL Server Books Online (BOL) defines a cross join as "[a join] that does not have a WHERE clause." That description is partially correct. A cross join doesn't contain a phrase in either the FROM clause or the WHERE clause that forces a match between the join columns of the two tables that you're linking. If you write a join by using the ANSI technique—which inserts the join expression into the FROM clause—you can specify CROSS JOIN, as the following code shows:

USE pubs
GO
SELECT au_fname, au_lname, royaltyper
FROM Authors CROSS JOIN TitleAuthor

This query contains no instruction that says, "match values of column A to values of column B," as you expect in a standard INNER JOIN or OUTER JOIN query. Instead, a cross join's resultset is a Cartesian product, which "multiplies" each row from the first table by all rows from the second table. The Authors table contains 23 rows, and the TitleAuthor table consists of 25 rows, so the resultset from this query, the Cartesian product (23 x 25), has 575 rows.

After I decided to use cross joins to help build all these tables, I proceeded to write my queries. First, for tables A through E, I created a set of "template" tables that contained the appropriate column names and data types for each of the tables but no rows of data. Next, I built two more tables—one that contained the region prefixes (A, B, C, D, and E) and one that contained the region numbers (01 through 75). Then, I cross-joined the two tables and created yet another table that contained the 375 new table names. Because I was going to build from the template tables that I created in the first step, I wrote a second query that created another table containing 375 rows, each row a SELECT INTO statement. After I created the 375 SELECT INTO statements as rows in a table, I could easily select from the table, copy the resultset, paste it into Query Analyzer, and execute all 375 SELECT INTO statements, thereby creating 375 tables.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.