Notice that the execution plan contains no startup filters. The plan shows that SQL Server accesses only the local table. This query differs from the previous two in that SQL Server didn't auto-parameterize this query. SQL Server determined that it would perform partition elimination (in this case, eliminating the partitions on Node2 and Node3) at compile time and not at runtime, so you don't see the partition elimination in the execution plan. Nevertheless, SQL Server performed the elimination. The following query, which is similar to the previous one, won't be able to reuse an auto-parameterized plan because such a plan doesn't exist. Figure 5 shows the execution plan for this query:
SELECT *
FROM Customers
WHERE customerid = 'WILMK'
OR customerid = 'WOLZA'
In this plan, SQL Server also performs partition elimination at compile time, so you see only one remote query, which SQL Server issues against Node3. (For more information about auto-parameterization and plan caching, see Kalen Delaney, Inside SQL Server, "SQL Server 7.0 Plan Caching," September 1999.)
Now let's see some more magic by examining how SQL Server handles GROUP BY queries in distributed partitioned views. Consider the following GROUP BY query and its execution plan, which Figure 6 shows:
SELECT customerid, COUNT(*) AS count_custs
FROM Orders
GROUP BY customerid
Notice that SQL Server calculates aggregates in the local node and issues a GROUP BY query against each remote node. Later, SQL Server concatenates all the results and aggregates each node's aggregates to form the final result set. Splitting the aggregation in this example is called "local-global aggregation." This method is much more efficient than bringing all the individual rows from all the nodes and aggregating them locally. By executing the GROUP BY query on the source servers, SQL Server transfers much less data across the network. In addition, this method distributes the processing load to exploit the power of all the servers. Note that you might get a different plan for this query. This particular plan was generated when we configured Node1 to use a maximum of 64MB of memory. The optimizer came up with a plan that pushes the other nodes' resources to the limit. Similarly, if you partition two tables on the same column with the same partitioning criteria, SQL Server also locally performs any join queries that you run against the views. Or, you can use a method that returns all the rows to the requesting server and performs the join at that server. This option lets you split the join processing to the participating servers and exploits each server's resources for performing local joins. Now let's compare a join query against two views that are partitioned on the same partitioning criteria with a join query against two views that are partitioned on different partitioning criteria. To make this comparison, partition the Orders table by the orderid column instead of the customerid column, as Listing 1 shows.
Now run the following join query against the views Customers and Orders, which are partitioned with the same partitioning criteria:
SELECT *
FROM Customers AS C JOIN Orders AS O
ON C.customerid = O.customerid
Compare this query's execution plan, which Figure 7 shows, to the following query's execution plan, which Figure 8 shows. The following query runs against the views Customers and Orders2, which use different partitioning criteria:
SELECT *
FROM Customers AS C JOIN Orders2 AS O
ON C.customerid = O.customerid
Notice that in the first query, each node processed the join locally, and later SQL Server concatenated the results in the local server. In the second query, SQL Server brought the unchanged results of the Customer's partitions and the Orders2 partitions to the local server. Then SQL Server performed the join on the local server without using the remote servers' resources.
More Distributed Partitioned Views to Come
Now that you're familiar with the basic steps to create, modify, and query distributed partitioned views, you need to know how to handle the exceptions. In an upcoming issue, we'll offer suggestions for achieving similar functionality when your tables or views don't meet the criteria for distributed partitioned views that we described here.