DOWNLOAD THE CODE:
Download the Code 46117.zip

The first invocation of the recursive member returns target pages that are two links away from the source. The second invocation returns target pages that are three links away from the source, and so on, until the recursive member finds no more linked pages. The outer query then selects the distinct list of source/target pairs from the result of the CTE. We applied DISTINCT because without it, you get duplicate result rows when multiple paths lead from a source page to a target. If you remove the DISTINCT, you'll see two additional rows in your result set. Both the (1,7) and (1,8) pairs have duplicates because paths exist from 1 to 4 to 7 and 1 directly to 7 for the pair (1,7) and 1×4×7×8 and 1×7×8 for the pair (1,8) in the input graph.

With a minor addition to the CTE, as Listing 3 shows, you can also calculate the minimum distance (shortest path) between the nodes of this acyclic directed graph. In other words, the minimum number of links that you need to follow to get from one page to another. In the anchor member, add the constant 1 as the distance because the anchor returns direct links. In the recursive member, the recursive member calculates the distance as the parent's distance plus 1. The outer query groups the result by source and target page IDs, and returns the minimum distance for each pair. Running the code in Listing 3 gives the desired results that Table 2 shows.

Note that the Web links graph is equidistant—that is, the distance between any source and target pair is considered the same (one click away), as opposed to a non-equidistant graph, which has an attribute containing the distance between nodes in cases where there might be a different distance for different pairs. So, we use the constant 1 as the distance between source and target Web pages, both in the anchor member and as the increment in the recursive member. Non-equidistant graphs (e.g., a map of roads and highways with different distances between various pairs of source and target locations) require a slight revision to calculate the cumulative distance. Simply replace the constant 1 with the attribute holding the actual distance value between the nodes.

Cyclic Graphs
Cyclic graphs, in which a path might exist between a node and itself, are more complex to deal with than acyclic graphs, mainly because you could end up traversing paths infinitely. An example of a cyclic graph is a Web site, where you might end up in the same page you started browsing after following a series of links. The trick in dealing with cyclic graphs is to identify the cycle so that your code will know when to stop traversing.

Run the code in Listing 4 to repopulate the Links table with a graph that contains cycles. Figure 2 illustrates the relationships between the Web pages in the repopulated Links table. The Links table now represents a cyclic directed graph.

Listing 5 shows a solution that returns all possible paths and their lengths in a cyclic directed graph. In a moment, we show two modifications of this query that produce transitive closure and the shortest path between any two nodes of a cyclic directed graph. The query in Listing 5 navigates paths starting from each page as anchor, since each page might be the beginning of some navigation. While navigating, the query calculates the distance from the anchor page and constructs an enumerated path of page IDs leading from source to target. The recursive member filters out nodes for which it detects a cycle; a cycle is detected if a target page ID already appears in the source page's enumerated path. Your CTE identifies a cycle by using the LIKE predicate in the recursive member's filter. (For more information about the enumerated path construction and cycle detection, see "Cycling with CTEs.")

Running the code in Listing 5 produces the desired output that Table 3 shows. To return the transitive closure of links (cyclic directed graph) that Table 4 shows, you can revise the outer query in Listing 5 so that it returns only the distinct list of source and target page IDs. The code in Listing 6 has two revisions to the code in Listing 5. It returns only the pairs of source and target page IDs without returning the distance and the path attributes, and only the distinct list of pairs, generating as a result the actual transitive closure. Remember that transitive closure contains the distinct list of source and target items that have a path between them.

Finally, the code in Listing 7 shows how to return the shortest paths (as opposed to all possible paths) between pages in the Links relation representing a cyclic directed graph. In Listing 7, the CTE's body is the same as the CTE in Listing 5. The outer query calculates the minimum distance for each pair of pages, then generates the derived table MD from this data. Then the outer query joins all paths (an instance of TC called AP) to MD. The JOIN condition is based on source- and target-page ID matches, and the distance from AP must be equal to the minimum distance from MD. The join between AP and MD is the main addition to Listing 5's code, which returns all paths. The significance of the join is that you get back only pairs of pages that have the minimum distance between them, plus the actual paths. Run the code in Listing 7 to get the desired result, which Table 5 shows.

Transitive Closure and Recursive CTEs
Now you can utilize the new recursive-querying capabilities of SQL Server 2005 to provide short, elegant, ANSI-compliant, set-based solutions to transitive closure and shortest path problems. You can implement the techniques we've described in applications that handle requests for hierarchical information such as you'd find in BOMs and applications that deal with reachability questions, such as for road systems. The solutions for acyclic graphs are fairly simple and are based on traversal of one level at a time. The solutions for cyclic graphs use similar traversal of levels but also utilize cycle-detection techniques to stop traversing when a cycle is detected. Using the techniques we demonstrated in this article, you can quickly respond to requests for finding a path between two items and finding the shortest path between two items.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

 
 

ADS BY GOOGLE