• subscribe
May 26, 2004 12:00 AM

Cycling with CTEs

Common table expressions can help you manipulate hierarchies with ease
SQL Server Pro
InstantDoc ID #42452
Downloads
42452.zip

Now, add a filter to the outer query to return employees only if the Cycle value is 1, thus returning only employees for whom you detected a cycle. After detecting the cycle, you can fix the data by correcting the employees' manager IDs. The code in Listing 4 implements this algorithm. Note that you must run this listing's code from the same connection as the one where you created the #EmployeeCycle temporary table. Figure 1 shows the result of running Listing 4's code. SQL Server didn't abort the query, which returned Employee 12, because it detected a cycle. The Path column helps you identify the full cycle so that you can fix the data in your database.

Sorting Siblings
A request I frequently get from readers and customers about hierarchical data manipulation is, How can I sort siblings under each parent according to a specified sort order? Say you need to return the subtree of Employee 12, sorting siblings by employee ID or by first name. SQL Server 2000 and earlier releases have no simple way to handle such a request, and unfortunately, SQL Server 2005 doesn't bring a built-in solution to this problem. However, with a bit of creativity, you can use CTEs to craft a custom solution. In much the same way I constructed an enumerated path of employee IDs in Listing 4's solution, you can construct a binary path of employee IDs leading to each employee. You convert the current employee ID from an int data type to a binary(4), then concatenate it to the path of the employee's manager. Call the concatenated binary path SortCol, and use it in the ORDER BY clause of the outer query, and you get siblings sorted by their employee IDs. Running the code that Listing 5 shows returns Employee 12 and all levels of subordinates, with siblings sorted by their employee IDs.

The code in Listing 5 also generates a pseudo column, Level, using the technique for calculating the employee level below the root of the subtree that I demonstrated in "Get in the Loop with CTEs." The outer query uses the Level column's value to indent each employee a certain number of spaces and pipes ( | ) according to his or her level in the hierarchy. The outer query returns the employee ID in parentheses followed by the employee name. Figure 2 shows the result of running Listing 5's code, listing employees who have the same manager, sorted by employee ID.

A more complex problem involves sorting by a column such as FirstName, which doesn't lend itself to constructing short binary paths out of small, fixed-size values because first names have varying lengths. To concatenate name values of fixed sizes so that you can sort correctly, first use the following query to return, for each employee, the employee's sort position among siblings under the same manager:

SELECT EmployeeID, FirstName,
    ManagerID,
  ROW_NUMBER() OVER(PARTITION BY
    ManagerID ORDER BY FirstName)
FROM HumanResources.Employee
  AS E1
JOIN Person.Contact AS C
  ON C.ContactID = E1.ContactID

This query returns all rows from the Employee table and the corresponding first names from the Contact table. The query uses the new SQL Server 2005 ROW_NUMBER() function to return row numbers of employees separated by manager ID and sorted by first name. In other words, the function calculates the sort position of the employee among the employees who have the same manager (siblings) according to the employees' first names. Now, use Listing 5's solution, but instead of referring to the Employee table and constructing a binary path out of employee IDs, refer to a CTE that contains the query that calculates the sort positions by first name (call it EmpPos). Then, construct a binary path from the positions, as Listing 6 shows, instead of constructing the path from the employee IDs. Notice in the result, which Figure 3 shows, that employees who have the same manager are sorted by first name.

"It's the Question That Drives Us"
CTEs let you easily and efficiently manipulate multiparent hierarchies, avoiding lengthy iterative code. However, CTEs don't provide complete functionality for detecting cycles and sorting siblings. Sometimes, not having the "perfect" tools can be motivating because it gives you room for thought and creativity. One of my favorite quotes from The Matrix film series applies to puzzle enthusiasts like me. Neo, a computer programmer, is puzzling over what the Matrix is. Trinity, knowing the answer and its complexity, tells Neo, "It's the question that drives us—the answer is out there." The tough question or puzzle drives you to look for the answer. With the techniques I've shown, you can create your own cycle-detection and sorting solutions for working with hierarchies.



ARTICLE TOOLS

Comments
  • Anonymous User
    7 years ago
    Sep 01, 2005

    Another great article from Itzik - on June CTP I needed to cast(BOM.PerAssemblyQty as int) to get listing2 to work. [CTEs will ensure we upgrade to 2005 soon.]

You must log on before posting a comment.

Are you a new visitor? Register Here