DOWNLOAD THE CODE:
Download the Code 16123.zip

Calculating Aggregates and Subtree Depth
Now for the neat stuff—problems that aren't easy to solve without recursion. Suppose you want to get the sum of all salaries of employees in a certain subtree, starting with a given manager. Listing 4 shows how you can use the dbo.ufn_GetSubtreeSalary function to perform this task. Note that the function is surprisingly short, considering the complex task at hand; it contains only a RETURN clause. The function asks for the manager's salary plus the sum of the salaries of the subtrees of each manager's direct subordinates. Now, try to use this new function to calculate the total salary for the subtree that starts with Janet (whose empid is 3), and you'll get 20000:

SELECT dbo.ufn_GetSubtreeSalary(3)

You can similarly calculate the depth of a subtree, as Listing 5 shows. The dbo.ufn_GetSubtreeDepth function also returns the result of a CASE expression, but the function is slightly different in this listing. The CASE expression first checks whether the given manager has subordinates. If so, the function returns 1 plus the maximum depth of all the manager's direct subordinates—hence the recursion. If the given manager has no subordinates, the CASE expression determines whether that manager exists. If she does, the function returns 1 (a manager with no subordinates is 1 level deep). Finally, if the manager doesn't exist, the CASE expression returns NULL. Now, try this function to calculate the depth of the whole tree by supplying Nancy's employee ID (Nancy is the big boss) as the argument, and you'll get 5, which means that the tree is 5 levels deep.

SELECT dbo.ufn_GetSubtreeDepth(1)

Getting a Chain of Management
Until now, these examples have used scalar UDFs—UDFs that return a single value. Now, let's see how to use table-valued UDFs, which return a rowset or a table (i.e., for use in the FROM clause). For example, a common requirement when working with hierarchical structures is to return a whole subtree starting with a given manager. Listing 6 shows the script that creates the ufn_GetSubtree function. Note that the returned table has the same structure as the original Employees table, with two additional columns: lvl and path. The lvl column holds the level in the subtree starting with 0, and the path column holds the path of the employee in the form .id0.id1...idn. This string contains all the employee IDs, starting with the top-level employee in the subtree and ending with the current employee. All the employee IDs in the path are preceded and followed by dots.

The path column allows proper sorting of the subtree rows that the dbo.ufn_GetSubtree function in my example returns. Because the path values of all subordinates of a particular employee are prefixed with their managers' paths, the values are sorted right after their manager. The function first inserts into the @tree table variable the manager's row, which you provided as an argument. Next, the function starts a loop, which iterates as long as the previous insert operation has result rows. The code in the loop appends the direct subordinates of the previous insert operation to the @tree table variable. You can identify the level of the previously inserted employees by keeping the current level of the employees in the subtree in the variable @lvl.

Now you can test the ufn_GetSubtree function. To get details about Andrew (whose empid is 2) along with details about his subordinates in all levels, issue the following query:

SELECT * FROM ufn_GetSubtree(2)
ORDER BY path

To get a hierarchical depiction of all the employees in the Employees table, issue the following query:

SELECT REPLICATE (' | ', lvl) + empname AS employee
FROM ufn_GetSubtree(1)
ORDER BY path

Figure 1 shows the results. To get the leaf nodes (employees who aren't managers) in Andrew's subtree, you issue the following query:

SELECT * FROM ufn_GetSubtree(2) AS S
WHERE NOT EXISTS(SELECT * FROM Employees AS E
WHERE E.mgrid = S.empid)

The last task is to report the chain of management leading to a particular employee by using the ufn_GetMgmtChain function, which Listing 7 shows. Note that you implement the ufn_GetMgmtChain function in a way similar to the ufn_GetSubtree function, except that you go up the chain until you reach the top. To get the management chain leading to James (whose empid is 14), you can issue the following query:

SELECT * FROM ufn_GetMgmtChain(14)
ORDER BY lvl DESC

In "Maintaining Hierarchies," I used the same lvl and path columns that I mention in this article, but I added them to the original Employees table and maintained their values with triggers. In terms of the SELECT queries' performance, adding such columns to the Employees table is more efficient than calculating the values of lvl and path in a UDF. The ufn_GetSubtree function needs to scan the whole table to calculate the values of lvl and path for each row before you can apply a filter that uses those columns. The solution that holds lvl and path as columns in the Employees table can exploit an index placed on the path column, thus not requiring a full table scan. However, using the ufn_GetSubtree function doesn't require any schema changes to the Employees table, and modifications to the Employees table don't incur the additional cost of maintaining the lvl and path columns, as in the triggers solution. The advent of UDFs has opened endless programmatic possibilities. UDFs are powerful tools for manipulating hierarchical data without having to alter a table's structure.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

 
 

ADS BY GOOGLE