SideBar    Temporary Tables: Local vs. Global
DOWNLOAD THE CODE:
Download the Code 19663.zip

Recursive stored procedures populate a tree from a SQL Server table

Storing and representing data in a hierarchical, or tree, form is a common requirement in database development and an elegant solution for many real-world programming problems. T-SQL provides a powerful way of rendering the rows of a resultset in tree form (i.e., indented to resemble a tree). Relational database management systems (RDBMSs) such as SQL Server inherently produce resultsets as a list of rows, but you can produce the same list as a tree—if you set up the correct schema and relationships. Here, we develop three stored procedures to build a tree from a SQL Server table. The client application calls one of the stored procedures, which uses the other two as needed. Then, an Active Server Pages (ASP) or HTML page uses the output from the stored procedures to produce an HTML list box without requiring any client processing.

Our T-SQL tree solution uses stored procedures, cursors, and global temporary tables (the names of which begin with two pound symbols—##). One of our examples also uses ASP or HTML to interface with the SQL Server output, although a Visual Basic (VB) or Visual C++ (VC++) client can use the same output. We used SQL Server Enterprise Manager to design the database schema and Microsoft Visual InterDev 6.0 to develop the stored procedures and the ASP page. (SQL Server Query Analyzer and Notepad are possible alternatives to Enterprise Manager and Visual InterDev, respectively.)

SQL Server features let you provide robust solutions to many complex problems. Tree generation is only one example of what you can do. We use the following SQL Server features in this article:

  • storing hierarchical data in a table structure or schema
  • using recursion in T-SQL stored procedures
  • using a global temporary table
  • making a solution data-tier scalable by using SQL Server's system process ID (SPID)
  • employing CURSOR OUTPUT VARYING to use the output resultset from one stored procedure in the parent stored procedure
  • using the sp_executesql system stored procedure to execute queries constructed on the fly

Data Structure Must Support a Tree
To present a resultset as a tree, your schema or table structure must let you process the data in hierarchical, or tree, form. The key to creating a tree is to define the parent-child relationship with no limit on the number of child, or nested, levels possible. For example, let's look at a simple project plan: A project contains n main activities, main activities have subactivities, and subactivities have further subactivities (or sub-subactivities). Consider the schema diagram in Figure 1. The following code creates the project activity table:

CREATE TABLE project_activity_schedule (
	activity_code VARCHAR(10) NOT NULL PRIMARY KEY,
	activity_description VARCHAR(255) NOT NULL,
	planned_start_date datetime NULL,
	parent_activity_code VARCHAR(10) NULL
)

A unique code in the activity_code column identifies each activity in the project. The activity_description and planned_start_date columns provide more information about each activity. But the column that converts the table's structure from a simple list-type structure to a hierarchical structure is parent_activity_code. This column's type and width are the same as those of activity_code.

The parent_activity_code column uses a foreign key to reference the activity_code column in the same table; thus, activity_code and parent_activity_code have a recursive relationship, which the pipe in the upper left corner of Figure 1 depicts. Therefore, each activity that has a parent activity has that parent activity's code in the activity's parent_activity_code column. Activities without a parent have NULL in the parent_activity_code column. You can use Enterprise Manager's Database Designer to create this recursive relationship, or you can use the following statements after you've created the table:

ALTER TABLE project_activity_schedule
ADD CONSTRAINT fk_parent_child_activity
FOREIGN KEY (parent_activity_code) REFERENCES
project_activity_schedule(activity_code)

The activity_code's data type depends on the project requirements. For simplicity, we stored only two attributes for each activity—the description and the start date—but you can use more attributes if you want.

Let's determine whether the structure of the data set in Table 1 satisfies the requirement for a tree with unlimited levels. (Although the schema provides unlimited nesting levels, SQL Server imposes a limit of 32 levels for nested stored procedures.) Of the three root-level activities, or nodes, A1 has no subactivities, A2 contains three subactivities, and A3 has two subactivities. These subactivities have A2 and A3, respectively, as their parent_activity_code. You can further extend this structure to 32 levels by specifying the appropriate parent code. Now, let's create the three stored procedures required to build the tree.

The Three Stored Procedures
The client application calls the topmost stored procedure, tsp_prj_activity_tree_list (Tree List), to create a unique, global temporary table. Web Listing 1 shows the Tree List procedure (for download instructions, see the More on the Web box, page 56). Then, the Tree List procedure calls the tsp_prj_activity_tree_build procedure (Build), which actually builds the tree by using recursion and feeds the tree into the temporary table. The Build procedure calls the tsp_prj_activity_child_list procedure (Child List), which issues a query to select rows from the table and returns the rows to the Build procedure. (Figure 2 shows the relationships among the three stored procedures.) Finally, the Tree List procedure extracts the tree from the temporary table and drops the table.

Why create the table by using sp_executesql rather than the CREATE TABLE statement? The answer lies in the word unique. Because you can invoke multiple instances of this procedure at one time, using only one table would lead to errors. All procedures would try to create the same table and feed data into it. Either the different procedures would fail at creation because the table already existed, or when one procedure terminated, it would drop the table, making all the other simultaneous procedures fail. Also, the data fed into the table wouldn't make any sense if the data came from various procedures because its context would be missing. Thus, each concurrent instance of the procedure needs its own unique table.

You can create unique tables if you add @@SPID to the static table name—in this case, ##prj_activity_tree_table—so that the final tables' names are similar to ##prj_activity_tree_table1 and ##prj_activity_tree_table2. SQL Server assigns a unique SPID to each logged-on session. You can use the @@SPID global variable to obtain this ID's value, which is an integer. Therefore, for each user who makes a connection and invokes the procedure, we add this SPID to the table name, making the name unique. Because you don't know this part of the table name beforehand, you must construct the query on the fly as a Unicode string, then use sp_executesql to execute the query.

Using a global temporary table and SPID lets the table persist across stored procedures and support simultaneous multiple instances in each stored procedure. (For information about the differences between local and global temporary tables, see the sidebar "Temporary Tables: Local vs. Global.") To get a list of all current connections and their associated SPIDs, you can run the sp_who system stored procedure. (For more information about SPIDs, see SQL Server Books OnlineBOL.)

You use @gtemp_table_name to store the table name and @sql_statement_string to store the queries that you use to create the table, select records from it, and drop it. You enclose the entire operation between the BEGIN TRANSACTION and COMMIT TRANSACTION statements to ensure that the operation remains independent. Because the operation spans stored procedures and involves multiple WRITE operations, such as CREATE TABLE and DROP TABLE, you should ensure that WRITE operations either succeed or fail completely (i.e., that the procedure leaves no operation incomplete because of errors). Transaction control isn't mandatory here because SQL Server has an automatic mechanism for cleaning up global and local temporary tables. However, good programming practice dictates that you ensure completion in such contexts, and when you extend this example to build an expanding-collapsing tree, you must also maintain state, which involves a WRITE operation.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

This solution works but it can be better. Recursive stored procedures are still limited to 32 levels deep. A better alternative is to create a temporary table and fill that table the top level parents. Then repeatedly fill the table with children of the parent already in the table until no records are added. Sorting is accomplished by building a sort string as you go (sort_order = SPACE(tree_level) + parent_key + child_key). This needs one procedure one local temporary table and one loop.

Billy

Another way to generate resultset in tree form, is adding a column to the table (or create a view or use the function in the select statement) to record the level in the tree, then you can use this value to generate the indent.

Example: (I use same table definition and sample data)

CREATE FUNCTION fnMyFunction ( @activity_code VARCHAR(10) ) RETURNS INT BEGIN DECLARE @i INT

SET @i = 0

IF @activity_code IS NOT NULL WHILE @activity_code IS NOT NULL BEGIN SET @i = @i + 1

SELECT @activity_code = parent_activity_code FROM project_activity_schedule WHERE activity_code = @activity_code

END

RETURN @i END GO

CREATE TABLE project_activity_schedule ( activity_code VARCHAR(10) NOT NULL PRIMARY KEY, activity_description VARCHAR(255) NOT NULL, planned_start_date datetime NULL, parent_activity_code VARCHAR(10) NULL, treeLevel AS dbo.fnMyFunction(parent_activity_code) ) GO

ALTER TABLE project_activity_schedule ADD CONSTRAINT fk_parent_child_activity FOREIGN KEY (parent_activity_code) REFERENCES project_activity_schedule(activity_code) GO

SET NOCOUNT ON INSERT project_activity_schedule VALUES('A1', 'Site clearance - clearing of trees, garbage, etc.', '20000620', NULL) INSERT project_activity_schedule VALUES('A2', 'Excavation for foundation', '20000623', NULL) INSERT project_activity_schedule VALUES('A2.1', 'Arrangement of 25 unskilled laborers', '20000623', 'A2') INSERT project_activity_schedule VALUES('A2.2', 'Arrange of machinery', '20000623', 'A2') INSERT project_activity_schedule VALUES('A2.3', 'Excavation up to 25m x 25m x 1m', '20000623', 'A2') INSERT project_activity_schedule VALUES('A3', 'Laying of foundation', '20000630', NULL) INSERT project_activity_schedule VALUES('A3.1', 'Foundation activity 1', '20000630', 'A3') INSERT project_activity_schedule VALUES('A3.2', 'Foundation activity 2', '20000707', 'A3')

DECLARE @indent_string VARCHAR(25)

SET @indent_string = CHAR(9)

SELECT REPLICATE(@indent_string, treeLevel) + activity_code + ' ' + activity_description + ' (' + CONVERT(VARCHAR, planned_start_date) + ')' FROM project_activity_schedule

SET NOCOUNT OFF GO

DROP TABLE project_activity_schedule DROP FUNCTION fnMyFunction GO

Result:

-------------------------------------------------------------------------- A1 Site clearance - clearing of trees, garbage, etc. (Jun 20 2000 12:00AM) A2 Excavation for foundation (Jun 23 2000 12:00AM) A2.1 Arrangement of 25 unskilled laborers (Jun 23 2000 12:00AM) A2.2 Arrange of machinery (Jun 23 2000 12:00AM) A2.3 Excavation up to 25m x 25m x 1m (Jun 23 2000 12:00AM) A3 Laying of foundation (Jun 30 2000 12:00AM) A3.1 Foundation activity 1 (Jun 30 2000 12:00AM) A3.2 Foundation activity 2 (Jul 7 2000 12:00AM)

Alejandro Mesa

On Friday, Dec 13 2002, I posted a comment, about another solution to the sample posted with the article "Using T-SQL to Generate a Resultset in Tree Form". The solution, I sent to you, worked fine because casualty, I did not use any ORDER BY clause in the select statement to assure the result's order.

Here are two revisited versions, the first one, use "ORDER BY activity_code" because the way data was enter (A1, A2, A2.1, A2.2, etc.). The second one use a function that return the concatenation of parents and child's key in reverse order (first parent, second, third, ..., child's key) to assure the order of the result. This solution has a lot of cons but if the concatenation of the keys does not exceeds 8000 characters, then this solution fit well. It does not perform very well in terms of time because the HierarchyLevel and HierarchyOrder is calculated for each node in the tree.

I created two functions, one (HierarchyLevel) returns the level of the node in the tree, and the other (HierarchyOrder) returns the concatenation of the parents and child key, to be used in the ORDER BY clause.

CREATE TABLE project_activity_schedule ( activity_code VARCHAR(10) NOT NULL PRIMARY KEY, activity_description VARCHAR(255) NOT NULL, planned_start_date datetime NULL, parent_activity_code VARCHAR(10) NULL ) GO

ALTER TABLE project_activity_schedule ADD CONSTRAINT fk_parent_child_activity FOREIGN KEY (parent_activity_code) REFERENCES project_activity_schedule(activity_code) GO

CREATE FUNCTION fnHierarchyLevel ( @activity_code VARCHAR(10) ) RETURNS INT BEGIN DECLARE @i INT

SET @i = 0

WHILE EXISTS (SELECT parent_activity_code FROM project_activity_schedule WHERE activity_code = @activity_code) SELECT @i = @i + 1, @activity_code = parent_activity_code FROM project_activity_schedule WHERE activity_code = @activity_code

RETURN @i END GO

CREATE FUNCTION fnHierarchyOrder ( @activity_code VARCHAR(10) ) RETURNS VARCHAR(8000) BEGIN DECLARE @s VARCHAR(8000) DECLARE @t TABLE ( ColA INT NOT NULL IDENTITY, Value VARCHAR(10) )

SET @s = ''

WHILE EXISTS (SELECT parent_activity_code FROM project_activity_schedule WHERE activity_code = @activity_code) BEGIN INSERT @t VALUES(@activity_code)

SELECT @activity_code = parent_activity_code FROM project_activity_schedule WHERE activity_code = @activity_code END

SELECT @s = @s + Value FROM @t ORDER BY ColA DESC

RETURN @s END GO

SET NOCOUNT ON

INSERT project_activity_schedule VALUES('A1', 'Site clearance - clearing of trees, garbage, etc.', '20000620', NULL) INSERT project_activity_schedule VALUES('A2', 'Excavation for foundation', '20000623', NULL) INSERT project_activity_schedule VALUES('A2.1', 'Arrangement of 25 unskilled laborers', '20000623', 'A2') INSERT project_activity_schedule VALUES('A2.2', 'Arrange of machinery', '20000623', 'A2') INSERT project_activity_schedule VALUES('A2.3', 'Excavation up to 25m x 25m x 1m', '20000623', 'A2') INSERT project_activity_schedule VALUES('A3', 'Laying of foundation', '20000630', NULL) INSERT project_activity_schedule VALUES('A3.1', 'Foundation activity 1', '20000630', 'A3') INSERT project_activity_schedule VALUES('A3.2', 'Foundation activity 2', '20000707', 'A3')

DECLARE @indent_string VARCHAR(25)

SET @indent_string = CHAR(9)

SELECT Tree = REPLICATE(@indent_string, dbo.fnHierarchyLevel(activity_code) - 1) + activity_code + ' ' + activity_description + ' (' + CONVERT(VARCHAR, planned_start_date) + ')' FROM project_activity_schedule ORDER BY activity_code

SELECT Tree = REPLICATE(@indent_string, dbo.fnHierarchyLevel(activity_code) - 1) + activity_code + ' ' + activity_description + ' (' + CONVERT(VARCHAR, planned_start_date) + ')' FROM project_activity_schedule ORDER BY dbo.fnHierarchyOrder(activity_code)

SET NOCOUNT OFF GO

DROP TABLE project_activity_schedule DROP FUNCTION dbo.fnHierarchyLevel DROP FUNCTION dbo.fnHierarchyOrder GO

Excuse me for my poor english.

Sincerely,

Alejandro Mesa

Alejandro Mesa

Great!!!

loge

http://www.codeproject.com/cs/database/Trees_in_SQL_databases.asp

Anonymous User