• subscribe
March 17, 2003 12:00 AM

XML Trees: Step by Step

How to return data as hierarchical XML
SQL Server Pro
InstantDoc ID #38047
Downloads
38047.zip

Relational databases often store data in a tree or hierarchy structure, in which each item is related to another in a parent-child relationship. An example is the classic company management structure such as the one that Figure 1 shows. The hierarchy usually has an arbitrary depth, and each item contains information only about its relative position in the tree. Judging from the many questions that appear on newsgroups, using SQL Server to return the entire tree in XML format is a common requirement. For example, many people need to display a hierarchy on a Web page or report. Figure 2 shows a tree displayed as XML.

Although SQL Server 2000's XML support is flexible, it has no direct support for returning hierarchical XML to any arbitrary depth. However, you can use the FOR XML EXPLICIT T-SQL command to achieve this result. (For more details about using this technique, see Bob Pfeiff, "In Control with XML EXPLICIT," March 2001, InstantDoc ID 19810.)

Some techniques for accessing hierarchical data through T-SQL can become quite complex; this article explains the techniques and builds up the T-SQL in small steps to make it easier to understand. On the way, I cover methods such as using dynamic SQL and reveal a few tricks and tips, including how to use temporary tables instead of UNIONs in your FOR XML EXPLICIT statements to reduce the size and complexity of these statements. The article assumes an understanding of SQL Server 2000's XML capabilities and the use of the universal table. ("In Control with XML Explicit" also covers the universal table.)

Universal Structure
The usual way of representing a hierarchy in a database is to use a table that has a self join, as Figure 3 shows. In this table, which contains information about each employee, the ReportsTo column is a foreign key that points to another row in the same table, forming the self join. A refinement of this approach is to split the information between two tables, one containing the information about employees and the other containing information about relationships, as Figure 4 shows. Listing 1, page 26, shows a script to create the tables and populate them with data (for simplicity, I didn't populate all the columns).

To return the data in XML format, you use the Explicit mode, writing queries to produce a result set that has a specific format. By adding the FOR XML EXPLICIT clause to the end of your query, you can return the results as XML rather than as a recordset. To create the XML that Figure 2 shows, you first need to create a query that produces the result set that Figure 5, page 26, shows. Note that the column names, the data order, and the data itself are each important for establishing the hierarchy.

The structure in Figure 5 is known as the universal table. This term is confusing because there is no actual table—you just need to write a query that returns data in this format. The universal table format is a means of visualizing the output of the required query. Adding FOR XML EXPLICIT to that query produces the XML in Figure 2. SQL Server is fussy about the universal table's structure. The column names and the row order must all be correct, or an error will occur—or worse, the query will produce XML with an incorrect structure. By comparing the data in Figure 5 to the XML in Figure 2, you can start to work out how SQL Server maps the data and column names to XML.

The special Tag and Parent columns contain metadata that defines the XML hierarchy. If Parent is 0 or null, this row becomes the root node. For all other rows, the Parent column stores the parent's tag. The row order is important here—the children must immediately follow the parent.

The names of the other columns also contain metadata that SQL Server uses to define the rest of the XML structure. In the example in Figure 5, Employee!1!EmployeeID adds the attribute EmployeeID to an element named Employee. The 1 denotes the tag number and ensures that the element is correctly placed in the hierarchy. The value of the attribute is the value of the column. The column Employee!1!FirstName adds a second attribute, FirstName, to the Employee element. Now you know enough about the universal table to produce hierarchical tree structures.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here