Inference is a quick, easy way to load an XML document into a DataSet. Tables, columns, and relationships are created automatically by introspectiona process whereby the DataSet examines the XML document's structure and content. Although using inference significantly reduces your programming effort, it introduces unpredictability in your implementation because small changes to the XML document can cause the DataSet to create different-shaped tables. These changes in shape can cause your application to break unexpectedly. Therefore, I recommend that you always reference a schema for production applications and limit your use of inference to building prototypes.
Now let's look at an example of how you can easily use a schema to build a client-side DataSet cache that you can use to update your SQL Server database.
Mapping an XML Order
Suppose you're writing an application that accepts orders from your customers in the XML format that the XSD schema in Figure 1 defines. The schema defines three complex types that provide the order's customer data, order data, and line items. A top-level Customer element defines the XML document's root. The containment hierarchy defines relationships between the elements: An Order element contains a LineItem element, and a Customer element contains an Order element. Figure 2 shows an instance of an XML document that matches Figure 1's schema.
The C# code in Listing 1, page 38, uses the ReadXmlSchema method to load the schema from Figure 1 into a DataSet called orderDS. ReadXMLSchema creates three DataTables that correspond to the Customer, Order, and LineItem elements that the schema defines. So that you can verify that the schema created the expected tables in the relational cache, the printDSShape method writes the table name for each table to the console, followed by a list of columns and the data type for each column.
Look closely at the column names in Figure 3, page 38. The Customer_Id and Order_Id columns are present in the DataTables although they aren't specified in the schema. The ReadXmlSchema method automatically adds these columns to the DataSet. The DataSet uses the columns as foreign keys to model the relationships between a Customer element and its Order element and between an Order element and its LineItem element. Because XML typically uses nested relationships instead of foreign keys, the DataSet automatically generates its own primary and foreign keys between the DataTables and stores them in these columns.
Also look carefully at the data types in Figure 3the DataSet has mapped the data types from XML Schema data types to the corresponding .NET data types. When you load an XML document into the DataSet, the DataSet converts each value from the XML to the corresponding .NET type.
After loading the schema into the DataSet, all you have to do to complete the relational mapping is load the XML data into the DataSet. Listing 1's ReadXml method opens the file named Order.xml, which Figure 2 shows. Then, it reads the data from the file into the three DataTables that the DataSet created when you read the schema in the previous step. Your XML order is now accessible through the DataSet.
To demonstrate how to access the data in the DataSet, Listing 1's printDSData method navigates through the DataTables and, for each table, displays the column names, followed by all rows in the DataTable. Figure 3 shows the automatically generated values for the Customer_Id and Order_Id columns that the ReadXmlSchema method added to the DataSet.
Also notice that three elements that appear in Order.xmlPO, Address, and Descriptionaren't mapped into the DataTables. This data is omitted because the schema you supplied to the DataSet didn't contain these elements, and the DataSet simply ignores any data not described in the schema when it's creating the shape of the relational cache and loading the XML data. This convenient feature lets your code work properly even if additional data you didn't anticipate is included in the XML order you receive from your customer.
Building Applications That Use the Cache
Now that you've learned how to use the DataSet to build a relational cache for XML data, you can apply this knowledge to implement applications that execute business logic and update SQL Server. Implementing business logic is relatively straightforward when you use the DataSet programming model. ADO.NET gives you several alternatives for updating data in SQL Server, including using DataAdapters, writing your own queries, and executing stored procedures. DataSets make mapping XML data to a relational model easy; the rest is up to you.