As the November 7 release of Visual Studio 2005 and SQL Server 2005 approaches, we're seeing a flood of courses, technical articles, and marketing pieces pour out of the content providers like so many cars trying to get out of town on a Friday evening. Much of this new content unabashedly extols the virtues of ADO.NET 2.0, the new version of ADO.NET, and the tools you can use to create applications that reference it. Many of the articles simply highlight the long list of new, flashy features in ADO.NET 2.0, but I believe most developers care more about how these features will solve specific development challenges. Therefore, I'm going to introduce ADO.NET 2.0 by listing some important problems and explaining how the new release addresses them in smarter, faster, and better ways.
Smarter at Managing Connections
One of the uglier problems with ADO.NET has been how the data providers handle (or mishandle) the connection pool. For example, when a connection dies in ADO.NET 1.1, the pooling mechanism keeps the carcass around until some unsuspecting application tries to reuse it, and only then does the pooler toss out the corpse. However, the pooler leaves other dead connections for subsequent Open calls to trip over. ADO.NET 2.0 implements a new approach: Once it finds a bad connection, it flushes that connection and all other bad connections from the pool, which greatly simplifies your exception handlers. You can also add code to force the pool (or all the pools) to flush. ADO.NET 2.0 also replaces the connection pool performance counters. The new counters seem to work (the old ones clearly did not), so you can more accurately monitor your connection-pool status.
When working with the Visual Studio IDE to create a new connection in Visual Studio 2003, you have a choice of OLE DB providersbut no .NET Data Providers. To address this problem, ADO.NET 2.0 exposes a new DbConnectionStringBuilder class to help developers build real .NET Data Providerspecific connection strings. Given the ability to use new .NET Framework APIs to list providers and servers, tool builders will have an easier time building ConnectionString dialog boxes.
Some applications, especially those that have evolved from JET architectures, try (and fail) to use a single connection to post updates against rowsets that aren't fully populated. For example, developers might execute a SELECT statement by using a DataReader and, as they step though the rows, try to execute an UPDATE command through the same connection. Regardless of the wisdom (or lack thereof) of this approach, ADO.NET 2.0 and SQL Server 2005 now support Multiple Active Resultsets (MARS). Although I've yet to embrace this feature, Microsoft seems pretty proud of it. MARS lets you execute additional operations on a single connection (under guarded circumstances). I believe that simply opening an additional connection usually makes more sense, but MARS is off by default, so developers can choose whether to use it.
Handling Data FasterFor Smart and Dumb Stuff
To be fair, I accept that some developers really need to bring 100,000 rows to the client and manage the data in a DataTable. Microsoft accommodates these folks by redesigning the way ADO.NET constructs, stores, indexes, and fetches DataTable and DataSet objects. If you keep your DataTable smaller than 10,000 rows, you won't see much difference in performance. But developers who build gigantic DataTables will see a dramatic performance improvementas much as 80 times faster (or so Microsoft tells me). This enhancement shouldn't encourage you to download the entire list of state Medicare subscribers into memory. There are better, faster ways to deal with large volumes of datasuch as using the new BulkCopy class to import the rows into a server-side table and process them there by using a stored procedure or even a new CLR-based procedure.
It also seems that lots of developers use a DataReader to fetch data (because they heard it was faster) but subsequently use their own code to load the data into a DataTable or DataSet. ADO.NET 2.0's DataTable and DataSet now expose the Load method to let developers directly load a DataTable from a DataReader. You can also expose a Data-Reader stream from an existing DataSet or DataTable by using the CreateReader method. Yep, that means the DataReader might return multiple rowsets accessed through the NextResult method.
But suppose someone asks you how to make an application run faster when moving 10,000 or more rows to the client. You (like I) might casually ask, "Why?" Often, it's because the customer wants to execute a bulk operationbut doesn't know it. Generally, such customer requests take one of two forms.
- The customer asks you to fetch a bazillion rows from a table, make a change, and send the rows back. Of course, some customers aren't aware of stored procedures that can usually make all the needed changes right on the server in a fraction of the time. I provide these folks with a tutorial about writing intelligent server-side UPDATE operations. But people use the tools they're comfortable with.
- The customer asks you to fetch a bazillion rows from another database and move them to SQL Server. Sometimes these folks want to execute logic on the data as it goes by, but typically they just want to import a lot of rows from a mainframe in their Cleveland office. In the past, I recommended that these folks use DTS or bulk copy program (BCP) for the operation. In ADO.NET 2.0, you can easily import data from any data source by using the SqlBulkCopy class. You can bulk copy data exposed by any .NET data providerSqlClient, OLE DB, ODBC, or any third-party providerto SQL Server for processing. Once the data is on the server, I suggest the client alter or filter the rows in place. What takes hours or days to do with previous ADO versions takes seconds or minutes with ADO.NET 2.0.
To illustrate this last point, I recently wrote an application to use the MSDN Index CD to find a DVD in my Universal collection. Unfortunately, the Microsoft Internet Explorer (IE)based search tool that accompanies the Index CD takes several minutes for a simple search. I uploaded the MSDN Index data by using the ADO.NET 2.0 SqlBulkCopy class to move the ADO classic Advanced Data TableGram (ADTG)-persisted .rs files from the Index CD to SQL Server. After the files loaded, I created appropriate indexes and was able to reduce the query time from several minutes to less than 5 seconds. The BCP operation moved about 450,000 rows in less than 30 seconds. I also tried to import the XML data from the CD, but the results were appalling. Why did it take so much longer? Perhaps it's because XML is one of the most inefficient (but one of the most flexible) data-storage mechanisms.
Another feature not to be overlooked is the new ADO.NET 2.0 TableAdapter. You can best generate this class by using the new drag-and-drop mechanisms in the Visual Studio 2005 IDE. The TableAdapter class is designed to expose the new functionality of the DataTable object, which now implements many of the DataSet methods. The DataTable object lets you serialize an individual DataTable by using either XML or binary serialization. Add to that new binding mechanisms and rowset navigation controls, and you have a more powerful tool to build interactive applications. ASP.NET also gets some new ADO.NET support. For example, you can now automatically update the DataSets you build and bind by using new "full-duplex" data-management technology.
Prev. page  
[1]
2
next page