When you fulfill these requirements, you can use any bulk-loading toolincluding BCP, BULK INSERT, tools that use the OLE DB interface IRowsetFastload (including the DTS Data Pump and XML Bulk Load), or the ODBC BCP APIto load data with minimal logging.
Verify Minimal Logging
To verify that your operation is using minimal logging, you can use one of two techniques, depending on whether you're loading into a heap or a table that contains a clustered index. If you're loading into a heap (i.e., a table without a clustered index), you can easily see whether you're using minimal logging by checking the lock type that SQL Server uses. With minimal logging, SQL Server acquires a table lock of req_mode = 12. Information about the table lock resides in the syslockinfo system table. To access the information, wrap the BULK INSERT statement in a BEGIN TRAN statement, as Listing 1 shows. The code uses the sp_lock stored procedure to load data and show the lock results, returning output similar to that in Figure 4. The presence of the lock type TAB and mode BU indicates that SQL Server is performing a minimum-logged bulk load into a heap.
When you're loading into a table that contains a clustered index, determining whether you're using minimal logging is more difficult. SQL Server doesn't use a BU lock with a clustered index, so you can't use the lock type to validate whether you're using minimal logging in the bulk operation. Instead, you need to examine the number and size of the log records. To do so, use the following query:
SELECT [# Log Records]
= count(*),
[# Log Bytes] = sum
([Log Record Length])
FROM ::fn_dblog(NULL, NULL)
For example, our team used the code that Listing 2 shows to run a test of this method. We loaded a native BCP file containing 20,000 rows and 20,280,000 bytes, then changed the variables that influence the amount of logging to simulate different scenarios. We received the results that Table 1 shows, which reveal how changing variables affects logging.
After you've determined how to get the best data-load performance for one session, you can work on scaling out your data-load operation. Even when you know how to optimize one session, you can't simply start running multiple sessions in parallel to speed up your data load. The Customer Advisory Team will cover parallel data loading in a future column.
To maximize your data-loading throughput, you must understand how the transaction log, recovery model, and locking affect your data-load potential. The Customer Advisory Team works to help most of its customers improve their data-loading throughput because these days, almost every project starts with existing data.
End of Article
Prev. page
1
2
[3]
next page -->