Executive Summary:
The amount of logging can affect a SQL Server operation’s performance. Learn which insert methods can benefit from minimal logging, to improve your statements’ performance.
|
SQL Server supports several methods for inserting data into tables, including SELECT INTO, BULK INSERT, bcp.exe, INSERT SELECT FROM OPENROWSET(BULK …), and regular INSERT SELECT. Some of the insert methods perform minimal logging in certain circumstances. Because writes to the transaction log in SQL Server are sequential, the amount of logging can affect an operation’s performance. Operations that perform minimal logging run substantially faster than operations that perform full logging, as well as reduce the load on the disks holding the transaction log.
In this article I describe the insert methods that can benefit from minimal logging and the requirements they must meet. I provide you with tools to determine how much logging was involved in a particular operation and to discover what information was logged. I also cover some important enhancements in SQL Server 2008 in terms of minimally logged insert methods.
Note that SQL Server supports other types of minimally logged operations, such as index and large object operations, but this article focuses on insert methods. Also, SQL Server supports minimal logging for certain insert methods that don’t use T-SQL (e.g., that use SQL Server Integration Services—SSIS, or the bulk-copy API), but this article focuses on methods using T-SQL.
For more information about Bulk Import Optimizations, see the series of blogs written by Sunil Agarwal, who is with the SQL Server development team at Microsoft (blogs.msdn.com/user/Profile.aspx?UserID=13989).
Prior to SQL Server 2008
Prior to SQL Server 2008, only the following insert methods benefitted from minimal logging: SELECT INTO, BULK INSERT, bcp.exe, and INSERT SELECT FROM OPENROWSET(BULK …). Let’s discuss the SELECT INTO statement first, because the other insert methods (which I refer to as the bulk import methods) have different requirements for minimal logging.
One requirement that all insert methods share in order to allow minimal logging is that the database recovery model must be set to SIMPLE or BULK_LOGGED. If you set the database recovery model to FULL, all insert methods will incur full logging. Of course, before you set your database recovery model to something other than FULL, you need to make sure the setting is acceptable for your environment’s recovery needs. For more information about recovery models, see SQL Server Books Online.
SELECT INTO. The only requirement necessary to allow minimal logging for the SELECT INTO statement is to set the recovery model to SIMPLE or BULK_LOGGED. Prior to SQL Server 2008, the regular INSERT SELECT statement couldn’t benefit from minimal logging; therefore, SELECT INTO had an advantage over INSERT SELECT. However, SELECT INTO has several shortcomings compared with INSERT SELECT.
SELECT INTO creates the target table and populates it with data, combining the Data Definition Language (DDL) and the Data Manipulation Language (DML) activities into one statement. SELECT INTO doesn’t give you control over definition of the target table—this is dictated by the queried source. Also, in SQL Server—unlike in some other database platforms—both DDL and DML are transactional. The implication is that while a SELECT INTO statement is running, both the data involved in the DML part and the metadata involved in the DDL part (rows written to system tables) are exclusively locked. So if you try to query the system catalog while a SELECT INTO statement is in progress, and your query attempts to grab locks on metadata rows that are locked by the SELECT INTO transaction, your request will be blocked. Even though a minimally logged SELECT INTO statement can run substantially faster than a fully logged INSERT SELECT statement, the SELECT INTO statement can take a long time to run if you’re dealing with large volumes of data.