• subscribe
August 06, 2009 12:00 AM

Minimally Logged Inserts

Improve your insert statements’ performance
SQL Server Pro
InstantDoc ID #102462

 

Executive Summary:

The amount of logging can affect a SQL Server operation’s performance. Learn which insert methods can benefit from minimal logging, to improve your statements’ performance.

 

SQL Server supports several methods for inserting data into tables, including SELECT INTO, BULK INSERT, bcp.exe, INSERT SELECT FROM OPENROWSET(BULK …), and regular INSERT SELECT. Some of the insert methods perform minimal logging in certain circumstances. Because writes to the transaction log in SQL Server are sequential, the amount of logging can affect an operation’s performance. Operations that perform minimal logging run substantially faster than operations that perform full logging, as well as reduce the load on the disks holding the transaction log.

In this article I describe the insert methods that can benefit from minimal logging and the requirements they must meet. I provide you with tools to determine how much logging was involved in a particular operation and to discover what information was logged. I also cover some important enhancements in SQL Server 2008 in terms of minimally logged insert methods.

Note that SQL Server supports other types of minimally logged operations, such as index and large object operations, but this article focuses on insert methods. Also, SQL Server supports minimal logging for certain insert methods that don’t use T-SQL (e.g., that use SQL Server Integration Services—SSIS, or the bulk-copy API), but this article focuses on methods using T-SQL.

For more information about Bulk Import Optimizations, see the series of blogs written by Sunil Agarwal, who is with the SQL Server development team at Microsoft (blogs.msdn.com/user/Profile.aspx?UserID=13989).

Prior to SQL Server 2008
Prior to SQL Server 2008, only the following insert methods benefitted from minimal logging: SELECT INTO, BULK INSERT, bcp.exe, and INSERT SELECT FROM OPENROWSET(BULK …). Let’s discuss the SELECT INTO statement first, because the other insert methods (which I refer to as the bulk import methods) have different requirements for minimal logging.

One requirement that all insert methods share in order to allow minimal logging is that the database recovery model must be set to SIMPLE or BULK_LOGGED. If you set the database recovery model to FULL, all insert methods will incur full logging. Of course, before you set your database recovery model to something other than FULL, you need to make sure the setting is acceptable for your environment’s recovery needs. For more information about recovery models, see SQL Server Books Online.

SELECT INTO. The only requirement necessary to allow minimal logging for the SELECT INTO statement is to set the recovery model to SIMPLE or BULK_LOGGED. Prior to SQL Server 2008, the regular INSERT SELECT statement couldn’t benefit from minimal logging; therefore, SELECT INTO had an advantage over INSERT SELECT. However, SELECT INTO has several shortcomings compared with INSERT SELECT.

SELECT INTO creates the target table and populates it with data, combining the Data Definition Language (DDL) and the Data Manipulation Language (DML) activities into one statement. SELECT INTO doesn’t give you control over definition of the target table—this is dictated by the queried source. Also, in SQL Server—unlike in some other database platforms—both DDL and DML are transactional. The implication is that while a SELECT INTO statement is running, both the data involved in the DML part and the metadata involved in the DDL part (rows written to system tables) are exclusively locked. So if you try to query the system catalog while a SELECT INTO statement is in progress, and your query attempts to grab locks on metadata rows that are locked by the SELECT INTO transaction, your request will be blocked. Even though a minimally logged SELECT INTO statement can run substantially faster than a fully logged INSERT SELECT statement, the SELECT INTO statement can take a long time to run if you’re dealing with large volumes of data.



ARTICLE TOOLS

Comments
  • Zarzuela
    2 years ago
    May 10, 2010

    you talk about some times performance penalty when using the T-610 but don't specify when or why.

    Some guidance?

  • Zarzuela
    2 years ago
    May 10, 2010

    Incredible article, as usually, but the links are bloken (figures and listings), aren't them?

  • Steve
    3 years ago
    Oct 01, 2009

    I should clarify that all I care to minimally log are data warehouse INSERTs. Thus, I can get away with the simple rule of "...ignoring 'Log Record Length' and simply look at aggregate counts of LOP_INSERT_ROWS."

  • Steve
    3 years ago
    Sep 30, 2009

    lovalvob...
    Itzik mentioned TF 610, but his code does not use it. Giving credit where it's due, he also pointed to Sunil's blog, where you may find the following helpful: http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/23/minimal-logging-changes-in-sql-server-2008-par
    t-2.aspx

    TF 610 is responsible for virtually all the magic shown (INSERT INTO non-empty clustered index w/out TABLOCK, yet no meaningful logging!)

    BTW, I always have to keep in mind that BULK INSERT (my preference) behaves differently than INSERT INTO, with respect to locking. For more on BULK INSERT's behavior, go to part 3 of Sunil's blog post.

    As Byzantine as minimal logging is, the real trick is detecting minimal logging--or a lack thereof. Itzik's code makes a great contribution in this area. Personally, however, I ignore "Log Record Length" and simply look at aggregate counts of LOP_INSERT_ROWS. Anything less than the number of rows actually inserted means that minimal logging took place...or that you're pulling from a truncated log :)

  • Barry
    3 years ago
    Sep 23, 2009

    Very useful, but it appears to be a good deal less effective when there are indexes (even non-clustered). Can you comment?

You must log on before posting a comment.

Are you a new visitor? Register Here