• subscribe

Solutions to VLT concerns around statistics and maintenance!


Posted @ 2/9/2012 1:49 PM By Kimberly L. Tripp

 

Question: Partitioned tables were the new shiny feature in SQL Server 2005 – why are partitioned views even still available? Are there any benefits that they provide?

Answer: In the first post (Partitioned Tables v. Partitioned Views–Why are they even still around?) I raised some of the general questions and concerns that come with VLT (very large table). Today, I’ll start to tackle why partitioned views can be a fantastic choice for partitioning large sets – even for new design.

When I first posted, some folks emailed me and said that the most obvious reason to choose partitioned views is that they don’t require Enterprise Edition. While that’s a fantastic reason, it’s still not the only reason (or my favorite). The most important reason for me is that there are still a few concerns around VLT that even partitioned tables don’t solve. In this post, I’ll tackle two concerns with partitioned tables.

The first issue is related to statistics. Statistics always cover the entire set described by the index (or statistic). For the purpose of this discussion, I’m going to focus primarily on table-level statistics (meaning statistics that are NOT filtered). And, while you can use filtered statistics with partitioned tables – there are significant limits to fast-switching. As a result, they might solve one problem yet introduce another. So, why are table-level statistics an issue? Really, in concept, they’re not. But, remember, a statistic must be relatively small to be useful. With larger and larger tables (especially those that have a lot of uneven data distribution), it becomes more and more difficult for a statistic to be accurate.

Additionally, it also takes longer and longer for the statistic to get updated. There is a special trace flag available to change the invalidation threshold for statistics (so that they don’t wait as long to get updated) but there’s nothing that changes their accuracy (except filtered stats – which then eliminate fast-switching if the filtered statistic isn’t over the entire set). And, unfortunately there’s also an issue with regard to how/when filtered statistics get updated. Simply put, you’ll need to manage their updating in order to keep them more accurate. So, none of the solutions to make statistics for VLT more accurate really work.

However, what does solve the problem? Keeping your tables smaller and more manageable. Conceptually, this IS “partitioning” but partitioned tables don’t solve the problem. Instead, partitioning your data into “partitioned views” is exactly that solution. Instead of having 20 years’ worth of sales in one VLT, consider one table per year. These “yearly” tables could each be partitioned tables or just standalone, non-partitioned tables. Then, how do you query 20 tables? If you constrain the tables by date (possibly by ID as well if there’s a correlation) and then use a UNION ALL view to bring them together, then your queries can do partition elimination. As long as your queries supply either the date (or the ID, if the base tables are also constrained by ID) then the optimizer can compare your WHERE clause to the constraints that exist. If the tables have been “checked” against the constraint then the constraint is deemed trusted. Trusted constraints can be used by the optimizer to eliminate partitions. Tables without any relevant data will not be accessed. This is the definition of partition elimination. The end result for statistics is that each table’s table-level statistic will be more accurate because the data set over which it covers is smaller.

This brings me to the second issue – maintenance. While partitioned tables do offer some maintenance benefits (fast-switching for data loads and/or data removal) they do not offer online rebuilds at the partition level (in any release or edition – even SQL Server 2012). So, if you have a single, VLT of 20 years’ worth of data with only the most recent/current data being updated then you only need to defragment that last year. In a single partitioned table you’ll need to take the partition offline to rebuild it. If you have separated this into 20 individual years of data then you can rebuild the most current year (at the table-level) online. You can always do only reorganization of your data but a rebuild cannot be done at the partition level as an online operation. Online operations are only supported at the table-level.

So, for part 2 – those are two of my favorite reasons to consider partitioned views over partitioned tables! I’ll tackle a bit more around this scenario in part 3.

Keep those comments/emails coming!
kt

Related Content:

Comments

Add A Comment
  • Posted @ February 23, 2012 05:41 PM by Kimberly L. Tripp

    Thanks Rowland! It's funny when you put everything together... often I'll end up with PVs and PTs together but rarely PTs alone. There are just too many benefits to splitting tables into smaller tables (even if they are partitioned).

    Cheers,
    kt

  • Posted @ February 10, 2012 08:04 AM by Rowland Gosling

    Hard to argue against your proposition that partitioned views are better management tools for VLTs.

    Nicely done Kimberly.

    Rowland

You must log on before posting a comment.

Are you a new visitor? Register Here


More about Kimberly and Paul . . .

Paul worked on the SQL Server team for nine years in development and management roles, writing many of the DBCC commands, and he was ultimately responsible for SQL Server 2008’s core storage engine. Paul blogs at SQLskills.com/blogs/Paul.

Kimberly worked on the SQL Server team in the early 1990s as a tester and writer before leaving to found SQLskills and embrace her passion for teaching and consulting. Kimberly blogs at SQLskills.com/blogs/Kimberly.

They've written Microsoft white papers and books for SQL Server 2008, 2005 and 2000, and they're regular, top-rated presenters worldwide on database maintenance, high availability, disaster recovery, design, performance tuning, and SQL Server internals. Together they teach SQLskills Immersion Events, which are wildly popular classes that teach the essential skills required to be an effective DBA; these classes also cover the required topics of the Microsoft Certified Master (MCM) certification.


Check out Kimberly and Paul’s SQLskills website for more information about their upcoming speaking engagements.