• subscribe
February 28, 2000 10:32 AM

Creating Horizontally Partitioned Views

SQL Server Pro
InstantDoc ID #8234
Downloads
8234.zip

Improve query performance and maintain high data availability

Every database management system (DBMS) developer eventually encounters a situation in which a table stores a huge amount of historical data but users retrieve only small, logically distinct portions at a time. For example, an international trading firm's database might have tables containing millions of records' worth of trades spanning a year or more, but users typically need to retrieve data for only one month at a time. To improve query performance, a DBA or a developer often splits a large table into separate tables with the same structure, each containing data restricted to a given logical separation. For instance, the trading firm's database could take the form of 12 tables, one for each month. If the yearly total was 120 million records, you might expect an average of 10 million records per table. Most users will issue queries that span a month or less, so they can choose the monthly table they want to query and sort through 10 million records instead of 120 million.

Querying smaller tables is more efficient, but it complicates the creation of front-end applications and reporting tools that need the hard-coded table names to relate to a particular query. Worse yet, if any queries span more than a month, creating programs in a robust fashion becomes almost impossible without arbitrarily applying UNION statements throughout the code.

Fortunately, SQL Server 7.0 has a feature known as horizontally partitioned views, which provides a highly effective alternative to that complexity. A horizontally partitioned view lets you divide a large table into smaller sub-tables yet provide only one view as the means of interacting with all the tables. A user or developer needs to query only one view to extract records from any of the sub-tables. In addition to the convenient interface, this type of partitioned view provides a considerable performance boost.

However, besides a brief description in SQL Server Books Online (BOL) and a small mention in the SQL Server resource kit, this useful and elegantly implemented feature is surprisingly sparsely documented. To remedy that lack, let's look at what horizontally partitioned views are, how to implement them, and how to distribute data into the correct tables automatically.

Create Horizontally Partitioned Views
To improve query performance, you could split a large table, such as the one with 120 million records, into 12 more-manageably sized tables, each containing one month's records. Because the table maintains the same structure but is partitioned along row-level boundaries—in this case, by month—it is horizontally instead of vertically partitioned. To treat all these tables as one object, you need to rejoin them with a partitioned view. The view definition in Listing 1 shows the fundamental structure of a partitioned view. Issuing queries against one object instead of 12 is one big advantage this method offers when you're creating applications and reports.

For the query plan to work most efficiently, the SQL Server engine needs a guarantee that each table contains only the data it was intended to contain. To obtain this guarantee, use check constraints. You need to create each table with the constraints in the manner that Listing 2 illustrates. The listing includes table and index creation data-definition language (DDL), but note that the check constraint is designed to ensure that the table includes only data for a particular month—January, in this example. This check constraint on the trade date guarantees that each table contains only trades for a given month. Now, when someone issues a query such as

SELECT * FROM V_TRADES 
WHERE
TRADEDATE  BETWEEN '1/10/1999 AND '1/15/1999' 
AND   TRADETYPE = 'B'

the SQL Server engine will quickly examine all the tables that compose the view and determine through the check constraints that only the TRADE_011999 table might contain records that match the date portion of the WHERE clause. The engine will then query the TRADE_011999 table without attempting to select any records from the other tables that form part of the union.

Populate the Tables
After you've established partitioning as a method for seamlessly improving query performance, you still need a way to place the trades in the correct table. Conceivably, trade data might come in from the end of the previous month, from the current month, or from deferred trades that occurred 2 months ago. For the table population to work as seamlessly as the queries, you must create a stored procedure that can tell which table the record belongs in based on information in the record—in this case, the trade date. Because the table name the insert statements will use is variable, the most efficient way to create a stored procedure for this purpose is by using the sp_executesql system stored procedure, as Listing 3 shows. The example in Screen 1 shows a query on data from February 1999. The query plan shows that SQL Server will scan only the February table and check only the month constraint for the other tables.

A partitioned structure for large tables provides some benefits that go beyond query performance. By partitioning tables, you can easily archive old data in removable media as the months progress. For example, if you divide a 7GB table into 12 tables of 600MB each, you can store each month on a CD-ROM. Without this partitioning, an organization faces the daunting prospect of copying data from the main table to an archive table and then deleting the copied rows. In a system that runs 24 x 7, the locks such an operation generates might render the database inaccessible for an unacceptably long time. Horizontal partitioning helps you maintain high availability without costly overhead.



ARTICLE TOOLS

Comments
  • asomar
    8 years ago
    Jul 28, 2004

    I creared partitioned view, i see performance gain for reading data but performance degrades for my inserts.

  • marcelo
    10 years ago
    Sep 21, 2002

    I created a partitioned view on 12 tables, one for each month. The primary key of each table is (Date, id) and is clustered. I created a check constraint on date for each table. I did some tests to check the execution time of the same query over the table with data of the whole year and the partitioned view created. For my surprise the second was much bigger! Checking the execution plan of the query over the view it makes a index seek for all the tables, including the ones with date out of the between range. Is ir right? How to make sure The view I created is considered partitioned?

  • LUCY KHOVYEVA
    11 years ago
    Aug 20, 2001

    Hi,
    My name is Lucy. I've read your article about creating horizontally partitioned views (article dated April 2000 in SQL Server magazine).
    I working with an application which has a similar problem- large table, but user retrieves only small amount of data. Of course performance is horrible. Partition views would be definitely a solution for our intent to improve performance. Your article was really helpful, but it leaves a few questions unanswered. I've tried to search for an answer on web, but could not find anything. You would be a real help if you just take a few minutes and answer on two of my questions.

    1. What is the maximum number of tables can be used in single horizontally partitioned view?

    2. Does horizontally partitioned view technique works with queries where inner join is used for specifying the criteria?
    Ex:
    select * from vwJobAnalysis inner join tblPeriod on vwJobAnalysis.per_end_dt =tblPeriod.per_end_dt

    It seems that SQL Server scans all tables even when only one table has a data that matches the criteria in table tblPeriod. (I've created a few tables, all tables are have the same structure and no indexes. Each table has a constraint on field "per_end_dt" equals to specific period). SQL Server scans only one table if a change my syntax to :

    SELECT * FROM vwJobAnalysis WHERE per_end_dt IN ("08/15/2001")

    Please let me know if I did something wrong.
    -Thanks in advance.
    Lucy Khovyeva.

  • Zeev
    11 years ago
    May 02, 2001

    Where can i Fined Listing 1 and Listing 2 ??

  • david freibrun
    12 years ago
    Aug 28, 2000

    Do horizontally partitioned views work with a large number of tables? Say, over 100? I created the view with just 10 tables and defined contraints on the tables and the results were considerably slower than when I was querying an individual table. I need to find out if I am missing something.

You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...