Each range of values in a partition is restricted by
boundaries that are specified in the FOR VALUES
clause. Note that if you’re using datetime data types
for boundary values and your company has an office in
Europe, you’ll have to decide on an international standard
for datetime, so that it’s uniform across your company.
SQL Server assumes that us_english is the default
language for the session, so if that’s not the case, you’ll
want to create a user-defined function (UDF) that will
convert various date formats into us_english, and reference
that UDF in the FOR VALUES clause. You don’t
have to use literals in the FOR VALUES clause; you
can reference variables, functions, and UDFs.
Creating a Partition Scheme
Now that you’ve created a partition function, you need
to create a partition scheme. The partition scheme
maps partitions to various file groups, as shown in the
following command:
CREATE PARTITION SCHEME
MyPartitionScheme
AS MyPartitionFunction
TO (MyFilegroup1, MyFilegroup2,
MyFilegroup3, MyFilegroup4,
MyFilegroup5)
MyPartitionScheme is the name of the partitioning
scheme, and MyPartitionFunction refers to the partition
function. This command maps the boundary
values into partitions that are then assigned to one
or more file groups. Data rows with Date_of_Event
datetime values prior to 1/01/03 are assigned to MyFilegroup1.
Rows with datetime values greater than or
equal to 1/01/03 and prior to 1/01/05 are assigned to
MyFilegroup2. Rows with datetime values greater than
or equal to 1/01/05 and prior to 1/01/07 are assigned
to MyFilegroup3. All other rows with datetime
values greater than or equal to 1/01/07 are assigned to
MyFilegroup4.
For each set of boundary values (remember,
boundary values are in the FOR VALUES clause
of the partition function statement), there will be
(# of boundary values + 1) partitions. The previous
CREATE PARTITION SCHEME statement includes
three boundary values and four partitions. No matter
whether the partitions are created RANGE RIGHT or
RANGE LEFT, there will always be (# of boundary
values + 1) partitions, up to 1000 partitions per table.
So why are there five file groups instead of four in this example? (Remember, the partition function statement
had only three boundary values.) The fifth file group is
the optional “next-used” file group. Let’s look at how
this file group is used and how important it is in the
partition scheme.
In the CREATE PARTITION FUNCTION
example, the last boundary value is 1/01/2007, so
records with a Date_of_Event value greater than or
equal to the 1/01/2007 will be stored in partition 4
(P4). When 1/01/2009 rolls around, you’ll have to
create a new partition to maintain the storage scheme
that you’ve established. If your original CREATE
PARTITION SCHEME statement didn’t include a
next-used file group, you wouldn’t be able to split P4
into P4, which will hold data from 2007 through the
end of 2008, and partition 5 (P5), which will hold data
from 1/01/2009 and beyond. If your partitioning plan
involves periodically creating new partitions to hold
new data, as in this example, make sure you include the
next-used file group in your CREATE PARTITION
SCHEME statement.
You don’t have to assign one partition to one file
group, as I did in Figure 4. Instead, you can map multiple
partitions to a single file group or you can map all
the partitions to one file group. However, you can’t map
one partition to multiple file groups.
Creating the partition scheme might be the most
important step in the partitioning process. In the
future, you might want to combine the data from two
adjacent partitions into one partition, add a boundary
value to an existing partition, or move data from a
populated partition into an empty partition. To perform
these operations, you’ll want to do some advance
planning and create the partition scheme so that it
will support these activities. SQL Server 2005 Books
Online (BOL) provides some good planning resources,
including “Planning Guidelines for Partitioned Tables
and Indexes” (msdn2.microsoft.com/en-us/library/ms180767.aspx)
Creating a Partitioned Table
Creating a partitioned table isn’t all that different from
creating a regular table; you just have to reference the
name of the partition scheme in the ON clause. To do
so, run the command
CREATE TABLE SALESFact
(SalesFact_KEY bigint identity
(1, 1) primary not clustered NOT
NULL,
… all the other columns in the
table, ending up with the
partitioning column… ,
Date_of_Event datetime
NOT NULL)
ON MyPartitionScheme (Date_of_Event)
By specifying the name of a partition scheme, you’re
indicating that this table is a partitioned table. Obviously,
the partition scheme and the partition function
have to exist in the database before you can create the
table.
You can combine the data from multiple partitions
into one partition. However, you can only merge two
adjacent partitions at a time, so if you’re trying to “unpartition”
a table that has multiple partitions, you’ll
have to repeat this step many times. To merge two
partitions, run the command
ALTER PARTITION FUNCTION
MyPartitionFunction()
MERGE RANGE (‘1/01/2003’)
This command will merge the partitions as follows:
partition 1 (P1) will merge into P2, meaning that P2 will
then contain all rows with a Date_of_Event value prior
to 1/01/05. Internally (i.e., in the sys.partitions system
table), the partitions will be renumbered, starting with
one (not zero). P1 and P2 will become P1, P3 will
become P2, and P4 will become P3. I recommend
sketching out the DDL before you merge any partitions
because you could potentially cripple your operations
for sustained periods of time if you’re not careful
when you perform these merges. In fact, Eric Hanson,
Microsoft’s lead program manager for query processing
on the SQL Server Relational Engine Team, recommends
performing splits and merges only on empty
partitions, even if that means you have to temporarily
empty a partition.
Easily Manage Monster Tables
SQL Server 2005’s ability to horizontally partition
a table into non-overlapping sections and place
each section on a separate partition makes it easy to
manage large fact tables. Because the SQL Server 2005
Enterprise and Developer Editions are partitionaware,
properly-written queries will access only the
rows within the relevant partition, thus running much
faster than if they had to traverse the contents of the
entire table.
End of Article
Prev. page
1
[2]
next page -->