Datetime calculations are at the heart of many databases. Every day, programmers face
challenges related to the manipulation of datetime data. Nearly every time SQL Server Magazine's editors and authors brainstorm about subjects that we should cover
in the magazine, we agree about the need to cover datetime-related subjects simply
because they're so practical. With that in mind, I'd like to begin a series of articles in
which I discuss datetime challenges and calculations. This
month, I focus on trimming techniques—that is, returning
only the date or the time of a given datetime value.
Datetime Storage Format
One of the most common challenges in working with datetime datatypes (e.g., DATETIME, SMALLDATETIME) in SQL Server is that there's no separation between date
and time. But, of course, you often have the need to store just dates or just times. Storing
your dates and times in datetime datatypes has several advantages over storing them in
other datatypes (e.g., character strings): You get inherent integrity enforcement (i.e., invalid
values are rejected), and you can use the datetime functions in your calculations.
When you need to store only dates or only times, the trick is to trim the part you don't
need. However, because the datetime datatypes include both a date and a time portion,
you won't actually be trimming; rather, in practice, you'll be zeroing the irrelevant part.
The storage format that SQL Server uses internally to represent datetime values is two
4-byte integers for DATETIME and two 2-byte integers for SMALLDATETIME. One
integer is an offset in terms of days from the base date January 1, 1900, and the other is an
offset from midnight (in terms of clockticks for DATETIME—in which one clocktick is
31/3 milliseconds—and in terms of minutes for SMALLDATETIME). When you need to
store only dates, you'll store a date at midnight; technically, you'll "zero" the time portion
of the datatime value. Knowing that the time portion is always midnight in the values
you're manipulating, you can simply ignore it. By doing so, you'll be able to easily work
with dates.
Notice what you get when you convert a character string containing only a string
representation of a date to DATETIME:
SELECT CAST('20070212' AS DATETIME);
You get the output 2007-02-12 00:00:00.000. SQL Server assumes midnight as the time
value. Internally, it stored 0 as the integer representing the offset from midnight. Assuming you specify dates with midnight in the
time portion when you store them in a
DATETIME column called date_col in a
table, when you want to filter rows with a
certain date (e.g., February 12, 2007) you'll
use the filter
WHERE date_col = '20070212'
The column name date_col that appears to
the left of the equals sign is a DATETIME,
and the literal (i.e., constant) to the right
of the equals sign is a character string (i.e.,
VARCHAR) that contains only a date.
DATETIME has a higher datatype precedence than VARCHAR, so SQL Server
will implicitly convert the VARCHAR
value to DATETIME. Because no time
component was specified in the literal,
SQL Server will assume midnight as the
time component, and thus there's basis for
comparing date_col to a character string
that contains only the date component.
Similarly, if you want to store only times,
you can zero the integer that represents the
offset from the base date; in other words,
you store the times with the base date.
Notice what you get when you convert a
character string that contains only a time
to DATETIME:
SELECT CAST('01:23:43.210' AS
DATETIME);
You get the output 1900-01-01
01:23:43.210. SQL Server assumes the
base date as the date value. Internally,
it stored 0 as the integer representing
the offset from the base date. Assuming
you stored times with the base date in a
DATETIME column called time_col in a
table, when you want to filter rows with
a certain time (e.g., 01:23:43.210), you'll
use the filter
WHERE time_col = '01:23:43.210'
Again, SQL Server will implicitly convert the
literal that appears to the right of the equals
sign to DATETIME assuming the base date,
and thus the values are comparable.
Extracting Date Only
Now that you understand the storage format
of datetime datatypes and the fact that date
and time are technically inseparable, you can
start handling common calculation needs.
Suppose you need to extract only the date
portion from a datetime value—for example,
GETDATE(), which returns the system's
datetime. You need to produce a datetime
value with the input date at midnight.
You can perform this calculation in three
ways. In the first technique
SELECT CAST(
CONVERT(CHAR(8), GETDATE(),
112) -- 'YYYYMMDD'
AS DATETIME);
the CONVERT function converts the
input datetime value to a character string
using style 112 (YYYYMMDD). This style
extracts only the date portion from the input
value. The CAST function converts the date
character string back to DATETIME. When
a character string expressed in this format is
converted to a datetime datatype, it's independent of any language- or date-related
settings that are in effect for your session.
The second technique that lets you set the
time portion to midnight is to convert the
input datetime value to an integer, subtract 0.50000004, and convert the result back to
datetime:
SELECT CAST(CAST(GETDATE()-
0.50000004 AS INT) AS
DATETIME);
When a datetime value is converted to an
integer, SQL Server returns the offset in terms
of days from the base date; the time portion
is rounded down to 0 days if it's smaller than
or equal to 11:59:59.993 and otherwise up to
one day. By subtracting 0.50000004 portion
of a day from the input datetime value, you
compensate for cases in which the time portion is later than 11:59:59.993, in which case it would have otherwise been rounded up to
the next day. When converting an integer to
a datetime, SQL Server simply assumes this
integer as the offset from the base date, and
stores 0 as the other integer representing the
offset from midnight. Although this expression is short (and efficient, as I'll demonstrate
shortly), I have to say that I feel uneasy with
it. I'm not sure I can put my finger on exactly why—maybe because it's too technical, and
you can't see datetime-related logic in it.
I like the third technique best of all. I
learned it from SQL Server MVP Steve Kass.
It's very cool! Here goes:
SELECT DATEADD(day, DATEDIFF(day,
'19000101', GETDATE()),
'19000101');
The DATEDIFF function calculates the offset
in terms of days between the base date—
January 1, 1900—and the input date—GETDATE(). Call that offset diff. The DATEADD
function adds diff days to the base date. And
you have the input date at midnight. The
anchor date doesn't have to be the base date
of January 1, 1900. The important thing to
remember is that you should use the same
date in both the DATEDIFF function and
the DATEADD function.
Remember that there's no way for you
to specify a datetime literal; rather, here you specify a character string (i.e., '19000101')
that SQL Server will implicitly convert to a
datetime datatype. Similarly, you can specify
an integer value representing an anchor date.
Remember that converting the integer 0 to
a datetime yields the base date at midnight.
Bearing this in mind, you can shorten the
expression to
SELECT DATEADD(day, DATEDIFF(day,
0, GETDATE()), 0);
I ran a test to compare the performance
of the three techniques that I've presented.
Listing 1 shows the code I used, and Table 1 shows the performance measures.
I ran the calculation in a loop of
1,000,000 iterations.
After subtracting the overhead time
involved with the code surrounding
the actual calculation, the first technique appears to be the slowest,
taking. more than twice as long as
the second and third techniques.
The second technique seems to be
the fastest, but it's just a bit faster
than the third technique, which I
believe to be the most elegant.
Prev. page  
[1]
2
next page