To test the function, run the following code:
SELECT *
FROM dbo.fn_HistSteps(‘20080501
00:00’, ‘20080511 00:00’) AS S
ORDER BY n;
You should get the steps shown in Web Table 1,
without the num_events column.
To test the function with nonround range boundary
points, query it with the following inputs:
SELECT *
FROM dbo.fn_HistSteps(‘20080501
12:30’, ‘20080510 10:00’) AS S
ORDER BY n;
You should get the steps shown in Web Table 2; again,
excluding the num_events column.
Now that your function works for one-day intervals,
you can add logic to support a requested date
part (@date_part) and number of parts (@num_parts).
The revision to the function isn’t complicated. You
need to substitute all expressions that currently use
the date part day with a CASE expression that uses
the requested date part. Also, when calculating diff,
you’ll to need divide the value by @num_parts as part
of the expression. To create the revised fn_HistSteps
function, run the code in Web Listing 4. Note that if
the function is invoked with an unrecognized part, the
CASE expressions will default to ELSE NULL, the
query filter will filter out all rows, and the function will
return an empty set.
Now you can specify the date part and the number
of parts as inputs. For example, to get a steps table
for the range ‘20080501 00:00’ - ‘20080502 00:00’,
with four-hour intervals, you’d query the function as
follows:
SELECT *
FROM dbo.fn_HistSteps(‘20080501
00:00’, ‘20080502 00:00’, ‘hour’,
4) AS S
ORDER BY n;
Producing the Actual
Histogram
Most of the work is now behind you; what’s left is to
join the fn_HistSteps function with the Events table
to match steps and events, group the results by step,
and return the count of active events in each step. To
check whether an event (starting at event_start and
ending at event_end) overlaps with a step (starting at from_dt and ending at to_dt), you can use the following
predicate:
event_start < to_dt AND event_end >
from_dt
Note that you can use <= instead of <, and >= instead
of > depending on how you want to treat the boundary
point itself (inclusive versus exclusive).
Here’s the full query that would give you a daily
histogram for the range ‘20080501 00:00’ - ‘20080511
00:00’, producing the output in Web Table 1:
SELECT n, from_dt, to_dt,
COUNT(event_id) AS num_events
FROM dbo.fn_HistSteps(‘20080501
00:00’, ‘20080511 00:00’, ‘day’,
1) AS S
LEFT OUTER JOIN dbo.Events AS E
ON E.event_start < S.to_dt
AND E.event_end > S.from_dt
GROUP BY n, from_dt, to_dt
ORDER BY n;
Notice that an OUTER JOIN is used here instead of
an INNER JOIN in order to return empty steps (steps/
intervals with a num_events value of 0) as well.
The following query returns a daily histogram for
the range ‘20080501 12:30’ - ‘20080510 10:00’, producing
the output in Web Table 2:
SELECT from_dt, to_dt, COUNT(event_
id) AS num_events
FROM dbo.fn_HistSteps(‘20080501
12:30’, ‘20080510 10:00’, ‘day’,
1) AS S
LEFT OUTER JOIN dbo.Events AS E
ON E.event_start < S.to_dt
AND E.event_end > S.from_dt
GROUP BY from_dt, to_dt
ORDER BY from_dt;
And finally, the following query returns a fourhour
step histogram for the range ‘20080501 00:00’
- ‘20080502 00:00’, producing the output in Web
Table 3:
SELECT n, from_dt, to_dt,
COUNT(event_id) AS num_events
FROM dbo.fn_HistSteps(‘20080501
00:00’, ‘20080502 00:00’, ‘hour’,
4) AS S
LEFT OUTER JOIN dbo.Events AS E
ON E.event_start < S.to_dt
AND E.event_end > S.from_dt
GROUP BY n, from_dt, to_dt
ORDER BY n;
One Step at a Time
Dealing with temporal data can be quite tricky. In fact,
I’ve written extensively about datetime manipulation
(see the Learning Path). When you face such challenges
in which the solution isn’t trivial, it’s important to
break the problem into steps as I’ve done in this article.
In addition, when handling all of a task’s requirements
at once is too complicated, a useful approach is to relax
some of the requirements, solve a simpler form of the
problem, then reintroduce the complexity layers.
End of Article
Prev. page
1
[2]
next page -->