• subscribe
December 13, 2011 09:54 AM

Microsoft SQL Server 2012: How To Write T-SQL Window Functions, Part 2

Using offset and distribution functions
SQL Server Pro
InstantDoc ID #141022
Last month, I started a series of articles about the profound window functions and their support in SQL Server 2012. I explained the concept of SQL windowing, I described the elements involved in window specifications (partitioning, ordering, and framing), and I discussed the difference between the two window frame unit options ROWS and RANGE. I showed examples using window aggregate functions. This month, I cover two other types of window functions: offset and distribution. If you missed part 1 of this article series, see "SQL Server 2012's Window Functions, Part 1."

As a reminder, you need to use SQL Server Denali CTP3 or later to run the sample code for this series of articles.  You also need to install the sample database TSQL2012. You also need to download the source code file that creates and populates the sample database.

Window Offset Functions

Window offset functions let you return a value from a row that's in a certain offset from the current row (LAG and LEAD) or from the first or last row in the window frame (FIRST_VALUE and LAST_VALUE). Let's start with the LAG and LEAD functions.

Window offset functions LAG and LEAD. The LAG function returns a value from a row in the window partition that, based on the window ordering, is the specified number of rows before the current row. Similarly, the LEAD function returns a value from a row in the window partition that, based on the window ordering, is the specified number of rows after the current row. By default, the LAG and LEAD functions assume an offset of one row if an explicit offset wasn't specified.

You indicate the value you want to return from the row in question as the first argument to LAG and LEAD. If you want to indicate an explicit offset, you indicate it as the second argument to the function. If a row isn't found in the requested offset, the functions return a NULL. If you want to return a different value in case a row isn't found, you can indicate such a value as the third argument to the function.

As an example, the following query returns for each customer order the value of the customer's previous order (LAG), as well as the value of the customer's next order (LEAD):

USE TSQL2012;

SELECT custid, orderdate, orderid, val,

LAG(val) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS prevval,

LEAD(val) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS nextval

FROM Sales.OrderValues; 

Figure 1 shows the output of this query.

Because the calculations are supposed to be performed for each customer independently, the functions partition the window by custid. As for window ordering, it's based on orderdate and orderid as a tiebreaker. Observe that the functions rely on the default offset 1 and return NULL when a row isn't found in the applicable offset.

You can freely mix in the same expression detail elements from the row as well as calls to window functions. For example, the following query computes the difference between the customer's current and previous order values, as well as the difference between the customer's current and next order values:

SELECT custid, orderdate, orderid, val,

val - LAG(val) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS diffprev,

val - LEAD(val) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS diffnext

FROM Sales.OrderValues; 

Figure 2 shows the output of this query.

As I mentioned, the default when an explicit offset isn't specified is 1 -- but you can indicate your own value as a second argument to the function. You can also indicate as a third argument what to return instead of a NULL when a row isn't found in the requested offset. Here's an example that specifies 2 as the offset and 0 as the value to return when a row isn't found:

SELECT custid, orderdate, orderid, val,

LAG(val, 2, 0) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS prev2val,

LEAD(val, 2, 0) OVER(PARTITION BY custid

ORDER BY orderdate, orderid) AS next2val

FROM Sales.OrderValues; 

Figure 3 shows the output of this query.



ARTICLE TOOLS

Comments
  • Sam Bendayan
    4 months ago
    Jan 16, 2012

    Are the values in Figure 7 accurate? They show the same value for percentileDisc for all rows and only 2 distinct values in percentileCont.

You must log on before posting a comment.

Are you a new visitor? Register Here