• subscribe
May 14, 2009 12:00 AM

How to Determine the Average Number of Days Between Orders or Other Important Events

Use the ROW_NUMBER function for fast queries
SQL Server Pro
InstantDoc ID #101924
Downloads
101924.zip

I've been working on a large B2B Customer Relationship Management (CRM) program. One important Key Performance Indicator (KPI) we wanted to develop for the program was the average number of days between orders. We wanted this KPI to determine the customers' usual purchase patterns so we could answer questions such as "Do some customers purchase more often than the average customer?" and "Is the average time between purchases increasing?" (An increase could be a sign that the customer is at risk for leaving the program.)

Determining the average number of days between orders involves finding two consecutive orders from the same customer and calculating the number of days between them. The first solution I tried used a looping algorithm that worked something like this:

  1. Find the first order for a customer and save the order date in a temporary variable.
  2. Find the next order and save that order date in a second temporary variable.
  3. Find the difference between the two dates and save that value.
  4. Move the second temporary variable's value into the first temporary variable, retrieve the next record, and save that record's order date in the second temporary variable. Find the difference between the two dates and save that value.
  5. Continue looping through all the records for a customer, comparing the dates from consecutive orders until you encounter the next customer's records.

Not only were the algorithm queries complicated to write, they didn't perform very well. When I tested the algorithm against the FactInternetSales table, it took more than four minutes to complete on my server running SQL Server 2005 SP2. FactInternetSales has only around 60,000 records and 19,000 customers. I knew that the performance of this algorithm would only get worse as these numbers got larger.

I decided to try a different approach. I began by using the ROW_NUMBER function with an OVER clause to get a chronological list of each customer's orders. After I saved the output to a temporary table, I used an INNER JOIN operation to bring adjacent records together. Callout A in Listing 1 shows this query.


Listing 1: DaysBetweenOrders.sql


As Figure 1 shows, the query produces a list of orders sorted by date for each customer and an additional column that specifies the sequence of each order in each customer list.


Figure 1: Sample results from the query that produces a chronological list of each customer's orders




ARTICLE TOOLS

Comments
  • RICHARD
    3 years ago
    Jul 09, 2009

    The algorithm can be simplified significantly. All you need is the number of orders and the first and last order dates:

    AvgDaysBetweenOrders = (LastOrderDate - FirstOrderDate)/(NumberOfOrders - 1)

    This is very similar to what jcelko has proposed.

  • Joe
    3 years ago
    Jun 28, 2009

    Why not pick a window (start_date end_date) and use (size of window in days) / (# of orders in window) = (avg days between orders)? Ex: If you look at a 30 day window and you had 5 orders ANY TIME in the window, then you averaged one order every 6 days. Draw a time line on paper, represent your orders as counters on the line and it is easier to see.

  • Mike
    3 years ago
    May 28, 2009

    Awesome Article....I will be using this solution over and over again....Thanks

You must log on before posting a comment.

Are you a new visitor? Register Here