Tuesday, February 14, 2012

Analytic sales forecast

Analytic functions have been very helpful at my work. One of the good examples is trying to forecast next years sales for each item taking into consideration seasonal variations and whether the item has been going up in sales the last years or declining. In this blog post I demonstrate how to use REGR_SLOPE for that purpose.

We have our sales statistics by month (though we're thinking about going to weekly data.)
So for the simple demo we create this very simple table:

SQL> CREATE TABLE sales (
  2   item  VARCHAR2(10),
  3   mth  DATE,
  4   qty  NUMBER
  5  )
  6  /

Table created.

We populate it with real data for the years 2008, 2009 and 2010 for two items:
  • Snowchain, which typically sells in the winter and has been going up in sales.
  • Sunshade, which sells in the summer and has been slightly declining.
(My demo script that populates the table can be downloaded here.)

Let us now pretend we are now in January 2011 and wishes to forecast the 2011 sales.
Let me start by showing the result we are aiming for (larger picture here) :













Snowchain (in blue) sells in the winter months. The dotted line shows it is generally going up in sales. The stapled line (forecast) is in a sense the 2010 curve moved 12 months to the right along the dotted line. Therefore the 2011 forecast of 882 is higher than 2010 sales of 691.

Similarly for Sunshade (in red), but the peaks are here in the summer months and the sales decline.

Let us see what we can do with REGR_SLOPE analytic function :

SQL> SELECT sales.item
  2     , sales.mth
  3     , sales.qty
  4     , REGR_SLOPE(
  5          sales.qty
  6        , EXTRACT(YEAR FROM sales.mth) * 12 + EXTRACT(MONTH FROM sales.mth)
  7       )
  8       OVER (
  9          PARTITION BY sales.item
 10          ORDER BY sales.mth
 11          RANGE BETWEEN INTERVAL '23' MONTH PRECEDING AND CURRENT ROW
 12       ) slope
 13    FROM sales
 14   ORDER BY sales.item, sales.mth;

ITEM       MTH               QTY      SLOPE                                     
---------- ---------- ---------- ----------                                     
Snowchain  2008-01-01         79                                                
Snowchain  2008-02-01        133         54                                     
Snowchain  2008-03-01         24      -27,5                                     
...
Snowchain  2010-10-01          1 -2,2743478                                     
Snowchain  2010-11-01         73 -2,3630435                                     
Snowchain  2010-12-01        160 -,99086957                                     
Sunshade   2008-01-01          4                                                
Sunshade   2008-02-01          6          2                                     
Sunshade   2008-03-01         32         14                                     
...
Sunshade   2010-10-01         11 ,217391304                                     
Sunshade   2010-11-01          3 -,20043478                                     
Sunshade   2010-12-01          5 -,57391304                                     

72 rows selected.

REGR_SLOPE gives us the slope of the linear extrapolation of the dots in the graph.
The vertical axis of the graph contains sales.qty .
The horizontal axis is a numeric representation of the month (unit 1 = 1 month.)
PARTITION BY considers each item separately.
ORDER BY defines the order the RANGE clause will consider.
RANGE then says to calculate the slope based on the current row and 23 months back (= 2 years data.)

Note: our slope keeps changing as it uses a sliding 2-year window (rather than the simplified dotted line in the graph above.) By calculating the slope with a sliding window we can handle items that used to go up in sales the first years we sold it, but then started to decline as for example new technology gets on the market.

The calculated slope we can then use for moving each graph point 12 months into the future:

SQL> SELECT item
  2     , mth
  3     , qty
  4     , qty + 12 * slope qty_next_year
  5    FROM (
  6   SELECT sales.item
  7        , sales.mth
  8        , sales.qty
  9        , REGR_SLOPE(
 10      sales.qty
 11    , EXTRACT(YEAR FROM sales.mth) * 12 + EXTRACT(MONTH FROM sales.mth)
 12          )
 13          OVER (
 14      PARTITION BY sales.item
 15      ORDER BY sales.mth
 16      RANGE BETWEEN INTERVAL '23' MONTH PRECEDING AND CURRENT ROW
 17          ) slope
 18     FROM sales
 19       )
 20   WHERE mth >= DATE '2010-01-01'
 21   ORDER BY item, mth;

ITEM       MTH               QTY QTY_NEXT_YEAR                                  
---------- ---------- ---------- -------------                                  
Snowchain  2010-01-01        167    188,313043                                  
Snowchain  2010-02-01        247    304,855652                                  
Snowchain  2010-03-01         42    96,3913043                                  
Snowchain  2010-04-01          0    42,6991304                                  
Snowchain  2010-05-01          0    30,8869565                                  
Snowchain  2010-06-01          0    19,0747826                                  
Snowchain  2010-07-01          0     7,2626087                                  
Snowchain  2010-08-01          1    -3,4295652                                  
Snowchain  2010-09-01          0    -16,121739                                  
Snowchain  2010-10-01          1    -26,292174                                  
Snowchain  2010-11-01         73    44,6434783                                  
Snowchain  2010-12-01        160    148,109565                                  
Sunshade   2010-01-01          2    -11,617391                                  
Sunshade   2010-02-01          8    -11,137391                                  
Sunshade   2010-03-01         28    9,11304348                                  
Sunshade   2010-04-01         26    8,86086957                                  
Sunshade   2010-05-01         23    9,66434783                                  
Sunshade   2010-06-01         46    39,1130435                                  
Sunshade   2010-07-01         73    79,4486957                                  
Sunshade   2010-08-01         25    31,7147826                                  
Sunshade   2010-09-01         13    18,0504348                                  
Sunshade   2010-10-01         11    13,6086957                                  
Sunshade   2010-11-01          3    ,594782609                                  
Sunshade   2010-12-01          5    -1,8869565                                  

24 rows selected.

For each month in 2010 we take the quantity and add the slope multiplied by 12 months.
That will give us the quantity expected for the same month next year.

But we like the numbers to be non-negative integers :

SQL> SELECT item
  2     , ADD_MONTHS(mth, 12) mth
  3     , GREATEST(ROUND(qty + 12 * slope), 0) forecast
  4    FROM (
  5   SELECT sales.item
  6        , sales.mth
  7        , sales.qty
  8        , REGR_SLOPE(
  9      sales.qty
 10    , EXTRACT(YEAR FROM sales.mth) * 12 + EXTRACT(MONTH FROM sales.mth)
 11          )
 12          OVER (
 13      PARTITION BY sales.item
 14      ORDER BY sales.mth
 15      RANGE BETWEEN INTERVAL '23' MONTH PRECEDING AND CURRENT ROW
 16          ) slope
 17     FROM sales
 18       )
 19   WHERE mth >= DATE '2010-01-01'
 20   ORDER BY item, mth;

ITEM       MTH          FORECAST                                                
---------- ---------- ----------                                                
Snowchain  2011-01-01        188                                                
Snowchain  2011-02-01        305                                                
Snowchain  2011-03-01         96                                                
Snowchain  2011-04-01         43                                                
Snowchain  2011-05-01         31                                                
Snowchain  2011-06-01         19                                                
Snowchain  2011-07-01          7                                                
Snowchain  2011-08-01          0                                                
Snowchain  2011-09-01          0                                                
Snowchain  2011-10-01          0                                                
Snowchain  2011-11-01         45                                                
Snowchain  2011-12-01        148                                                
Sunshade   2011-01-01          0                                                
Sunshade   2011-02-01          0                                                
Sunshade   2011-03-01          9                                                
Sunshade   2011-04-01          9                                                
Sunshade   2011-05-01         10                                                
Sunshade   2011-06-01         39                                                
Sunshade   2011-07-01         79                                                
Sunshade   2011-08-01         32                                                
Sunshade   2011-09-01         18                                                
Sunshade   2011-10-01         14                                                
Sunshade   2011-11-01          1                                                
Sunshade   2011-12-01          0                                                

24 rows selected.

Here we round the numbers and make sure they don't go negative.
Also we add 12 months to the date so we get the month the forecast is for.
And presto - we have seen the future :-)

We can then concatenate this to the actual sales if we wish :

SQL> SELECT item
  2     , mth
  3     , qty
  4     , type
  5    FROM (
  6   SELECT sales.item
  7        , sales.mth
  8        , sales.qty
  9        , 'Actual' type
 10     FROM sales
 11   UNION ALL
 12   SELECT item
 13        , ADD_MONTHS(mth, 12) mth
 14        , GREATEST(ROUND(qty + 12 * slope), 0) qty
 15        , 'Forecast' type
 16     FROM (
 17      SELECT sales.item
 18    , sales.mth
 19    , sales.qty
 20    , REGR_SLOPE(
 21         sales.qty
 22       , EXTRACT(YEAR FROM sales.mth) * 12 + EXTRACT(MONTH FROM sales.mth)
 23      )
 24      OVER (
 25         PARTITION BY sales.item
 26         ORDER BY sales.mth
 27         RANGE BETWEEN INTERVAL '23' MONTH PRECEDING AND CURRENT ROW
 28      ) slope
 29        FROM sales
 30          )
 31    WHERE mth >= DATE '2010-01-01'
 32       )
 33   ORDER BY item, mth;

ITEM       MTH               QTY TYPE                                           
---------- ---------- ---------- --------                                       
Snowchain  2008-01-01         79 Actual                                         
Snowchain  2008-02-01        133 Actual                                         
Snowchain  2008-03-01         24 Actual                                         
Snowchain  2008-04-01          1 Actual                                         
Snowchain  2008-05-01          0 Actual                                         
Snowchain  2008-06-01          0 Actual                                         
Snowchain  2008-07-01          0 Actual                                         
Snowchain  2008-08-01          0 Actual                                         
Snowchain  2008-09-01          1 Actual                                         
Snowchain  2008-10-01          4 Actual                                         
Snowchain  2008-11-01         15 Actual                                         
Snowchain  2008-12-01         74 Actual                                         
Snowchain  2009-01-01        148 Actual                                         
Snowchain  2009-02-01        209 Actual                                         
Snowchain  2009-03-01         30 Actual                                         
Snowchain  2009-04-01          2 Actual                                         
Snowchain  2009-05-01          0 Actual                                         
Snowchain  2009-06-01          0 Actual                                         
Snowchain  2009-07-01          0 Actual                                         
Snowchain  2009-08-01          1 Actual                                         
Snowchain  2009-09-01          0 Actual                                         
Snowchain  2009-10-01          3 Actual                                         
Snowchain  2009-11-01         17 Actual                                         
Snowchain  2009-12-01        172 Actual                                         
Snowchain  2010-01-01        167 Actual                                         
Snowchain  2010-02-01        247 Actual                                         
Snowchain  2010-03-01         42 Actual                                         
Snowchain  2010-04-01          0 Actual                                         
Snowchain  2010-05-01          0 Actual                                         
Snowchain  2010-06-01          0 Actual                                         
Snowchain  2010-07-01          0 Actual                                         
Snowchain  2010-08-01          1 Actual                                         
Snowchain  2010-09-01          0 Actual                                         
Snowchain  2010-10-01          1 Actual                                         
Snowchain  2010-11-01         73 Actual                                         
Snowchain  2010-12-01        160 Actual                                         
Snowchain  2011-01-01        188 Forecast                                       
Snowchain  2011-02-01        305 Forecast                                       
Snowchain  2011-03-01         96 Forecast                                       
Snowchain  2011-04-01         43 Forecast                                       
Snowchain  2011-05-01         31 Forecast                                       
Snowchain  2011-06-01         19 Forecast                                       
Snowchain  2011-07-01          7 Forecast                                       
Snowchain  2011-08-01          0 Forecast                                       
Snowchain  2011-09-01          0 Forecast                                       
Snowchain  2011-10-01          0 Forecast                                       
Snowchain  2011-11-01         45 Forecast                                       
Snowchain  2011-12-01        148 Forecast                                       
Sunshade   2008-01-01          4 Actual                                         
Sunshade   2008-02-01          6 Actual                                         
Sunshade   2008-03-01         32 Actual                                         
Sunshade   2008-04-01         45 Actual                                         
Sunshade   2008-05-01         62 Actual                                         
Sunshade   2008-06-01         58 Actual                                         
Sunshade   2008-07-01         85 Actual                                         
Sunshade   2008-08-01         28 Actual                                         
Sunshade   2008-09-01         24 Actual                                         
Sunshade   2008-10-01         19 Actual                                         
Sunshade   2008-11-01          6 Actual                                         
Sunshade   2008-12-01          8 Actual                                         
Sunshade   2009-01-01          2 Actual                                         
Sunshade   2009-02-01         13 Actual                                         
Sunshade   2009-03-01         29 Actual                                         
Sunshade   2009-04-01         60 Actual                                         
Sunshade   2009-05-01         29 Actual                                         
Sunshade   2009-06-01         78 Actual                                         
Sunshade   2009-07-01         56 Actual                                         
Sunshade   2009-08-01         22 Actual                                         
Sunshade   2009-09-01         11 Actual                                         
Sunshade   2009-10-01         13 Actual                                         
Sunshade   2009-11-01          5 Actual                                         
Sunshade   2009-12-01          3 Actual                                         
Sunshade   2010-01-01          2 Actual                                         
Sunshade   2010-02-01          8 Actual                                         
Sunshade   2010-03-01         28 Actual                                         
Sunshade   2010-04-01         26 Actual                                         
Sunshade   2010-05-01         23 Actual                                         
Sunshade   2010-06-01         46 Actual                                         
Sunshade   2010-07-01         73 Actual                                         
Sunshade   2010-08-01         25 Actual                                         
Sunshade   2010-09-01         13 Actual                                         
Sunshade   2010-10-01         11 Actual                                         
Sunshade   2010-11-01          3 Actual                                         
Sunshade   2010-12-01          5 Actual                                         
Sunshade   2011-01-01          0 Forecast                                       
Sunshade   2011-02-01          0 Forecast                                       
Sunshade   2011-03-01          9 Forecast                                       
Sunshade   2011-04-01          9 Forecast                                       
Sunshade   2011-05-01         10 Forecast                                       
Sunshade   2011-06-01         39 Forecast                                       
Sunshade   2011-07-01         79 Forecast                                       
Sunshade   2011-08-01         32 Forecast                                       
Sunshade   2011-09-01         18 Forecast                                       
Sunshade   2011-10-01         14 Forecast                                       
Sunshade   2011-11-01          1 Forecast                                       
Sunshade   2011-12-01          0 Forecast                                       

96 rows selected.

And that is the data used for the graph shown above. "Forecast" data is the stapled lines.


The complete spool output of the demo script can be downloaded here.
And you can read more in Oracle documentation on REGR_SLOPE or the data warehousing guide on linear regression.

1 comment: