When work with transactions or time series data, it is often convenient to create a moving average. I have seen many different ways of computing moving averages in SAS. Some more efficient than others. In this post, I present three different examples of computing moving averages using the Data Step, the IML Language and finally PROC EXPAND. First of all, let us create a simple test data set for our examples.
data testdata; input time value @@; datalines; 0 100 1 101 2 103 3 107 4 116 5 108 6 113 7 106 8 110 9 115 10 121 11 117 12 123 13 127 14 121 15 126 16 131 17 125 18 125 19 130 20 135 ; |
The test data set contains 21 ordered observations with a time variable and a value variable. This is a typical time series data set.
Data Step Example
First, let us consider creating a moving average using the data step. First, I set the number of periods , I want to calculate my moving average on. Then I use array processing to calculate the desired number of lags, calculate the average over the current observation and the calculated lags, and finally drop the lagged variables again. You should be cautious about the tail when you create a moving average. In the first or second observations, I do not want a three period backwards looking moving average. That would not make sense. Therefore, I handle the cases where the observation number is less than the desired number of periods separately, and calculate the average only on the periods available. The result is seen on the right.
%let n=3; data DataStepMoveAv; set testdata; n=_n_; array cum_{&n}; do i = &n to 2 by -1; cum_[i]=cum_[i-1]; end; cum_[1]=value; if n < &n then MovAv = sum(of cum_[*]) / n; else MovAv = sum(of cum_[*]) / &n; retain cum_:; drop i n; run; |


The Data Step way is in my opinion a bit overly complicated in this context. It requires many complicated features of the data step and it is hard to implement by-group processing. Though this is done beautifully in the macro from the book Cody’s Collection of Popular SAS Programming Tasks and How to Tackle Them. You can find the macro in the free code examples from the book in the link.
SAS/IML
Next we look at an example using PROC IML. First, I read in the data created above. Then I set the number of periods I want to look bank (n=3). Next, I allocate a vector to hold the Moving Average values. Then I use
proc iml; use testdata; read all var _ALL_; close testdata; n=3; MovAv = j(nrow(value), 1, .); do i = 1 to nrow(value); idx = max(1,i-n+1):i; MovAv[i] = mean(value[idx]); end; title "Backward Moving Average from SAS/IML"; call series(time, MovAv); quit; |
The IML way is in my opinion superior over the Data Step method, because of the simplicity of the code. Also it is easier to calculate rolling averages for different groups and much easier to calculate both weighted and exponentially weighted rolling averages. Furthermore, it is much easier to adjust to a centered or forward-looking moving average in a vectorized language such as IML than in the data step, where we would also have to incorporate leads of observations in the code. For examples of creating Weighted Moving Averages and Exponentially Weighted Moving Averages in SAS/IML, see the article Rolling Statistics in SAS/IML by Rick Wicklin.
PROC EXPAND
I have specified four Convert Statements in the procedure. The first one creates a simple three period backwards moving averages in the previous examples. In the next one, I specify transformout=(movave(1 2 3)) in the statement options. This creates a Weighted Moving Average, standardizing the weights 1, 2 and 3 so that the variable WMovAv is given as
Keep in mind that the variable in the ID Statement must be a date variable.
proc expand data=testdata out=MoveAv method=none; id time; convert value=MovAv / transformout=(movave 3); convert value=WMovAv / transformout=(movave(1 2 3)); convert value=CMovAv / transformout=(cmovave 3); convert value=EWMovAv / transformout=(ewma .4); run; |
Summary
There are many different ways to create moving averages in SAS. In this post it is clear, that PROC EXPAND is superior over other alternatives. The only reason for which I would use other methods than PROC EXPAND is if I did not have access to SAS/ETS. In fact, it it not only rolling averages, but rolling statistics in general that PROC EXPAND creates swiftly. Consult the Transformation Operations part of the PROC EXPAND Documentation to see how many rolling statistics the procedure can create easily. Also, check out Easy Rolling Statistics with PROC EXPAND for a great article on how to think of and calculate rolling statistics.
Also, see the related posts Create Rolling Statistics in SAS with PROC SQL, Using Temporary Array to Store Lagged Values in SAS and Create Moving Average with the SAS Hash Object.
You can download the entire program from this post, including the code creating the plots here.