Today, I demonstrate a little trick by example, that allows you to store lagged values in a temporary array. This technique enables you to calculate rolling statistics in SAS based on multiple lags of a variable. Without using the Lag Function at all. In the example to come, I will use the example data below.
data have; input id var @@; datalines; 1 1 1 2 1 3 1 4 1 5 2 1 2 . 2 3 2 . 2 5 3 1 3 2 3 4 3 . 3 5 ;
In the code below, I demonstrate the technique. I split up the explanation in 6 steps:
- First, I create a Temporary Array to hold the lag values. I want to have two lagged values and the current observation value. That means, I want three elements in my array. If you wonder why the array has indexes 0 to 2, you will get the answer in step 4.
- Next, I use the Call Missing Routine to set the elements of the array to missing between by groups. Obviously, we do not want to calculate statistics across groups.
- Now, I use a DoW Loop to read the data and increment the obs variable by one for each iteration. I use a DoW Loop and read By Groups separately to prevent lags to cross groups.
- This is where the magic happens. Here, I fill the temporary array lag with values. I use the Mod Function to point to the element that I want to insert the current value of var into. In the first iteration of the DoW Loop, the value is inserted into the array element indexed by 1. This is because Mod(1, 3)=1. In the next iteration the value is inserted into the array with index 2. Next, it is inserted into the element with index 0 (Mod(3, 3)=0). Then the process starts over.
- The fact that the lag array now contains values of var for the two prior observations enables us to use any function that computes descriptive statistics on it. Obviously, the result is a backwards rolling statistic defined by the function. In this example, I use the SAS N Function to count the number of non missing observations. Furthermore, I use the SAS Mean Function to calculate a backwards rolling average.
- Finally, to help you understand this technique, see the variable temp. It shows you what the temporary array currently stores.
data want; array lag[0:2] _temporary_; /* 1 */ call missing(of lag[*], obs); /* 2 */ do obs=1 by 1 until (last.id); /* 3 */ set have; by id; lag[mod(obs, 3)]=var; /* 4 */ count=n(of lag[*]); /* 5 */ avg=mean(of lag[*]); temp = catq('d', '|', of lag[*]); /* 6 */ put temp=; output; end; run;
This post demonstrates an example of how to use a temporary array to store lags in SAS. A quite handy little trick if you want to compute multiple lags in the data step. Or if you want to compute rolling statistics in the data step and not using PROC SQL.
Finally, you can download the entire code from this post here.