Replace Missing Values With Previous Non Missing in SAS
A frequently asked question on the SAS Communities is how to replace missing values with the previous/most recent non missing value. In this post, I will demonstrate how to do this with the UPDATE Statement. The update statement is a bit overlooked, but in situations like this, it is a life saver.
A SAS Example
First, let us create some example SAS data. I create a very simple data set with ID and value as variables. Here, ID has three levels: 1, 2 and 3 and value is a numeric variable with a few missing values.
data MyData; input ID$ Value; datalines; 1 2 1 . 1 . 1 4 2 . 2 9 2 . 2 . 3 3 3 . 3 0 ;
Obviously, this SAS data set is for demonstration purposes only. As you can see, Value has a few missing values. I want to replace these missing values with the previous non-missing value. Furthermore, I want to do this By Group. Meaning, I want to “drag” non missing values down over missing values, but not crossing IDs.
I use the UPDATE Statement with MyData as both the master data set and the transaction SAS data set. I specify ID in the BY Statement, so naturally I assume that the data is sorted by the ID variable.
data Want; update MyData(obs=0) MyData; by ID; output; run; proc print data=Want; run;
The key to this trick is the UPDATE Statement. I use MyData(obs=0) as the master data set to load the relevant variables and no observations into the PDV. Next, I also use MyData (all observations included) as the transaction data set and update the master data set from this. Also, I have to use an explicit OUTPUT Statement, because by default SAS only updates the first observation in each by group and outputs that.
You can see the resulting data to the right. As desired, we have replaced the missing values with the most recent non missing value. Also, we do not cross over any By Groups.
This post demonstrates how to replace missing values with previous non-missing values in SAS. I make heavily use of the update statement using the same data set as both the master and transaction data set. If you are not familiar with the UPDATE Statement and the syntax of the master – and transaction data set, I strongly encourage you to consult the SAS Documentation and the great examples there.
I have previously written posts about how to Replace Missing Values With Mean in SAS and Replace Missing Values With Zero. Also, see the blog posts Using the Update Statement in the SAS Data Step and Using the Modify Statement in the SAS Data Step.
You can download the entire code from this post here.