A frequent question in the SAS Online Community is how to select or do some calculation on every n’th observation in a SAS Data set. Today, I will demonstrate three approaches to do so. Obviously, this can be done in many different ways. Though, the approaches here are pretty simple and efficient.
Use the Point= Option
First, let us use the Point= Option in the Data Step Set Statement. In the code below, I use a DoW Loop to read every tenth observation of the sample data set sashelp.shoes. I do so with the iteration variable p, which I iterate until p is larger than the number of observations minus 10. This gives me every tenth observation.
data point; do p = 1 by 10 until (p >= (nobs-10)); set sashelp.shoes point=p nobs=nobs; output; end; stop; run;
Use the Mod Function
Next, let us look at my favorite method. Using the Mod Function. In the beautiful and simple code below, I use the Mod Function with the Automatic _N_ Variable to output only if the remainder of the division of _N_ by 10 is equal to one. That is the case for 1, 11, 21 and so on. The code below creates the exact same result as the code above.
The Mod Function is one of my favorite SAS Functions. Probably because it has such a variety of purposes. It never ceases to amaze me, the problems it can solve. A good example is using it as a hash function for a fitting prime in the Array Hasing Method.
data mod; set sashelp.shoes; if mod(_N_, 10) = 1; run;
Use PROC SQL
Finally, my least favorite method (there had to be three, right?). In the code below, I use the undocumented Monotonic() Function in PROC SQL. Which is the nearest equivalent of the Data Step _N_ you will find and basically do the same as above. The code is not as tight as above, though it does produce the same output.
Proc SQl; Create table sql as select * from sashelp.shoes where (mod(monotonic () ,10))=0; quit;
In this short post, I demonstrate three approaches to selecting every n’th observation from a SAS data set. In the examples, I chose every tenth. I do so with two data step approaches and an SQL approach. Of the three, I recommend the Mod Function technique.
You can download the entire code from this post here.