In this post, I will investigate the Return Statement in the SAS Data Step. Most SAS programmers know that it exists. However few really understand it. Today, I will dig into the documentation and clarify what the Return Statement does. Where does it send processing? Does other statements affect it? It turns out, it is not that hard, once you get the grasp of it.

In the examples in this post, I will use the simple demonstration data set below.

data have;
input x y;
datalines;
1 1
1 2
2 3
2 4
;

The Implicit Return Statement

Let us start with the basics. Every data step has an implicit return statement at the bottom of the data step. This is what sends processing to the top of the data step and iterates over the next input observation. Most data steps also has an implicit Output Statement at the bottom of the data step. However, if an explicit output statement (or Remove or Replace Statements with Modify statement) is present, it does not. Take a look at the program below. It does exactly the same thing if you omit the explicit Output and Return Statements.

data want;
   set have;
 
   output;
   return;
run;

Comparable to the Delete Statement?

The Return Statement Documentation says that the statement: “Stops executing statements at the current point in the DATA step and returns to a predetermined point in the step.”. If that was all there is to it, why not just use the Delete Statement? The reason is found further down in the documentation. Here, it is clear that if we use a Return Statement in a data step without an explicit Output Statement, an implicit Output Statement will output the current observation. Take a look at the code below. Given the knowledge from the documentation, how many observations are written to the return data set? 4 is correct because no explicit Output Statement is present.

data return;
   set have;
   if x = 2 then return;
   *output;
run;

This is not the case with the Delete Statement. The Doc is pretty clear about it: “When DELETE executes, the current observation is not written to a data set, and SAS returns immediately to the beginning of the DATA step for the next iteration.”. Consequently, the data step below outputs only two observations.

data delete;
   set have;
   if x = 2 then delete;
run;

The Link and Go To Statements

There are two data step statements, that alter the behavior of the Return Statement. There are Link and Go To. First, let us consider the Link Statement. The documentation says this about the Return Statement: “The LINK statement tells SAS to jump immediately to the statement label that is indicated in the LINK statement and to continue executing statements from that point until a RETURN statement is executed. The RETURN statement sends program control to the statement immediately following the LINK statement.”.

Knowing this, how many observations does SAS output in the data step below? Remember, that the Return Statement does not send execution back to the top of the data step. 8 is the correct answer because SAS does sends processing the the statement which immediately follows the Link Statement. Thus outputs the observation once more.

data link;
   set have;
   link here;
   here : 
             output;
             return;
 
run;

Next, let us do the exact same thing with the Go To Statement. The Documentation on the Go To Statement says: “Directs program execution immediately to the statement label that is specified and, if followed by a RETURN statement, returns execution to the beginning of the DATA step.”. Now, consider the same example as above, but with the Go To Statement instead. How many observations are does the data step output? 4 is correct. Because SAS sends processing back to the top of the data step and reads the next input line. Thus, it outputs once for each observation read.

data goto;
   set have;
   goto here;
   here : 
             output;
             return;
 
run;

Summary

In this post, I provide an investigation of the Return Statement. We see that there is a difference between an implicit and explicit return statement. Also, we learn that other Statements such as Link and Go To affects the processing and behavior of the Return Statement. I recommend that you read the documentation thoroughly.

As a related posts, see When Does the SAS Data Step Set Variable to Missing?

You can download the entire code from this post here.