Once in a while, you see a SAS programming statement that makes little sense. For many, the If 0 Then Set data step Statement is one of them. This post demonstrates what the statement does and why it can be convenient by example.
A SAS data step runs in two overall steps. First, the data step compiles. During compilation, SAS checks for syntax errors and sets up the Program Data Vector (PDV). Next, the data step executes. During execution, data is read into Memory and the PDV. Conditional logic and various calculations run and data is output to a SAS data set. If you are not comfortable with compilation and execution logic, I encourage you to take the SAS Programming 1 e-Learning Course.
A Simple Example
The Set Statement in the SAS data step plays a role during both compilation and execution. During compilation, the data step reads in variables from the data set listed in the Set Statement. During execution, SAS reads in the observations from the input data set in sequential order. However, we can bypass the execution part of the Set Statement with conditional logic that always fails. Remember, the data step compiler does not consider conditional logic. It merely checks for syntax errors and sets up the PDV.
Zero is the intrinsic false value in the SAS language. Therefore the “If 0 Then Set” could just as well be coded “If (1=2) Then Set” or “If (“a”=”b”) Then Set”. However, the first syntax is considered good programming practice.
Let us look at a very simple example. Below, I create the data set test1. The only statement in the data step is if 0 then set sashelp.class. During compilation, the data step sets up the PDV with the same variables as in the sashelp.class data set. The variables in test1 have the same length and attributes as in sashelp.class. During execution, SAS reads the if 0 part and concludes that the condition fails by construction. Consequently, no observations are read from sashelp.class and the data step terminates.
data test1; if 0 then set sashelp.class; run; /* data test1; if (1=2) then set sashelp.class; run; */
We can achieve the same result with a Length Statement or Attrib Statement. However, this requires that we know exactly the variables to rad into the PDV, which requires more work and possible PROC CONTENTS calls. The approach above is thus way more flexible when the circumstances apply.
data test2; length Age 8 Height 8 Name $8 Sex $1 Weight 8; run;
A Hash Object Application
The If 0 Then Set approach is especially convenient in hash object applications. When we use hash objects, the data step runs a process called “parameter type matching”. This means that the variables defined in a hash object must exist in the PDV. This makes sense because the hash object and the PDV must be able to add and retrieve variable values from one another. If 0 Then Set approach offers a simple and flexible solution to add exactly the variables to the PDV as is added to the relevant hash object. Consider the hash lookup in the blog post SAS Hash Object Lookup Example.
The code is below and the two data sets from the example can be found here.
In the code below, you can see that I list the same data set, work.emphours, in the If 0 Then Set Statement and in the dataset: argument tag in the Declare Statement. This way, I ensure that the exact same variables with the exact same attributes exist in both the hash object and the Program Data Vector. Further down, I use a second Set Statement with the Employees data set. However, here the statement both compiles and executes. Consequently, data is read into the PDV here.
data wanthash(drop=rc); if 0 then set work.emphours; if _N_ = 1 then do; declare hash h(dataset:'work.emphours'); h.defineKey('empid'); h.defineData('hours', 'sickdays', 'seniority'); h.defineDone(); call missing(hours, sickdays, seniority); end; set Employees; rc=h.find(); run;
In this post, I have explained to logic between the If 0 Then Set statement. We have seen that the statement is a clever way to set up the PDV and bypass the execution part of the Set Statement. Furthermore, we have seen that the approach is especially valuable in a hash object context, where parameter type matching is inevitable. However, the parameter type matching is a lot smoother and more flexible when the approach from this post is applied.
You can see more examples of hash object application where the If 0 Then Set Statement is applied in the blog posts Group Variable Values With Hash Object In SAS, Run Time Effect Of The Hash Object HASHEXP Argument and A SAS Hash Object Of Hash Objects (Hash Of Hash).
You can download the entire code from this example here.