The SAS hash object is flexible in so many ways. However, when we want to read many variables into either the key – or data portion of the object, programmers usually struggle to do so. This post discusses a few approaches to handling the problem. Without manually writing out all the variables.

In the examples to come, I will use the example data below. Here, I have a single key variable k and 100 data variables d1-d100.

data have;
    do _N_ = 1 to 10;
        k = ceil(rand('uniform')*100);
        array d $ 8 d1-d100;
        do over d;
            d = uuidgen();
        end;
        output;
    end;
run;

A Metadata Approach Before the Hash Object

The first approach involves reading the variables from SAS Metadata. This is probably the first approach that comes to mind. Suppose that we want to read all variables into the data portion of the hash object with the prefix “d”. In the SQL Query below, I read all variable names from the data above starting with a “d” into the macro variable d. I use the Quote Function to put quotes around the variable names. Finally, I separate the variable names by commas. This makes the form acceptable for the hash object. Run the code below and check the log.

proc sql noprint;
    select quote(strip(name)) into : d separated by ','
    from dictionary.columns
    where libname = "WORK" and memname = "HAVE" and char(name, 1) = "d";
run;
 
%put &d.;

Next, I simply use the macro variable d in the DefineData() Method to read all the variables with the prefix d into the data portion. Run the code below and check the hashcontent data set. Here, you can see that the hash object contains only variables with the prefix d. As expected.

data _null_;
    if 0 then set have;
 
    declare hash h (dataset : "have");
    h.definekey ("k");
    h.definedata (&d.);
    h.definedone ();
 
    h.output (dataset : "hashcontent");
run;

A SAS Array Technique

The metadata approach above is quite flexible. It allows us to put variables into the key – or data portion of the object based on advanced patterns. However, there is a shorter and better way. First, realize that we are not limited to a single DefineData() Method call. Instead of listing all data variables in a single method call, we can use multiple calls to DefineData() like below. Here I only read in d1, d2 and d3. However, this flexibility allows us to dynamically create many calls to a single method.

data _null_;
    if 0 then set have;
 
    declare hash h (dataset : "have");
    h.definekey ("k");
    h.definedata ("d1");
    h.definedata ("d2");
    h.definedata ("d3");
    h.definedone ();
 
    h.output (dataset : "hashcontent");
run;

The approach is threefold. First, create an array of all the variables of interest. Next, loop over each element in the array. Finally, call the DefineData() Method for each element encountered. See the example below. Here, I create an Implicit Array with all variables that start with the letter d. Note that I can do this only because the If 0 Then Set Statement had the compiler prepare the PDV in the line above. Next, I use the Do Over (for an implicit array only) syntax to loop over each element in the array. For each element, I call the DefineData() Hash Object Method. Run the code below and verify that the result is identical to the one above.

data _null_;
    if 0 then set have;
    array d d:;
 
    declare hash h (dataset : "have");
    h.definekey ("k");
    do over d;
        h.definedata (vname(d));
    end;
    h.definedone ();
 
    h.output (dataset : "hashcontent");
run;

I like this approach over the metadata technique for several reasons. It is simpler, no initial data read of metadata is necessary. Finally, the approach is quite flexible. Suppose I want to have all character variables in a data set read into the data portion of a SAS hash object. This is easily accomplished using the _character_ keyword like below.

data _null_;
    if 0 then set have;
    array d _character_ /* _numeric */;
 
    declare hash h (dataset : "have");
    h.definekey ("k");
    do over d;
        h.definedata (vname(d));
    end;
    h.definedone ();
 
    h.output (dataset : "hashcontent");
run;

Summary

In this post, I demonstrate two approaches to reading many variables into the key – or data portion of a SAS Hash Object. One using metadata and the other using SAS arrays. Which one to use depends on the circumstances. However, the array solution is usually to be preferred due to its simplicity and flexibility.

You can download the entire code from this post here.