When we work with arrays in SAS, it can be beneficial that the values are in ascending or descending order. Therefore, it is a common task to sort an array in SAS. If you browse the SAS Community or SAS-L, you will find many different approaches to this problem. In this post, I will demonstrate four different techniques. Some efficient. Others not so much. I will go through a Proc Sort approach, Call Sortn, a Hash Object approach and the Qucksort algorithm macro. I will not go into detail about each approach. Rather, provide one or two examples and material for further reading.

Using Proc Sort

First, let us take a naive approach. Most programmers know about Proc Sort. Visually speaking, Proc Sort performs horizontal sorts. What we are after is a vertical sort. At least for the one-dimensional array. Therefore, we have to form the data appropriately for Proc Sort to work.

The example below consists of three steps. First, I take the array x and create one observation for each element in it. Next, I use Proc Sort and sort the data. Finally, I read the sorted data in a data step and read each observation value into an array element. Starting from the first one, which is the smallest due to the recent sort procedure.

data have;
   array x {10} (8 10 4 7 2 5 1 9 3 6);
   do _N_ = 1 to dim (x);
      _x = x [_N_];
      output;
   end;
run;
 
proc sort data=have;
   by _x;
run;
 
data want(drop=_:);
   array x {10};
   do _N_ = 1 by 1 until (lr);
      set have(keep=_x) end=lr;
      x [_N_] = _x;
   end;
run;

Call Sortn Routine

If all we want to do is to sort a one-dimensional numeric array in ascending order, Call Sortn is the obvious choice. The Call Sortn Routine was implemented in SAS 9.1.3, but was not documented until version 9.2. In the example below, I create a numeric array x with ten elements. Then, I use the Call Sortn Routune with the Of keyword and specify x[*] as the argument. This tells the routine to sort all elements in x. Finally, the Put Statement at the bottom prints the result in the log. Run the example code below and verify that the result is correct.

data _null_;
   array x {10} (8 10 4 7 2 5 1 9 3 6);
   call sortn (of x[*]);
 
   put (x[*])(=);
run;

The Routine has a character-pendant in the Call Sortc Routine.

Using a Hash Object

Next, let us move to an approach, not obvious to most. Using a hash object to sort an array in SAS. In this technique, I exploit the internal flexibility of the hash object and the Ordered: Argument in the Declare Statement. In the example below, I sort a numeric array x with a hash object. The code has three main parts.

  1. First, I declare the hash object h. I use the Multidata:”Y” to be able to allow duplicate key values. Then, I use the Ordered:”Y” argument to specify, that I want the key values in ascending order. I specify _x as the only key (and data) variable. Finally, I declare an iterator object and link it to the hash object h.
  2. In step 2, I read each element in the array x. I traverse the array with the _N_ variable. I save the current value of the x element in _x, which is the key variable in the hash object h. Finally, I call the Add() Method to insert the array element in the hash object. The Ordered argument will take care of the sort process.
  3. Finally, in step three, I iterate over the entire array again. I do so with the lbound and hbound functions to account for arrays with bases different from 1. For each array element, I use the Next() Iterator Method to traverse the entire hash object and insert the elements back into the array. In sorted order.
data _null_;
 
   array x {10} (8 10 4 7 2 5 1 9 3 6);
 
   declare hash h (multidata : "Y", ordered : "A");   /* 1 */
   h.definekey ("_x");
   h.definedone();
   declare hiter hi ("h");
 
   do _N_ = lbound(x) to hbound(x);                   /* 2 */
       _x = x[_N_];
       rc = h.add();
   end;
 
   put (x[*])(=);
 
   do _N_ = lbound(x) to hbound(x);                   /* 3 */
       hi.next();
       x[_N_] = _x;
   end;
 
   put (x[*])(=);
 
run;

Here, I merely scratch the surface of array sorting with the SAS hash object. It may seem like overkill at first. But the technique is extremely flexible and efficient. One of the main advantages is that performs stable sorting in the case of duplicate values. Also, it can handle sorting of parallel arrays and even multi-dimensional arrays. The technique is briefly discussed in the book Data Management Solutions Using SAS Hash Table Operations. However, if you really want to get the technique down, read the article Sorting Arrays Using the Hash Object by Paul Dorfman. Highly recommendable.

The Quicksort Algorithm Macro

Finally, let us sort an array in SAS with the Quicksort Algorithm. The quicksort algorithm is popular, simple and very efficient. If you have never heard of the quicksort algorithm before, see the introduction here. The quicksort algorithm is cleverly implemented in the article Quicksorting An Array by Paul Dorfman. Below, I present three examples of how to use the Qsort macro and how flexible it is.

First, I simply sort a numeric array ascending. This is equivalent to the Call Sortn approach.

data _null_;
   array x {10} (8 10 4 7 2 5 1 9 3 6);
   %qsort (Arr=x);
 
   put (x[*])(=);
run;

The Qsort macro handles sorting parallel arrays swiftly. Suppose, I have two arrays x and y. Then, I can sort the x array ascending and order the elements in y accordingly. See the example below. I simply specify both arrays in the Arr= argument and only x in the By= argument.

data _null_;
   array x {10} (8 10 4 7 2 5 1  9 3 6);
   array y {10} (3 1  7 4 9 6 10 2 8 5);
   %qsort (Arr=x y, By=x);
 
   put (x[*])(=);
   put (y[*])(=);
run;

Finally, suppose I want to sort only the first 5 elements in x. And leave the rest of the elements untouched. This is easily handled by the macro by specifying lb=1 and hb=5. An extremely handy feature of the Qsort macro.

data _null_;
   array x {10} (8 10 4 7 2 5 1 9 3 6);
   %qsort (Arr=x, lb=1, hb=5);
 
   put (x[*])(=);
run;

If you are interested in learning about how the macro is constructed, the code is well commented in the article. Otherwise, simply use the examples presented.

Discusion

So, what approach to use? As with most problems with multiple solutions, it depends. However, do not use Proc Sort to sort arrays in SAS. Luckily there are way better options for you to choose from. Call Sortn and Call Sortc handles the ascending sort of 1-dimensional array in SAS well. But that is it. Unfortunately, they are not very flexible. Flexible array sorts require flexible tools. Luckily, that is exactly what the hash object and Qsort macro offers. Each tool has its own strength and I highly encourage you to read the material that I posted above.

As a rule of thumb, here are my favorite aspects of each approach. The hash object offers stable array sorting with no extra work required. It is a simple side-effect of how key values are inserted and retrieved to/from the object. A stable sort means that the sorted values remain in their original relative position within each key group. The Qsort macro however, offers the flexibility of effortlessly sorting only parts of the array. This can be done with the hash object as well, but in the macro, the work is already done.

As a side note, all the techniques prested above handle Temporary Arrays as well.

Summary

In this post, I demonstrate four ways to sort an array in SAS. These are Proc Sort, Call Sortn, the Hash Object and the Qsort macro. We see examples of all four approaches. Finally, I discuss pros and cons of the different approaches and how to choose between them. I aim to present the different approaches mainly by example and only explain briefly how the techniques work. Perhaps, I will go into detail with one of the approaches in a future post.

For other posts about SAS arrays, read Set All Array Elements to Zero in SAS and An Array Hashinh Scheme in SAS.

You can download the entire code from this post here.