In this post, I will investigate the Suminc and Keysum arguments in the SAS Hash Object Declare Statement. There exists little literature on the topic and the documentation is sparse. Also, the documentation is not rich on examples, nor does it fully explain the inner workings. In this post, I attempt to clear up a few misconceptions about the two arguments.

The Key Summary Value

If we want to understand the Suminc and Keysum Arguments, we firsts have to understand the Key Summary. The key summary is a variable in memory, which keeps track and increments each time the key is referenced by a fitting method. However, not all object methods increment the key summary. Some initializes/re-initializes the key summary to the current value of the variable specified in the Suminc Argument (more on that below). A rule of thumb is this: The Add(), Replace() and Ref() Methods initialize the key summary. In this context, you have to think of loading the hash object with a data with the dataset: option as a series of Add() Method calls. Other methods, which reference the key increment the key summary.

However, caution must be taken when you use the Ref() Method. Because this is basically the Check() and Add() Method in one. It turns out, that if the key summary has already been initialized, the Ref() method increments. It does not re-initialize, as the Replace() and Add() Methods do.

As a final note, we can retrieve the current value of the key summary with the SAS Hash Object Sum Method. It is a common misunderstanding that the Sum Method does summation operations. That is not the case.

The Suminc Argument

Now, let us take a look at the Suminc Argument. In the hash object declaration, I specify s as the suminc variable. This means that the key summary initializes to – and increments by the current value of s from the set of rules described above. Run the code example below and check the log. This makes it apparent what the different method does to the key summary. For each method call, I use the Sum() Method to assign the current value of the key summary to the pdv varaible t.

data _null_;
   declare hash h (suminc : 's');
   h.definekey ('k');
   h.definedone ();
 
   k = 1;
   s = 2;
 
   h.add ();           /* Initializes the key summary to 2         */
   h.sum (sum : t);
   put "After Add() Method: " @25 t =;
 
   h.check ();         /* Increments the key summary to  4         */
   h.sum (sum : t);
   put "After Check() Method: " @25 t =;
 
   h.find ();          /* Increments the key summary to  4         */
   h.sum (sum : t);
   put "After Find() Method: " @25 t =;
 
   h.replace ();       /* Re-Initializes the key summary to s = 2  */
   h.sum (sum : t);
   put "After Replace() Method: " @25 t =;
 
run;

The documentation and literature on the topic is sparse. However, do read the article Let Hash SUMINC Count For You and this thread at SAS-L for further discussion.

The Keysum Argument

Next, let us look at the Keysum argument. The documentation says that the Keysum Argument “specifies the name of a variable that tracks the key summary for all keys. A key summary is a count of how many times a key has been referenced on a FIND method call.”. This is not strictly accurate. Also, there are a few points that I think need to be added. First of all, the key summary is not a simple count. A count implies an increment of 1. However, the Keysum increases by the value of the suminc variable. Which can take any value. Also non-integers.

Secondly, The find() method is not the only method, which increases the Keysum variable. Take a look at the code below. Run it and check the test output dataset from the Output Hash Object Method. This clearly implies that Ref() and Add() also increments the value of Keysum. However, it seems that a Find() Method must be present for the Keysum varaible to take other values than zero. Try commenting out the find() Methods. Now, the Keysum value is not incremented, but remains 0. However, this does not imply that the Key Summary is equal to zero. If you comment out the find() methods, but insert a Sum() Method call after the last Ref() Method, you will see that the Key Summary is equal to 6.

data _null_;
   declare hash h (suminc : 's', keysum : 'keysum');
   h.definekey ('k');
   h.definedone ();
 
   k = 1;
   s = 2;
   keysum = 0;
 
   h.add ();
   h.ref ();
   h.ref ();
 
   h.find ();
   h.find ();
 
   h.output (dataset : 'test');
run;

As with the Sumin Argument, the documentation is short. The best explanation is the doc is found in the Maintaining Key Summaries section in Using the Hash Object. There are not many examples to refer to out there. However, there are a few at Stackoverflow here and here. I recently opened a discussion about the Keysum argument at Stackoverflow.

Summary

In this post, I investigate the rarely used Suminc and Keysum arguments in the SAS Hash Object. I attempt to clear up a few frequent misconceptions by example. Furthermore, a few inaccuracies of the usually spot-on SAS documentation are addressed. I try to make things as clear and simple to follow as posible. In a future blog post, I will dig deeper and demonstrate what kind of real-life problems we can solve with the two options.

Not long ago, I investigated another rarely used option in the blog post The Hash Object Memrc Argument in Definedone. Also, see the related posts Create a FIFO Queue in SAS and An interesting PDV Application of the SAS Hash Object.

You can download the entire code from this post here.