Most SAS programmers find the hash object hard at first. The syntax is different from most other SAS syntax. Also, a few conditions have to be fulfilled for a successful hash object operation. In this post, I list 5 of the most common reasons why the hash object fails, yields errors in the log or gives wrong results. The examples do not relate to a specific task. Rather, they demonstrate basic conditions that must be fulfilled in the situations.
Missing Parameter Type Matching
Missing parameter type matching is probably the most common source of error from the hash object. The term parameter type matching means: The key and data variables in the hash object must exist in the data step PDV with exact identical names. Parameter type matching ensures send/retrieve data to/from the object. Even if we do not intend to perform any search or add operation, this condition must be fulfilled.
Consider the code below. Here, I declare a hash object h. I define the key variable k and the data variable d. This code yields an error: ERROR: Undeclared key symbol k for hash object at line xx. This happens because the key and data variables do not have host variables in the PDV. The Definedone() Method is responsible for checking proper parameter type matching. Consequently, xx in the log error message will point to the Definedone() line. If you uncomment the last line (k=.; d=.;), no error appears because both k and d exist as host variables in the PDV.
data _null_; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); /*k=.; d=.;*/ run;
The Hash Object Exceeds Memory
The hash object is an in-memory construct. Not surprisingly, this means that the available memory to your SAS session limits the amount of data we are allowed to fit in an object. Consider the code below. This gives me the error message: ERROR: The SAS System stopped processing this step because of insufficient memory.
data _null_; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); k=.; length d $ 1000; do k=1 to 1e7; rc=h.add(); end; run;
You can not make the hash object independent of memory. However, there are techniques to limits its memory footprint. I have previously written about hash object memory management in the blog posts below
Failed Unassigned Method Call
As I demonstrate in the blog post Assigned Vs Unassigned Hash Object Method Call in SAS, you can either call a method assigned or unassigned. A failed unassigned method call yields an error in the log. A failed assigned method call simply returns a non-zero return code to the assigned variable.
Consider the code below. The unassigned Find() Method call yields the error: ERROR: Key not found in the log because the key variable k=1 does not exist in h. Conversely, if we make the call assigned, no error will appear. The method call will simply return a non-zero return code.
The lesson: Always be aware of whether you should call a method assigned or not. Do you want a failed call to yield an error in the log? Call the method unassigned. Do you want a failed method call to return a non-zero return code and take action from there? Call the method assigned.
data _null_; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); k=.; d=.; h.find(key: 1); run;
Duplicate Keys Without the Multidata:’Y’ Argument Tag
This should be obvious to most. However, I have fallen into this one myself too many times not to mention it. When you want to add multiple elements to h for the same key, use the Multidata:’Y’ Argument Tag in the Declare Statement. Consider the code below. The second Add Method() call gives the message ERROR: Duplicate key in the log.
data _null_; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); k=.; d=.; h.add(key: 1, data: 1); h.add(key: 1, data: 1); run;
Remember the Parenthesis
Method calls have parenthesis. Object attributes do not. Luckily, there are only two attributes to keep track of. The Item_Size and the Num_Items Attributes. If a call is not one these two, remember the parenthesis. Otherwise SAS yields a syntax error in the log.
data _null_; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); k=1; d=1; rc=h.add(key: 1, data: 1); rc=h.find; run;
Matched Values Are Retained
Be aware that matched hash object values are retained in the SAS data step. Consider the code below. Here, I create the same object as above, fill it with elements for key values 1 and 3 (not 2). Next, I try to find the key values 1, 2 and 3. Obviously, only 1 and 3 are found successfully. Since 2 does not exist in h, the Find() Method returns a non-zero return code and nothing is inserted into the host variable d. The logical value for d in the second observation is missing. However, since d is retained from the previous (successful) Find() Method call, the outputted value is 1. This is rarely desirable. Therefore, be sure to initialize the relevant data values to missing before the Find() Method call. That way, no retained values are falsely assigned to host variables.
data test; declare hash h(); h.definekey('k'); h.definedata('d'); h.definedone(); k=.; d=.; h.add(key: 1, data: 1); h.add(key: 3, data: 3); do k=1, 2, 3; *call missing(d); rc=h.find(); output; end; run;
In this post, I have explored 6 of the most common reasons why the SAS hash object fails, yields errors or produce undesired results. Of the six points, the missing parameter type matching is probably the most common. Parameter type matching is a crucial hash object concept. Also, the lack hereof shows a lack of understanding in the subject in general. Here is 5 Tips to Learn and Understand the Hash Object in SAS.
Did I miss anything? Feel free to reach out! Debugging can be hard with hash objects. See the blog post Print Content of a SAS Hash Object in the Log for help.
You can download the entire code from this post here.