The SAS hash object is an in-memory data construction. This is one of the reasons for its efficiency. Therefore, it is crucial that it fits into memory. While SAS will issue an error if the object does not fit, it is a good idea to make sure that the hash object size is not close to occupying the entire memory space. In this post, I will demonstrate how to measure the memory occupation of a hash object with the XMRLMEM Option.
Measure The Size a SAS Hash Object
There are several methods out there to estimate the memory consumption of a hash object. If you add items from a data set, you can use PROC CONTENTS and use the record length and multiply with the number of records. This will give you a nice estimate. If the hash object items do not come from a data set, you can use the ITEM_SIZE and NUM_ITEMS attributes to do the same.
While the above approaches estimate the memory occupation well, the method I will show you measures the memory occupation exactly. In the example page, Check Available Memory to SAS Session, I demonstrate how you can check how much memory is currently available to the SAS session. We can use this exact technique here. In the example below, I read some made-up data into a hash object. I wrap the creation of the hash object in statements that measure the available memory before and after the hash object is initialized. This gives me the exact size of the object.
data MyData; do x=1 to 10e6; output; end; run; data _null_; if 0 then set MyData; before=input(getoption('xmrlmem'),20.); if _N_=1 then do; declare hash h(dataset:"MyData"); h.defineKey('x'); h.defineDone(); end; after=input(getoption('xmrlmem'),20.); hashsize=before-after; put "Hash Object Takes Up:" hashsize sizekmg10.2; run;
I put the result in the log. The object at question takes up 461.46 MB of memory.
Estimate The Size of an Empty SAS Hash Object
The result that you get from using the XMRLMEM Option to measure the object size will be different from the estimate you get from using PROC CONTENTS output or the ITEM_SIZE and NUM_ITEMS Attributes. The reason is that even an empty hash object takes up memory. The memory occupation of an empty hash object depends on the HASHEXP Option, which controls the number of binary search trees. As an example, let us see how much memory an empty hash object with the maximum number of search trees possible takes up.
data _null_; k=.; before=input(getoption('xmrlmem'), 20.); if _N_ = 1 then do; declare hash h(hashexp:20); h.defineKey('k'); h.defineDone(); end; after=input(getoption('xmrlmem'), 20.); hashsize=before-after; put "Hash Object Takes Up:" hashsize sizekmg.; run;
The empty hash object only takes up 13 MB of data. And remember, this is with the maximum number of search trees possible. Whenever you read large amounts of data into hash objects, it is usually a good idea to specify HASHEXP:20.
In this post, we learn how to measure the memory consumption of a hash object. We use the XMRLMEM Option to do so. We have seen that even an empty object takes up memory. Though the size of an empty hash object is usually not to be considered.
I use the technique presented in this post in the blog posts Three Basic Techniques to Reduce SAS Hash Object Size and Two Advanced Techniques to Reduce SAS Hash Object Size. Also, see the relevant blog post The Hash Object Memrc Argument in Definedone.
You can download the entire code from this post here.