How Much Memory Does SAS Hash Object Occupy?
The SAS hash object is an in-memory data construction. This is one of the reasons for the efficiency of the hash object. Therefore, it is crucial that your hash object fits in memory. While SAS will issue an error if the hash object does not fit in memory, it is a good idea to make sure that the hash object is not close to occupying the entire memory space. In this post, I will demonstrate how to measure the memory occupation of a hash object.
Measure The Size a Hash Object
There are several methods out there to estimate the memory consumption of a hash object. If you add items from a data set, you can use PROC CONTENTS and use the record length and multiply with the number of records. This will give you a nice estimate. If the hash object items do not come from a data set, you can use the ITEM_SIZE and NUM_ITEMS attributes to do the same.
While the above approaches estimate the memory occupation well, the method I will show you measures the memory occupation exactly. In the example page, Check Available Memory to SAS Session, I demonstrate how you can check how much memory is currently available to the SAS session. We can use this exact technique here. In the example below, I read some made up data into a hash object. I wrap the creation of the hash object in statements that measure the available memory before and after the hash object is initialized. This gives me the exact size of the hash object.
data MyData; do x=1 to 10e6; output; end; run; data _null_; if 0 then set MyData; before=input(getoption('xmrlmem'),20.); if _N_=1 then do; declare hash h(dataset:"MyData"); h.defineKey('x'); h.defineDone(); end; after=input(getoption('xmrlmem'),20.); hashsize=before-after; put "Hash Object Takes Up:" hashsize sizekmg10.2; run;
I put the result in the log. The hash object at question takes up 461.46 MB of memory.
Estimate The Size of an Empty Hash Object
The result that you get from using the XMRLMEM Option to measure the hash object size will be different from the estimate you get from using PROC CONTENTS output or the ITEM_SIZE and NUM_ITEMS Attributes. The reason is that even an empty hash object takes up memory. The memory occupation of an empty hash object depends on the HASHEXP Option, that controls the number of binary search trees. As an example, let us see how much memory an empty hash object with the maximum number of search trees possible takes up.
data _null_; k=.; before=input(getoption('xmrlmem'), 20.); if _N_ = 1 then do; declare hash h(hashexp:20); h.defineKey('k'); h.defineDone(); end; after=input(getoption('xmrlmem'), 20.); hashsize=before-after; put "Hash Object Takes Up:" hashsize sizekmg.; run;
The empty hash object only takes up 13 MB of data. And remember, this is with the maximum number of search trees possible. Whenever you read large amounts of data into hash objects, it is usually a good idea to specify HASHEXP:20.
Tin this post we have seen how to measure the memory consumption of a hash object. We use the XMRLMEM Option to do so. We have seen that even an empty hash object takes up memory. Though the size of an empty hash object is usually not to be considered.
I use the technique presented in this post in the blog posts Three Basic Techniques to Reduce SAS Hash Object Size and Two Advanced Techniques to Reduce SAS Hash Object Size.
Also, I utilize the Hash Objects ability to grow and shrink directly in memory in the post The SAS Hash Object as a Dynamic Placeholder.
You can download the entire code from this post here.