In the blog post SAS Hash Object Lookup Example I demonstrate how to perform a simple one to one lookup with the hash object. However, the usual business case is more complicated than this. Luckily, the SAS hash object is very flexible. This post demonstrates an example of a one to many merge with the Hash Object Do_Over Method. This means that we look up several data values for each given key value in a data set.
First, let us create some example data to work with. I create two simple data sets. The Customers data set contain customer IDs of interest. The Transactions data set contains transaction data for various customer IDs. The data is for demonstration purposes only.
data Customers; input CustID $ @@; datalines; 112 113 114 ; data Transactions; input CustID $ Quantity; datalines; 111 20 111 40 112 10 112 50 112 10 113 60 113 20 114 50 114 30 114 50 114 30 114 40 115 90 115 70 ;
Find(), Find_Next() Method Example
First, let us look at the traditional way to lookup several data values for a given key variable. I want to look up every transaction that each of the customers in the Customers data set has made. Therefore, I want to lookup (possibly) several data values from Transactions for each customer.
I can do this by setting up a hash object with CustID as the key and Quantity as the data variable. Then I read the Customers data set sequentially. Finally, I use a do loop and use the Find() Method in the first iteration to look up the first data value for the implicitly specified key. If the key variable is found, I invoke the Output Statement and use the Find_Next() Method to read the next data values for the key into the PDV. This continues until all data values are retrieved for the given key. The If 0 Then Set Statement ensures parameter type match.
data FindFindNext(drop=rc); if 0 then set Transactions; if _N_=1 then do; declare hash h(dataset:"Transactions", multidata:"Y"); h.definekey("CustID"); h.definedata("Quantity"); h.definedone(); end; set Customers; do rc=h.find() by 0 while (rc=0); output; rc=h.find_next(); end; run;
Do_Over() Method Example
The Find() and Find_Next() Method example above is the traditional hash object approach to lookup multiple data values for a given key value. However, in SAS 9.4, the Do_Over() Method was introduced to ease this process. The Do_Over method combines the two method calls from above into a single method call.
The code below creates exactly the same data as the example above. However, the code is arguably a lot simpler. Though, behind the scenes almost exactly the same happens. If the key is found, the first instance of the data values for the key is read into the PDV. Then, the Do_Over Method checks if another set of data values exist for the same key. If it does, then they are read into the PDV. If not, SAS goes to the next key value and starts over.
data Do_Over; if 0 then set Transactions; if _N_=1 then do; declare hash h(dataset:"Transactions", multidata:"Y"); h.definekey("CustID"); h.definedata("Quantity"); h.definedone(); end; set Customers; do while (h.do_over()=0); output; end; run;
This post demonstrates an example of how the SAS Hash Object Do_Over() Method lets you look up multiple data values for a given key. The Do_Over() Method is available since SAS 9.4 and replaces the traditional Find()/Find_Next calls. The Do_Over() Method is a nice example of the flexibility of the SAS Hash Object. In future posts, I will show more simple examples of how to exploit the many features of the Hash Object to your advantage.
All the methods called in the post are called assigned. This is always the case for hash object methods that search and retrieve data. Read more in the post Assigned Vs Unassigned Hash Object Method Call in SAS. Also, check out the Introduction to the Hash Iterator Object in SAS.
I use the Do_Over Method in the blog posts Complicated Fuzzy Grouping With The SAS Hash Object, Partial Key Lookup in SAS Hash Object and Single Key Changes in Multi Key Hash Objects.
You can download the entire code from this post here.