My 10 All Time Favourite SAS Articles
I like to read SAS articles. In fact, articles inspire many of the posts on my blog. Today, I will show you the 10 articles that taught me the most. This was not an easy task. There are so many great articles. In so many fields within SAS. By so many great authors. Obviously, the articles that I present here reflect my own interest within the SAS language. In the bottom of the post, I will link to online resources where you can find articles that reflect your interest.
Proc Format Articles
The first two articles are about PROC FORMAT. One of my favorite procedures.
The Power of the PROC FORMAT by Jonas Bilenas. A boiled down version of the similar-named book The Power of the PROC FORMAT. A very nice overview of how to think of a SAS format. Also, you learn how to create an implement custom formats. Furthermore, it demonstrates format purposes, not obvious to may programmers such as lags, leads and lookups.
Ten Things You Should Know About PROC FORMAT by Jack N Shoemaker. A very nice set of PROC FORMAT Facts that you should familiarize yourself with. Good examples are Multilabel Formats and Hybrid Formats. A very nice and easily readable article.
Data Step Articles
How to Think Through the SAS DATA Step by Ian Whitlock. If you want to know exactly how to think of data step processing, you have to read this article. After reading this, you will understand what the Program Data Vector is. What the _N_ Automatic Variable is. Compilation vs Execution and much more. Also, did you know that the W in DoW Loop stands for Whitlock?
Direct Addressing Techniques Of Table Look-Up By Paul Dorfman. This article is from 1998. However, it is still relevant today. The article covers three lookup techniques Direct Accessing, Bitmapping and Hashing. Be aware that this was published before the hash object as we know it was available in the Data Step. Therefore, arrays are used to hold key and data variables. Different techniques are discussed to handle hash collisions. This article is a true masterpiece. Also, it was the inspiration of the blog post An Array Hashing Scheme in SAS.
Re-Mapping A Bitmap by Lessia S. Shajenko and Paul Dorfman. The article above briefly introduces the concept of Bitmap Searching. This article takes it much further. You will learn how to verify the presence of a key variable, regardless of the key type. Also, you will learn different techniques to use as many bits as possible. Thus reducing memory consumption to a minimum. A confusing topic at first, but if you hang in there, it will be worth every second you spend learning it. This article was the direct inspiration to the blog post Bitmap Searching in the SAS Data Step.
Leads and Lags: Static and Dynamic Queues in the SAS DATA STEP by Mark Keintz. SAS programmers often misuse or misunderstand the Lag Function. That will not be the case after reading this article. The key lesson is this. The Lag Function is a queue. Not a lookback. This can create undesirable results. Especially handling By Groups. I did not fully understand the Lag Function until I read this article. This article was the inspiration for the blog posts Investigating the SAS Lag Function By Example and Dynamic Lags in SAS with the Hash Object.
From Stocks to Flows: Using SAS Hash Objects for FIFO, LIFO, and other FO’s Mark Keintz. A very nice text that explain how to exploit the dynamic nature of the SAS Hash Object to implement various stacks and queue structures in SAS. Highly recommendable.
The Swiss Army Knife of SAS Procedures by Michael Raithel. This article is the best introduction to the Datasets Procedure out there. Also the Swiss Army Knife of SAS Procedures is a very fitting description. No other procedure can solve the same range of high-level data operations as PROC DATASETS. An absolute must-read for all SAS data scientists.
Getting Started with the SAS/IML Language by Rick Wicklin. If you are interested in learning the Interactive Matrix Language in SAS, this article is a good way to start. It gives you a nice overview of how to think of vectors, matrices. Also, it introduces fundamental IML concepts such as user defined functions, data simulation and much more. For more IML material, Rick Wicklin is also the author of the book Statistical Programming with SAS/IML Software and the blog The Do Loop. Both packed with IML tips.
Stupid Human Tricks with PROC EXPAND by David L. Cassell. Proc Expand is the kind of procedure that solves problems with 5 lines of code that requires 30 lines of data step code. Therfore, it is one of my favorite procedures. Especially when handling time series data. This article introduces the syntax and many capabilities of the Expand Procedure. Highly recommendable for anyone handling time series data in SAS.
A Cup of Coffee and Proc FCMP: I Cannot Function Without Them by Peter Eberhardt. A nice and gentle introduction to PROC FCMP. PROC FCMP is one of the most underused procedures in all of the SAS language. Probably because it is quite a mouthful at first sight. However, once you get familiar with the basics, you will discover a whole new world of possibilities. This article will help you do so.
In this post, I present my 10 favorite SAS articles. Did I miss any? If you read a SAS article that you think deseves to be on the list, feel free to reach out.
If you want to browse SAS articles, lexjansen.com is the place to go. You can also find them at SAS Insitutes site. However, lexjansen stores them all in one place. I highly encourage you to browse the site for SAS articles in your field of interest.