Proc Summary is my favorite SAS Procedure to calculate descriptive statistics. Actually, it is one of my favorite SAS procedures overall. In this example page, I will demonstrate a few simple examples of how to use Proc Summary. Furthermore, I will demonstrate a few tricks to create exactly the appearance and statistics you want.
Before we go to the examples, you should understand one thing. Proc Summary is the same procedure as Proc Means. There are only minor differences between the two. The overall difference is this. Proc Summary stores descriptive statistics in a data set. Proc Means displays descriptive statistics in output destinations. For example the HTML destination. You can read about the exact differences here.
In the examples below, I use the sashelp.class data set.
A Simple Proc Summary Example
First, let us see a simple example. In the code snippet below, I specify the variable of interest in the Var Statement. Also, I use the Output Statement and specify the name of the output data set. This results in an output data set class with five observations. One for each default statistic. The default statistics of Proc Summary are N, MIN, MAX, MEAN and STD. The data set contains 4 variables. The variable height is simply the value of the computed statistic. Besides that, SAS creates three new variables.
- _TYPE_: Indicates which combination of the class variables is used to compute the statistic. The _TYPE_ variable can be a little hard to grasp at first. However, a nice explanation is in the Results Section of the documentation. In many cases, we only want the statistics for which all class variables contribute to the calculation. See the NWAY Option in the last code snippet for an easy way to achieve this.
- _FREQ_: The number of observations that contribute to the calculation of the statistic in Proc Summary
- _STAT_: The name of the statistical size.
proc summary data=sashelp.class; var height; output out=class; run;
Add a Class Variable
Next, let us add a class variable to Proc Summary. This creates 15 observations instead of 5 above. 5 for each sex and 5 overall. Also, Sex now appears as a variable in the data set. Not surprisingly, the Sex variable contains what level of sex the statistic is calculated on.
proc summary data=sashelp.class; class sex; var height; output out=class; run;
Choose Statistics to Calculate
In most situations, we do not want all the default statistics in the output SAS data set. We can specify the statistics we want to display in the Output Statement if Proc Summary. In the code snippet below, I specify mean and sum. When I request statistics like this, Proc Summary does not output the _STAT_ variable. This is desirable in most situations. This way, I get two new variables with the names mean and sum. You can see the available statistics in the Output Statement Documentation.
proc summary data=sashelp.class; class sex; var height; output out=class mean=mean sum=sum; run;
Use the NWAY Option in Proc Summary
Finally, let us look at the NWAY Option in the Proc Summary Statement. The NWAY Option is related to the _TYPE_ variable in the output data set. When we specify NWAY, Proc Summary limits the output statistics to the observations with the highest _TYPE_ value. This means, that SAS outputs only the observations where all class variables (if any) contribute to the statistic. Consequently, no overall statistics appear in the output.
proc summary data=sashelp.class nway; class sex; var height; output out=class mean=mean sum=sum; run;
Above, I provide a gentle introduction to Proc Summary in SAS. We see that the procedure lets us create descriptive statistics with little coding. Obviously, I only scratch the surface in this post. There are many options and statements that I do not discuss. I encourage you to browse the documentation and familiarize yourself with some of them. An example of a little-known option is the Idgroup Option in the Output Statement. I use this in the blog post 3 Ways to Select Top N By Group in SAS.
The documentations is the best reference for Proc Summary. However, the brilliant book Carpenter’s Guide to Innovative SAS Techniques has an entire chapter on the topic. Read chapter 7 and you will be good to go. Highly recommendable.
You can download the entire code from this example page here.