Multilabel formats allow programmers to create overlapping classes in SAS. This can be quite handy when the classes are not mutually exclusive. This post demonstrates how to handle such cases with the Multilabel Option in the Format Procedure.

In the examples to come, I will use the example data below. The data is for demonstration purposes only.

data creditdata;
   array first_names{20} $20 _temporary_ ("Paul", "Allan", "Thomas", "Michael", "Chris", "David", "John", "Jerry", "James", "Robert",
                                          "William", "Richard", "Bob", "Daniel", "Paul", "George", "Larry", "Eric", "Charles", "Stephen");
   array last_names{20}$20 _temporary_ ("Smith", "Johnson", "Williams", "Jones", "Brown", "Miller", "Wilson", "Moore", "Taylor", "Hall",
                                        "Anderson", "Jackson", "White", "Harris", "Martin", "Thompson", "Robinson", "Lewis", "Walker", "Allen");
   call streaminit(123);
   do ID=1 to 1e5;
      first_name=first_names[ceil(rand("Uniform")*20)];
      last_name=last_names[ceil(rand("Uniform")*20)];
      creditrate=rand('integer', 1, 10);
      output;
   end;
 
   format ID z6.;
run;

Applying the Multilabel Option

Let us assume that the creditrate variable in the above data set represents the customers credit rating. The smaller the value, the more creditworthy the customer is. Now, we want to categorize the credit ratings into buckets like ‘Strong Approval’, ‘Weak Approval’, ‘Approval’, and so on. However, note that eg ‘Strong Approval’ and ‘Approval’ are not mutually distinct categories. If you have a strong approval rate, you are still approved. This problem is hard to handle in an if-then-else statement. However, the Multilabel Option in PROC FORMAT handles cases like this neatly.

In the Format Procedure below, I create the numeric format appr. In the options, specified before the ranges, I use the Multilabel Option to allow for overlapping ranges. If I leave out this option, SAS issues an error in the log: “ERROR: These two ranges overlap: 1-2 and 1-6 (fuzz=1E-12).”. I use the Notsorted Option to display the ranges in the order specified in the procedure in the later summary statistics. I use the Default= Option simply because I like to control the length of both formats and variables whenever I can.

proc format library=work;
value appr (default=20 multilabel notsorted)
1-2  = 'Strong approval'
3-6  = 'Weak Approval'
1-6  = 'Approval'
7-8  = 'Weak Decline'
9-10 = 'Strong Decline'
7-10 = 'Decline'
;
run;

Using the Multilabel Format in Summary Procedures

SAS PROC FORMAT Multilabel Option Example PROC MEANSNext, let us put the multilabel format to work. I use PROC MEANS to calculate frequencies of credit approvals. Needless to say, you can calculate all kinds of descriptive statistics here. However, since the focus is not on the statistics, rather the format use, I will keep it simple.

In the Class Statement Options, I use the MLF Option to tell SAS that the format has overlapping ranges. Next, I use the Preloadfmt Option and the Order=data Option to make sure that the procedure maintains the order specified in the above Format Procedure. In the Format Statement, I simply specify the created numeric appr format.

I am most comfortable with the Means Procedure. However, you can use other summary statistics procedures as well. It depends on the specific situation. You can see examples of other approaches in the article Creating and Using Multilabel Formats.

proc means data=creditdata n maxdec=1 nonobs;
   class creditrate / mlf preloadfmt order=data;
   format creditrate appr.;
run;

Summary

In this post, we have seen a simple example of writing and utilizing multilabel formats in SAS. We have seen that multilabel formats swiftly handle classification problems with overlapping ranges. Overlapping ranges are much harder to handle within data step logic, where if-then-else statements are probably the first choice of many programmers.

Read the related posts Looking Up Data With PROC FORMAT, 5 Picture Format Options You Should Know and Nesting Formats in SAS With Proc Format.

You can download the entire code from this post here.