Convert From Character to Date in SAS
In the blog posts Convert from Character to Numeric and Convert from Numeric to Character I demonstrate the correct way of converting variable types in SAS by use of the INPUT and PUT Function respectively. A common use of converting a variable from character to numeric in SAS is when a date is stored as a character value. Lets us take a look at how to address this problem.
First off, let me make one thing clear. You should never ever store a date as a character variable! It makes it impossible to do calculations based on these values. See the blog post Date Basics in SAS to see examples of these calculations. Therefore it is crucial that your dates are stored as valid dates, i.e. as integers indicating the number of days since 01jan1960.
When you want to convert a date variable stored as text into a valid SAS date, you essentially want to convert a variable from character to numeric. In the article above, I present both the correct way and a naive approach of doing so. The naive approach (which you should not resolve to) is to use the character variable in a numeric context. This is an example where this would not work. Instead, we have to use the correct approach, namely the INPUT function.
SAS Code Example
When we convert a date stored as text into a valid SAS date, we first have to analyze in what format it is stored. Or more correctly, what informat the INPUT function should use to read the variable in order to convert the variable correctly. Consequently, the conversion requires two simple steps:
- Find out what format the text date is stored in.
- Use the INPUT Function to convert the character variable to a valid SAS date.
Usually, I use this table of SAS Date Formats to answer number 1 above. In the
data Dates; CharDate1="20oct2017"; /* Format: date9 */ CharDate2="20/10/2017"; /* Format: ddmmyy10 */ CharDate3="10/20/2017"; /* Format: mmddyy10 */ Numdate1=input(CharDate1, date9.); NumDate2=input(CharDate2, ddmmyy10.); NumDate3=input(CharDate3, mmddyy10.); /* format Num: date9.; */ run; proc contents data=Dates; run; proc print data=Dates; run;
To the right, you can see the output from the PROC CONTENTS procedure. As you can see, the three CharDate variables all have type Character, with the length equal to the number of letters forming them. As a result of the conversion from the INPUT function, the NumDate variables all have numeric types. If you run the Print Procedure, you will see that all the numeric date values resolve to the same numeric value, namely 21112, which is the date equivalent of the 14. October 2017.
I will point this out again, because it really is that important. Never ever store your date values as character variables. They will loose all computational power, because they are merely text values. Also you can not sort them in a meaningful way through time. If you sort them, they will be sorted alphabetically, which will make no sense at all.
You can download the entire code from this post here.