Convert from Character to Numeric in SAS – The Lazy and The Right Way
When you deal with data in SAS, you will often face the challenge of converting a variable. A SAS variable can either be numeric or character. This is called the type of the variable. Once a variable has been declared, its type can not be changed. So when we want to convert variables in SAS, we are really talking about converting a variable value and assigning it to a new variable. In this post, I will focus on converting a character variable to numeric. First, I show a common pitfall that unfortunately, many SAS users resolve to. Next, I show the correct conversion method putting you in charge of the conversion.
From Character to Numeric – The Lazy Way
“I have a number stored as a character variable in SAS. Luckily, I can multiply it by 1 and it becomes numeric“. It is true that when you use a SAS character variable in an arithmetic context, SAS will interpret it as a numeric variable if it makes sense. But beware! It is very dangerous to blindly rely on this assumption.
Let us consider the example of a character variable containing the number 1000 with a thousand separator “.” and decimal separator “,”. Then we try to convert this value by multiplying by 1.
data _null_; CharNum = "1.000,00"; /* Character value 1000 */ Number = CharNum*1; /* Naive convert method */ put Number=; /* Print to log */ run;
As you can see from running this code, it returns a missing value as a result of a failed conversion. Also a note is printed to the log, indicating that a character value has been implicitly converted to numeric. Next let us consider the value 1mio in a character variable written with commas as thousand separator. This is common in Europe.
data _null_; CharNum = "1,000,000"; /* EU character value of 1mio */ Number = CharNum*1; /* Naive conversion method */ put Number=; /* Print to log */ run;
Again, we check the log and confirm that the conversion method of multiplying by 1 fails. The converted numeric variable is missing.
From Character to Numeric – The Right Way
It seems that the conversion method of using a character variable in a simple arithmetic context is not sustainable for converting variables. Because you can not tell the arithmetic operation what format is applied to the number stored as text. However, you can tell exactly that to the INPUT Function. The syntax of the INPUT function is INPUT(source, informat.), where source is the character value you want to convert and informat is the format applied to the value i.e. the informat for the function.
Below I convert the same character variables as above, but using the INPUT Function. I consult the SAS Documentation on Numeric Formats to learn what informats to use. It seems that I can use the comma10.2 informat to tell the INPUT function to interpret periods as thousand separators and the and the commax10. informat to interpret commas as thousand separators.
data ConvertData; CharNum1 = "1.000,00"; /* Character value of 1000 */ CharNum2 = "1,000,000"; /* EU character value 1mio */ Number1 = input(CharNum1, commax10.2); /* Correct convert method */ Number2 = input(CharNum2, comma10.); /* Correct convert method */ put Number1= Number2=; /* Print to log */ run;
From the printed values to the right, it seems that the INPUT function correctly interprets the formats of the numbers stored as text and converts the variables the desired way.
In conclusion, you should not rely on easy, fast track methods to convert from character type to numeric in SAS. As you have seen, you are not in control of the conversion this way, because you cannot tell an arithmetic operation what your character value looks like. Therefore, you should always use the INPUT Function when you convert from character to numeric in SAS.
Soon I will reverse the process and blog about how to Convert From Numeric to Character. Because unfortunately, many SAS users also take the lazy approach here.
You can download the entire program from this post here.