8 SAS National Language Support
8.1 Intro
SAS allows you to work with a wide variety language encodings, and provides user interfaces to a handful of these encodings as well. This also means that your SAS output might be encoded in many different ways. However Rmarkdown assumes that all input is in the UTF-8 encoding (which should accomodate the whole variety).
8.2 Setup
Set up your document by loading SASmarkdown.
library(SASmarkdown)
To use a language encoding that is not your default in SAS requires
additional set up when SAS is started. This is accomplished by adding
a -config
option on the SAS command line. In Markdown this will
be done through the engine.opts
setting.
To then read a SAS output file in a non-“latin1” encoding into Rmarkdown,
an encoding
chunk option will be used.
8.3 Default Language
Your default language depends on how you have SAS configured. In my case, SAS defaults to an English language interface, and a “latin1” (or “wlatin1” on Windows) language encoding.
First, consider an example that gives us the default listing output:
proc means data=sashelp.class(keep=height);
run;
The MEANS Procedure
Analysis Variable : Height
N Mean Std Dev Minimum Maximum
------------------------------------------------------------------
19 62.3368421 5.1270752 51.3000000 72.0000000
------------------------------------------------------------------
8.4 French
I can switch to a French encoding by pointing SAS to the French language configuration file.
sasopts <- "-nosplash -ls 75 -config 'C:/Program Files/SASHome/SASFoundation/9.4/nls/fr/sasv9.cfg'"
knitr::opts_chunk$set(engine.opts=list(sas=sasopts, saslog=sasopts))
Because French output is also encoded in the “latin1” standard, nothing special is required to use the SAS output.
The result is:
proc means data=sashelp.class(keep=height);
run;
La procédure MEANS
Variable d'analyse : Height
N Moyenne Ec-type Minimum Maximum
------------------------------------------------------------------
19 62.3368421 5.1270752 51.3000000 72.0000000
------------------------------------------------------------------
And using HTML output:
proc means data=sashelp.class(keep=height);
run;
8.5 Chinese
Switching to a Chinese encoding requires one extra step. Not only do we have to ensure that SAS produces Chinese output (this might be your default, but it is not mine), but we also have to instruct Rmarkdown to transcode from Chinese (in this case, the “gbk” standard) to UTF-8.
So a SAS set up is
sasopts <- "-nosplash -ls 75 -config 'C:/Program Files/SASHome/SASFoundation/9.4/nls/zh/sasv9.cfg'"
knitr::opts_chunk$set(engine.opts=list(sas=sasopts, saslog=sasopts,
sashtml=sasopts, sashtmllog=sasopts))
This Chinese output is also encoded in the “gbk” standard, so a chunk option is required to use the SAS output properly.
The code chunk looks like this:
```{sas, encoding="gbk"}
proc means data=sashelp.class(keep=height);
run;
```
The result is:
proc means data=sashelp.class(keep=height);
run;
MEANS PROCEDURE
分析变量: Height 身高(英寸)
数目 均值 标准差 最小值 最大值
--------------------------------------------------------------------
19 62.3368421 5.1270752 51.3000000 72.0000000
--------------------------------------------------------------------
And with HTML output:
proc means data=sashelp.class(keep=height);
run;
8.6 Documentation
To see what encodings you may use for SASmarkdown, use the function iconvlist()
. The result is
platform dependent.
To see what encodings SAS is capable of producing see Encoding Values for a SAS Session. The options again depend on your operating system (platform).
To see what encoding SAS is using,
2 proc options option=encoding;
3 run;
SAS (R) PROPRIETARY SOFTWARE RELEASE 9.4 TS1M6
ENCODING=EUC-CN 指定 SAS 会话的默认字符集编码。
Be aware that the encoding name given by SAS may not match the encoding name
used by R. Here, for instance, SAS calls it’s encoding “EUC-CN”, but that value
fails in R (despite being in the iconvlist
).
After a little experimentation, I find “gb2312” and “gbk” (an expanded version of “gb2312”) both appear to work. (But I don’t read or speak any Chinese language, so someone tell me if my choice is wrong!) I came to these guesses by
- looking at the output file with the Notepad++ text editor, which guessed the file is “gb2312”
- and by saving SAS HTML output and looking at the character set it declared, which was “gbk”.
Written using
- SASmarkdown version 0.8.0.
- knitr version 1.40.
- R version 4.2.2 (2022-10-31 ucrt).