The SAS Transport File Format is an openly documented specification maintained by SAS, a commercial company with a variety of software products for statistics and business analytics, including the application now known as SAS/STAT, which originated in the late 1960s as SAS (an acronym for Statistical Analysis System) at North Carolina State University. The transport format was originally developed in the late 1980s when the corporate entity was known as SAS Institute, Inc. and the software as SAS, to support data transfers between statistical software systems, especially between SAS applications running on different operating systems. SAS considers it non-proprietary. This description is for the original version, now termed Version 5, which was introduced in the late 1980s. This format is referred to in several ways, including XPORT and XPT. In this description, "SAS_xport_5" will be used. Version 8 was introduced in October 2012. See Usage Note 46944: New SAS transport format and tools available. References on the Web to the SAS transport format without qualification as to version should probably be assumed to refer to Version 5.
See SAS_xport_family for a summary of the structure that is common to both versions of the format.
SAS_xport_5 is subject to certain restrictions. The list below is adapted from Appendix A1 in the Stata manual section entitled Import and export datasets in SAS XPORT format:
- The dataset may contain only 9,999 variables. This is constrained by a limit of 4 decimal digits in the NAMESTR header.
- The names of the variables and value labels may not be longer than eight characters and
are case insensitive; for example,
- Variable labels may not be longer than 40 characters.
- The contents of a variable may be numeric or string:
- Numeric variables may be integer or floating point. Floating point variables may not have absolute values smaller than 5.398e–79 or greater than 9.046e+74. The range and precision are controlled by the IBM Double Precision (8-byte) numeric format. For more on how numeric formats are stored, see Numeric Precision in SAS Software
- String variable values may not exceed 200 characters. String variables are padded with blank/space characters to the fixed length declared in the descriptor for the variable. Hence, when variables are read, it cannot be determined
whether the original variable value had trailing blanks.
- When data is missing, a missing data value is stored in the first byte of the data location for the variable. The variable value is padded with Hex 0x00 bytes to the declared length for the variable.
- Value labels are
written in the
dataset. Suppose that you have a variable
the data with values 0 and 1, and the values are labeled for gender (0=male, and 1=female).
When the dataset is written in
Transport format, you can record that the variable
is associated with the
variable, but you cannot record the association with
the value labels male and female.
Value-label definitions are typically stored in a second
dataset or in a text file