The SAS Transport File Format is an openly documented specification maintained by SAS, a commercial company with a variety of software products for statistics and business analytics, including the application now known as SAS/STAT, which originated in the late 1960s as SAS (an acronym for Statistical Analysis System) at North Carolina State University. The transport format was originally developed in the late 1980s when the corporate entity was known as SAS Institute, Inc. and the software as SAS, to support data transfers between statistical software systems, especially between SAS applications running on different operating systems. SAS considers it non-proprietary. This format is referred to in several ways, including XPORT. This description is for the second version, termed Version 8 (corresponding to the first version of SAS that supported this version of the transport format). Version 8, named "SAS_xport_8" here, was introduced in October 2012. See Usage Note 46944: New SAS transport format and tools available. References on the Web to the SAS transport format without qualification as to version should probably be assumed to refer to Version 5.
See SAS_xport_family for a summary of the structure that is common to both versions of the format. For SAS_xport_8, all header labels include the string "V8".
Several restrictions in Version 5 [SAS_xport_5] were lifted for this version:
- The constraint on the number of variables of 4 decimal digits in the NAMESTR header was raised to 6 decimal digits.
- Variable names can be up to 32 characters and are stored in their original case (upper or lower). Variable names can contain any characters other than null (Hex 0x00). Version 5 only allows alphanumeric and underscore as characters in names.
- The 40-characters limit on variable labels was lifted to 256 characters
- The contents of a variable may be numeric or string:
- No change was made to the representation of numeric variables, which may be integer or floating point. Floating point variables may not have absolute values smaller than 5.398e–79 or greater than 9.046e+74. The range and precision are controlled by the IBM Double Precision (8-byte) numeric format. For more on how numeric formats are stored, see Numeric Precision in SAS Software
- The 200-byte limit on character variables was raised to 32,767 bytes.
- When data is missing, a missing data value is stored in the first byte of the data location for the variable. The variable value is padded with Hex 0x00 bytes to the declared length for the variable.
- Value labels are
written in the
dataset. Suppose that you have a variable
the data with values 0 and 1, and the values are labeled for gender (0=male, and 1=female).
When the dataset is written in
Transport format, you can record that the variable
is associated with the
variable, but you cannot record the association with
the value labels male and female.
Value-label definitions may be stored in a second
dataset or in a text file