Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

RAR Archive File Format Family

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name RAR Archive File Format Family
Description

RAR, or the Roshal ARchive format thanks to its namesake creator software developer Eugene Roshal, is a proprietary archive file format that supports data compression, error recovery and file spanning. There are at least six main versions and subversions of the RAR format and because the early versions have little formal documentation, only selected versions currently are described at this site. The early versions RAR1.3 and especially RAR1.5 are thought to be the base for later versions but no public documentation is available about these early versions. Comments welcome. RAR4 is also known as RAR version 2.9 in some documentation. The current version as of this writing is RAR5. See Relationships below for more details.

Similar in purpose to ZIP files, RAR files are data containers in which one or more files are stored in compressed form. RAR is a proprietary format under copyright, along with RAR's compression applications and libraries to Alexander Roshal, brother of Eugene Roshal. RAR files are the native format for WinRAR software and can only be created through this tool which is licensed to win.rar GmbH although there are several options to open RAR files. See External Dependencies for details.

Structurally, a RAR file is comprised of variable length blocks of required and optional data. The precise composition of the blocks evolved over time with the versions. At its core, a RAR file is comprised of a marker or introductory block, an archive block which includes the archive header and file header, and closing block which includes additional comments or other information needed to correctly process the file. The order of these blocks may vary, but the first block must be a marker block followed by an archive header block. The archive block is the most complex because it contains the headers for the archive itself as well as the file headers:

  • Self-extracting module (optional): Also known as SFX, this means any data preceding the file signature and the block size and contents are not defined.
  • The RAR file signature is specific to the version of the format and must be searched for from the beginning of the file past the maximum SFX module size. According to the ad hoc description at RARLab and to PRONOM, the file signatures for RAR2.0, RAR3, RAR4, and RAR5 all begin with "Rar!" (Hex: 52 61 72 21). See subtypes for more details. See File Signatures.
  • Archive encryption header (optional): Present only in archives with encrypted headers. Every next header after this one is started from 16 byte AES-256 initialization vector followed by encrypted header data. Size of encrypted header data block is aligned to 16 byte boundary. Encryption version is declared in the tag of the same name with only AES-256 (value = 0) supported for RAR5.
  • Main archive header which, among other things, includes the optional Locator tag to quickly access the positions of different service blocks without scanning the entire archive.

After the main archive header but still within the archive block comes one or more file headers, one for each file within the archive. File headers are followed by optional Service headers.

  • File header: Includes among other data the Compression Record (stored as values 0 - 5 where 0 means no compression) and an optional Hash record for the standard CRC32 checksum. If another hash algorithm is used, it is stored in the extra area record.
  • Service headers: Optional headers that store supplementary information.

The End of Archive Marker follows the last File Header to close out the archive block, after which RAR does not read anything to permit third party tools to add extra information such as a digital signature to archive.

Relationship to other formats
    Has subtype RAR1.3, RAR Archive File Format, Version 1.3. No information available. Not described on this site at this time.
    Has subtype RAR1.5, RAR Archive File Format, Version 1.5. Often described as the basis for subsequent versions but no detailed information available. Not described at this site at this time.
    Has subtype RAR2, RAR Archive File Format, Version 2
    Has subtype RAR3, RAR Archive File Format, Version 3
    Has subtype RAR4, RAR Archive File Format, Version 4. RAR4 is also known as RAR version 2.9.
    Has subtype RAR5, RAR Archive File Format, Version 5

Local use Explanation of format description terms

LC experience or existing holdings RAR files have appeared in various personal papers collections as a submission format for large PST email files.
LC preference None

Sustainability factors Explanation of format description terms

Disclosure Proprietary format with limited public information.
    Documentation The full specification is not publicly available but file structure information for RAR5 is available through the RARLab website.
Adoption According to one report, RAR format has gained much popularity over these years as compared to its competitor archive formats like 7Z, zip, etc ... [because it has] better data compression rate than ZIP and uses a lossless compression.
    Licensing and patents RAR is a proprietary file format created solely by the compression software WinRAR. The decompression code is available for use in other programs and the license holder allows for its distribution, but with a license provision (detailed in the license.txt file from the UnRAR source code) that "Unrar source may be used in any software to handle RAR archives without limitations free of charge, but cannot be used to re-create the RAR compression algorithm, which is proprietary. Distribution of modified Unrar source in separate form or as a part of other software is permitted, provided that it is clearly stated in the documentation and source comments that the code may not be used to develop a RAR (WinRAR) compatible archiver."
Transparency Transparency is low because the compression algorithms are proprietary and not publicly available.
Self-documentation RAR files contain supporting metadata in headers to easily identify and organizing the compressed files within.
External dependencies RAR files can only be created by WinRAR software but can be opened in other tools (aside from WinRAR) including The Unarchiver, PeaZip, RAR Opener, 7-Zip and many others. The now-defunct unrarLib tool only works with RAR files up to version 2.
Technical protection considerations RAR supports optional AES encryption, a type of block cipher which uses an algorithm that encrypts data on a per-block basis. There are various forms of the AES standard and the implementation used by RAR has changed with different versions of the format. RAR5 (current version as of this writing in March 2017) uses AES-256, a change from AES-128 used in RAR4.

Quality and functionality factors Explanation of format description terms

Other
Bundling/compression Separate functionality factors for comparing formats that are used to bundle and or compress files have not been developed. From the perspective of digital preservation, consideration of the sustainability factors above is more important than the degree of compression

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension rar
For the data volume set
Filename extension rev
For the recovery volume set
Filename extension r00
According to Wikipedia, in early versions of the format, multi-volume files were split with the first file name .rar followed by .r01, .r02 etc.
Internet Media Type application/x-rar-compressed
From File-Extensions.org
Internet Media Type application/vnd.rar
From PRONOM
File signature See note.  Wikipedia states that RAR1.3 lacks a signature. Forensics Wiki states that "older versions of the RAR file format have a file signature of "52 45 7E 5E" but there is no documentation to support this. Comments welcome. See subtypes for details.
Pronom PUID x-fmt/264
For RAR2. See http://www.nationalarchives.gov.uk/PRONOM/x-fmt/264
Pronom PUID fmt/411
PRONOM labels this as RAR version 2.9 but other documentation including WinRAR 5.0 and RAR for Android refer to this version as RAR4. See http://www.nationalarchives.gov.uk/PRONOM/fmt/411
Pronom PUID fmt/613
For RAR5. See http://www.nationalarchives.gov.uk/PRONOM/fmt/613
Wikidata Title ID Q243303
No version distinctions. See https://www.wikidata.org/wiki/Q243303.

Notes Explanation of format description terms

General

According to the RARLab website, RAR has several advantages over ZIP files including "more convenient multipart (multivolume) archives, tight compression including special solid, multimedia and text modes, strong AES-128 encryption, recovery records helping to repair an archive even in case of physical data damage, Unicode support to process non-English file names and a lot more." In addition, RAR archives usually provide a noticeably higher compression ratio than ZIP file format.

History  

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 07/11/2017