The Microsoft Compound File Binary (CFB) file format is used for storing storage objects and stream objects in a hierarchical structure within a single file. See CFB_3 for a full description of the file system structure.
There are two active versions of CFB, version 3 and version 4. One major distinction between the versions is that the sector size for version 3 is of 512 bytes and the sector size for version 4 is 4096 bytes.
The minimum size of a compound file is three sectors: one header, one FAT sector and one directory sector.
- 4096-byte sector compound files can have 64-bit file and user-defined datastream sizes, up to slightly less than 16 terabytes.
- The maximum number of directory entries (storage objects and stream objects) is roughly 4 billion. This corresponds to a maximum directory sector chain length of slightly less than 512 GB for a 4096-byte sector compound file.
A file in the CFB format begins with a 512-byte header. The first sector of a compound file with 4096-byte sectors is padded with zeroes. Values given below are as they occur in the physical file, for example when viewed using a Hex dump utility.
- Header Signature for the CFB format with 8-byte Hex value D0CF11E0A1B11AE1. Gary Kessler notes that the beginning of this string looks like "DOCFILE"
- 16 bytes of zeroes
- 2-byte Hex value 3E00 indicating CFB minor version 3E. The specification states that the minor version should always be indicated as 3e.
- 2-byte Hex value 0400 indicating CFB major version 4.
- 2-byte Hex value FEFF indicating little-endian byte order for all integer values. This byte order applies to all CFB files.
- 2-byte Hex value 0C00 (indicating the sector size of 4096 bytes used for major version 4)
- 480 bytes for remainder of the 512-byte header