Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

PDF_1_4, PDF Version 1.4

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name PDF (Portable Document Format), version 1.4
Description

PDF (Portable Document Format), developed by Adobe Systems Incorporated, is described by Adobe as a general document representation language. PDF represents formatted, page-oriented documents. These documents may be structured or simple. They may contain text, images, graphics, and other multimedia content, such as video and audio. There is support for annotations, metadata, hypertext links, and bookmarks.

Version 1.4 was the basis for the first versions of ISO standards PDF/X and PDF/A.

Production phase In general, a final-state format for delivery to end users. PDF/X, based on PDF 1.4 is a middle-state format for submission of images to printers (prepress).
Relationship to other formats
    Subtype of PDF, Portable Document Format Family
    Has earlier version PDF_1_3, PDF, Versions 1.0-1.3
    Has later version PDF_1_5, PDF, Version 1.5
    Has subtype PDF/A-1, PDF for Long-term Preservation, Use of PDF 1.4. The first version of PDF/A was based on PDF 1.4
    Has subtype PDF/X, PDF for Prepress Graphics File Exchange. The first version of PDF/X was based on PDF 1.4

Local use Explanation of format description terms

LC experience or existing holdings

The Library of Congress creates PDFs as service formats for some content it creates or makes available, including for some digitized historical materials, primarily to support convenient downloading and printing. Some of this content is in version PDF 1.4. Examples (as of early 2019) include text transcriptions made for books and pamphlets digitized for American Memory in the late 1990s: a broadside and a travel book from 1862.

The National Digital Newspaper Program, which produces Chronicling America requires awardees to deliver a PDF per page, using detailed guidelines. These guidelines require XMP metadata following specific conventions and require that "The PDF will be compatible with Acrobat 5.0 or later." Hence the earliest version of PDF accepted is PDF 1.4; in practice, as of early 2019, all the awardees and LC itself appear to be using PDF 1.4. These PDFs are each for a single image, with OCR text available for searching. Example: newspaper page from July 1930.

LC preference See PDF.

Sustainability factors Explanation of format description terms

Disclosure Fully documented by Adobe Systems. Incorporated as a normative reference into ISO standards for PDF/A-1 and PDF/X-1. See also PDF.
    Documentation PDF Reference, Third Edition. Adobe Portable Document Format, Version 1.4. See also PDF.
Adoption

PDF 1.4 is widely used as the basis for the first version of PDF/X (for prepress graphics exchange) and for PDF/A-1.

In early 2019, the LibreOffice Export to PDF command produces a PDF 1.4 file. Also in early 2019, the printer/copier/scanners (MFDs) used by the Library of Congress can scan multipage documents direct to PDF 1.4 files. No other PDF option is supported by the MFDs.

    Licensing and patents See PDF.
Transparency See PDF.
Self-documentation Version 1.4 can include XMP metadata packages. XMP is Adobe's framework for including arbitrary blocks of metadata, using a representation in RDF.
External dependencies See PDF.
Technical protection considerations See PDF.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering See PDF.
Integrity of document structure See PDF.
Integrity of layout and display See PDF.
Support for mathematics, formulae, etc. See PDF.
Functionality beyond normal rendering See PDF.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension pdf
See PDF.
Internet Media Type application/pdf
Media type registered with IANA. See also PDF.
Magic numbers Hex: 25 50 44 46 2D 31 2E 34
ASCII: %PDF-1.4
From PRONOM. However, the magic number value in the header (%PDF-1.4) declaring the PDF version with which the file complies can be overridden elsewhere in the file. See Note below for more detail.
Pronom PUID fmt/18
See http://www.nationalarchives.gov.uk/PRONOM/fmt/18 for PDF 1.4.
Wikidata Title ID Q26085326
See https://www.wikidata.org/wiki/Q26085326 for PDF 1.4.

Notes Explanation of format description terms

General

Identification of chronological versions of PDF can be given in two places in a PDF file. All PDF files should have a version identified in the header with the 5 characters %PDF– followed by a version number. For PDF files conforming to ISO 32000-1:2008 or earlier specifications (i.e. prior to ISO 32000-2:2017), the version number has the form 1.N, where N is a digit between 0 and 7. For example, PDF 1.4 is identified by %PDF–1.4. However, beginning with PDF 1.4, a conforming PDF writer may use the Version entry in the document Catalog to override the version specified in the header. The location of the Catalog within the file is indicated in the Root entry of the file trailer/footer. This override feature was introduced to facilitate the incremental updating of a PDF by simply adding to the end of the file. As a result, it is necessary to locate the Catalog within the file to get the correct version number. Unless the PDF is "linearized," in which case the Catalog is up front, this will require reading the trailer and then using the reference there to locate the Catalog, which will typically be compressed. This has practical implications because format identification tools, including DROID, typically look for particular characters at the beginning of a file (i.e., in the header), to permit identification with minimal effort. DROID can look for characters at the end of the file, but is not able to follow an indirect reference or decompress file contents. When the version number is not the same in the header and the Catalog, there is potential for format identification errors.

History PDF 1.4 was published in November 2001, and corresponds to Acrobat version 5. PDF 1.4 was incorporated into the first versions of the PDF/X and PDF/A ISO standards.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 01/12/2019