Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

PDF, Version 1.7 (ISO 32000-1:2008)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name PDF (Portable Document Format), version 1.7, Base level (ISO 32000-1:2008)
Description

PDF (Portable Document Format), developed by Adobe Systems Incorporated, is described by Adobe as a general document representation language. PDF represents formatted, page-oriented documents. These documents may be structured or simple. They may contain text, images, graphics, and other multimedia content, such as video and audio. There is support for annotations, metadata, hypertext links, and bookmarks.

The original version 1.7 of the PDF format was released November 2006 and associated with Acrobat and Adobe Reader 8.0. Version 1.7 was published as ISO 32000-1 in July 2008.

Among other new features, this version of PDF introduces an extensibility mechanism based on an Extensions Dictionary. This mechanism was used by Adobe to introduce new features, but is also available for other vendors or developers to use to establish published extensions. Adobe stated that it would maintain a publicly available registry of vendors to be used to identify extensions at http://adobe.com/go/ISO32000Registry. As of early 2019, the URL provided only a registration form. A PDF Name Registry is now available at https://github.com/adobe/pdf-registry.

Production phase In general, a final-state format for delivery to end users.
Relationship to other formats
    Subtype of PDF, Portable Document Format
    Has earlier version PDF_1_6, PDF, Version 1.6
    Has extension PDF_1_7_ext03, PDF, Version 1.7, ExtensionLevel 3

Local use Explanation of format description terms

LC experience or existing holdings See PDF.
LC preference See PDF.

Sustainability factors Explanation of format description terms

Disclosure Approved as international standard, ISO 32000-1:2008.
    Documentation

ISO 32000-1:2008. Document management -- Portable document format -- Part 1: PDF 1.7. Confirmed in 2018.

Adobe makes available an ISO-approved copy of the standard at http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf. An equivalent document, although organized and formatted differently, is PDF Reference, Sixth Edition, Version 1.7. November 2006, found at http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf

Adoption Widely adopted. However, in 2019, many PDF creation tools still create files that identify themselves as conforming to earlier versions of PDF.
    Licensing and patents

From the text of ISO 32000-1:2008, "The International Organization for Standardization draws attention to the fact that it is claimed that compliance with this document may involve the use of patents concerning the creation, modification, display and processing of PDF files which are owned by the following parties: Adobe Systems Incorporated, 345 Park Avenue, San Jose, California, 95110-2704, USA. ISO takes no position concerning the evidence, validity and scope of these patent rights. The holders of these patent rights has assured the ISO that they are willing to negotiate licenses under reasonable and non-discriminatory terms and conditions with applicants throughout the world. In this respect, the statements of the holders of these patent rights are registered with ISO. Information may be obtained from those parties listed above."

In association with the adoption of PDF, version 1.7 as an ISO standard (ISO 32000-1:2008), Adobe issued a Public Patent License, granting "every individual and organization in the world the royalty-free right, under all Essential Claims that Adobe owns, to make, have made, use, sell, import and distribute Compliant Implementations."

See PDF for more information from Adobe about royalty-free use of Adobe patents.

Transparency See PDF.
Self-documentation Version 1.4 and later of PDF can include XMP metadata packages. XMP is Adobe's framework for including arbitrary blocks of metadata, using a representation in RDF.
External dependencies

See PDF.

Starting with version 1.7, PDF document files may use extensions developed by organizations other than Adobe. Documents employing these extensions to the baseline specification may not be fully functional in commonly available PDF viewers, such as Adobe Reader. See PDF Name Registry on github; the four-character names in the registry may be used to identify proprietary extensions or as prefixes associated with digital signatures. Adobe uses the prefix "ADBE."

Technical protection considerations See PDF.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering See PDF.
Integrity of document structure See PDF.
Integrity of layout and display See PDF.
Support for mathematics, formulae, etc. See PDF.
Functionality beyond normal rendering See PDF

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension pdf
See PDF.
Internet Media Type application/pdf
Media type registered with IANA. See also PDF.
Magic numbers Hex: 25 50 44 46 2D 31 2E 37
ASCII: %PDF-1.7
From PRONOM. However, the magic number value in the header (%PDF-1.7) declaring the PDF version with which the file complies can be overridden elsewhere in the file. See Note below for more detail on determining which chronological version a PDF document declares itself to comply with.
Pronom PUID fmt/276
See http://www.nationalarchives.gov.uk/PRONOM/fmt/276 for PDF 1.7.
Wikidata Title ID Q26085317
See https://www.wikidata.org/wiki/Q26085317 for PDF 1.7.

Notes Explanation of format description terms

General

Identification of chronological versions of PDF can be given in two places in a PDF file. All PDF files should have a version identified in the header with the 5 characters %PDF– followed by a version number. For PDF files conforming to ISO 32000-1:2008 or earlier specifications (i.e. prior to ISO 32000-2:2017), the version number has the form 1.N, where N is a digit between 0 and 7. For example, PDF 1.7 is identified by %PDF–1.7. However, beginning with PDF 1.4, a conforming PDF writer may use the Version entry in the document Catalog to override the version specified in the header. The location of the Catalog within the file is indicated in the Root entry of the file trailer/footer. This override feature was introduced to facilitate the incremental updating of a PDF by simply adding to the end of the file. As a result, it is necessary to locate the Catalog within the file to get the correct version number. Unless the PDF is "linearized," in which case the Catalog is up front, this will require reading the trailer and then using the reference there to locate the Catalog, which will typically be compressed. This has practical implications because format identification tools, including DROID, typically look for particular characters at the beginning of a file (i.e., in the header), to permit identification with minimal effort. DROID can look for characters at the end of the file, but is not able to follow an indirect reference or decompress file contents. When the version number is not the same in the header and the Catalog, there is potential for format identification errors.

The JHOVE PDF module does take account of the situation, stating that for PDF 1.0 - 1.6, "The PDF version is determined by the data specified in the PDF header and the Version key of the document catalog dictionary. In the event that these two values do not match, the Version key is taken as the authoritative value."

Extension mechanism for PDF format: PDF 1.7 introduced an extension mechanism based on an Extensions Dictionary. Adobe used this mechanism to specify features introduced with Acrobat 9.0 (June 2008) and 9.1 (June 2009). See PDF_1_7_ext03 and PDF_1_7_ext05. Vendors developing extensions were expected to choose 4-character identifiers and be listed in a registry. Adobe uses the identifier ADBE. As of early 2019, http://adobe.com/go/ISO32000Registry does not lead to a registry, but to a PDF with a form for submitting applications. Meanwhile, a PDF Name registry is available as a spreadsheet on github at https://github.com/adobe/pdf-registry. The plan for a registry was one of a small set of functional differences between Adobe's original specification for PDF 1.7 and the final ISO 32000-1:2008. For more detail about the mechanism for extending the PDF standard, see 7.12.2 Developer Extensions Dictionary and Annex E in ISO 32000-1:2018.

Recommended practice to facilitate recognition of a PDF document as a binary file: Both Adobe's PDF Reference for version 1.7 and ISO 32000-1:2008 recommend that "If a PDF file contains binary data, as most do ..., it is recommended that the header line be immediately followed by a comment line containing at least four binary characters—that is, characters whose codes are 128 or greater. This ensures proper behavior of file transfer applications that inspect data near the beginning of a file to determine whether to treat the file’s contents as text or as binary." This practice is required in PDF documents conforming to any version of PDF/A.

History

PDF 1.7 was released in November 2006 in association with version 8 of Acrobat and Adobe Reader. In January 2007, Adobe announced the intention to pursue standardization through TC 171/SC 2 of ISO. This process led to publication as ISO 32000-1 in July 2008. There are substantial editorial differences between the two specification documents, particularly in the order of material. Small functional differences may reflect asynchrony between the Adobe product development cycle and the ISO standardization process, but Adobe describes the specifications as "matching."

To quote from ISO 32000-1:2008, "The first version of PDF was designated PDF 1.0 and was specified by Adobe Systems Incorporated in the PDF Reference 1.0 document published by Adobe and Addison Wesley. Since then, PDF has gone through seven revisions designated as: PDF 1.1, PDF 1.2, PDF 1.3, PDF 1.4, PDF 1.5, PDF 1.6 and PDF 1.7. All non-deprecated features defined in a previous PDF version were also included in the subsequent PDF version. Since ISO 32000-1 is a PDF version matching PDF 1.7, it is also suitable for interpretation of files made to conform with any of the PDF specifications 1.0 through 1.7. Throughout this specification in order to indicate at which point in the sequence of versions a feature was introduced, a notation with a PDF version number in parenthesis (e.g., (PDF 1.3)) is used. Thus if a feature is labelled with (PDF 1.3) it means that PDF 1.0, PDF 1.1 and PDF 1.2 were not specified to support this feature whereas all versions of PDF 1.3 and greater were defined to support it."


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 03/03/2019