Sustainability of Digital Formats: Planning for Library of Congress Collections |
|
![]() |
|
Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact |
Full name | PDF (Portable Document Format), versions 1.0-1.3 |
---|---|
Description |
PDF (Portable Document Format), developed by Adobe Systems Incorporated, is described by Adobe as a general document representation language. PDF represents formatted, page-oriented documents. These documents may be structured or simple. They may contain text, images, graphics, and other multimedia content, such as video and audio. There is support for annotations, metadata, hypertext links, and bookmarks. Versions through 1.3 are described here together for convenience. |
Production phase | In general, a final-state format for delivery to end users. |
Relationship to other formats | |
Subtype of | PDF_family, Portable Document Format |
Has later version | PDF_1_4, PDF, Version 1.4 |
LC experience or existing holdings | The Library of Congress creates PDFs as service formats for some content it creates or makes available, including for some scanned historical materials, primarily to support convenient downloading and printing. Some of this content is in version PDF 1.0. Examples (as of early 2019) include items consisting of scanned pages with no text: a rare book and a piece of sheet music. See also PDF_family. |
---|---|
LC preference | See PDF. |
Disclosure | PDF 1.0-1.3 were fully documented by Adobe Systems Incorporated. Specifications for versions 1.0 and 1.3 were published by Addison Wesley. Full references were provided in the bibliography for ISO 32000-1:2008. |
---|---|
Documentation |
The PDF Reference editions that specify PDF versions 1.0 through 1.3 are:
|
Adoption |
PDF 1.3 has been very widely used. As of early 2019, it is still the version of PDF built in to MacOS 10.14 (Mojave) and used by the Pages word-processing application. |
Licensing and patents | See PDF_family. |
Transparency | See PDF_family. |
Self-documentation | Metadata capabilities in versions of PDF prior to 1.4 are very limited. |
External dependencies | See PDF_family. |
Technical protection considerations | See PDF_family. |
Text | |
---|---|
Normal rendering | See PDF_family. |
Integrity of document structure | See PDF_family. |
Integrity of layout and display | See PDF_family. |
Support for mathematics, formulae, etc. | See PDF_family. |
Functionality beyond normal rendering | See PDF_family. |
Tag | Value | Note |
---|---|---|
Filename extension | pdf |
See PDF_family. |
Internet Media Type | application/pdf |
Media type registered with IANA. See also PDF_family. |
Magic numbers | Hex: 25 50 44 46 2D 31 2E 30 ASCII: %PDF-1.0 Hex: 25 50 44 46 2D 31 2E 31 ASCII: %PDF-1.1 Hex: 25 50 44 46 2D 31 2E 32 ASCII: %PDF-1.2 Hex: 25 50 44 46 2D 31 2E 33 ASCII: %PDF-1.3 |
From PRONOM. However, these magic number values in the header (e.g., %PDF-1.3) declaring the PDF version with which the file complies can be overridden elsewhere in the file. See Note below for more detail. |
Pronom PUID | fmt/14 fmt/15 fmt/16 fmt/17 |
For PDF versions 1.0, 1.1, 1.2, 1.3, respectively. |
Wikidata Title ID | Q26085339 Q26085336 Q26085333 Q26085330 |
For PDF versions 1.0, 1.1, 1.2, 1.3, respectively. |
General |
Identification of chronological versions of PDF can be given in two places in a PDF file. All PDF files should have a version identified in the header with the 5 characters %PDF– followed by a version number. For PDF files conforming to ISO 32000-1:2008 or earlier specifications (i.e. prior to ISO 32000-2:2017), the version number has the form 1.N, where N is a digit between 0 and 7. For example, PDF 1.3 is identified by %PDF–1.3. However, beginning with PDF 1.4, a conforming PDF writer may use the Version entry in the document Catalog to override the version specified in the header. The location of the Catalog within the file is indicated in the Root entry of the file trailer/footer. In early 2019, a test on Mac OS 10.14 (Mojave) was performed: a .pages document created in the Pages word-processing application was imported into the Preview application and exported to PDF. The resulting file had %PDF-1.3 in the header but version 1.4 was specified in the Catalog. This override feature was introduced to facilitate incremental updating of a PDF by simply adding to the end of the file. As a result, it is necessary to locate the Catalog within the file to get the correct version number. Unless the PDF is "linearized," in which case the Catalog is up front, this will require reading the trailer and then using the reference there to locate the Catalog, which will typically be compressed. This has practical implications because format identification tools, including DROID, typically look for particular characters at the beginning of a file (i.e., in the header), to permit identification with minimal effort. DROID can look for characters at the end of the file, but is not able to follow an indirect reference or decompress file contents. When the version number is not the same in the header and the Catalog, there is potential for format identification errors. The JHOVE PDF module does take account of the situation, stating that for PDF 1.0 - 1.6, "The PDF version is determined by the data specified in the PDF header and the Version key of the document catalog dictionary. In the event that these two values do not match, the Version key is taken as the authoritative value." |
---|---|
History | Documentation for PDF version 1.0 was published in June 1993 in association with the release of Acrobat 1.0. PDF version 1.1 was published in March 1996 in association with the release of Acrobat 2.0. PDF version 1.2 was published in November 1996 in association with the release of Acrobat 3.0. PDF version 1.3 was published in July 2000 in association with the release of Acrobat 4.0. PDF 1.4 was published in November 2001, and corresponds to Acrobat 5.0. |
|