Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Document Container File: Core (based on ZIP 6.3.3)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ISO/IEC 21320-1 [as of January 2014, in committee draft stage] Information technology -- Document Container File -- Part 1: Core (formal name). Profile of ZIP File Format, Version 6.3.3 from PKWARE.
Description

ISO/IEC CD 21320-1 - Information technology -- Document Container File -- Part 1: Core is an activity under ISO/IEC JTC 1/SC 34/WG1 to develop a refinement of the widely used ZIP format from PKWARE (termed ZIP_PK here). The format described here and termed "ZIP_21320_1" will be a profile of ZIP_6_3_3, as specified in PKWARE's APPNOTE.TXT, Version 6.3.3. See ZIP_PK for more on the ZIP File Format in general.

The new ISO "work item" was approved in August 2011 and is under way, as ISO/IEC NP 21320-1 - Information technology -- Document Container File -- Part 1: Core. In November 2012, a working draft of ISO/IEC 21320-1 of the proposed standard was made available for discussion.

ZIP_21320_1 describes itself as a compatible profile of ZIP_6_3_3. The specification consists of restrictions in relation to the full ZIP specification, referred to by specific paragraph numbers in the PKWARE Version 6.3.3 of APPNOTE.TXT. The restrictions in the current draft are expected to be maintained, and they include:

  • Files stored in document container files may only be stored uncompressed or using the "deflate" mechanism as defined in RFC 1951
  • The encryption features defined in APPNOTE.TXT are prohibited
  • The digital signature features defined in APPNOTE.TXT are prohibited
  • The "patch data" features defined in APPNOTE.TXT are prohibited
  • Document container files should not be segmented or span multiple volumes
  • Filenames may/should be encoded in UTF-8 (which allows for ASCII filenames). Whether to mandate use of the UTF-8 indicator is the main technical issue awaiting resolution
Production phase May be used at any lifecycle phase for bundling/packaging files together for exchange, storage, or distribution.
Relationship to other formats
    Subtype of ZIP_6_3_3, ZIP File Format, Version 6.3.3 (PKWARE)
    Subtype of ZIP_PK, ZIP File Format (PKWARE)

Local use Explanation of format description terms

LC experience or existing holdings See ZIP_PK.
LC preference See ZIP_PK.

Sustainability factors Explanation of format description terms

Disclosure ISO/IEC NP 21320-1 is a standards effort approved in August 2011 under ISO/IEC JTC 1/SC 34/WG1 [Markup Languages] for core requirements for a document container file. This effort is expected to produce a specification document of "core" requirements for a document container file that is a profile of ZIP_PK.
    Documentation

When approved and published, the standard will be available from ISO as ISO/IEC 21320-1 Information technology -- Document Container File -- Part 1: Core at http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=60101.

A committee draft of ISO/IEC 21320-1 dated 2013-02-19 is available, perhaps temporarily. Version 6.3.3 of ZIP, on which draft ISO/IEC 21320-1 is based, is documented in APPNOTE.TXT, Version 6.3.3 (September 2012).

Adoption It is too early (as of January 2014) to discuss adoption of this version of ZIP. However, the objective of the effort is to define a profile of ZIP that is compatible with the largest number of existing applications and hence provide the greatest level of interoperability. See ZIP_PK for discussion of adoption of ZIP in general.
    Licensing and patents The features in this profile of ZIP are chosen to avoid patent and licensing implications. See ZIP_PK for discussion of patent issues for the parent ZIP format.
Transparency Encryption of individual files and of the central directory is prohibited. Hence this profile of ZIP_PK is more transparent than its parent format.
Self-documentation The ZIP format per se and this profile in particular provide no metadata support beyond what is needed to support unpacking the ZIP archive and extracting the component files. The document format specifications that build on restricted subsets of the ZIP format and might be expected to incorporate this profile in future versions are likely to mandate or facilitate some level of descriptive and structural metadata. For example, OOXML's OCF and EPUB both incorporate files that provide metadata for the document as a whole. Relationships between component files are also likely to be explicit in such document formats, either through a generic relationship representation or use of prescribed application-specific naming conventions.
External dependencies See ZIP_PK.
Technical protection considerations Encryption as supported within the ZIP specification is prohibited in this profile of the ZIP file format. However, it is possible for applications to apply encryption or DRM to the file as a whole or implement application-specific technical protection. Examples of the latter include SCORM and EPUB. See ZIP_PK.

Quality and functionality factors Explanation of format description terms

Other
Bundling/compression Separate functionality factors for comparing formats that are used to bundle and or compress files have not been developed. From the perspective of digital preservation, consideration of the sustainability factors above is more important than the degree of compression.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension zip
ZIP
Other extensions are used for particular applications that use the ZIP format as a container.
Internet Media Type application/zip
Other Internet Media Types are used for particular applications that use the ZIP format as a container.
File signature See related format.  See ZIP_PK.

Notes Explanation of format description terms

General

The ZIP format is designed for cross-platform data exchange and efficient data storage for a set of related files. ZIP_PK is a de facto industry standard, developed, maintained, and openly documented by PKWARE.

See also ZIP_PK.

History

The original version of the format was developed by Phil Katz (hence the "PK" in PKWARE). Since the first specification was published in 1990, PKWARE has updated the format as supported in its products and issued new versions of the specification in a document called APPNOTE.TXT. See ZIP_PK for a more detailed history. The formats defined by versions 6.3.2 (September 2007) and 6.3.3 (September 2012) of APPNOTE.TXT are technically identical. Version 6.3.3 of the APPNOTE.TXT states that the changes from version 6.3.2 are "formatting changes to support easier referencing of this APPNOTE from other documents and standards."

As described in http://en.wikipedia.org/wiki/ZIP_(file_format), a proposed project to create an ISO/IEC international standard for a format compatible with ZIP failed to pass a 2010 ballot of national standards bodies. Instead, a study period was initiated, resulting in recommendations documented in ISO/IEC JTC 1/SC 34 N 1621. The recommendations were (a) to have PKWARE continue its maintenance of the ZIP Application Note, (b) to plan for a new multi-part ISO standard to build on top of the ZIP Application Note, and (c) to propose a new work item for Part 1 of the new standard for a Document Container File. The new work item was approved in August 2011 and is under way, as ISO/IEC CD 21320-1 - Information technology -- Document Container File -- Part 1: Core.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 02/27/2017