Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

DOCX Strict (Office Open XML), ISO 29500-1: 2008-2016

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name DOCX Strict, (Office Open XML, WordprocessingML) ISO 29500-1:2008-2016, also ECMA-376, Editions 2-5.
Description

The Strict variant of DOCX disallows a variety of elements and attributes that are permitted in the more common Transitional variant (DOCX/OOXML_2012). The markup for the Strict variant is essentially a subset of markup for the Transitional variant, but the schemas use different namespaces and are distributed separately in complete form.

Among the disallowed elements and attributes are:

  • Deprecated element names related to text layout incorporating left and right that had been replaced by more correctly named and functionally equivalent names incorporating start and end.
  • Attribute values for non-Unicode character sets.
  • Legacy numbering level properties and other elements related to a legacy numbering framework.
  • All elements and attributes related to VML, a deprecated markup language for drawings, replaced by DrawingML.
  • Attributes specifying deprecated and redundant mechanisms for generating hash values to support checks against content corruption.
  • Compatibility settings intended to preserve visual fidelity of documents produced in earlier word-processing applications, particularly in relation to spacing, margins, pagination, etc.

The Strict variant of DOCX described here was introduced during the standardization in ISO/IEC 29500 in 2008 with the intention of excluding features included in ECMA-376, Edition 1 that were present simply to handle bugs and features of earlier word-processors or to permit continued use of deprecated markup (e.g. VML markup for drawings). The intent of the split of the markup specification into Strict (Part 1) and Transitional (Part 4) was that applications would create new documents in the Strict variant; however, in practice, pressure for backwards compatibility has meant that most new files are marked up by applications using the Transitional namespace even if they use no features that are incompatible with the Strict specification.

Note that the Strict variant of DOCX does allow extensions conforming to MCE/OOXML_2012. Microsoft has used MCE to add functionality to Microsoft Word. See [MS-DOCX], Word Extensions to the Office Open XML (.docx) File Format. A summary of the extensions through Word 2016 are listed in [MS-DOCX]: 1.3 Structure Overview (Synopsis).

For discussion of other aspects of the Strict DOCX format, see the description of the more common Transitional variant of DOCX, DOCX/OOXML_2012.

Production phase Can be used in any production phase. Likely used primarily for creating documents (initial state) and for editing and review (middle-state). Documents that are formally published are often converted to a format that is designed for final publication and not for convenient editing.
Relationship to other formats
    Subtype of OOXML_Family, OOXML Format Family -- ISO/IEC 29500 and ECMA 376
    Subtype of OPC/OOXML_2012, Open Packaging Conventions (Office Open XML), ISO 29500-2:2008-2012
    Modification of DOCX/OOXML_2012, DOCX Transitional (Office Open XML), ISO 29500:2008-2016, ECMA-376, Editions 1-5. The Transitional form of DOCX allows additional legacy markup to address backward compatibility with bugs and features of older word-processors. The legacy markup is specified in Part 4 of ISO/IEC 29500.
    May contain MCE/OOXML_2012, Markup Compatibility and Extensibility (Office Open XML), ISO 29500-3:2008-2015, ECMA-376, Editions 1-5
    Defined via XML, Extensible Markup Language (XML)

Local use Explanation of format description terms

LC experience or existing holdings In 2017, the Library of Congress is not aware of any documents in the Strict form of DOCX in its collections.
LC preference The list of Library of Congress Recommended Formats Statement for Textual and Musical works, as of 2016, lists the OOXML family of formats, which includes the DOCX format, as acceptable for textual works and electronic serials. It does not distinguish between the Strict and the more common Transitional DOCX/OOXML_2012 form in its preferences.

Sustainability factors Explanation of format description terms

Disclosure International open standard. Maintained by ISO/IEC JTC1 SC34/WG4 as Part 1 of ISO/IEC 29500, first published in 2008.
    Documentation

ISO/IEC 29500-1, Information technology -- Document description and processing languages -- Office Open XML File Formats -- Part 1: Fundamentals and Markup Language Reference. Latest version (dated 2016 as of February 2017) is available from ISO/IEC Publicly Available Standards.

All editions of the OOXML standards as published by ECMA are available from ECMA-376: Office Open XML File Formats. The split between Strict and Transitional variants of DOCX was introduced in Edition 2 of ECMA-376 which is identical to ISO/IEC 29500:2008.

Adoption

The Strict variant of DOCX does not appear to be widely widely used as of February 2017, although support has been added to several applications in recent years. The ability to read Strict DOCX files was first implemented by Microsoft in Word 2010; in Windows Office, the ability to write Strict files as an option was added in Word 2013 and is available in Word 2016 and Office 365. Office for Mac 2011 could neither read nor write Strict files. The latest version of Word for a desktop Mac (in Office for Mac 2016) can read but not write Strict files.

Versions of LibreOffice since 4.2.3 can read Strict DOCX files. The Feature Comparison provided by LibreOffice for version 5.3 (released in early 2017) indicates that Strict DOCX files can be read but not written. However, the existence of Support OOXML strict export as a project on a to-do list for LibreOffice suggests that this capability may be introduced before long. A test using LibreOffice 5.2 confirmed that DOCX files written by that application are always in the more common Transitional form, regardless of which of two .docx options is chosen from the dropdown menu in the Save As feature. Two options are presented in LibreOffice because of a few small differences found in some files produced by Microsoft Office, particularly by Office 2007. See Useful References below.

Whether the Strict version of DOCX is more widely used in the future will likely depend on whether pressure on software vendors from governments for its adoption outweighs market pressure, which currently seems to favor backwards compatibility.

    Licensing and patents See the more common Transitional form of DOCX, DOCX/OOXML_2012 and OOXML Format Family.
Transparency See the more common Transitional form of DOCX, DOCX/OOXML_2012.
Self-documentation See the more common Transitional form of DOCX, DOCX/OOXML_2012
External dependencies See the more common Transitional form of DOCX, DOCX/OOXML_2012.
Technical protection considerations See the more common Transitional form of DOCX, DOCX/OOXML_2012.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering See the more common Transitional form of DOCX, DOCX/OOXML_2012 for functionality supported.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension docx
Used for Strict and the more common Transitional form of DOCX.
Internet Media Type application/vnd.openxmlformats-officedocument.wordprocessingml.document
From IANA registration.
XML namespace declaration http://purl.oclc.org/ooxml/wordprocessingml/main
This namespace declaration is for the Strict variant of DOCX. It occurs in the mandatory Main Document part of a DOCX file (package) with the name /word/document.xml and is mapped to the prefix w.
Other Target="word/document.xml"
Will occur in the top-level Relationships part (/_rels/.rels part in an OPC package in the <Relationships> element of a DOCX file. In the Strict variant, it will be the target of a relationship of type http://purl.oclc.org/ooxml/relationships/officeDocument. See root namespace and source relationship for Main Document Part in ISO/IEC 29500-1:2012, §11.3.10.
Pronom PUID fmt/412
See http://www.nationalarchives.gov.uk/PRONOM/fmt/412. As of February 2017, PRONOM does not distinguish between Strict and Transitional versions of DOCX.
Wikidata Title ID Q26207818
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2012. See https://www.wikidata.org/wiki/Q26207818
Wikidata Title ID Q26207786
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2011. See https://www.wikidata.org/wiki/Q26207786
Wikidata Title ID Q26207675
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2008. See https://www.wikidata.org/wiki/Q26207675
Wikidata Title ID Q26211533
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2012, with Microsoft extensions. See https://www.wikidata.org/wiki/Q26211533
Wikidata Title ID Q26211506
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2011, with Microsoft extensions. See https://www.wikidata.org/wiki/Q26211506
Wikidata Title ID Q26208225
Office Open XML Wordprocessing Document, Strict, ISO/IEC 29500:2008, with Microsoft extensions. See https://www.wikidata.org/wiki/Q26208225

Notes Explanation of format description terms

General See the more common Transitional form of DOCX, DOCX/OOXML_2012.
History For chronologies of the OOXML standard and for versions of Microsoft Office, see OOXML_Family. See also the more common Transitional form of DOCX, DOCX/OOXML_2012.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 02/21/2017