|Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
|HyperText Markup Language (HTML) 5, including all 5.x versions
HyperText Markup Language (HTML) is the standard markup language for creating web pages and web applications. This format description is for HTML 5, standardized in two coordinated efforts. One is by the Web Hypertext Application Technology Working Group (WHATWG), which maintains a specification for HTML as a modularized "living standard" at https://html.spec.whatwg.org/. The second is at the World Wide Web Consortium (W3C). A series of snapshots of well-supported modules have been compiled and published as W3C Recommendations: 5.0 (2014); 5.1 (2016); 5.2 (2017). The latest W3C Recommendation is a result of a May 2019 agreement between W3C and the WHATWG regarding the development of a single version of the HTML specification. The specifics of the agreement can be found here and the latest recommendations for HTML can be found at the https://html.spec.whatwg.org/multipage/. See Notes below for more on the relationship between W3C and WHATWG.
A key objective for the WHATWG was backwards compatibility to ensure good rendering of existing websites. Another objective was that the specification be detailed enough that implementers such as browser developers can achieve complete interoperability without reverse-engineering. With these objectives in mind, the dependence of HTML on SGML was eliminated for HTML 5 and the specification includes details on how browsers are to render pages and how parsers are to handle shortcuts such as missing end tags. For example, the specification states that "A <tr> element’s end tag may be omitted if the <tr> element is immediately followed by another <tr> element, or if there is no more content in the parent element." HTML parsers/validators will accept these omissions. The specification also incorporates two serializations, the backwards-compatible HTML serialization (documented in 8. The HTML syntax) and a stricter XML-based serialization (documented in 9. The XML syntax), sometimes referred to as XHTML5, and compatible with the earlier W3C Recommendations for XHTML. The XML-based serialization requires the use of a different Internet media type and has different rules for declarations at the beginning of the document.
HTML 5 incorporated major changes and extensions over HTML 4.01. Major motivations for the changes are described in a slide presentation by Olle Olsen of W3C in 2008, when the first working draft from W3C for HTML 5 was published. Detailed changes, such as new elements and attributes, are documented in HTML5 Differences from HTML4 (2014). Note that the HTML 5 specifications use lower case for element names. Some of the most significant extensions include:
|Relationship to other formats
|HTML_family, HTML File Format Family
|Has earlier version
|HTML_4_01, HyperText Markup Language (HTML) 4.01
|Has earlier version
|XHTML_1_1, Extensible HyperText Markup Language (XHTML) 1.1, Module-based XHTML
|EPUB 3.0, EPUB, Electronic Publication, Version 3.0 (2011). ISO/IEC TS 30135:2014. EPUB 3 uses the XML syntax for HTML, i.e. the successor to XHTML.
|WebVTT, Web Video Text Tracks Format (WebVTT). WebVTT files are created displaying timed text in connection with the HTML5 <track> element.
Timed Text Markup Language Version 1 (TTML1).
The TTML1 specification states “While TTML is not expressly designed for direct (embedded) integration into an HTML or a SMIL document instance, such integration is not precluded.”
TTML may provide a "standard content format to reference from a <track> element in an HTML5 document.
|LC experience or existing holdings
|The Library of Congress home page archived on January 11, 2011 used XHTML 1.0 Transitional. For the new design introduced on January 12, 2011, HTML 5 was used. See also HTML_family.
HTML 5, developed and published as a "living standard" under the auspices of WHATWG, is a non-proprietary format, openly developed and published, and freely implementable.
The most current specifications for HTML 5 are the HTML Living Standard from WHATWG. Between October 2014 and December 2017, W3C published a sequence of Recommendations, incorporating patches and enhancements from the WHATWG specification adopted to resolve bugs registered against the previous W3C HTML 5.x specification and more accurately representing implementations in browsers or other user agents. According to a 2019 collaboration between W3C and the WHATWG, W3C will no longer independently publish HTML specifications. As a result the URL for the latest W3C Recommendation [ https://www.w3.org/TR/html/ ] now resolves to the HTML Living Standard from WHATWG.
According to W3Techs (Web Technology Surveys), in early March 2018, of websites based on HTML, 87% use HTML 5. That statistic omits the roughly 20% of all websites that use XHTML. Support in browsers for individual elements can be assessed via CanIUse or in tables at the bottom of entries for individual elements in the MDN HTML elements reference.
|Licensing and patents
No concerns. See HTML_family.
The transparency of image and video files intended for incorporating into the rendered display depends on the formats of those files. Note that such files are not stored within the HTML file, but referenced by URL. The URL may be absolute or relative to the HTML file.
See also HTML_family.
|Technical protection considerations
|Extensions typically employed with the usual HTML 5 serialization.
|Internet Media Type
|The media type for the usual HTML 5 serialization is text/html
|The specification for the HTML serialization of HTML 5 requires that a conforming document have a document type declaration of <!DOCTYPE html>, matched without case sensitivity.
|Wikidata Title ID
|Extensions sometimes used for documents in the XML serialization of HTML 5.
|Internet Media Type
|The recommended media type for use with the XML serialization of HTML 5. See 2002 registration at IETF RFC 3236 and its 2014 update at https://www.iana.org/assignments/media-types/application/xhtml+xml.
Relationship between W3C and WHATWG: The relationship between W3C and the WHATWG is described in the History section from the HTML 5 specification. This note is based largely on that section. W3C had stopped development of HTML in 1998, with the redirection of focus onto the XML-based XHTML. XHTML 1.0, as essentially equivalent to HTML 4.01, did not raise compatibility issues for browser vendors and was well adopted. However, the 2003 W3C Recommendation for XForms, positioned as the next generation of Web forms, would have required browsers to implement rendering engines that were incompatible with many existing HTML Web pages. In 2004, Apple, Mozilla, and Opera jointly announced their intent to continue working on extensions to HTML and formed the WHATWG. To quote from the official history, "The WHATWG was based on several core principles, in particular that technologies need to be backwards compatible, that specifications and implementations need to match even if this means changing the specification rather than the implementations, and that specifications need to be detailed enough that implementations can achieve complete interoperability without reverse-engineering each other."
In 2007, after a change of mind, W3C formed a working group to work with WHATWG. Until 2011, W3C's working group and WHATWG worked together under the same editor, Ian Hickson. In 2011, the groups concluded that they had different goals. The W3C wanted to reach closure on an HTML 5.0 Recommendation, while the WHATWG wanted to continue working to maintain the specification for HTML continuously and add new features. Since then, the W3C WG responsible for HTML has been adopting patches to address bugs and enhancements that already have wide support in browsers from the WHATWG and has published a sequence of W3C Recommendations.
Many individuals and small businesses use another approach to build their websites. Businesses such as WordPress.com and Squarespace offer hosting, templates, and design tools on a single platform. They provide one-stop shopping allowing the non-technical to build and maintain websites. WordPress.com is a hosting platform based on the open-source WordPress.org project. Squarespace templates use HTML 5. WordPress templates are known as "themes"; the version of HTML declared is controlled by the theme. The compilers of this resource have not investigated differences among the features used in HTML files generated using different web platforms and frameworks. Comments welcome.
HTML 5, also called HTML5, was the first update since HTML 4.01, published as a W3C Recommendation in December 1999. HTML 5 builds both on experience with XHTML and on extensions built by browser vendors. Requirements included support for varying screen sizes; more reliable and interoperable support for audio and video; and features to support interactive applications.
The first working draft of HTML 5 was published in January 2008. Adoption of HTML 5 was gradual between 2008 and 2012, and then steady. HTML 5 was finally issued as a W3C Recommendation in 2014.
For a more complete discussion and chronology of versions for the HTML format, see HTML_family.