|Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
|Web Video Text Tracks Format (WebVTT)
Web Video Text Tracks Format (WebVTT), is defined by the World Wide Web Consortium (W3C), an international community developing open standards for long-term growth of the web and is specified in the WebVTT: The Web Video Text Tracks Format W3C Candidate Recommendation (referenced throughout this document).
The WebVTT time-indexed file format is intended for marking up external text track resources in connection with the HTML <track> element, specifically HTML 5. The <track> tag specifies text tracks for <audio> and <video> elements. WebVTT files provide captions or subtitles for video content.
Structure of WebVTT:
According to the specification, WebVTT box model consists of three elements:
WebVTT are container files with chunks of data time aligned with a video or audio resource. The file starts with a header, followed by a series of data blocks. Data blocks with a start/end time are WebVTT cues. Other data, per the HTML specification, includes subtitles, captions, descriptions, chapters, and metadata. WebVTT files can only contain data of one kind, i.e. chapter file vs metadata file. WebVTT caption/subtitle cues are rendered as overlays on top of a video viewport.
WebVTT files must consist of a WebVTT file body, consisting of the string “WEBVTT”, followed by data blocks, line terminators, and other optional characters.
Comments data blocks can be included, preceded by a blank line, starting with the word “NOTE” and ending with a blank line.
Example of WebVTT:
Uses of WebVTT:
The main use for WebVTT files, according to the specification, is captioning or subtitling video content, but also WebVTT files can be used for time-aligned metadata for delivering paired cues, chapters for file navigation, and text video descriptions for visually understanding context.
Phil Cluff in Subtitles, Captions, WebVTT, HLS, and Those Magic Flags from January 2020, states “WebVTT isn’t just used for subtitles and captions (though those are the primary use cases), it can also be used for other forms of structured metadata that you might want to deliver alongside your content...WebVTT strikes an elegant balance between functionality, readability, and extensibility, being the only specification flexible enough to have a place to carry structured metadata. WebVTT is supported seamlessly on a comprehensive set of web players and OTT devices, which makes it great for streaming delivery.”
|WebVTT files can be used across any production phase. A variety of software programs are available to aid users in creating, editing, converting, validating, and publishing WebVTT files.
|Relationship to other formats
|HTML_5, HyperText Markup Language 5. WebVTT files are created displaying timed text in connection with the HTML5 <track> element.
|CSS, Cascading Style Sheet. Style sheets applied to an HTML page containing a <video> element can target WebVTT cues/regions. Style sheets can also be embedded in WebVTT files.
|JSON, JSON Date Interchange Format. WebVTT files can consist of time-aligned metadata that can be any string and often is provided as a JSON construct.
|SRT, SubRip Format. WebVTT was broadly based on SRT, initially called WebSRT with the same .srt extension. Later it was renamed to WebVTT and introduced with the <track> tag for HTML5.
|LC experience or existing holdings
|The Packard Campus uses both sidecar and embedded captions in preservation and access files via WebVTT, SRT, and SCC. See FADGI's 2022 Survey Results: The Current State of Accessibility Features for Audiovisual Collections Content in Five FADGI Institutions for more details.
|The Library of Congress has not defined format preferences for caption or subtitle files.
WebVTT is an open specification published by the World Wide Web Consortium. The WebVTT specification is based on the Draft Community Group Report of the Web Media Text Tracks Community Group and is produced by the W3C Timed Text Working Group as a Candidate Recommendation, with the intention to become a W3C Recommendation.
WebVTT: The Web Video Text Tracks Format – W3C Candidate Recommendation (April 2019) https://www.w3.org/TR/webvtt1/
WebVTT: The Web Video Text Tracks Format – Draft Community Group Report (February 2023) https://w3c.github.io/webvtt/
WebVTT was made to be an extension of SRT (fdd000569) to add useful optional features that were not available in SRT, but by adding more features WebVTT may not be supported on as many players as SRT.
Speechpad.com states in the article WebVTT (Web Video Text Tracks) of May 2021 (Wayback Machine link), “The WebVTT file format is supported by most video players, streaming platforms, authoring tools, editing software, including: YouTube, Microsoft Player Framework, Vimeo, Adobe Premiere Pro, DVD Studio Pro,” to name a few. See link for full list.
|Licensing and patents
The W3C Patent Policy has the goal of assuring that all W3C Recommendations can be implemented on a royalty-free basis.
WebVTT files are text files that are save in the Video Text Track (VTT) format, so they can be opened and edited in a plain text editor.
None beyond availability of supporting software.
|Technical protection considerations
WebVTT IANA Security Considerations: “Text track files themselves pose no immediate risk unless sensitive information is included within the data. Implementations, however, are required to follow specific rules when processing text tracks, to ensure that certain origin-based restrictions are honored. Failure to correctly implement these rules can result in information leakage, cross-site scripting attacks, and the like.”
Good support. WebVTT files are simple text files encoded as UTF-8.
|Integrity of document structure
Good support. WebVTT files must follow a specified format described in the W3C specification that includes the WebVTT file body encoded as UTF-8.
According to Andreas Tai in Balisage Paper: WebVTT versus TTML: XML considered harmful for Web Captions? August 2013, “WebVTT does not use a formal grammar to describe the syntax but a sequence of rules written in normative prose.”
The WebVTT specification states, “As with any text-based format, it is possible to construct malicious content that might cause buffer over-runs, value overflows, and the like.” And “Implementers should take care in implementing a parser that over-long lines, field values, or encoded values do not cause security problems.”
|Integrity of layout and display
While it is not essential to the function of a WebVTT file, the text can be styled and positioned to display as the creator pleases. Style can be defined directly in the text file by using the string “STYLE” after any headers but before the first cue. Style customizations include size, positioning, and fonts.
Style sheets applied to an HTML page containing a <video> element can target WebVTT cues/regions. Style sheets can also be embedded in WebVTT files.
The WebVTT specification states “WebVTT can embed CSS style sheets, which will be applied in user agents that support CSS. Under these circumstances, the privacy and security considerations of CSS apply, with the following caveats.”
|Support for mathematics, formulae, etc.
Little to no information on WebVTT’s support of mathematics, chemical formulae, diagrams, etc.
|Functionality beyond normal rendering
|Internet Media Type
|EF BB BF 57 45 42 56 54 54 0A
EF BB BF 57 45 42 56 54 54 09
EF BB BF 57 45 42 56 54 54 EOF
57 45 42 56 54 54 0A
57 45 42 56 54 54 0D
57 45 42 56 54 54 20
57 45 42 56 54 54 09
57 45 42 56 54 54 EOF
|WebVTT files all begin with one of the following byte sequences (where "EOF" means the end of the file). See https://www.iana.org/assignments/media-types/text/vtt.
|Wikidata Title ID
|File format. See https://www.wikidata.org/wiki/Q3566973.
Downloadable WebVTT sample.
WebVTT files all begin with one of the following byte sequences (where "EOF" means the end of the file). An optional UTF-8 BOM, the ASCII string "WEBVTT", and finally a space, tab, line break, or the end of the file.
Interesting - According to Andreas Tai in Balisage Paper: WebVTT versus TTML: XML considered harmful for Web Captions? August 2013, “Another difference to the TTML use case is, that WebVTT is designed to only serve as a web distribution format for subtitles. There is no ambition for it to be used as an intermediary format. And although later in the specification process documents and extensions were published to support the translation from existing US broadcast standards into WebVTT[P11], support for legacy formats had not been a requirement from the beginning."
WebVTT was initially created and released in 2010. Early drafts were written by WHATWG (Web Hypertext Application Technology Working Group) after discussions about what caption format should be supported by HTML5, choosing between TTML (fdd000568) or a new standard based on SubRip, SRT (fdd000569) format. The new chosen format was called WebSRT and shared the same .srt extension, before the name was changed to WebVTT.
November 2014, the Time Text Working Group published a First Working Draft of WebVTT, defining WebVTT. In April of 2019, they published an updated Candidate Recommendation of WebVTT.