Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Matroska Multimedia Container

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name Matroska Multimedia Container
Description

Matroska is a open, non-proprietary multimedia container format based on the EBML for Matroska structure. Since 2018, the in-process specification for version 4 of the Matroska Media Container Format has been published as a sequence of IETF Internet Drafts, available at Matroska Media Container Format Specifications datatracker. As of 2020, Matroska is a Proposed Standard under active review with the most recent update on April 17, 2020. The published IETF Internet Draft (available through Internet Archive) states in the Status of this document (link available through Internet Archive) section: "This document is a work-in-progress specification defining the Matroska file format as part of the IETF Cellar working group. But since it's quite complete it is used as a reference for the development of libmatroska. Note that versions 1, 2 and 3 have been finalized. Version 4 is currently work in progress. There MAY be further additions to v4."

As a purpose built multimedia wrapper, Matroska has structures to carry a variety of payloads (as defined in the TrackType element) including video, audio, subtitles, metadata as well as other data types such as 'complex', logo, buttons, and control.

Structurally, Matroska is built on the EBML framework with fixed declarations that define the file as a Matroska file. Specifically, within the EBML Header, EBML DocType must be EBMLSchema docType="matroska", EBMLMaxIDLength must be 4 and EBMLMaxSizeLength must be between 1 and 8 inclusive. See EBML for Matroska for more details about EBML. One interesting note is that the required EBML 'Root Element' in the EBML Schema is known as a 'Segment' in Matroska. The Matroska Structure (link available through Internet Archive) includes several helpful diagrams to help visualize the structure. It defines eight Top Level Elements which may occur within the Segment: SeekHead (also known as MetaSeek), Info, Tracks, Chapters, Cluster, Cues, Attachments, and Tags. Each of these contain a number of subelements to further define the content. Technical details about the audio and video payloads are defined in the Tracks element such as CodecID, CodecName for any Track type with FieldOrder, StereoMode, DisplayWidth, DisplayHeight, AspectRatioType and color for video tracks and SamplingFrequency, Channels and BitDepth for audio tracks. The extent of the required subelements depends on the codec used for the track but the subelements must provide all the data needed by the codec to decode the data of the specified track. The payload data itself is stored as 'Blocks' (either SimpleBlocks or as a BlockGroup) in the Cluster Element. Subtitles and captions are well supported via a defined subtitle track and the specification recommends (link available through Internet Archive) "For each subtitle track present, each subtitle frame SHOULD be referenced by a CuePoint Element with a CueDuration Element." The Attachments element is interesting in that it supports that attachment of pictures including cover art, webpages, programs, or even the codec needed to play back the file, much like the Generic Stream Partition functionality of a SMPTE RDD 48 MXF file (see 6.2.4.1 Generic Stream Partitions and Embedding Data [informative] on p. 19).

Improved support for timecode, as opposed to timestamps (link available through Internet Archive), is still in development as is support for multi-planar and 3D videos.

Production phase A final state format for enduser delivery.
Relationship to other formats
    Has subtype Matroska_FFV1, Matroska File Format with FFV1 video encoding
    Has subtype Matroska_AVC, Matroska File Format with MPEG-4, Advanced Video Coding (Part 10) (H.264)
    Has subtype Matroska_MPEG-2, Matroska File Format with MPEG-2 Video Encoding (H.262)
    Has subtype Matroska_LPCM, Matroska File Format with LPCM Audio Encoding
    Has subtype Matroska_MP3, Matroska File Format with MP3 Audio Encoding
    Defined via EBML, Extensible Binary Meta Language. Language used to code the Matroska file format.
    Has modified version WebM, WebM

Local use Explanation of format description terms

LC experience or existing holdings As of this writing in August 2020, the Motion Picture, Broadcasting and Recorded Sound Division (MBRS) has ingested 107 ffv1/mkv files. These files were received from CUNY-TV through the American Archive of Public Broadcasting and the Library will receive many more of these files through the AAPB in the coming months.
LC preference FFV1 codec in the Matroska container is an 'acceptable' format in the Recommended Formats Statement (RFS) for Video -- File-based but only for content without closed captions or timecode information. While captions are supported in Matroska, FFmpeg functionality is limited (and does not, for example, support Timed Text Markup Language or TTML). Improved support for timecode, as opposed to timestamps, is still in development as is support for multi-planar and 3D videos.

Sustainability factors Explanation of format description terms

Disclosure

The work-in-progress specification is published concurrently in two locations: the Matroska website and the Internet Engineering Task Force (IETF) Matroska Media Container Format Specifications datatracker. The Matroska Web site includes additional information about licensing, test files, source code repositories and more. The IETF datatracker only publishes the specification documentation. Since 2018, the in-process specification for version 4 of the Matroska Media Container Format has been published as a sequence of Internet Drafts (which must be updated every six months by IETF policy), available at Matroska Media Container Format Specifications datatracker. As of 2020, Matroska version 4 is a Proposed Standard under active review with the most recent Internet Draft published on April 17, 2020 and expiring in October 2020. See history section of the Notes below which also reports on the CELLAR IETF standardization project launched in 2015.

    Documentation

In August 2020, the "source" of the Matroska specification is an XML (link available through Internet Archive) file hosted on Matroska.org's GitHub repository (although that table has not been updated since 2018). This table is also used to generate the semantic data used in libmatroska and libmatroska2.

Specification information is published concurrently in two locations: The Matroska website and the Internet Engineering Task Force (IETF) Matroska Media Container Format Specifications datatracker. Since 2018, the in-process specification for version 4 of the Matroska Media Container Format has been published as a sequence of Internet Drafts, available at Matroska Media Container Format Specifications datatracker. An early version of the specification dated January 11, 2009 authored by Alexander Noé is available on the Matroska site but it is not clear which versions are covered in this document.

Adoption

Early adopters of Matroska in the archival community, mostly in combination with the FFV1 codec, include the City of Vancouver Archives starting in at least 2011. But the success of the No Time to Wait (NTTW) series of symposiums starting 2016 and initially sponsored as part of the European PREFORMA (PREservation FORMAts for culture information/e-archives) project played a large part in the rapid spread of adoption in recent years. The first NTTW in Berlin included a report out by a Matroska working group which includes a recap of the informal discussion as well as an overview of the history to date by Matroska inventor Steve Lhomme (slides and video. Subsequent NTTW conferences (in Vienna 2017, London 2018 and Budapest 2019) have each provided status updates on the maturation of the IETF specification process and increased adoption.

Significant influencers for Matroska adoption include high profile institutions including Indiana University Media Digitization and Preservation Initiative (MDPI) and the influential white paper published in March 2017 Encoding and Wrapper Decisions and Implementation for Video Preservation Master Files. Author Mike Casey lays out the decision to chose Matroska (and FFV1) as their preservation master file format: "Both the Matroska specification and its underlying specification for EBML are at a mature and stable stage with thorough documentation and existing validators. Matroska has recently gained native support in the Windows OS and is also the basis for Google’s WebM format container. A number of media communities have adopted Matroska, which has seen extensive Internet usage, because of its features including extensible structured metadata, broad support of audiovisual encodings, subtitle management, etc. Open source software developers have built tools based on the Matroska specification for many years." (p.7) The acceptance of Matroska by MDPI, with its "80 units across IU Bloomington contributing more than 250,000 audio and video recordings" for digitization, gave confidence to the others in the audiovisual preservation community to adopt Matroska as a preservation format.

Many other organizations followed suit including NYPL, The National Archives UK, the British Film Institute, City University of New York (CUNY) and more. Recent news as of this writing in August 2020 are two announcements that the Library of Congress has added Matroska and FFV1 as an "acceptable" format for file-based video content without closed captions and/or timecode information and that Meemoo (the Flemish Institute for Archives) will be transcoding or rewrapping all its MXF-jpeg2000 content into MKV-FFV1 from 2021, together with a large scale LTO6 to 8 migration.

The IASA-TC 06 Guidelines for the Preservation of Video Recordings recommends Matroska and FFV1 for a number of different "classes" of video content including digitized analogue video recordings (class 1), digital videotapes with encodings that are “out of reach” or inappropriate for long-term retention (class 2), and digital videotapes with encodings that can be extracted “as data" (class 3). Matroska is also used as a preservation format for digitized motion picture film as an alternative to Digital Moving-Picture Exchange (DPX). Reto Kromer and Kieran O’Leary led this work with a presentation at NTTW in 2016 (see video of presentation) with Kromer later following up in his 2017 paper Matroska and FFV1: One File Format for Film and Video Archiving?: "The Matroska container and the FFV1 video codec are good choices for single-image-based content when making archive masters. Often, a resolution of 2K, or sometimes 4K, an RGB colour space, the 4:4:4 chroma sampling and a bit-depth of 16 bit per colour channel are canonical choices. For stream-based content, the Matroska container and the FFV1 video codec are also good choices for the archive master. A resolution of HD (with pillar-boxing of letter-boxing if required), in general, the Y′CBCR colour space, the 4:2:2 subsampling and a bit-depth of 10 bit are usually considered best practice." Kromer also notes that "the Matroska container is currently not popular enough for it to be recommended for access." See slides from O'Leary and Kromer's talk on "Using Matroska and FFV1 for DPX Preservation" from The Reel Thing XXXVIII in Hollywood, California, 18–20 August 2016. This is echoed by Caroline Gil and Peter Oleksik in their talk Assessing the preservation of DPX image sequences with MoMA. Many NTTW presentations advocate for the use of Matroska (with FFV1 usually) with some notable ones including O'Leary's Migrating ProRes/MOV to FFV1/MKV from 2019, Peter Bubestinger-Steindl's Presets for FFV1 and MKV: Choosing the right parameters for the job (2019), Genevieve Havemeyer-King and Ben Turkus from NYPL MKV and Mass Digitization: What We've Learned Since Giving Uncompressed Video the Boot (video from 2017; presentation starts at about 5:17:00).

Other adopters include Data Archiving and Networked Services (Netherlands), MIPoPS (Moving Image Preservation of Puget Sound), Smithsonian Institutions DAMS Supported File Formats, and University of Georgia, Walter J. Brown Media Archives (presentation by media archivist Callie Holmes. Matroska also plays a critical role in the WebM open media project because WebM is a subtype of Matroska.

Workflows for Matroska are supported by both proprietary and open source software. MKVToolNix is the de-facto reference implementation of a Matroska multiplexer which bundles a set of Matroska and WebM focused tools to get information about (via mkvinfo) Matroska files, extract tracks/data from (via mkvextract) Matroska files and create (via mkvmerge) Matroska files from other media files and (via) mkvpropedit can edit properties such as header and chapter information or attachments without remuxing. MediaArea's RAWcooked open source software which, in conjunction with FFmpeg, encodes from, and decodes to, 'raw' audio-visual image sequences. FFmpeg encodes the audio-visual data into a Matroska container using the video codec FFV1, and audio codec FLAC. "Whenever necessary, RAWcooked can decode the Matroska file back to the original RAW image sequence, including restoration of the original metadata and sidecar files. It is important to stress that the encoded files can be fully decoded, and that this process will be created bit-by-bit identical files to the originals. Not only is the image and/or sound content fully preserved, but also all enclosed metadata and the all of the file’s characteristics. Therefore, an encoded and decoded RAW file cannot be differentiated from its original." Other tools listed on the Matroska website include: mkclean, a command line tool to clean and optimize Matroska and WebM files; mkvalidator a command line conformance checker for Matroska and WebM files; Meteorite which repairs damaged Matroska files but this project may be dormant as the last update seems to be prior to 2009.

Third party support applications include, for media players VLC (VideoLAN client), MPC-HC (MediaPlayer Classic — Home Cinema) a free and open-source video and audio player for Windows, BS.player (multimedia player), Zoomplayer (video player for Windows), foobar2000 (media player for Windows), KMPlayer and SMPlayer; for encoders, the open source Handbrake and MakeMKV. Editing options include Pinnacle Studio, 4videosoft Video Converter for Mac and Anusoft Video Converter for Mac.

    Licensing and patents

According to the Matroska website, "Matroska has several components that are licensed in different ways to maximize its software and hardware adoption" including LGPL for LibEBML (A simplified binary extension of XML for the purpose of storing and manipulating data in a hierarchical form with variable field lengths) and LibMatroska (C++ library to parse Matroska files, it requires libEBML or libEBML2); BSD for LibEBML2 (Another EBML parser with a similar interface to libEBML but written in C) and Core C (A low level API layer for the C programming language).

Transparency Depends upon included encodings, some of which will depend upon algorithms and tools to read and require sophistication to build tools.
Self-documentation

Built on well-structured EBML, Matroska has multiple options within the file structure to provide technical and descriptive metadata. Technical details about the audio and video payloads are defined in the Tracks element such as CodecId, CodecName for any Track type with FieldOrder, StereoMode, DisplayWidth, DisplayHeight, AspectRatioType and color for video tracks and SamplingFrequency, Channels and BitDepth for audio tracks.  The "Info" Element includes information about its parent Segment including durations, unique identifiers and title for the Segment. The "Tags" Element "contains metadata describing Tracks, Editions, Chapters, Attachments, or the Segment as a whole.

Matroska has a robust tagging infrastructure which enables structured descriptive labels. The official set of tags covers Organization Information, Titles, Nesting Information, Organization Information, Titles, Nested Information, Entities, Search and Classification, Temporal Information, Spacial Information, Personal, Technical Information, Identifiers, Commercial, Legal, and Notes. Tags can be bested for more granularity so that "when a Tag is nested within another Tag, the nested Tag becomes an attribute of the base tag."

External dependencies

The specification points out that Matroska inherits security considerations from EBML which can include the vulnerability of an "easter egg" within the Attachment element which could contain arbitrary and potentially executable data. To combat this, "Matroska Readers that extract or use data from Matroska Attachments SHOULD check that the data adheres to expectations." In addition a Matroska Attachment may have an inaccurate IANA media type or mime type.

Technical protection considerations

Encryption is supported in a generic way "to allow people to implement whatever form of encryption is best for them" according to the specification and that "It is easily possible to use the encryption framework in Matroska as a type of DRM." ContentEncryption element must be present with a value of 1 to indication that encryption is present. The element is ignored if present with any other value. The ContentEncAlgo Element defines the type of encryption used from a controlled vocabulary list. The value '0' in the ContentEncAlgo Element means that the contents have not been encrypted but only signed  Encryption can also be layered within Matroska so multiple types of encryption can be used.


Quality and functionality factors Explanation of format description terms

Moving Image
Normal rendering Good support.
Clarity (high image resolution) Varies according to encoding. See FFV1, MPEG-4_AVC and MPEG-2.
Functionality beyond normal rendering

Matroska can contain 3D video. The authors of this document do not know of examples of 3D video in Matroska files so comments are welcome. The Matroska website notes that "3D support is still in infancy and may evolve to support more features." See Notes below for more about 3D.

Sound
Normal rendering Good support.
Fidelity (high audio resolution) Varies according to encoding. See LPCM and MP3_ENC.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension mkv
Used for files that contain at least one video track (usually with at least one audio track and optionally with subtitle tracks). This is the most commonly used extension
Filename extension mka
Used for audio only files, can contain any supported audio compression format, such as MP2, MP3, Vorbis, AAC, AC3, DTS, or PCM
Filename extension mks
Used for files that only contain subtitles
Filename extension mk3d
For stereoscopic video files. See Notes below.
Internet Media Type video/x-matroska
audio/x-matroska
video/x-matroska-3d
From the Matroska IETF draft specification (link available through Internet Archive). These are not listed with IANA
Magic numbers HEX: 0x1A45DFA3
ASCII: EBML
From EBML specification, section 11.2.1. Used for all EBML-based files including Matroska Multimedia Container and WebM
Magic numbers See note.  1A45DFA3{0-32}4282886D6174726F736B614287 from PRONOM fmt/569 entry: 0x1A45DFA3 ('EBML'), 0x428288 (EBML DocType element ID), 0x6D6174726F736B61 ('matroska'), 0x4287 (DocTypeVersion element ID). "The specification allows for ASCII text before the EBML header and considers the level 0 EBML header element to be the beginning of the EBML document. For now this signature is offset to 1024 bytes to allow for some ASCII text to precede the header, but to assume the EBML header of a Matroska multimedia format will begin near the beginning of the file and to mitigate against wasteful processing of large files that are not Matroska multimedia."
Indicator for profile, level, version, etc. See note.  "DocTypeVersion" and "DocTypeReadVersion" within the EBML Header informs the reading application what version of Matroska. According the specification, "DocTypeVersion" must be equal to or greater than the highest Matroska version number of any "Element" present in the Matroska file and the "DocTypeReadVersion" must contain the minimum version number that a reading application can minimally support in order play the file.
Pronom PUID fmt/569
Matroska v. 1-4. See http://www.nationalarchives.gov.uk/PRONOM/fmt/569.
Wikidata Title ID Q223535
No versions declared. See https://www.wikidata.org/wiki/Q223535.

Notes Explanation of format description terms

General

Matroska is an English word derived from the Russian word Matryoshka which means nesting doll, a reference to this format's ability to wrap a number of component elements.

Timestamp and timecode have an interesting history in Matroska. From Specification notes on the Matroska website: "Historically timestamps in Matroska were mistakenly called timecodes. The Timestamp Element was called Timecode, the TimestampScale Element was called TimecodeScale, the TrackTimestampScale Element was called TrackTimecodeScale and the ReferenceTimestamp Element was called ReferenceTimeCode." But two are not synonymous. Timecode is defined by SMPTE Standard 12M. Timecode can be continuous throughout the file or have gaps (known as discontinuous) and can take various forms, including but not limited to, Linear timecode (LTC), Vertical interval timecode (VITC) and Ancillary Time Code (ATC), and it can be of various frame rates and frame counting modes. The typical format is represented as HH:MM:SS:FF (hour:minute:second:frame with variation depending on if the timecode is drop-frame or non-drop-frame). Timestamps on the other hand, mark a specific location in the file and are very useful in transcriptions and chaptering. However, they are not as granular as timecode and start at the beginning of the file. Timestamps usually use HH:MM:SS format to record elapsed time from the beginning of the audio or video file. The gist of this issue is that timestamps are well supported in Matroska but timecode is not. Timecode is not present in every file but it is common on professional broadcast content and those files that do have it may want to retain it for continuity and integrity. The Library of Congress for example lists FFV1 codec in the Matroska container is an 'acceptable' format in the Recommended Formats Statement (RFS) for Video -- File-based but only for content without closed captions or timecode information.

Regarding 3D, multi-planar, and stereoscopic video footage, the Matroska Specification notes Web page offers an extensive discussion, excerpted here: "There are 2 different ways to compress 3D videos: have each 'eye' track in a separate track and have one track have both 'eyes' combined inside (which is more efficient, compression-wise). . . . For the single track variant, there is the StereoMode element which defines how planes are assembled in the track (mono or left-right combined). Odd values of StereoMode means the left plane comes first for more convenient reading. The pixel count of the track (PixelWidth/PixelHeight) should be the raw amount of pixels (for example 3840x1080 for full HD side by side) and the DisplayWidth/Height in pixels should be the amount of pixels for one plane (1920x1080 for that full HD stream). Old stereo 3D were displayed using anaglyph (cyan and red colours separated). For compatibility with such movies, there is a value of the StereoMode that corresponds to AnaGlyph. There is also a "packed" mode (values 13 and 14) which consists of packing 2 frames together in a Block using lacing. The first frame is the left eye and the other frame is the right eye (or vice versa). The frames should be decoded in that order and are possibly dependent on each other (P and B frames). For separate tracks, Matroska needs to define exactly which track does what. TrackOperation with TrackCombinePlanes do that."

History

Matroska founder Steve Lhomme first started looking for an open and flexible in 2001 while recording political debates during the French presidential election cycle. He first used AVI (Audio Video Interleaved) but found AVI to be too limited at the time. Although some of the issues, such as the 2GB file size limit and support of variable bitrate, were resolved in the AVI File Format with OpenDML Extensions published in 1997, AVI "still no proper and spec compliant way to support modern compression formats like the excellent, open source Ogg Vorbis audio compression format."

Matroska was forked from Multimedia Container Format project on December 06, 2002 (and its birthday was celebrated during the NTTW4 conference on the same date in 2019). According to Wikipedia, consulted August 31, 2020, Matroska "was announced on 6 December 2002 as a fork of the Multimedia Container Format (MCF), after disagreements between MCF lead developer Lasse Kärkkäinen and soon-to-be Matroska founder Steve Lhomme about the use of the Extensible Binary Meta Language (EBML) instead of a binary format. Additional major milestones include the start of WebM development (2010) and the start of the IETF standardization process in 2016.

Like EBML, the standardization of Matroska has its roots in the European PREFORMA (PREservation FORMAts for culture information/e-archives) project which had the stated intention "to research critical factors in the quality of standard implementation in order to establish a long-term sustainable ecosystem around developed tools with a variety of stakeholder groups." PREFORMA started in 2014 and co-funded by the European Commission under its Seventh Framework Programme (link through EU Web archive) which was active from 2007 to 2013. Among the projects funded through this data call was the CELLAR (Codec Encoding for LossLess Archiving and Realtime) working group organized through IETF whose charter lists these goals: "FFV1 is a lossless video codec and Matroska is an extensible media container based on EBML (Extensible Binary Meta Language), a binary XML format. There are open source implementations of both formats, and an increasing interest in and support for use of FFV1 and Matroska. However, there are concerns about the sustainability and credibility of existing specifications for the long-term use of these formats. These existing specifications require broader review and formalization in order to encourage widespread adoption....Using existing work done by the development communities of Matroska, FFV1, and FLAC, the Working Group will formalize specifications for these open and lossless formats. In order to provide authoritative, standardized specifications for users and developers, the Working Group will seek consensus throughout the process of refining and formalizing these standards." The formalization of EBML was needed in order to firm up Matroska to accomplish this, so sections of the Matroska specification which more directly pertained to EBML were moved into the EBML specification "so that the Matroska specification may build upon the EBML specification rather than act redundantly to it. The updated EBML specification includes documentation on how to define an EBML Schema which is a set of Elements with their definitions and structural requirements rendered in XML form. Matroska’s documentation now defines Matroska through an EBML Schema as a type of EBML expression." According to Ashley Blewer and Dave Rice in their 2016 iPres paper, "In 2004 (two years after the origin of Matroska), Martin Nilsson produced an RFC draft of EBML, which extensively documented the format in Augmented Backus-Naur Form (ABNF). This draft was not published by the IETF but remained on the Matroska site as supporting documentation. Also in 2004, Dean Scarff provided draft documentation for a concept of the EBML Schema."

The PREFORMA project also funded the development of MediaConch (Media CONformance CHecker) open source software. Developed by MediaArea, MediaConch is an implementation checker, policy checker, reporter, and fixer that targets preservation-level audiovisual files specifically for Matroska, Linear Pulse Code Modulation (LPCM) and FF Video Codec 1 (FFV1).

The first IETF Internet draft (version 00) was published on July 17, 2018 soon followed by updated versions published roughly every six months (as IETF Internet Drafts are required to do): version 01 (July 26, 2018), version 2 (January 9, 2019), version 3 (July 22, 2019), version 4 (October 27, 2019) and version 5 (April 17, 2020). Full version history on IETF Datatracker. The Matroska website also maintains a copy of the specification.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 08/04/2021