Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

MPEG-4 File Format, Version 2

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format (formal name); MPEG-4 file format, version 2 (common name)
Description The second MPEG-4 file format developed by the Moving Picture Experts Group (MPEG). The format's object-based design defines a set of tools that present binary coded representation of individual audiovisual objects, text, graphics, and synthetic objects. (See Notes below.) This format is intended to serve web and other online applications; mobile devices, i.e., cell phones and PDAs; and broadcasting and other professional applications. See also Notes below.
Production phase Generally a final-state (end-user delivery) format, may also serve as middle-state format.
Relationship to other formats
    Subtype of ISO_BMFF, ISO Base Media File Format
    Has subtype MP4_FF_2_V, MPEG-4 File Format, V.2, with Visual Encoding (All Profiles)
    Has subtype MP4_FF_2_AVC, MPEG-4 File Format, V.2, with AVC, No Profile Indicated
    Has subtype MP4_FF_2_AVC_BP, MPEG-4 File Format, V.2, with AVC, Baseline Profile
    Has subtype MP4_FF_2_AVC_MP, MPEG-4 File Format, V.2, with AVC, Main Profile
    Has subtype MP4_FF_2_AVC_EP, MPEG-4 File Format, V.2, with AVC, Extended Profile
    Has subtype MP4_FF_2_AVC_HP, MPEG-4 File Format, V.2, with AVC, High Profile
    Has subtype MP4_FF_2_AVC_H10P, MPEG-4 File Format, V.2, with AVC, High 10 Profile
    Has subtype MP4_FF_2_AVC_H422P, MPEG-4 File Format, V.2, with AVC, High 4:2:2 Profile
    Has subtype MP4_FF_2_AVC_H444P, MPEG-4 File Format, V.2, with AVC, High 4:4:4 Profile
    Has subtype MP4_FF_2_AAC, MPEG-4 File Format, V.2, with Advanced Audio Coding
    Has subtype For other object types, not described at this time
    Has earlier version MP4_FF_1, MPEG-4 File Format, Version 1

Local use Explanation of format description terms

LC experience or existing holdings The content produced by the NDIIPP partnership project with SCOLA consists of foreign television news broadcasts in MP4_FF_2_V, MPEG-4 File Format, V.2, with Visual Encoding.The Library of Congress has many MPEG-4 files in its collections - over 225TB in early 2023 - across numerous collections.
LC preference

The Library of Congress Recommended Formats Statement (RFS) lists MPEG-4 as an Acceptable viewing proxy format for Video - File-Based and Physical Media.

Sustainability factors Explanation of format description terms


Open standard in that it is fully documented and disclosed. As with any ISO-sponsored project, any updates to the specification are done through the ISO process for such which includes funneling feedback through national members, such as ANSI in the case of the USA. This process is transparent in its procedure but because membership in national bodies is limited (for example, individuals are not eligible to join ANSI as members), it is not considered an open format. Moreover, the specification documents are paywalled.

Developed through ISO technical program ISO/IEC JTC 1/SC 29 for coding of audio, picture, multimedia and hypermedia information. The working group WG11 (for coding of moving pictures and audio) is also known as the Moving Picture Experts Group (MPEG). See the ISO Standards Catalogue for the list of standards published by ISO/IEC JTC1/SC29. See for information specific to MPEG-4 and its many parts.


ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format.

The total documentation package for ISO/IEC 14496 is extensive; 17 parts have been published from 1998 to 2004, with more to come. See complete list of documents in Format specifications below.

Adoption Appears to be more widely adopted than MP4_FF_1.  Overall, the adoption of MPEG-4 has been slowed by licensing terms that require some content disseminators to pay fees according to the number of end users or the extent of content delivered. As adoption advances, it may not extend to all profiles, levels, or parts of the standard.  
    Licensing and patents

MPEG-4 Visual, Systems, and Advanced Video Coding licensing is managed by MPEG LA LLC (  These licenses cover the manufacture and sale of devices or software and, for some content disseminators, levy fees according to number of end users or the extent of content delivered.  The arrangements are updated periodically; for example, in January 2005, MPEG LA announced that the patent portfolio had been expanded to cover the FRExt (Fidelity Range Extensions) associated with MPEG-4_AVC and ITU H.264.

MPEG-4 Audio licensing is managed by Via Licensing Corporation (, link available through Internet Archive), an independent subsidiary of Dolby Laboratories. MPEG-4 Audio licensing appears to be limited to the manufacture of devices or software.

Transparency Depends upon included encodings, but all MPEG-4 encodings depend upon algorithms and tools to read and require sophistication to build tools.

The inclusion of metadata of various types is a key element in MPEG-4. As indicated in the notes below, object and scene descriptions are required in order for MPEG-4 content to be presented.

Semantic description is carried by Object Content Information (OCI) descriptors and streams; the standard also permits the inclusion of MPEG-7 data, a separately standardized structure for metadata to support discovery and other purposes.

External dependencies Playback of surround sound requires multiple loudspeakers.
Technical protection considerations MPEG-4 offers a standardized Intellectual Property Management and Protection (IPMP) interface consisting of IPMP-Descriptors (IPMP-Ds) and IPMP-Elementary Streams (IPMP-ES) that allow the design and use of domain-specific IPMP systems.

Quality and functionality factors Explanation of format description terms

Moving Image
Normal rendering Good support. The format supports timescales that manage the playout of time-based media streams and hint tracks employed in streaming applications.
Clarity (high image resolution) Depends upon encoding; see MPEG-4_V and MPEG-4_AVC.
Functionality beyond normal rendering MPEG-4 program streams may be multiplexed in MPEG-2 transport streams. Random access and other features are discussed in the specification.
Fidelity (high audio resolution)

Depends upon encoding; the encodings used are generally lossy and provide moderate to very good fidelity. See, for example, AAC_MP4, considered to be superior to MP3 (MPEG-2 layer 3 audio) at a given bit rate.

The MPEG-4 standard also provides support for other "natural" sound encodings, e.g., parametric coding (HILN or Harmonic and Individual Lines plus Noise) and CELP (Code Excited Linear Prediction) and other encodings for speech.  The standard also supports the synthesis of audio, and for what is called Synthetic-Natural Hybrid Coding (SNHC).  The presentation of these elements depends upon the use of AudioBIFs (Audio BInary Format for Scenes).  In 2005, the MPEG committee announced two additional audio capabilities: Audio Lossless coding (ALS; lossless compression of multi-channel sound using time-domain prediction and entropy coding) and Scalable to Lossless coding (SLS; a scalable enhancement layer is added to a lossy bitstream that extends the representation to lossless but which can be truncated at delivery time).  The compilers of this document do not know the degree to which any of these various elements may be implemented in practice.

Multiple channels

The AAC_MP4 audio structure provides a capability of up to 48 main audio channels, 16 LFE (Low Frequency Encoding or Effects) channels, 16 overdub/multilingual channels, and 16 data streams.

SNHC [and other note-based or synthetic?] sound can be spatially presented using extensions of the concepts initially implemented in Virtual Reality Modeling Language (VRML).

Support for user-defined sounds, samples, and patches Not applicable.
Functionality beyond normal rendering Not fully investigated at this time. Recent published or announced additions to the standard include Part 16, the Animation Extension Framework; Part 17 for "timed text," e.g., subtitles or karaoke; Part 18 for font compression and streaming; and Part 22 for Open Fonts based on the OpenType specification.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension mp4

Paraphrased from the former site: MP4 can be used for MPEG 4 video files, combined video and audio files, or just plain MPEG 4 audio. M4A files contain only MPEG 4 Audio. Apple started using M4A to identify files unprotected by digital rights management; note that protected QTA_AAC files carry the M4P and and M4B (for bookmarkable files) extensions. Apple felt that MP4 was too general (video, video/audio, or audio) and might confuse some media players. Until recently, encoder and player software like Nero and Compact used .mp4 for audio files while WinAmp 5.02, Apple iTunes, and others used .m4a. Today, most audio software developers allow you to choose the file extension you prefer.

The Wikipedia article Apple Lossless (consulted November 2, 2012) reports that the m4a extension is used for files containing either AAC_MP4 or the Apple Lossless encoding, wrapped in the MPEG4_FF_2 (MPEG-4, version 2) file format.

Internet Media Type video/mp4
According to IETF RFC 4337 (March 2006), for files with video and audio streams (including MPEG-J1).
Internet Media Type audio/mp4
According to IETF RFC 4337 (March 2006), for files with audio but no visual aspect (including MPEG-J1).
Internet Media Type application/mp4
According to IETF RFC 4337 (March 2006), for files with neither visual nor audio presentations but only MPEG-J1 or MPEG-7 metadata.
Internet Media Type application/mpeg4-iod
IOD (Initial Object Descriptor) in binary format and (with appended xmt) in textual format, from IETF RFC 4337 (March 2006).
Internet Media Type video/mp4v-es
Additional MIME types referred to in various documents. IETF RFC 3016 reports that MIME types may have indicators for data rate or profile-level appended to them.
Magic numbers Not found.  Comments welcome.   
Uniform Type Identifier (Mac OS) mpg4
Similar in function to a filename; the mpg4 type code is documented in IETF RFC 4337 (March 2006).
File type brand (ISO Base Media File Format) mp42
ISO_BMFF includes a file type box that contains major and minor brands (identifiers); this brand is specified in Part 14, Section 4 (ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format, p. 6).

Notes Explanation of format description terms


The four file formats associated with the ISO/IEC 14496 family of specifications are:

  • MP4_FF_1, "version 1" from Part 1 (2001)
  • MP4_FF_2, "version 2," this document, from Part 14
  • MP4_FF_AVCE, for Advanced Video Coding extensions, from Part 15
  • MP4_XMT, "textual format" from Part 11

Version 2 is very similar to its predecessor MP4_FF_1 as both owe a debt to the QuickTime file format that preceded them. This lineage is shared with the supertype for MP4_FF_2, ISO_BMFF, defined in Parts 12 of both the MPEG-4 and JPEG 2000 standards.

Note that "object-oriented building blocks" are called boxes in this file format and its parent, ISO_BMFF; in contrast, they are called atoms in the predecessor MP4_FF_1 and QuickTime.

The object-based design of MPEG-4 is characterized as follows in Fernando Pereira and Touradj Ebrahimi's The MPEG-4 Book (Upper Saddle River, NJ: IMSC Press, 2002): "MPEG-4 is an ISO/IES standard developed by MPEG for communicating interactive audiovisual scenes. The standard defines a set of tools that provide binary coded representation of individual audiovisual objects, text, graphics, and synthetic objects. The interactive behaviors of these objects and the way they are composed in space and time to form an MPEG-4 scene are dependent on the scene description, which is coded in a binary format known as binary format for scenes (BIFS) . . . . The audiovisual streams are defined as elementary streams (ESs) and managed according to the object descriptor (OD) framework . . . . In addition, the OD framework defines additional streams for object content information (OCI), MPEG-J [Java APIs], and intellectual property management and protection (IPMP)." (p. 188)

BIFS owes a debt to the Virtual Reality Modeling Language (VRML), even as it extends VRML's capabilities and employs binary encoding. Timing of elements in MPEG-4 is managed by a Synchronization Layer (SL). The delivery of MPEG-4 content is supported by the Delivery Multimedia Framework or DMIF and its application interface.

MPEG-J is described in Part 1 of the standard (ISO/IEC 14496-1:2004). This API for the interoperation of MPEG-4 media players with Java code is  contrasted with a conventional parametric system.  "By combining MPEG-4 media and safe executable code, content creators may embed complex control and data processing mechanisms with their media data to intelligently manage the operation of the audio-visual session. The parametric MPEG-4 System forms the Presentation Engine while the MPEG-J subsystem controlling the Presentation Engine forms the Application Engine.  The Java application is delivered as a separate elementary stream to the MPEG-4 terminal. There it will be directed to the MPEG-J run time environment, from where the MPEG-J program will have access to the various components and required data of the MPEG-4 player to control it." (p. xii)


Format specifications Explanation of format description terms

Useful references


Books, articles, etc.

1 Adapted from MPEG-J (MPEG-4 Java), originally defined in part 1 of the of the MPEG-4 standard [MPEG-4 Systems Standard]. MPEG-J lets content creators embed simple or complex algorithmic control along with audio and video streams. MPEG-J enables monitoring of network bandwidth (and packet losses) and helps in adapting to a wide range of dynamically varying network conditions.

Last Updated: 03/14/2024