Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Uncompressed YCbCr Video Picture Stream (4:2:2)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name Uncompressed YCbCr Video Picture Stream Family (4:2:2)

A digital, color-difference component video picture stream in which the two chroma components are sampled at half the rate of luma. Reducing the horizontal chroma resolution by one-half reduces the bandwidth of the uncompressed video signal by one-third with little visual impact. As noted in the Vid_Unc_Pix format description, broadcast professionals designate the three components as YCbCr (or Y'CbCr), while specialists in data networks and computer applications tend to use the term YUV (or Y'UV).

Chroma subsampling is usually expressed as a three-part ratio (in this case 4:2:2) although it may also include a fourth part (e.g., 4:2:2:4), when alpha or transparency data is part of the stream. As explained in the Wikipedia article Chroma subsampling, the ratio describes the number of luma and chroma samples "in a conceptual region that is J pixels wide, and 2 pixels high." The three key parts of the ratio are as follows, omitting the alpha channel:

  • J: horizontal sampling reference (width of the conceptual region). Usually and in this case: 4.
  • a: number of chrominance samples (Cr, Cb) in the first row of J pixels. In this case: 2.
  • b: number of (additional) chrominance samples (Cr, Cb) in the second row of J pixels. In this case: 2.

For 4:2:2 picture data, the conceptual region consists of a block of eight pixels that "contains" 12 samples: 8 luma and 4 chroma.

Uncompressed YCbCr 4:2:2 video streams are encountered with two different sets of levels, one standardized and one ad hoc. The standardized levels are specified by the International Telecommunications Union Radiocommunication Sector (ITU-R) and are often referred to as "video range," "legal levels," or "studio swing." These levels carry values from 16-235 for Y and 16-240 for Cr and Cb, assuming 8 bits per sample (higher values if 10-bit). The specification for "last generation" standard definition picture is ITU-R Recommendation BT.601 (often called Rec. 601 or by its former name, CCIR 601). BT.601 encoding of North American 525-line 60 Hz and European (and other) 625-line 50 Hz signals (both interlaced) yields 720 luminance samples and 360 chrominance samples per line (non-square pixels). The specification for "current generation" digital picture is ITU-R BT.709 and it codifies interlaced and progressive scanned picture at a variety of picture sizes and frame rates (square pixels in the specification's later versions). In professional video production, BT.601 and BT.709 signals are carried by the SMPTE-standardized serial digital interfaces (SDI, HD-SDI, etc.). Meanwhile, ad hoc uncompressed YCbCr 4:2:2 video streams with "wide range" or "super white" levels (from 0-255, assuming 8 bits per sample) may be produced in desktop computer graphics systems. In all cases--BT.601, BT.709, and "wide range"--the data for a pair of pixels are stored in the order Cb-Y1-Cr-Y2, with the chrominance samples co-sited with the first luminance sample. Some additional information is provided in the Notes.

Uncompressed 4:2:2 video picture streams are encoded (some would say serialized or formatted) when they are incorporated into files, using wrappers like AVI, Quicktime, Matroska, and MXF. The actual byte structure for the encoded picture-data is governed by semi-formal specifications and conventions. In many cases, these encodings are identified by their FOURCC codes, a widely used four-character identifier system often associated with Microsoft (but see the complex history in the Wikipedia article FourCC, consulted January 8, 2013). There are a number of possible encodings for uncompressed video; the FOURCC YUV page (consulted January 8, 2013) provides codes for 30 packed and 20 planar encodings (see Notes below).

The three subtypes described at this Web site at this time are ones frequently used by preservation-oriented archives when reformatting older analog and media-dependent digital videotapes. All are packed formats and our description titles use the encodings' FOURCC codes. The descriptions provide additional identifying tags from Apple (sometimes consisting of Apple's own four-character codes), the ffmpeg organization, and the Society of Motion Picture and Television Engineers (SMPTE).

Production phase Employed in creation(initial phase), post-production or editing (middle phase), and dissemination (final phase).
Relationship to other formats
    Subtype of Vid_Unc_Pix, Uncompressed YCbCr Video Picture Stream Family
    Has subtype V210, V210 Video Picture Encoding
    Has subtype UYUV, UYUV Video Picture Encoding
    Has subtype YUY2, YUY2 Video Picture Encoding
    Has subtype Other 4:2:2 uncompressed video picture encodings. Not described at this Web site at this time.

Local use Explanation of format description terms

LC experience or existing holdings Underpins many video streams in LC collections, both in digital videotape and in files.
LC preference Not applicable.

Sustainability factors Explanation of format description terms

Disclosure Not relevant to this description; see Disclosure information for the subtypes listed under Relationships.
    Documentation Not applicable. Several relevant online articles are cited in Useful references below.
Adoption Widely adopted.
    Licensing and patents None.
Transparency Transparent.
Self-documentation Not applicable.
External dependencies None.
Technical protection considerations None.

Quality and functionality factors Explanation of format description terms

Moving Image
Normal rendering Supported
Clarity (high image resolution) Excellent, with 10-bit sampling surpassing 8-bit. Streams with 4:4:4 chroma subsampling would provide greater clarity; 4:2:0 and other streams would provide less.
Functionality beyond normal rendering Not applicable.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Pronom PUID See note.  PRONOM has no corresponding entry as of November 2018.
Wikidata Title ID See note.  Wikidata has no corresponding entry as of November 2018.

Notes Explanation of format description terms


BT.601 and BT.709. The International Telecommunications Union Radiocommunication Sector (ITU-R) BT.601, published in 1987, was designed to provide a common digital standard for interoperability between the three analog video/TV systems (NTSC, PAL, and SECAM). ITU-R BT.601 enables their signals to be converted to digital and then easily converted back again to any of the three for distribution. Meanwhile, version 1 of BT.709 was published in 1990 and has seen a number of significant changes and extensions; version 5 was published in 2008.

The Farlex Free Dictionary entry ITU-R BT.709 (consulted January 8, 2013) summarizes the sampling frequencies used for both standards: BT.601(standard for SDTV), Luma sampling rate = 13.5 MHz, chroma sampling rate=6.75 MHz (4:2:2); BT.709 (standard for HDTV), Luma sampling=74.25 MHz, chroma sampling=37.125 MHz (4:2:2).

Video and wide range. The ITU-R specifications support values for Y, Cb, and Cr that conform to what is sometimes called "video range," "legal range," or "studio swing" levels. Expressed in terms of 8-bit tonal range values, video range has a 16-235 levels for Y and 16-240 levels for Cr and Cb. The term video range is used to contrast with "wide range" or "super white " values from 0 to 255, sometimes used when video signals are created using computer-based graphics applications.

Packed and planar encodings. In a packed video-picture encoding, the Y, Cb (U) and Cr (V) samples are packed together into macropixels stored in a single array. This contrasts with planar formats where each component is stored as a separate array with the final image consisting of a fusing of the three separate planes. According to the Wiki from the non-profit VideoLAN organization, "Packed formats are very popular inside webcams. In hardware, using separate planes is inefficient: several memory accesses are needed for each pixel. Packed formats are easier and thus cheaper to use. On the other hand, packet formats cannot normally deal with vertical sub-sampling. Otherwise scan lines would have different sizes. So generally, packed formats are horizontally subsampled, especially by a factor of 2 (i.e. YUV 4:2:2)."


Format specifications Explanation of format description terms

Useful references


Last Updated: 08/04/2021