skip navigation
  • Ask a LibrarianDigital CollectionsLibrary Catalogs
  •  
The Library of Congress > Preservation > Resources > Recommended Formats Statement
Preservation
  • Preservation Home
  • About
  • Collections Care
  • Conservation
  • Digital Preservation
  • Emergency Management
  • En Español
  • FAQ
  • Preservation Science
  • Resources
  • Outreach & Training Opportunities
  • Have a preservation question?
    Ask-a-Librarian

Related Links

  • Donate
  • Blog: Guardians of Memory, Preserving the National Collection
  • Audio-Visual Preservation
  • National Film Preservation Board
  • National Recording Preservation Board

Recommended Formats Statement


{ subscribe_url: '/share/sites/Bapu4ruC/preservation.php' }
« Back to Recommended Formats Statement
Main | Table of Contents | Introduction | Summary of Digital Format Preferences | Textual Works | Still Image Works | Moving Image Works | Audio Works | Musical Scores | Datasets | GIS, Geospatial and Non-GIS Cartographic | Design and 3D | Software and Video Games | Web Archives | Email

Summary of Digital Format Preferences

This is a summary table of digital file format preferences. See each content category for more information and context.

i. Summary of Digital  Format Preferences 
i. Summary of Digital Format Preferences
Content Category Content Details Preferred Acceptable
Textual Works Digital (In order of preference)
  1. XML-based markup formats, with included or accessible DTD/schema, XSD/XSL presentation stylesheet(s), and explicitly stated character encoding
    1. EPUB3-compliant. (Other versions of EPUB are also preferred formats but EPUB3 is the most common.)
    2. BITS (Book Interchange Tag Suite) version 2.0
    3. Other widely-used book DTDs/schemas (e.g., TEI, DocBook, etc.)
  2. Page-layout formats
    1. PDF/UA (ISO 14289-1 compliant)
    2. PDF/A (ISO 19005-compliant)
    3. PDF (highest quality available, with features such as searchable text, embedded fonts, lossless compression, high resolution images, device-independent specification of colorspace, content tagging; includes document formats such as PDF/X)
  1. Other structured or markup formats
    1. XHTML or HTML, with DOCTYPE declaration and presentation stylesheet(s)
    2. XML-based document formats (widely-used and publicly-documented), with presentation stylesheet(s) if applicable. Includes DOCX/OOXML 2012 (ISO 29500), ODF (ISO/IEC 26300) and OOXML (ISO/IEC 29500).
    3. SGML, with included or accessible DTD
    4. Other XML-based non-proprietary formats, with presentation stylesheet(s)
    5. XML-based formats that use proprietary DTDs or schemas, with presentation stylesheet(s)
  2. Page-layout formats
    1. PDF (web-optimized)
  3. Other formats
    1. Rich text format (RTF)
    2. Plain text
    3. Widely-used proprietary word-processing formats
Textual Works Electronic Serials (In order of preference)
  1. Content compliant with the NISO JATS: Journal Article Tag Suite (ANSI/NISO Z39.96-2015) with XSD/XSL presentation stylesheet(s) and explicitly stated character encoding
  2. Page-layout formats
    1. PDF/UA (ISO 14289-1 compliant)
    2. PDF/A (ISO 19005-compliant)
    3. PDF (highest quality available, with features such as searchable text, embedded fonts, lossless compression, high resolution images, device-independent specification of colorspace, content tagging; includes document formats such as PDF/X
  1. Other structured or markup formats
    1. Widely-used serials or journal non-proprietary XML-based DTDs/schemas with included or accessible DTD/schema, presentation stylesheet(s) and explicitly stated character encoding.
    2. Proprietary XML-based format for serials or journals (with documentation) with DTD/schema and presentation stylesheet(s)
    3. XHTML or HTML, with DOCTYPE declaration and presentation stylesheet(s)
    4. XML-based document formats (widely-used and publicly-documented), with presentation stylesheet(s) if applicable. Includes DOCX/OOXML 2012 (ISO 29500), ODF (ISO/IEC 26300) and OOXML (ISO/IEC 29500).
  2. Page-layout formats
    1. PDF (web-optimized with searchable text)
  3. Other formats
    1. Rich text format
    2. Plain text
    3. Widely-used proprietary word-processing formats or page-layout formats
    4. Other text- or graphic-based formats not listed here that represent textual works
Still Image Works Photographs - Digital
  • TIFF (.tif)
  • JPEG2000 (.jp2)
  • PNG (.png)
  • JPEG/JFIF (.jpg)
  • Photoshop (.psd)
  • JPEG2000 Part 2 (.jpf, .jpx)
  • Digital Negative DNG (.dng)
  • Proprietary Camera Raw formats (.nef, .crw, .arw, .iiq)
  • GIF (.gif)
Still Image Works Other Graphic Images - Digital
  • TIFF (.tif)
  • JPEG2000 (.jp2)
  • PNG (.png)
  • JPEG/JFIF (.jpg)
  • Photoshop (.psd)
  • JPEG2000 Part 2 (.jpf, .jpx)
  • Encapsulated Postscript (.eps)
  • Digital Negative DNG (.dng)
  • Proprietary Camera Raw formats (.nef, .crw, .arw, .iiq)
  • GIF (.gif)
Moving Image Works Video - File-based (In order of preference)

Final production version with the original production resolution and frame rate (i.e. 1080p24; 720p60, etc.) and file-based format that was delivered to the content distributor

  1. Interoperable Master Format (IMF) consisting of
    1. Essence files as MXF tracks including video, audio, data and dynamic metadata essences
    2. Composition playlist
    3. Packaging data XML files (asset map, packing list, volume index)
  2. FFV1
    1. Version 3 only, as defined by RFC 9043
    2. Matroska (.mkv) container
  3. ProRes
    1. QuickTime (.mov) container
    2. 4444 (XQ), 4444 or 422 HQ codecs
  4. MPEG-2
    1. Compliant with ISO/IEC 13818
  5. XDCAM
    1. MXF container
    2. HD422, SHD422, HD codecs

Contact archive for guidance regarding pre-production versions.

Viewing proxy such as

  1. Recordable DVD
  2. Recordable Blu-ray disc
  3. MPEG-4 (.mp4)
Audio Works Media-independent - Digital (In order of preference)
  1. Final production /release version of content rather than pre-production version
  2. Highest native resolution PCM WAVE file of final version produced (44.1 kHz / 16 bit or higher) in addition to Compact Disc (CD) when both are produced
  3. WAVE file with embedded metadata (Broadcast WAVE) rather than without embedded metadata (LC will specify fields)
  4. File in native resolution rather than up-sampled resolution
  5. Very high resolution file formats such as DSD, PCM 176.4kHz, 192kHz up to 384kHz when produced for release in addition to Compact Disc (CD) when both are produced
  6. DSD in the released version (e.g., surround-sound or stereo)
  7. Uncompressed files rather than compressed.
  8. Compressed version in a major standard compression scheme rather than non-standard scheme
  1. Uncompressed file of final release version
  2. Highest resolution compressed version in a major standard compression scheme
  3. Lossless compression scheme rather than lossy compression scheme
Musical Scores Digital (In order of preference)
  1. XML-based markup music notational format, with included or accessible DTD/schema, XSD/XSL presentation stylesheet(s), and explicitly stated character encoding
    1. MusicXML
    2. Music Encoding Initiative (MEI)
    3. Other widely-used and publicly documented musical notation DTDs/schemas
  2. Page-layout formats
    1. PDF-UA (ISO 14289-1-compliant)
    2. PDF/A (ISO 19005-compliant)
    3. PDF (highest quality available, with features such as searchable text, embedded fonts, lossless compression, high resolution images; includes document formats such as PDF/X)
  1. Other structured or markup formats
    1. XHTML or HTML, with DOCTYPE declaration and presentation stylesheet(s)
    2. SGML, with included or accessible DTD
  2. Page-layout formats
    1. PDF (web-optimized)
  3. Other formats
    1. Widely-used proprietary music notation formats
    2. Other music composition formats (including graphics-based formats) not listed here
Datasets Datasets (In order of preference)
  1. Platform-independent, character-based formats are preferred over native or binary formats as long as data is complete, and retains full detail and precision. Preferred formats include well-developed, widely adopted, de facto marketplace standards, e.g.
    1. Formats using well known schemas with public validation tool available
    2. Line-oriented, e.g. TSV, CSV, fixed-width
    3. Platform-independent open formats, e.g. .db, .db3, .sqlite, .sqlite3
  2. Any proprietary format that is a de facto standard for a profession or supported by multiple tools (e.g. Excel .xls or .xlsx, Shapefile)

  3. Character Encoding, in descending order of preference:
    1. UTF-8, UTF-16 (with BOM),
    2. US-ASCII or ISO 8859-1
    3. Other named encoding

For data :

  1. Non-proprietary, publicly documented formats endorsed as standards by a professional community or government agency, e.g. CDF, HDF
  2. Text-based data formats with available schema

For aggregation or transfer:

  1. ZIP, RAR, tar, 7z with no encryption, password or other protection mechanisms.
GIS, Geospatial and Non-GIS Cartographic Geographic Information System (GIS): Vector Data

Most complete data (all layers, appendices), even if proprietary, with a preference for preserving the native format and projection of the data

Vector formats compatible with widely adopted GIS including

  • Shapefile, which is comprised of at least a SHP, SHX, and DBF file and optionally a PRJ (highly recommended), XML (highly recommended), SBN, and/or SBX.
  • Esri File Geodatabase
  • OGC GeoPackage
  • GeoJSON (may have scalability issues)
  • KML
  • GML
GIS, Geospatial and Non-GIS Cartographic GIS Vector and Raster Combined In order of preference:
  1. Most complete data (all layers, appendices), even if proprietary, with a preference for preserving the native format and projection of the data
  2. Vector and raster formats compatible with widely adopted GIS including:
    1. Esri File Geodatabase
    2. OGC GeoPackage
    3. Formats compatible with recommendations and tools from geospatial open source and open data communities; formats supported by well supported open source software libraries such as GDAL, OGR and GeoTools
  • TerraGo GeoPDF
  • Geospatial PDF
GIS, Geospatial and Non-GIS Cartographic GIS Raster and Georeferenced Images
  • Most complete data (all layers, appendices), even if proprietary, with a preference for preserving the native format and projection of the data
  • Raster formats compatible with widely adopted GIS including GeoTIFF
  • OGC GeoPackage

  1. TIFF (.tif) files with accompanying TIFF World File (.tfw and .tifw)
  2. GML in JPEG 2000

Design and 3D 2D and 3D Computer Aided Design (raster)
  • TIFF (.tif)
  • JPEG2000 (.jp2)
  • PNG (.png)
  • JPEG/JFIF (.jpg)
  • Digital Negative DNG (.dng)
  • GIF (.gif)
  • Photoshop (.psd)
  • JPEG2000 Part 2 (.jpf, .jpx)
  • Encapsulated Postscript (.eps)
Design and 3D 2D and 3D Computer Aided Design (vector)
  • Scalable vector graphics (.svg)
  • AutoCAD Drawing Interchange Format (.dxf)
  • Shapefile
  • Computer Graphics Metafile (CGM, WebCGM)
  • Extensible 3D (X3D)
  • 3D Manufacturing Format (3MF)
  • Non-proprietary formats endorsed as standards by a professional community or government agency, e.g. IFC, STEP
  • Page-layout formats, e.g. PDF/UA (ISO 14289-1-compliant), PDF/A (ISO 19005-compliant), PDF (highest quality available, with features such as searchable text, embedded fonts, lossless compression, high resolution images; includes document formats such as PDF/X)
  • PDF/E-1
  • Encapsulated Postscript (.eps)
  • Proprietary vector formats, e.g., AutoCAD Drawing file Family (.dwg)
Design and 3D Scanned 3D Objects (output from photogrammetry scanning) Not applicable
  • STereoLithography (.stl)
  • Reflectance Transformation Imaging (.rti)
  • Polygon File Format (.ply)
  • Wavefront (.obj)
Software and Video Games Content (In order of preference)
  1. Uncompiled source code: Version of a game that is ready to be sent to console manufacturers for certification. Contains files and folders created by a game developer and is still human readable using either a text editor, visual programming tool, or an integrated development environment (IDE).
  2. Gold master build (specific file types will vary depending on company producing build): Version of the software and video game which meets all of a publisher and platform’s requirements and is considered the finished product. If a game is released on multiple platforms, each platform will require its own preferences on how a user interacts with the game. In essence the game’s gold master build release’s look and feel is varied by the publisher and platform.
  3. Distribution file (e.g. ipa [Mac iOS], apk [Android], exe [Windows]): The distribution file is disseminated for public use regardless of the means of dissemination (on physical media or via an online source) and is comprised of one or more files from the gold standard build.
    • Media-based release: The version of a distribution file disseminated via a media-based physical object (cartridge, disc- or disk-based media, etc.). This is a type of distribution file.
    • Internet-based release: The release version of a software distributed via an online- or internet-based source, including mobile applications.
  1. Hard drive/flash drive/writable disk containing the unpublished version of game/software content
  2. Video of use, such as YouTube and Twitch streams, can substitute for the absence of a preservation copy of the game or inability to recreate significant online dependencies.
Web Archives Websites

The Library, and other organizations involved in web archiving, are preserving web content in the Web Archive (WARC) format using record-at-a-time GZIP compression, as described in Appendix A of the WARC Standard.

  • Internet Archive's ARC_IA format, a precursor to the WARC format
  • Web Archive Collection Zipped (WACZ), as used in the Webrecorder project 
  • CDX as a component file for WARC file content
Email Messages  No preferred formats at this time while the Library builds its capacity for email archiving.
  • For individual messages (as supported by client):
    • EML
    • MSG
    • PDF
  • For aggregated groups of messages (e.g., entire inbox or folder, as supported by client):
    • PST Unicode
    • PST ANSI
    • MBOX
    • PDF
  • Contact The Library of Congress for additional guidance.
Email Attachments

Attachments and embedded data should remain in their original format.

Back to Top


Back to Top

Stay Connected with the Library All ways to connect »

Find us on

PinterestFacebookTwitterYouTubeFlickr

Subscribe & Comment

  • RSS & E-Mail
  • Blogs

Download & Play

  • Podcasts
  • Webcasts
  • iTunes U 
About | Press | Jobs | Donate | Inspector General | Legal | Accessibility | External Link Disclaimer | USA.gov | Speech Enabled Download BrowseAloud Plugin