Sustainability of Digital Formats: Planning for Library of Congress Collections |
|
![]() |
|
Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact |
Full name | Common Data Format |
---|---|
Description | CDF is a conceptual data abstraction for storing, manipulating, and accessing multidimensional data sets. The basic component of CDF is a software programming interface that is a device-independent view of the CDF data model. In addition to the actual data being stored, CDF also stores user-supplied descriptions of the data, known as metadata. This self-describing property allows CDF to be a generic, data-independent format that can store data from a wide variety of disciplines. The application developer is insulated from the actual physical file format for reasons of conceptual simplicity, device independence, and future expandability. CDF files created on any given platform can be transported to any other platform to which CDF is ported and used with any CDF tools or layered applications. CDF Version 2.7 and up contain support for Java Application Program Interfaces (APIs), in addition to the C and Fortran APIs of earlier versions. |
Production phase | Generally used for middle- and final-state archiving. |
Relationship to other formats | |
Has subtype | Has several versions not documented separately here. |
LC experience or existing holdings | None |
---|---|
LC preference | The Library of Congress Recommended Format Specifications for Datasets lists the CDF file format as an acceptable format. |
Disclosure | Fully documented. Specifications of the format and the APIs in Java, C, and Fortran are freely available. Source code for the CDF software package is also freely available. |
---|---|
Documentation |
Available from http://cdf.gsfc.nasa.gov/. Documentation includes CDF User's Guide and complete list of APIs and their descriptions in reference manuals for the supported programming languages. Maintained by the Space Physics Data Facility (SPDF) at NASA/Goddard Space Flight Center. |
Adoption |
In use in various versions since 1985. From CDF FAQ: "The CDF software package is used by hundreds of government agencies, universities, and private and commercial organizations as well as independent researchers on both national and international levels. CDF has been adopted by the International Solar-Terrestrial Physics (ISTP) project as well as the Central Data Handling Facilities (CDHF) as their format of choice for storing and distributing key parameter data." CDF is supported by commercial and open source data analysis/visualization software such as IDL, MATLAB, and IBM's Data Explorer (XP). |
Licensing and patents | None. |
Transparency | TBD. |
Self-documentation |
CDF control information acts as an embedded data dictionary. Additional metadata appropriate for any particular dataset can be stored as attribute entries as part of the application data within the CDF. Guidelines for the Space Physics community are found at http://spdf.gsfc.nasa.gov/sp_use_of_cdf.html |
External dependencies | None. |
Technical protection considerations | None. |
Dataset | |
---|---|
Normal functionality | Good support. Structured representation of typed data. |
Support for software interfaces (APIs, etc.) | The basic component of CDF is a software programming interface that is a device-independent view of the CDF data model. Hence the specification focuses on an API rather than on organization of data in files. APIs in Fortran and C are available for all versions, in Java for version 2.7 and up. |
Data documentation (quality, provenance, etc.) | Capabilities for embedding user documentation for the dataset as a whole or for particular elements through a data dictionary can support documentation of precision, provenance, etc. |
Beyond normal functionality |
CDF is designed to support multi-dimensional data. The CDF structure is based on variable definitions (name, data type, number of dimensions, sizes, etc.) where a collection of data elements is defined in terms of a variable. The structure of CDF allows one to define an unlimited number of variables completely independent (loosely coupled) of one another and disparate in nature, a group of variables that illustrate a strong dependency (tightly coupled) on one another or both simultaneously. Compared to HDF format, CDF permitted cross-linking data from different instruments and spacecraft in ISTP with one development effort (according to https://web.archive.org/web/20160801173718/http://nssdc.gsfc.nasa.gov/nost/fep/researcher-szabo-cdf.html). |
Tag | Value | Note |
---|---|---|
Filename extension | cdf |
From http://www.fileinfo.com/. |
Wikidata Title ID | Q1116060 |
See https://www.wikidata.org/wiki/Q1116060. |
General | In 2002, the CDF office developed an XML-based markup language called CDF Markup Language (CDFML) to describe CDF data and metadata. Translators among various data formats, including CDF are available at http://cdf.gsfc.nasa.gov/html/dttools.html |
---|---|
History |
CDF was designed and developed in 1985 by the National Space Science Data Center (NSSDC) at NASA/GSFC. CDF was originally written in FORTRAN and only ran in VAX/VMS environments. CDF V3.0 was released on February 10, 2005. V3.0 is backward compatible with CDF V2.7, V2.6, and V2.5, but not vice versa. Libraries for CDF 3.0.0 and later will read a file that was created with CDF 2.5, 2.6, or 2.7 library, and save the file in the version that was originally created under (not 3.0). A file created from scratch with CDF 3.0.0 or later will be stored in the new format. The 3.0 format is incompatible with the previous versions of the CDF library. As of December 2021, the latest version of the CDF library is 3.8.1. |
|