Sustainability of Digital Formats: Planning for Library of Congress Collections |
|
![]() |
|
Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact |
Full name | Digital Forensics XML |
---|---|
Description |
Digital Forensics XML (DFXML) is an Extensible Markup Language (XML) language designed to represent a wide range of forensic information and processing results. Since its inception in 2007, DFXML has served the purpose of archiving forensic processing steps, reducing the need for re-processing digital evidence. It acts as an interchange format, facilitating the sharing of structured information between independent tools and organizations. The format was created to establish a shared way for forensic software tools to exchange information. DFXML captures metadata and provenance information about the operation of software tools. Initially designed to represent the output of digital forensics tools, particularly SleuthKit tools, DFXML has expanded to operate with bulk extractor digital forensics tools (DFXML Python and DFXML C++). DFXML is versatile and capable of representing various forensic information such as disk images; files; file system metadata; moves, adds, and changes (MAC) times; file hashes; sector hashes; transmission control protocol (TCP) flows; and hash sets. It provides provenance details, including the origin of data, classification, use restrictions, and the tools employed in the forensic process. Moreover, DFXML can document provenance, including details about the computer on which the application program was compiled, linked libraries, and the runtime environment, proving useful in research and courtroom testimony. DFXML plays a pivotal role in enhancing composability by providing a language for describing common forensic processes (e.g., cryptographic hashing), forensic work products (e.g., the location of files on a hard drive), and metadata (e.g., file names and timestamps). It serves as the basis for a Python module (dfxml.py), simplifying the creation of sophisticated forensic processing programs. |
Production phase | Middle-state and archival. |
Relationship to other formats | |
Subtype of | XML, Extensible Markup Language |
LC experience or existing holdings | None |
---|---|
LC preference | The Library of Congress has not yet expressed any format preference for digital forensic data. |
Disclosure | Fully disclosed. The original DFXML source code repository is now considered legacy and directs users to the official schema repository, version 1.2.0. The legacy repository is retained for historical reasons, housing legacy GitHub Issues and maintaining historical version control that wasn't transferred to the new repository. Additionally, DFXML has official Python and C++ codebases. |
---|---|
Documentation |
|
Adoption |
The assumed maintainer of the specification is the National Institute of Standards and Technology. Comments welcome. Widely adopted. A non-exhaustive list includes:
|
Licensing and patents | According to the specification, DFXML was “developed by the National Institute of Standards and Technology by employees of the Federal Government in the course of their official duties”. DFXML is hosted by the National Institute of Standards and Technology (NIST) at the National Software Reference Library (NSRL). |
Transparency | DFXML is open and text-based, and thus can be read using basic text editors. However, deployment of DFXML requires the use of complex tools. |
Self-documentation |
DFXML is identified through the XML namespace: <dfxml>. For example: <dfxml xmlns="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dfxmlext="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML#extensions" version="1.2.0"> |
External dependencies | For use with XML Schema Definition (XSD), the specification states: If you intend to use this file as a DFXML document validator, note that you will also need to download two accompanying .xsd files under the "ref" directory. The easiest way to do this is by downloading the repository as a Git clone, or by downloading the zip archive from the GitHub page. |
Technical protection considerations | None. |
Tag | Value | Note |
---|---|---|
Filename extension | dfxml xml |
DFXML can be embedded into other XML-based formats or as a standalone document. Documents may informally have the file extension of .XML or .DFXML, such as this sample file system report. |
Internet Media Type | See related format. | See XML. |
Magic numbers | See related format. | See XML. |
Indicator for profile, level, version, etc. | See note. | Version for the schema that generated the XML is required. See line 68 of the schema. |
XML DOCTYPE declaration | See note. |
"dfxml" is the XML namespace. At minimum the required declaration is: <dfxml xmlns="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dfxmlext="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML#extensions" version="1.2.0"> Version is required, notably this part: xmlns:dfxml="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" Full schema from the XSD/standard: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dfxml="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" targetNamespace="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" elementFormDefault="qualified"> <dfxml xmlns="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dfxmlext="http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML#extensions" version="1.2.0"> |
Pronom PUID | See note. | PRONOM has no corresponding entry as of January 2023. |
Wikidata Title ID | Q105855984 |
Digital Forensics XML file format. See https://www.wikidata.org/wiki/Q105855984. |
Wikidata Title ID | Q16956577 |
In addition to the Wikidata for the DFXML file format, there's also a Wikidata for the Digital Forensics XML language. See https://www.wikidata.org/wiki/Q16956577 |
General |
DFXML Structure The DFXML structure consists of metadata, creator information, runtime configuration, volume details, and file objects. The XML format serves to encapsulate Dublin Core metadata, runtime CPU usage information, and specifics about volumes and individual files. Example, from Garfinkel’s “DFXML and Other Standards”: <dfxml> <metadata> Dublin Core Metadata </metadata> <creator> The program that made this DFXML </creator> <configuration> Runtime Configuration </configuration> <volume> Information about Volumes </volume> <fileobjects> <fileobject> Information about a file </fileobject> </fileobjects> <rusage> Runtime CPU usage information </rusage> </dfxml> Tools for DFXML Validation The command line tool xmllint, commonly used for parsing XML files, is employed for validating DFXML against its XML Schema. Notably, xmllint can validate both DFXML and another schema called RegXML. The latter, similar to DFXML, has official documentation and a schema available on GitHub. According to Forensics Wiki, RegXML is an XML syntax analogous to DFXML. While it uses parts of DFXML in its schema, its official documentation and schema are independently available on GitHub. |
---|---|
History |
Developed by Simson L. Garfinkel, DFXML has been employed in forensic data description since 2006. The original DFXML paper by Garfinkel in 2009, “Navigating Unmountable Media with the Digital Forensics XML File System”, introduced Fiwalk as an extension to The SleuthKit. Fiwalk utilizes The SleuthKit's internal bindings for direct storage parsing and reports the results in XML format. |
|