Tools for preservation metadata implementation
This document contains information about tools (e.g. software, scripts, stylesheets) which support the implementation of preservation metadata, particularly as defined in the PREMIS data dictionary. Tools may be categorized as doing one or more of the following. As of this writing (July 2007) not all categories are represented here. Tools listed were not necessarily developed specifically for PREMIS, but may be used for implementation of preservation metadata more generally, and their relationship to PREMIS is stated.
- Tools for extracting technical metadata from objects
- Tools for converting extracted metadata into the PREMIS XML schema elements
- Tools for generating a METS object with appropriate slots for PREMIS metadata (i.e., amdSec with digiProv, techMD, etc.)
- Tools for converting Jhove output to PREMIS elements
- Tools for recording events and outcomes (e.g. format validation, fixity check, etc.)
Listings include what the tool does, who developed it, when and for what purpose it was developed.
Archivists' Toolkit (University of California, San Diego, New York University, and the Five Colleges, Inc.)Description: The Archivists' Toolkit is an open source archival data management system to provide integrated support for accessioning, description, donor tracking, name and subject authority work, and location management for archival materials. It includes integrated support for managing archival materials from acquisition through processing, a customizable interface, ingest of legacy data in multiple formats (e.g. EAD and MARCXML), rapid data entry interface for creating container lists, generation of reports, export of EAD 2002, MARC XML, METS, MODS, and Dublin Core, and support for desktop or networked, single- or multi-repository installations. Availability: The source code has not yet been made available generally. Contact info@archiviststoolkit for further information. Documentation URL: http://www.archiviststoolkit.org/ Last update: July 25, 2007 |
DAITSS (Florida Center for Library Automation)Description: DAITSS is an OAIS compliant open source preservation repository system which supports ingest, dissemination, and preservation strategies based on format transformation. It has no online public access component but can be used as a preservation back-end to institutional repository or digital library systems. Tool URL: http://daitss.fcla.edu/ Documentation URL: http://daitss.fcla.edu/wiki/DocumentationPage Last update: July 25, 2007 |
DROID (The National Archives (UK))Description: DROID (Digital Record Object Identification) is an automatic file format identification tool developed in conjunction with the PRONOM online registry of technical information by the National Archives of the UK. Technical information about the structure of file formats, and the software and hardware environments required to support them is included in PRONOM, which was developed initially as an internal resource for National Archives staff, and subsequently as a public, web-based resource. DROID uses byte signatures stored in PRONOM to identify and report the specific file format versions of digital files. DROID detects the addition of new signatures to the PRONOM database and automatically downloads updates via the Web, ensuring that it is always up-to-date. It is designed for batch processing, and can be used via a GUI or a command line interface, to support integration with other systems. DROID is a standalone, platform-independent Java tool, and is freely available to download from the PRONOM website. Tool URL: http://www.nationalarchives.gov.uk/aboutapps/pronom/ Documentation URL: http://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/droid_api_1.rtf Last update: July 25, 2007 |
Echodep (University of Illinois Urbana/Champaign)Description: ECHO DEPository is a digital research/development project at the University of Illinois Urbana-Champaign in partnership with OCLC and funded by Library of Congress under the National Digital Information Infrastructure Preservation Program (NDIIPP). The HandS tool suite is a package comprised of various components that provide open source tools in the context of the Echodep METS profiles. Tool URL: http://sourceforge.net/projects/echodep/ Documentation URL: http://dli.grainger.uiuc.edu/echodep/HnS/JavaDocs/ Last update: July 25, 2007 |
JHOVE (JSTOR/Harvard Object Validation Environment)Description: JSTOR and the Harvard University Library collaborated on this project to develop an extensible framework for object validation. Representation information (format type) is important to all digital repositories, since ingest, storage, access, and preservation decisions may be made depending upon the format, and it is necessary to automate the process of identifying and validating formats of digital objects. JHOVE performs format-specific identification, validation, and characterization of digital objects. Such actions are performed by modules for various format types and the output from the process is controlled by output handlers, using an extensible plug-in architecture. JHOVE is a format-specific digital object validation application program interface (API) written in Java. It is available for downloading as either a command line interface or a GUI interface. Tool URL: http://hul.harvard.edu/jhove/distribution.html Documentation URL: http://hul.harvard.edu/jhove/documentation.html Last update: July 25, 2007 |
METS Java Toolkit (Harvard University Library)Description: This tool uses Java to construct, validate, and process METS objects. It allows for reading in a METS document and using it as a Java object, where it can be modified and the resulting METS written out. The toolkit is a Java binding framework in which each particular schema element of a METS file (e.g. techMD, @LABEL) is represented in memory by an instantiated object where nodes and values can be set and then it can be added to the content of model of its parent. The toolkit supports both local and global validation of METS files. Tool URL: http://hul.harvard.edu/mets/download.html Documentation URL: http://hul.harvard.edu/mets/doc/ Last update: July 25, 2007 |
New Zealand metadata extractor (National Library of New Zealand)Description: The Metadata Extraction Tool was developed by the National Library of New Zealand to programmatically extract preservation metadata from a range of file formats. It is designed to automatically extract preservation-related metadata from digital files and output that metadata in XML formats for use in preservation activities. It is now available as open source software. Tool URL: http://meta-extractor.sourceforge.net/ Documentation URL: http://meta-extractor.sourceforge.net/documentation.htm Last update: July 25, 2007 |
PREMIS in METS Toolbox (Florida Center for Library Automation)Description: The PREMIS in METS Toolbox is a set of open-source tools developed to support the implementation of PREMIS in the METS container format. It provides the following: Tool URL: http://pim.fcla.edu Documentation URL: http://pim.fcla.edu/resources Programming language: Schematron + XSLT + Ruby Operating system/runtime environment: Any OS that supports these; tested on linux but any flavor of unix should do Licensing: no restrictions Version: 0.2.1.2 (beta) Last update: November 05, 2009 |
Rosetta (ExLibris)Description: Rosetta is a commercial product developed by ExLibris for the management of digital assets in libraries and academic environments, enabling institutions to create, manage, preserve, and share locally administered digital collections. Rosetta consists of a number of modules, each designed to address different needs, functions, and workflows pertaining to the life cycle of a digital object, including ingestion and metadata extraction, creation of a METS object, ability to edit metadata (both descriptive and technical). Availability: From Ex Libris as a commercial product. Tool URL: http://www.exlibrisgroup.com/category/RosettaOverview Documentation URL: http://www.exlibrisgroup.com/category/RosettaOverview Last update: November 04, 2009 |
Statistics New Zealand Prototype PREMIS Creation ToolDescription: This tool is a set of programs using XSL and VBScript that takes output from Jhove, the New Zealand Metadata Extractor, and DROID and produces PREMIS object records. It can run on single or multiple files. To create PREMIS output, an XSL stylesheet is run to bring all outputs together. The resulting file consists of a stream of multiple PREMIS object records, which may be split into separate files using a script which splits them. The PREMIS object schema has been slightly modified to allow for keeping information on the source of the values in each element. Availability: Requires login and password; see: http://www.loc.gov/standards/premis/pigInfo.jpg Tool URL: http://pigpen.lib.uchicago.edu:8888/pigpen/40 Documentation: Requires login and password; see: http://www.loc.gov/standards/premis/pigInfo.jpg Documentation URL: http://pigpen.lib.uchicago.edu:8888/pigpen/40/Creating_premis_object_records.doc Last update: July 25, 2007 |
