NAME: Metadata, Dublin Core and USMARC: a review of current efforts
SOURCE: Library of Congress
SUMMARY: This paper summarizes discussions about developing standards for a simple resource description record for Internet resources (metadata). A series of workshops have convened that have made progress on developing common models for description of Internet resources to support resource discovery and retrieval. The paper reviews the results of each of these workshops, including the Dublin Metadata Workshop held in Dublin, Ohio in March 1995, the Warwick Metadata Workshop held in Coventry, UK in April 1996, and the Image Metadata Workshop held in Dublin, Ohio in September 1996. Some projects experimenting with Dublin Core style metadata are discussed. In addition, the paper presents the revised list of data elements (the "Dublin Core") and gives a mapping to USMARC fields (revised from Discussion Paper No. 86).
KEYWORDS: Metadata; Dublin Core
RELATED: 96-2 (Jan. 1996); DP86 (June 1995)
1/21/97 - Forwarded to USMARC Advisory Group for discussion at the 1997 Midwinter MARBI meetings.
2/16/97 - Results of USMARC Advisory Group discussion - Stu Weibel from OCLC updated the group on milestones since the Image Metadata Workshop in September. These include: changes to the Dublin Core and general consensus that it was stable; a statement of consensus about the element set and embedding it in HTML in a recently released Internet Request for Comments (RFC); the upcoming fourth metadata workshop in Canberra, Australia; agreement on changing the PICS (Platform for Internet Content Selection) standard to enable the use of metadata packages. Specific comments about the mapping of Dublin Core to USMARC detailed in the discussion paper should be sent to LC.
DISCUSSION PAPER NO. 99: Metadata, Dublin Core and USMARC 1. BACKGROUND The term "metadata" has been increasingly used in various communities interested in information on the Internet to mean data about information resources being made available. Bibliographic records, which have been created for many years in the library world, are essentially metadata; they provide descriptive and other information about an information object. With the rapid development of the World Wide Web and the increasing numbers of Internet resources available, metadata for these resources are necessary for effective resource discovery and retrieval. The USMARC Advisory Group considered Discussion Paper No. 86 (Mapping the Dublin Core Metadata Elements to USMARC) in June 1995. The paper reviewed the developments concerning metadata at the OCLC/NCSA Metadata Workshop in March 1995 and presented a mapping of the elements to MARC fields. In addition, it considered problems in the mapping and options for their resolution. The Library of Congress agreed to keeping the group current on future activities on this subject. A proposal was initiated and approved based on this effort (Proposal No. 96-2: Define a Generic Author Field in the Bibliographic, Classification, and Community Information Formats). Since the initial workshop in 1995, two additional ones were convened. At the same time, other metadata standards for more specific applications of metadata have been developed and used (e.g. geospatial). As a result of many discussions (especially conducted by electronic mail after the workshops), the list of Dublin Core data elements has been slightly revised. In addition some participants of the second workshop at the University of Warwick, UK in April 1995 have issued papers dealing among other things with syntax and implementation. Some experiments have been initiated to use metadata, particularly in Web documents. 2. DUBLIN METADATA WORKSHOP (March 1995) The OCLC/NCSA Metadata Workshop, held in Dublin Ohio on March 1-3, 1995, was organized by OCLC and the National Center for Supercomputer Applications (NCSA) to address the problem of providing metadata for network-accessible materials. The original intent was to recognize various "stakeholder" communities with an interest in the search and retrieval of Internet resources, to understand the uses descriptive metadata would serve for these communities, and to achieve if possible some consensus on a limited data element set for identifying these resources. Workshop participants included librarians and archivists, researchers, computer and information scientists, software developers, publishers, and members of Internet Engineering Task Force (IETF) working groups. Within these constituencies there was tremendous diversity of approach. Some participants were concerned with electronic data resources in general while others focused on particular types of materials, such as humanities texts or geospatial metadata. Some were interested in the network services and protocols that would make use of the metadata, while others took the point of view of the author, publisher or end-user. The one thing that united all participants was a belief that nearly any standard metadata would be better than none, since there had been little agreement and no standardization at the time. Nonetheless, early in the course of the workshop it became evident that no single data element set whether limited or unlimited would satisfy the widely divergent and highly specific needs of the various stakeholders. The emphasis therefore shifted to something that was perceived as both useful and doable: the definition of a simple data element set that could be used by information providers to describe their own resources. The goal was to draft a single sheet of instructions that an author or publisher mounting a document on a network server would be able to follow without excessive effort or additional knowledge. Such a data element set, if it could become an official or de facto standard, would have several uses. It would encourage authors and publishers to provide metadata simultaneously with their data. It would allow the developers of authoring tools for network publishing to include templates for this information directly in their software, making it even easier for the information providers to supply it. The metadata created by the information providers would serve as a basis for more detailed cataloging or description when warranted by specific communities. In addition, it would ensure a common core set of elements that could be understood across communities, even if more specific information was required within a particular interest group. Because of the inadequacy of current search engines to provide relevant search results given the huge number of Internet resources being searched, fielded searching could be provided to allow for more precision if metadata were available for them. Because it was agreed that a fairly short list of data elements would be most useful and simple for naive users to use, the concept of extensibility was established. The metadata element set concentrated on describing intrinsic properties of the resource. Extrinsic data, such as cost, access limitations, etc. was considered outside the scope of the core set. The extensibility mechanism would allow for the base set to be extended for a variety of purposes. (Implementation considerations were considered in the second metadata workshop.) The extensibility mechanism means that a particular user community may establish a list of additional elements that may be incorporated for specialized purposes. A scheme sub-element is defined for some of the elements in the core set and may also be used for the extensible sets. This allows for the specification of established schemes or sets of rules that govern the syntax or semantics of an element. The simple resource description record that emerged from this first metadata workshop has come to be known as the "Dublin Core". It is a core set in the sense that it is a small number of elements, judged to have general applicability, that will be universally understood if the standard is followed. It is not a core data element set in the sense of being a minimum number of required elements. 3. WARWICK METADATA WORKSHOP (April 1996) Because the implementation of a simple resource description record requires a formal syntax and deployment strategy, the second metadata workshop was convened. It was organized by the UK Office for Library and Information Networking (UKOLN) and OCLC's Office of Research. The agenda included the identification and resolution of impediments to the deployment of a Dublin Core resource description record. The participants at the meeting recognized the need for a wider set of metadata types and a framework for extensibility for interchange of different types of metadata. There was consensus in many areas, and broader plans for implementation were considered. Specific areas of consensus that resulted in separate documents subsequent to the workshop included: Syntax: a concrete syntax for the Dublin Core was established, expressed as a Document Type Definition (DTD) in Standard Generalized Markup Language (SGML); this syntax was mapped to existing HyperText Markup Language (HTML) tags for embedding metadata in Web documents. Warwick Framework: a container architecture for bringing together different packages of metadata, which are separately accessable and maintainable. This allows for extensibility for other types of metadata not part of the Dublin Core element set (e.g., terms and conditions, administrative data). User's Guide: a guide to authors for preparing resource descriptions and for administrators of collections. This would include both a simple high-level guide and another one for more complex resource descriptions. Several proposals dealing with syntax have been presented subsequent to the workshop, including MIME and SGML-based implementations. In addition, participants at the World Wide Web Consortium Workshop on Distributed Indexing and Searching reached consensus on embedding metadata in HTML. This provides an impetus for implementation to begin. Since no single element set will satisfy all metadata requirements for different communities with different levels of complexity, the Warwick Framework was embraced as an important step forward and a necessity for accommodating a number of metadata models. Participants at the workshop agreed that some applications would need a fuller resource description record than the Dublin Core set of elements provides, and that other types of metadata outside that core are also needed (such as administrative, terms and conditions, etc.) The Warwick Framework provides an architecture that could satisfy the need for complementary or overlapping metadata models and allows for the interchange of different metadata packages. In the Warwick Framework, a package is an object with a specific type of metadata for a particular purpose. The metadata packages may be embedded in the object described or may exist separately with a URI (i.e., a URL or Uniform Resource Name (URN)) reference. These packages are brought together in a conceptual container through linkages. A Dublin Core type record might be one package, a MARC record another. Other packages might include (this list is not exhaustive): terms and conditions, domain-specific metadata (e.g. geospatial), rights, administrative data. Participants agreed that a registry of metadata package types was necessary. There are several issues that require examination before finalizing the framework. These include the overlapping of data elements in multiple metadata sets; how a type registry will work and how the system would deal with new metadata types; what the syntax for transferring sets of packages will be; what sort of structure will be used for encoding the data in each package; how efficient this distributed architecture will be on the Internet; and how retrieval of metadata will work. A draft user guide was prepared by a subgroup after the workshop and distributed, but its future is uncertain because of the many new issues that were presented. However, there is general consensus that a user guide will be necessary, probably in two forms. One would be a simple user guide addressing the basic Dublin Core elements that would be readily understood by the casual Internet user putting up documents. Another would be a more complex one incorporating qualifiers and flags for richer, more complex resource descriptions. 4. CNI/OCLC WORKSHOP ON METADATA FOR NETWORKED IMAGES (Sept. 1996) Another workshop was held in Dublin, Ohio in September 1996 to explore the usefulness of extending the Dublin Core element set to the area of digital images. During the first workshop, the scope was limited to "document-like objects (DLO)" in order to make progress. Although this term was never fully defined, many considered a document-like object to be essentially text. This third metadata workshop focused on the description of visual resources such as photographs, slides, and image files. It was decided that databases or applications having visual outputs of a dynamic nature need to be considered separately. The consensus of the workshop was as follows: "The Dublin Core, within the context of the Warwick Framework, affords a foundation for the development of a simple resource description model to support network-based discovery of images (items or collections of items, online or offline)." One refinement to the Dublin Core that the group supported was the addition of a rights element, to be used either with simple rights information about the object or as a link to a separate rights metadata package if the information were more complex. In addition, it endorsed the inclusion of the Coverage element (about which there had been some controversy over whether it was too domain specific), which is very important for images. It is important to note that only within the context of the Warwick Framework is the Dublin Core considered adequate, since richer, domain-specific description will always be necessary for specialized applications. 5. PROJECTS USING DUBLIN CORE METADATA There are several projects that are experimenting with using the Dublin Core. Many of these involve interoperability between different databases including MARC records, digital objects, and other forms of metadata. The following are a few of these (mainly those that include experimentation with MARC records); some descriptions are from the Dublin Core home page. (http://purl.oclc.org/metadata/dublin_core): SOLINET's Monticello Electronic Library. The basic function of Monticello Electronic Library is to link distributed regional resources regardless of source or type of information. The Dublin Core Element Set is being used to provide semantic interoperability between several databases of electronic media and record types including SGML EAD Finding Aid, MARC and GILS collections. The National Library of Australia and the National Library of New Zealand's National Document and Information Service (NDIS) Project. National Document and Information Service (NDIS) Project (National Library of Australia) will provide access to 70 databases covering law, journals, library resources, community resources, and research. The Dublin Core will be used to normalize a wide variety of data under a single structure. Resource Discovery Project, Distributed Systems Technology Centre (Australia). This project is investigating issues with locating and retrieving information in large networked environments (e.g. the Digital Library). Technical problems under consideration are resource access, resource services, and resource description. The project has a prototype system that maps MARC records to the Dublin Core, which then merges them with Web resources also mapped to the Dublin Core. The Nordic Metadata Project. Among Nordic countries there is a special need for a shared metadata creation system, as it will facilitate further the already active use of ILL and document delivery services within Scandinavia. The Dublin Core is being used to provide and enhance end-user services by making a diversity of digital documents more easily searchable and deliverable over the Net. The Nordic Metadata Project is creating conversion programs between NORMARC (similar to USMARC) and Dublin Core data elements. Library of Congress Ameritech Digital Library Competition. Another project experimenting with Dublin Core metadata elements is the Library of Congress' National Digital Library Program. With a grant from Ameritect, LC is sponsoring a competition to enable libraries, museums, historical societies, and archival institutions to create digital collections of primary resource material, which will augment the collections already digitized through this program. Collections that are digitized as part of this program must be supported by access aids, usually through catalog records, finding aids, or the provision of searchable reproductions of textual materials. One type of access aid is a bibliographic record in MARC format. Participants can choose to create less than full-level records in MARC format with fields as specified in the Ameritech guidelines. Alternatively, bibliographic records following the Dublin Core approach may be used. These would use SGML tagging if a Document Type Definition emerges from the Dublin/Warwick metadata participants by the time the awards for the Ameritech competition are made. In addition, institutions may choose to incorporate bibliographic information in the headers either using the Text Encoding Initiative guidelines (in SGML) or using Dublin Core descriptive elements embedded in the document header using HTML as agreed upon at the W3C Workshop on Distributed Indexing and Searching (mentioned above). The Library of Congress will work out details with awardees choosing to take the Dublin Core approach. Z39.50 and Dublin Core. The Z39.50 Profile for Simple Distributed Search and Ranked Retrieval (ZDSR) will employ the Dublin Core elements. ZDSR originally was based on the Stanford Protocol for Internet Search and Retrieval (STARTS), an initiative of the Stanford Digital Library Project, which developed requirements for distributed searching and ranked retrieval during the Spring and Summer of 1996. The profile assumes that queries pertain to text documents and that retrieval records consist of document metadata. Thus, a client searches for documents, and retrieves document descriptors. Those descriptors are compatible with those in the Dublin Core element set. 6. DUBLIN CORE/MARC CROSSWALK A mapping between the elements in the Dublin Core and USMARC fields is necessary so that conversions between various syntaxes can occur accurately. Once Dublin Core style metadata is widely provided, it might interact with MARC records in various ways such as the following: Enhancement of simple resource description record. A cataloging agency may wish to extract the metadata provided in Dublin Core style (presumably in HTML or SGML) and convert the data elements to MARC fields, resulting in a skeletal record. That record might then be enhanced as needed to add additional information generally provided in the particular catalog. Searching across syntaxes and databases. Libraries have large systems with valuable information in MARC bibliographic records (which may also be called metadata). Over the past few years with the expansion of electronic resource over the Internet, other syntaxes have also been considered for providing metadata. The Library of Congress has worked with a group of SGML experts to create a Document-Type Definition (DTD) for MARC, so that conversions can be made between SGML and MARC in a standardized way. It will be important for systems to be able to search metadata in different syntaxes and databases and have commonality in the definition and use of elements. A mapping between the Dublin Core elements and MARC was initially provided in Discussion Paper No. 86. As a result, a proposal was presented and approved to define a generic author field (Proposal No. 96-2 (Define a Generic Author Field in the Bibliographic, Classification, and Community Information Formats)) which allowed for more consistency between the Dublin Core Author element, which did not distinguish type of name and a MARC field. A revised mapping (also called crosswalk) between Dublin Core elements and MARC fields is given below. This should replace the previous mapping in Discussion Paper No. 86. The list of Dublin Core elements is current as of December 1996. There seems to be broad consensus on this list, and it is expected that the elements and their names are not expected to change substantively. In some cases, there are two mappings provided. The first is a simple mapping and is used if the Dublin Core elements are used without qualifiers. The second is for a more complex description for which the elements have qualifiers. There could be a mixture, but if the particular element is unqualified, then the simple mapping for that element should be used. Certain defaults have been assumed as to definitions and qualifiers; if this changes the list will need to be adjusted. This list has been made consistent with the GILS/MARC mapping where possible. Where applicable, subfields are given. Earlier metadata workshops supported the notion of defining qualifiers for elements when more complex descriptions are needed, but the list of qualifiers is not entirely agreed upon. When the list of qualifiers becomes standardized it will be necessary to modify this document and add to it as appropriate. Only the most obvious qualifiers have been included now. USMARC fields are listed with field number, then in parentheses field name/subfield name (if both are the same, no subfield name is included). If the value of the indicator is not provided, use a blank (H'20'). The label is a shortened form of the element name. 1 Title Label: TITLE The name given to the resource by the CREATOR or PUBLISHER. USMARC: 245$a (Title Statement/Title proper) (1st indicator=0) 2 Author or Creator Label: CREATOR The person(s) or organization(s) primarily responsible for the intellectual content of the resource. For example, authors in the case of written documents, artists, photographers, or illustrators in the case of visual resources. Qualifier possible: type. USMARC: 100$a (Main Entry--Personal Name) (1st indicator=1) If type=corporate: 110$a (Main entry--Corporate Name) 3 Subject and Keywords Label: SUBJECT The topic of the resource, or keywords or phrases that describe the subject or content of the resource. The intent of the specification of this element is to promote the use of controlled vocabularies and keywords. This element might well include scheme-qualified classification data (for example, Library of Congress Classification Numbers or Dewey Decimal numbers) or scheme-qualified controlled vocabularies (such as MEdical Subject Headings or Art and Architecture Thesaurus descriptors) as well. Qualifier possible: scheme. USMARC: 653$a (Index Term--Uncontrolled) If scheme=LCSH: 650$a (Subject added entry--topical term) If scheme=LCC: 050$a (Library of Congress Call Number/Classification number) If scheme=DDC: 082$a (Dewey Decimal Call Number/Classification number) 4 Description Label: DESCRIPTION A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources. Future metadata collections might well include computational content description (spectral analysis of a visual resource, for example) that may not be embeddable in current network systems. In such a case this field might contain a link to such a description rather than the description itself. USMARC: 520$a (Summary, etc. note) 5 Publisher Label: PUBLISHER The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. The intent of specifying this field is to identify the entity that provides access to the resource. USMARC: 260$b (Publication, Distribution, etc. (Imprint)/Name of publisher, distributor, etc.) 6 Other Contributors Label: CONTRIBUTORS Person(s) or organization(s) in addition to those specified in the CREATOR element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specifed in the CREATOR element (for example, editors, transcribers, illustrators, and convenors). Qualifiers possible: role, type. USMARC: 720$a (Added Entry--Uncontrolled Name/Name) $e [content of role qualifier] If type=personal: 700$a (Added Entry--Personal Name) $e [content of role qualifier] If type=corporate: 710$a (Added Entry--Corporate Name) $e [content of role qualifier] 7 Date Label: DATE The date the resource was made available in its present form. The recommended best practice is an 8 digit number in the form YYYYMMDD as defined by ANSI X3.30-1985. In this scheme, the date element for the day this is written would be 19961203, or December 3, 1996. Many other schema are possible, but if used, they should be identified in an unambiguous manner. Qualifier possible: modified. USMARC: 260$c (Date of publication, distribution, etc.) If type=modified: 005 (Date and time of latest transaction) 8 Resource Type Label: TYPE The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. It is expected that RESOURCE TYPE will be chosen from an enumerated list of types. A preliminary set of such types can be found at the following URL: http://www.roads.lut.ac.uk/Metadata/DC-ObjectTypes.html USMARC: 516$a (Type of Computer File or Data Note) 9 Format Label: FORMAT The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. The intent of specifying this element is to provide information necessary to allow people or machines to make decisions about the usability of the encoded data (what hardware and software might be required to display or execute it, for example). As with RESOURCE TYPE, FORMAT will be assigned from enumerated lists such as registered Internet Media Types (MIME types). In principal, formats can include physical media such as books, serials, or other non-electronic media. USMARC: 538$a (System Details Note) 10 Resource Identifier Label: IDENTIFIER String or number used to uniquely identify the resource. Examples for networked resources include URLs and URNs (when implemented). Other globally-unique identifiers,such as International Standard Book Numbers (ISBN) or other formal names would also be candidates for this element. Qualifier possible: scheme. USMARC: 024$a (with 1st indicator=8) (Other Standard Identifier/Standard number or code) If scheme=URL: 856$u (Uniform Resource Locator) (1st indicator=7) If scheme=ISBN: 020$a (International Standard Book Number) If scheme=ISSN: 022$a (International Standard Serial Number) If scheme=URN: 856$u with initial "urn:" (1st indicator=7) 11 Source Label: SOURCE The work, either print or electronic, from which this resource is derived, if applicable. For example, an html encoding of a Shakespearean sonnet might identify the paper version of the sonnet from which the electronic version was transcribed. USMARC: 786$t (Data Source Entry/Title) (1st indicator=0) 12 Language Label: LANGUAGE Language of the intellectual content of the resource. Where practical, the content of this field should coincide with the Z39.53 three character codes for written languages. Qualifier possible: scheme. USMARC: 546$a (Language note) If scheme=Z39.53: 041$a (Language code) If scheme=USMARC: 041$a (Language code) A three-character language code standard is currently being ballotted as: ISO 639-2 (not yet available electronically) 13 Relation Label: RELATION Relationship to other resources. The intent of specifying this element is to provide a means to express relationships among resources that have formal relationships to others, but exist as discrete resources themselves. For example, images in a document, chapters in a book, or items in a collection. A formal specification of RELATION is currently under development. Users and developers should understand that use of this element should be currently considered experimental. USMARC: 787$n (Nonspecific Relationship Entry/Note) (with 1st indicator=0) 14 Coverage Label: COVERAGE The spatial locations and temporal durations characteristic of the resource. Formal specification of COVERAGE is currently under development. Users and developers should understand that use of this element should be currently considered experimental. Possible qualifier: type. USMARC: 500$a (General note) If type=spatial: 255$c (Cartographic Mathematical Data/Statement of coordinates) If type=temporal: 513$b (Type of Report and Period Covered Note/Period covered) 15 Rights Management Label: RIGHTS The content of this element is intended to be a link (a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement, or perhaps a server that would provide such information in a dynamic way. The intent of specifying this field is to allow providers a means to associate terms and conditions or copyright statements with a resource or collection of resources. No assumptions should be made by users if such a field is empty or not present. USMARC: 506$a (Restrictions on Access Note/Terms governing access) 7. FUTURE EFFORTS A fourth metadata workshop to be held in Canberra, Australia in March 1997 was recently announced. This workshop will address major open issues concerning deployment of the Dublin Core and afford developers and planners an opportunity to share experiences with others. Agenda items will include: extensibility issues (how to extend the core elements to accommodate a variety of users), element structure (identification of default schemes and subelement conventions), and element refinement (semantics and clearer definitions). It is hoped that Internet search engines will begin to incorporate Dublin Core style metadata for resource search and discovery. Currently AltaVista and InfoSeek, two well know search engines on the World Wide Web, are using the description and keyword META tags, which may be embedded in the headers of HTML documents. However, the syntax is not entirely consistent with those elements in the Dublin Core. The meta tag, used to indicate a data element name, in AltaVista is "keywords", while in Dublin Core it is "subject" and the agreed upon HTML coding for Dublin Core indicates the scheme used (DC means Dublin Core; e.g., META NAME="DC.Subject"). It is likely that future development of Dublin Core style metadata and implementation into existing search software would greatly enhance search and retrieval of documents on the Internet. The challenge is to build strong consensus on this potential new standard, to work out further details about implementation, to convince implementors that it is solid and has strong support, and to get information managers and providers to supply metadata for their resources. 8. REFERENCES Workshop Reports. OCLC/NCSA Metadata Workshop I. _OCLC/NCSA Metadata Workshop Report_. Stuart Weibel, Jean Godby, Eric Miller, and Ron Daniel (June, 1995). (http://purl.oclc.org/metadata/dublin_core_report) OCLC/UKOLN Metadata Workshop II. _The Warwick Metadata Workshop: a framework for the deployment of resource description_. Lorcan Dempsey and Stuart Weibel, DLib Magazine (July/August,1996) (http://www.dlib.org/dlib/july96/07weibel.html) _The Warwick Framework: a container Architecture for aggregating sets of metadata_. Carl Lagoze, Clifford Lynch, Ron Daniel (June 28, 1996). (http://cs- tr.cs.cornell.edu:80/Dienst/UI/2.0/Describe/ncstrl.cornell%2fTR96- 1593) CNI/OCLC Workshop on Metadata for Networked Images. Workshop Home Page and Workshop's Executive Summary. (http://purl.oclc.org/metadata/image) Other Sources (This list is by no means exhaustive.) _Australian Digital Library Initiatives_. Renato Ianella (editor) (December 1996). (http://www.dlib.org/dlib/december96/12iannella.html) _Review of Metadata Formats_. Rachel Heery (October 1996). (http://www.ukoln.ac.uk/metadata/review) _The State of the Dublin Core_. Stuart Weibel (January 1997). To be issued in: _International Journal of Digital Libraries, Special Issue on Metadata and Digital Libraries_. Dublin Core Home Page. (http://purl.oclc.org/metadata/dublin_core) Metadata Resources (Home Page). (http://www.nlc- bnc.ca/ifla/documents/libraries/cataloging/metadata)