The Library of Congress >> Especially for Librarians and Archivists >> Standards

MARC Standards

HOME >> MARC Development >> Proposals List


MARC PROPOSAL NO. 2019-03

DATE: December 12, 2018
REVISED:

NAME: Defining Subfields $0 and $1 to Capture URIs in Field 024 of the MARC 21 Authority Format

SOURCE: PCC Task Group on URIs in MARC

SUMMARY: This is a proposal to capture machine actionable and parseable URIs in the 024 field of the MARC 21 Authority Format by adding:

  1. Subfield $0 for URIs that identify a ‘Record’ or ‘Authority’ entity describing a Thing (e.g. madsrdf:Authorities, SKOS Concepts for terms in controlled or standard vocabulary lists) and,
  2. Subfield $1 for URIs that directly identify a Thing itself (sometimes referred to as a Real World Object or RWO, whether actual or conceptual).

The proposed changes facilitate conversion from MARC to RDF by differentiating MARC subfields for standard numbers or codes that are not machine actionable URIs, already accommodated in 024 $a, from machine dereferenceable HTTP URIs.

Note: Standard vocabulary terms from controlled lists, such as MARC lists, are not generally considered Authority "records"; however, when those terms are represented as SKOS concepts and assigned actionable/dereferenceable URIs, they do carry with them "record" like data in a particular vocabulary scheme.  The latter are referenced in this paper as Authority "records" in conjunction with more traditional Authorities in a record format.

KEYWORDS: Field 024 (AD); Other Standard Identifier (AD); Subfield $0, in field 024 (AD); Authority record control number or standard number (AD); Subfield $1, in field 024 (AD); Real World Object URI (AD); Uniform Resource Identifier (AD); URIs

RELATED: 2018-DP08, 2017-08, 2017-DP01

STATUS/COMMENTS:
12/12/18 – Made available to the MARC community for discussion.

01/27/19 – Results of MARC Advisory Committee discussion: Approved, with the amendment to revise the Appendix A -- Control Subfields definition of $0 in the Authority format to reflect the changes approved for field 024 in this paper.

03/29/19 - Results of MARC Steering Group review - Agreed with the MAC decision.


Proposal No. 2019-03: Defining Subfields $0 and $1 to Capture URIs in Field 024

1. BACKGROUND

Discussion Paper 2018-DP08 suggested four options for adding machine dereferenceable URIs to the MARC Authority 024 field and disambiguating among standard numbers or codes that are not machine actionable, machine dereferenceable Authority URIs, and machine dereferenceable RWO URIs:

  1. Put all standard numbers or codes that are not machine actionable and URIs in $a and disambiguate Authority URIs and RWO URIs with second indicator values 0 and 1, respectively.
  2. Retain the current definition of 024 $a for standard numbers or codes that are not machine actionable and define $0 and $1 for Authority and RWO URIs respectively, and restricting $0/$1 to URIs providing machine actionable or machine dereferenceable data.
  3. Deprecate the 024 $a and replace it with $0 for both standard numbers or codes that are not machine actionable and machine dereferenceable Authority URIs and add $1 as defined in Proposal 2017-08.
  4. Redefine the 024 $a for recording both standard numbers or codes that are not machine actionable and machine dereferenceable Authority URIs and add $1 as defined in Proposal 2017-08.

Per the MARC Advisory Committee’s (MAC) direction, this proposal pursues approval of Option 2. MAC also required that use of $2 should be optional with $0 and $1; however, we believe that requirement is already in place with use of first indicator 8.

In approving Proposal 2017-08, MAC established a distinction, reflected in the definitions of $0 and $1, between machine dereferenceable HTTP identifiers for authorities and for RWOs. The PCC URI Task Force has identified the 024 field in the Authority Format as another MARC field where it is useful to add those subfields to make that distinction.

Experiments by the PCC URI Task Force and others in converting MARC 21 to linked data suggest that there are benefits to storing URIs in MARC 21 to facilitate conversion to RDF.  That said, RDF requires more semantic precision than MARC 21 currently contains. Therefore, changes to the 024 field in the Authority format (similar to the recent changes to the subfield $0 and addition of the $1 [see 2017-DP01, 2017-08]) are an important prerequisite for the conversion to linked data.

A scope note: The Uniform Resource Locator (or URL) is another important type of URI, which provides addresses for human-readable websites, documents, or web pages.  But since the focus of this paper is linked data designed for machine consumption, document URLs are out of scope. URLs and the use of $u (described above) to capture them are not part of the proposal.

2. DISCUSSION

2.1 Current Definition and Scope of Field 024 in the MARC Authority Format



Field Definition and Scope: Standard number or code associated with the entity named in the 1XX field which cannot be accommodated in another field (e.g., fields 020 (International Standard Book Number) and 022 (International Standard Serial Number)). The source of the standard number or code is identified in subfield $2 (Source of number or code).
Subfields in this field are defined for consistency with field 024 in the MARC 21 Format for Bibliographic Data.

2.2. Context

As described in 2017-DP01 and 2017-08, in the MARC 21 format it is critical to distinguish RWO URIs from URIs for Authorities and skos:Concepts to allow for meaningful conversions to other formats, such as RDF.

The Authority 024 is currently defined as, “Standard number or code associated with the entity named in the 1XX field which cannot be accommodated in another field …” It allows for capturing any external identifiers (both URIs and non-URI identifiers) for the thing described in the record not provisioned for elsewhere in the MARC Authority record. The 024 in the MARC 21 Authority format, like the definition of the $0 prior to DP 2017-08, lacks a machine actionable way to make meaningful mapping assertions from various datasets when converting MARC Authority Records to RDF. In order to accomplish this, we need the 024 to allow the following types of data to be disambiguated:

(1) standard numbers and codes that are not machine actionable
(2) URIs for machine actionable Authority/Concepts
(3) RWO URIs

2.3. Justification for Changes to MARC Authority Field 024

2.3.1. URIs and Equivalency

According to linked data design principles [COOL URIs, https://www.w3.org/TR/cooluris/], the semantic web infrastructure relies on the unique identification of entities—or, in semantic web terms, ‘Real World Objects’ (RWOs), or, even more colloquially, ‘Things.’ For example, a Person and a MARC 21 Authority record about the person are different RWOs, or Things, and each needs to be uniquely identified with distinct URIs for semantic clarity (for background see 2018-DP08 and 2017-08).   The distinction between authority URIs in $0 and RWO URIs in $1 adds some of that clarity in MARC.

The minting of multiple URIs for the same or similar things is inevitable. To help align these duplicative entities, there are established semantic web conventions for mappings between the same or similar entities. The decision regarding which equivalency relationships to use to map different entities depends on the type of entities being mapped and the degree of confidence with respect to their equivalencies.

In RDF, if you say that two entities are the same as each other using the common RDF property owl:sameAs (see http://www.w3.org/TR/owl-ref/#sameAs-def), then everything stated about one entity is also true of the other. The use of this property should be carefully considered because corresponding inferences can lead to messy data if the two things are not in fact the same. For instance, two authority records from different national authority files describing the same person are not the same resource. Each authority record has unique traits: different dates of creation and/or of modification, different sources of information, different processes asserted on them, etc.

Rather than asserting that two name authority records are owl:sameAs, we want to assert that the focus of each authority record is the same Person, which is identified by the URI for the Person/RWO. The foaf:focus (see http://xmlns.com/foaf/spec/#term_focus) property is designed to align SKOS terms to their RWO equivalents. URIs that directly identify a Person/RWO provide a bridge between different authority records focusing on the same Person.  More directly, the two authorities could be related using skos:exactMatch or skos:closeMatch (see http://www.w3.org/TR/skos-reference/#mapping) depending on the circumstance, or even looser properties such as rdfs:seeAlso (see http://www.w3.org/TR/rdf-schema/#ch_seealso) or schema:sameAs (see https://schema.org/sameAs).

As explained above, equivalency relationships in RDF are made explicitly; however, in MARC the relationships between or among entities are often less clear.  For purposes of MARC to RDF conversion, we need such clarity, particularly in regard to authority data.  We generally attain this clarity in MARC by stating the relationship in a subfield, e.g., $4, or defining certain fields or subfields to represent specific kinds of relationships between the Subject resource and the entity identified in the MARC field.

For the MARC Authority 024 field, we propose to follow previously approved practice to add machine dereferenceable HTTP URIs in $0 and $1 to facilitate MARC to RDF conversion and disambiguate URIs for Authorities from URIs for Things/RWOs.  Further, to make the equivalency relationships clear between the 024 $0, $1, and the base MARC Authority, an exact match relationship should be understood with $0 and a primary focus relationship should be understood with $1.  Such scoping of URIs in the 024 keeps equivalency relationships simple and clear, negating any need to introduce $4 in the MARC Authority 024.  For RDF conversion, it allows $0 URIs to reliably be considered a skos:exactMatch to the base Authority, and 024 $1 URIs to reliably be considered the foaf:focus of the base Authority.

2.3.2. Change to Field Definition and Scope

While the MARC Authority 024 scope note indicates that “subfields are defined for consistency” with the 024 field in MARC Bibliographic, the two fields do not have identical definitions:

  1. MARC Authority 024: “Standard number or code associated with the entity named in the 1XX field which cannot be accommodated in another field (e.g., fields 020 (International Standard Book Number) and 022 (International Standard Serial Number)).”
  2. MARC Bibliographic 024: “Standard number or code published on an item which cannot be accommodated in another field (e.g., field 020 (International Standard Book Number), 022 (International Standard Serial Number) , and 027 (Standard Technical Report Number)).”

These differences are sufficient not to synchronize this proposal for the Authority 024 with other MARC format 024 fields.  There is a vast difference between a URI that appears on a resource and a URI associated with the entity in the Authority 1XX field.  The entity that a bibliographic record describes (and therefore what the 024 URI identifies) can be ambiguous because that record captures Work, Publication, Item, Agent, etc. information; however, the Authority record clearly describes just the entity named in the 1XX field.  It is clear that these two 024 fields are not same and do not have the same focus.

2.3.3. Changes to $2

With the expansion of 024 to include machine dereferenceable HTTP URIs in $0/$1, it is advisable to make a corresponding change in the definition of $2, which currently states:

$2 - Source of number or code (NR)
MARC code that identifies the source of the number or code. Used only when the first indicator contains value 7 (Source specified in subfield $2). Code from: Standard Identifier Source Codes.

This title and definition refers to the standard numbers or codes that are not machine actionable recorded in 024 $a; however, with the addition of 024 $0/$1, the source should be understood to refer to all the identifiers in each field 024.  If the identifiers in field 024 are from different sources, the identifiers should be recorded in separate 024 fields.

3. PROPOSED CHANGES


Proposed Changes to MARC Authority Field 024 for Recording URIs
(additions and changes are underlined):

024 - Other Standard Identifier (R)

First Indicator
Type of standard number or code
7 - Source specified in subfield $2
8 - Unspecified type of standard number or code

Second Indicator
Undefined
# - Undefined

Subfield Codes
$a - Standard number or code (NR)
$c - Terms of availability (NR)
$d - Additional codes following the standard number or code (NR)
$q - Qualifying information (R)
$z - Canceled/invalid standard number or code (R)
$0 - Authority record control number or standard number (NR)
$1 - Real World Object URI (NR)
$2 - Source of number or code (NR)
$6 - Linkage (NR)
$8 - Field link and sequence number (R)

FIELD DEFINITION AND SCOPE
Standard number or code or URI associated with the entity named in the 1XX field which cannot be accommodated in another field (e.g., fields 020 (International Standard Book Number) and 022 (International Standard Serial Number)).

The source of the standard number or code or URI is identified in subfield $2 (Source of number or code).  It is recommended to identify a source for standard numbers or codes in $a; however, indicating source of a URI in $0 or $1 is optional when there is no $a. When identifiers are from different source vocabularies they should be recorded in separate occurrences of the field.

Authority URIs recorded in the 024 $0 should reflect an exact match of the base MARC Authority, i.e., indicating a high degree of confidence that the concepts can be used interchangeably across a wide range of information retrieval applications. RWO URIs in $1 should reflect the focus, i.e., the underlying or 'focal' entity, of the base Authority.

SUBFIELD CODES
$0 - Authority record control number or standard number
A machine actionable and parseable URI that identifies a name or label for an entity.  When dereferenced, the URI points to information describing that name.  See description of this subfield in Appendix A: Control Subfields.

$1 - Real World Object URI
See description of this subfield in Appendix A: Control Subfields.

$2 - Source of number or code
MARC code that identifies the source of the identifiers.  Used only when the first indicator contains value 7 (Source specified in subfield $2).  Code from: Standard Identifier Source Codes.

4. EXAMPLES

4.1. Associated standard numbers or codes with sources identified (current practice)

010 ## $a n  79034525
024 7# $a 24604287 $2 viaf
024 7# $a 500010879 $2 gettyulan
024 7# $a Q762 $2 wikidata

4.2. Standard number or code from VIAF with source identified and RWO URI from Wikidata, no source identified

010 ## $a n  79034525
024 7# $a 24604287 $2 viaf
024 8# $1 http://www.wikidata.org/entity/Q762

4.3. Authority and RWO URIs from id.loc.gov and RWO URI from Wikidata, no sources identified

010 ## $a n  79034525
024 8# $0 http://id.loc.gov/authorities/names/n79034525 $1 http://id.loc.gov/rwo/agents/n79034525
024 8# $1 http://www.wikidata.org/entity/Q762

4.4. Standard number or code, Authority URI, and RWO URI, all sourced from ULAN

010 ## $a n  79034525
024 7# $a 500010879 $0 http://vocab.getty.edu/ulan/500010879 $1 http://vocab.getty.edu/ulan/500010879-agent $2 gettyulan

4.5. Standard number or code and Authority URI from ULAN with source identified and Authority and RWO identifiers from id.loc.gov, no source identified

010 ## $a n  79034525
024 7# $a 500010879 $0 http://vocab.getty.edu/ulan/500010879 $2 gettyulan
024 8# $0 http://id.loc.gov/authorities/names/n79034525 $1 http://id.loc.gov/rwo/agents/n79034525

4.6. RWO URI, no source identified

010 ## $a n  79034525
024 8# $1 http://www.wikidata.org/entity/Q762

5. BIBFRAME DISCUSSION

The recommendations proposed in this document will facilitate the conversion of MARC records to BIBFRAME. The distinction between Thing and Authority URIs in subfields 1 and 0, respectively, is consistent with the BIBFRAME 2.0 model and previous changes to MARC format.

6. SUMMARY OF PROPOSED CHANGES

Make the following changes to field 024 in the MARC 21 Authority Format (see section 3 above for full description):


HOME >> MARC Development >> Proposals List

The Library of Congress >> Especially for Librarians and Archivists >> Standards
(03/29/2019)
Legal | External Link Disclaimer Contact Us