NAME: Redefinition of subfield $q (File transfer mode) in field 856 of the USMARC formats
SOURCE: OCLC Metadata Workshops; Library of Congress
SUMMARY: This paper proposes a redefinition of subfield $q (File transfer mode) to Electronic format type. This would result in recording the type of file, or MIME type in subfield $q, instead of the current definition that requires recording "ASCII" or "binary" to indicate what mode of transfer is necessary.
KEYWORDS: Field 856 (Bibliographic/Holdings/Classification/Community Information); Electronic Location and Access; Subfield $q, in field 856 [Bibliographic/Holdings/Classification/ Community Information]; File transfer mode; Electronic format type
RELATED: 96-1 (January 1996); DP99 (February 1997)
5/1/97 - Forwarded to USMARC Advisory Group for discussion at the 1997 Annual MARBI meetings.
6/29/97 - Results of USMARC Advisory Group discussion - Approved. An error in the original proposal was corrected to state that subfield $q would remain non- repeatable.
8/21/97 - Result of final LC review - Approved.
PROPOSAL NO. 97-8: Redefinition of Subfield $q (File transfer mode) in field 856 1. BACKGROUND Field 856 (Electronic Location and Access) was initially developed and approved by the USMARC Advisory Group in January 1993. At that time, the Internet Engineering Task Force was finalizing the draft standard for a locator, the Uniform Resource Locator (URL). During discussions of field 856, participants agreed that the field should enable a system to create a "hot link" to allow for the transfer of a file, the connection to another host, or the initiation of an email message through information recorded in the field. If the resource described in the record was available by telnet, the information should enable a connection; if the resource was available by email, it should enable the initiation of an email message; if by FTP, it should enable the transfer of a file. One piece of information that was deemed by participants to be required for FTP was whether the file is transferred as ASCII or binary. Thus subfield $q was defined as File transfer mode. Proposal No. 96-1 (Changes to Field 856 (Electronic Location and Access) in the USMARC Formats) was presented to the USMARC Advisory Group for discussion at the January 1996 MARBI meetings. One change proposed was to change the definition of subfield $q to include information on file format type, since it had not been used widely and the information included in it is related to file transfer mode. OCLC confirmed that the subfield had been rarely or never used in the INTERCAT database. However, the proposal was not approved for the following reasons: 1) concern was expressed that the need for file format to be explicit may be a temporary situation, and that in the future files may become more self- defining; 2) it was suggested that it would be better to wait and see if this change is still needed in the future, since no specific need had been demonstrated. 2. DISCUSSION In the past few years, the availability of all types of resources over the Internet has exploded. Now, the World Wide Web, which was only under development when enhancements were made to MARC to accommodate description of Internet resources, has allowed for the integration of multimedia resources. Software that is necessary for display of digitized images or playing of digital audio files is activated depending upon the file format. Often the file extension indicates the type of file and determines whether it is transferred in binary or ASCII mode (ASCII is the default; all other types of files are transferred using binary). The specification of whether a file is transferred as ASCII or binary was not included in the URL standard; such information may be assumed from the file extension. In creating MARC records for Internet resources, catalogers have been confused about where to include information about file format. Field 516 (Type of file) is a note field containing nature and scope information about the file described. In some cases this information has been combined in the field with file format (e.g. "Electronic journal in ASCII format"). In other cases, field 538 (System Details Note) has been used, since requirements for processing the file are dependent upon the type of compression used or file format type. File format is a data element included in the Dublin Core, a list of core data elements needed for Internet resource discovery and retrieval. This list was developed by a wide range of participants at four different workshops convened between March 1995 and March 1997. It is also an element in the Government Information Service (GILS) profile as Available Linkage Type, which is a subelement of Available Linkage (which maps to field 856). In the mapping of the Dublin Core elements to MARC, field 538 was used for Format (see Discussion Paper No. 99: (Metadata, Dublin Core and USMARC: a review of current efforts). However, this mapping is not entirely adequate, since field 538 is a note that can contain information other than file format, and since file format has also been recorded in other MARC fields. A revised version of the mapping, which looked at equivalent data elements in GILS and included them in the Dublin Core mapping, was recently made available and maps the element to field 856$q, for lack of a better equivalent data element (http://www.loc.gov/marc/dccross.html). If a subfield were defined in field 856 for file format, then the information could be given at the level of the location, rather than for the intellectual work as a whole. Using field 538 does not relate the file format type to the location. In recent discussions of whether separate records need to be created for different file formats, the majority of respondents have endorsed using one record for the intellectual work and to use repeating 856 fields for different file formats. Recording such information within field 856 would allow for the file format to be associated with a particular file at a particular host. However, other note fields would still be available for recording file format if this were desirable. The term "electronic format" is suggested, since the term "file" might not be appropriate in all situations. Electronic format type is often referred to as "Internet Media Type" (IMT), a new name for "MIME type". An Internet Request for Comments (RFC2046) entitled "Multipurpose Internet Mail Extensions (MIME) Part II: Media Types", (replacing RFC1521 "MIME (Multipurpose Internet Mail Extensions))" defines the general structure of the MIME media typing system and defines an initial set of media types. It includes content types and subtypes and uses the Internet Assigned Numbers Authority (IANA) as a central registry for specific values. Another document, RFC2048, specifies IANA registration procedures for media types. The National Digital Library Federation's Making of America II Project formed a task group to look at architecture issues in its digitization project. The group identified a need to indicate electronic format type in the metadata provided for complex digital objects. In addition, when Uniform Resource Names (URNs) rather than URL's are in more widespread use, the format type will no longer be implicit in the file name recorded as part of the URL. If subfield $q were redefined as Electronic format type, it may be desirable to record the data in a standard form, using those file format types registered with IANA. However, the data should not be restricted or controlled to using this form, since neither the Dublin Core element set or the GILS profile require a standardized use, and thus allow for free text. Subfield $q should not be repeatable, since it is difficult to envision a situation where all the other information in the field would apply (especially the URL) to different electronic file formats. If more detailed information is needed to accommodate multiple formats, this information would be given in subfield $z (such as electronic journals available by email subscription that make more than one format available). Subfield $q might be used by a system as a clue to how it deals with the object, and it would only be confusing to repeat the subfield. Thus, repetition of the electronic format information will require a repeatable 856 field. Example: 130 0#$aEmerging infectious diseases (Online) 245 00$aEmerging infectious diseases$b[computer file] 260 ##$aAtlanta, GA$bNational Center for Infectious Diseases$bCenters for Disease Control and Prevention,$c[1995- 516 8#$aASCII, Acrobat, and PostScript file formats 530 ##$aOnline version of: Emerging infectious diseases (Print). 776 1#$tEmerging infectious diseases (print)$x1080- 6040$w(DLC) 96648093 $2 (OCoLC) 31848353 856 00$umailto:lists#list.cdc.gov$isubscribe$fEIF-*$zInclude desired file format following the hyphen in the filename: EID-ASCII, EID-PDF, or EID-PS 856 10$aftp.cdc.gov$dpub/EID$lanonymous$zEach issue is in a separate subdirectory (e.g. vol1no1). There are additional subdirectories for each file format 856 40$uhttp://www.cdc.gov/ncidod/EID/eid.htm$qtext/html 4. PROPOSED CHANGES The following is presented for consideration: * In the USMARC Bibliographic/Holdings/Classification/ Community Information Formats, redefine subfield $q (File transfer mode) as Electronic format type and make it repeatable. ------------------------------------------------------------------ ATTACHMENT A Examples of Internet Media Types Following are examples of Internet Media Types (MIME) and their subtypes that are registered with IANA (this is not a comprehensive list). They would be expressed as type/subtype, e.g. application/msword; text/html. For additional information see: ftp://ftp.isi.edu/in-notes/iana/assignments/media-types Types/Subtypes: File extension (where available) application/oda oda application/pdf pdf application/postscript ai eps ps application/octet-stream bin application/x-powerpoint ppt application/wordperfect5.1 wp application/zip zip audio/basic au snd audio/x-aiff aif aiff aifc audio/x-wav wav image/gif gif image/jpeg jpeg jpg jpe jif image/tiff tiff tif image/x-portable-bitmap pbm image/x-cmu-raster ras message/http message/rfc822 message/news model/iges model/vrml multipart/encrypted multipart/mixed text/html html text/plain txt text/x-sgml sgml sgm video/mpeg mpeg mpg mpe video/quicktime qt mov