Seminar on Cataloging Digital Documents

Sarah E. Thomas, Ph.D.

Director for Cataloging

Over 75 North American librarians attended a seminar on cataloging digital documents coordinated by the Library of Congress on October 12-14. The seminar, which focussed on the emerging area of providing access to electronic texts and other media, commenced with a field trip to the University of Virginia in Charlottesville, where the University Library hosted the participants and organized tours of five centers actively engaged in the creation and use of electronic data and a session on the cataloging and organization of electronic materials. Included in the tour were the Electronic Text Center, the Digital Image Center, the Social Science Data Center and Geographic Information Systems Lab, the Music and Special Collections centers, and the Institute for Advanced Technology in the Humanities. Visitors saw firsthand some of the equipment used for capturing and viewing the data, and more importantly, heard center experts describe the philosophy of their efforts and some of the challenges they face. Among the most critical issues are standards for digitizing, access issues, and methods of conversion. The centers conducted a vigorous outreach campaign to make faculty and students aware of the potential of electronic texts, images, maps, and music for augmenting classroom teaching and advancing understanding and knowledge. Center staff encouraged researchers to create their own interactive digital texts and to assist in the conversion process. An important element in the provision of access to the digital materials was the TEI (Text Encoding Initiative) header, a convention for imbedding codified information about the digitized text in the document itself. Edward Gaynor, Head of Original Cataloging at the University of Virginia, facilitated the visit to the electronic centers, which, for many participants, was the highlight of the seminar.

On the second day of the seminar, attendees convened at the Library of Congress to hear a series of presentations by individuals representing a broad spectrum of perspectives about how librarians and others should provide access to digital materials. Susan Hockey, Director of the Center for Electronic Texts in the Humanities, laid the foundation for future discussion with an exposition of TEI and SGML (Standard Generalized Markup Language). The TEI and SGML are metalanguages that offer a means of conveying information about the texts in which they are contained. Following Hockey was LC's Carl Fleischhauer, who described some of the difficulties inherent in using MARC bibliographic records as a basis for indexing materials in the American Memory project. Lynn Marko, University of Michigan Library, explored the changing role of the cataloger in the electronic environment, warning that if catalogers were not alert to the new potential to perform the function of access to bibliographic materials in innovative ways, they might find themselves as superfluous as did members of the ice industry in New England following the introduction of refrigeration. Marko drew the attention of the audience to the lessons in business professor James M. Utterback's book entitled Mastering the Dynamics of Innovation, in which he examines the demise of this enterprise in detail.

In the evolving field of electronic cataloging, there is a clear need for standards and guidelines. Joan Swanekamp, Head of Original and Special Materials Cataloging at Columbia University, discussed the Guidelines for Bibliographic Description of Interactive Multimedia (Chicago:ALA, 1994), recently published by the American Library Association, and their utility for catalogers working with such materials. This extension of traditional cataloging is a candidate for a new chapter in the Anglo-American Cataloguing Rules, or for incorporation in revisions of either Chapter 9 (computer files) and other appropriate chapters. Also building on a tradition, this one of the MARC format and AACR2 and LCSH, was a demonstration of Text Capture and Electronic Conversion (TCEC), a program developed by Richard Thaxter and David Williamson at the Library of Congress. TCEC enables a cataloger to retrieve electronic texts via the Internet, and using the power of the bibliographic workstation and OS2, to cut, paste, and quickly convert data to the MARC format. These procedures vastly reduce keystrokes, increase the accuracy of transcription, and eliminate much of the clerical aspect in the creation of a bibliographic record.

Edward Gaynor next engaged the participants in an exercise of cataloging an electronic image, with many present discovering to their chagrin that the task was more difficult than they imagined. Numerous choices for main entry led some to question the value of adhering to this practice for electronic materials. Should a photograph of a Palladian villa be entered under Palladio, the photographer, the creator of the digital image, or the villa? The problem confounded the group and underscored the need for more documentation and training. Closing out the afternoon was Diane Vizine-Goetz, Consulting Research Scientist, OCLC, who described an OCLC project to catalog Internet resources and the incidence of bibliographic records for electronic resources in the Online Union Catalog. David Bearman, Museum and Archives Informatics, challenged the audience to think critically about the application of traditional bibliographic control to digital documents, strongly suggesting that new technologies bring with them a means of obviating cataloging as it is practiced today.

On Friday, October 14, following intensive small group discussion, critical issues for an action plan emerged. Because there is yet no consensus concerning how best to provide access to electronic materials, participants called for increased communication and analysis, including an email network, a videoconference, white papers on specific topics, and possible follow-up ALCTS (Association of Library Collections and Technical Services) regional institutes. Another recommendation called for the mapping of SGML to MARC and vice versa. With changing modes of publication and distribution, there is a need to explore whether cataloging these items using MARC and traditional bibliographic description is sufficient or whether there are other methods of intellectual access that will be essential. One certainty is that it is necessary to draw colleagues in public service and collection development areas into the discussion. OCLC, which helped sponsor the seminar, is embarking on a pilot to catalog Internet resources. Funded by the Department of Education, OCLC seeks collaborators to participate in the project, which will help provide access to electronic resources while examining issues of maintenance and utility. Many libraries, including the Library of Congress, expressed interest in cooperating in this endeavor. The seminar participants benefitted from the exposure to many new ideas and perspectives, as well as developed a network of individuals and institutions with which they share a common interest that will endure as they strive to prepare themselves for organizing digital resources.

The Seminar on Cataloging Digital Documents is one in a series of meetings the Library of Congress is sponsoring on aspects of the digital library. The Cataloging Directorate will make the proceedings available through MARVEL and over the Web (World Wide Web) in December.