Annual Report

of the

Bibliographic Enrichment Advisory Team (BEAT)

December 1994

Many of the activities described in this report were made possible by the generous financial support of the Edward Lowe Foundation

1994 was the second full year of BEAT operations. The group, originally constituted at the end of 1992, continued to develop tools to aid catalogers and searchers in creating and locating information, sought to enrich the content of Library of Congress bibliographic records as well as improve access to the data the records contain, and conducted further research and development in areas that can contribute to furthering these efforts. Some of the Team's work sustained and continued initiatives begun in 1993, while other efforts set new directions for activity.

The Team's main efforts were directed primarily at three areas in 1994: Library of Congress Classification, Subject Headings and subject terminology, and bibliographic enrichment. A fourth area, research and development, pursued initiatives mainly in the areas of content enrichment, catalogers' tools, and new data access techniques. At the end of 1994 the Team conducted a status review and formulated plans for 1995, and beyond. A summary of activity in these categories, the specific work done and current status of each, is presented below, with primary emphasis on the first three areas of the Team's efforts. Other related BEAT activities are also noted.

Library of Congress Classification

By the end of 1993 almost all of the work of converting to machine readable form the Library of Congress Classification (LCC) schedules selected by BEAT for such treatment had been completed. Continuing into 1994, the proofing of these schedules was also completed (Class "T" [Technology] is being completed as this is being written). Those schedules that are self-contained (Class "Z" [Bibliography], "J" [politics and government, political science] and shortly, class "T") have been or will be forwarded to the Library's Cataloging Distribution Service in preparation for publishing and for making them available to CDS' subscribers worldwide.

Beat Team project chair Gabe Horchler, with the assistance of Business and Economics Cataloging Team members Jim McGovern and Fred Augustyn, conducted a complete revision of the schedules for classes J-JV (politics, government, political science) before having them converted to on-line form, work that improved the quality and heightened the currency of these schedules.

All the schedules done by BEAT, both full and partial, have been added to the Library's master database and can now be accessed by staff on-line. Noteworthy is the fact that these BEAT efforts with class schedules begun by the Team have now been merged into the regular LCC maintenance and workflow of the Library's Cataloging Policy and Support Office, and have thus become part of the library's ongoing effort in on-line classification.

BEAT acquired a license for access to the multi-user version of Minaret, the database program used to create classification records, and this will aid 1995 efforts in classification research and give the Team necessary access to the program and various databases as needed. Access has been established and some test data has been created.

One effort that started during the year and which will continue over into 1995 is concerned with increasing the access terminology provided by classification records to the subject vocabulary of the Library of Congress Subject Headings. The Team identified approximately 7,000 Subject Headings entries with reference to classification in Class T. As the proofing of class T is completed it is expected that Barbara Biebrich of the Arts and Sciences Cataloging Division will spearhead efforts (in conjunction with the Cataloging Support and Policy Office) to add relevant subject headings to the appropriate classification in class T, thus providing a direct link between vocabularies used in both systems. This effort is a significant expansion of the work of adding index terms to classification records that is being done by the Business and Economics Cataloging Team. That idea has allure such that BEAT, as one of its 1995 activities, has elected to extend the concept to Classes H-HJ.

Of particular significance here is that successful BEAT R&D efforts such as these have proved workable, and of sufficient value such that they now merit both continued and expanded focus in additional BEAT projects, or that they are of such value as to warrant migration from the BEAT R&D arena to the realm of ongoing and regular LC projects. It is quite likely that such enrichment of classification can go forward with renewed interest as more staff come to have access to on-line classification.

Library of Congress Subject Headings

The major BEAT activity focused on adding access vocabulary of relevance to small business and entrepreneurship to the Library's Subject Headings List. That project was completed with almost 900 new terms or cross-references having been incorporated into LCSH as a result of the Team's activity. BEAT produced a list of the specific headings and references and it has been provided to the Library's Business Reference Team for their use in reference work. Although this work was conducted under BEAT auspices, it was done by largely by the professional subject specialist staff of the Cataloging Directorate's Business and Economics Cataloging Team, and suggests that future similar structured efforts could possibly be undertaken by specialists in other subject areas as well. Because of its success, this project too, will see its application expanded to another subject area in 1995.

As an adjunct to the TOC work in progress and as an integral component to future TOC implementation, the Business and Economics Cataloging Team continued to flag important new works in business received for cataloging on the team and produced photocopies of tables of contents for scanning work. This selection also underscores the integrated nature of such BEAT efforts, in that this work is now carried out within the regular work of the team, and as the information concerning these selected titles is also shared with the Library's Business reference staff.

Bibliographic Enrichment: Abstracts

The Team achieved success in realizing a long-desired objective of the 1993 Table of Contents (TOC) initiatives, and incorporated summaries concerning the nature and coverage of approximately 100 serial titles from works selected for inclusion in a bibliography, the Entrepreneur's Reference Guide To Small Business Information, compiled as a result of a Business Research Project initiative. These descriptions were added to the 520 field for the cataloging records of the serials they described.

Bibliographic Enrichment: Tables of Contents

BEAT acquired equipment (a scanner and a personal computer) and software (with imaging and optical character recognition (OCR) capability) with which to conduct an experiment to create machine readable data from photocopies of Tables of Contents from business books, which in turn would be added to the bibliographic record for the book. The objective was to develop an automated model to compare with baseline work done in a manual mode in 1993. The work to automate the efficient creation of reliable and accurate digitized text from hard copy proved very difficult, with the result that it took most of the year to construct a work flow and scanning and editing routines robust enough to merit testing and demonstration. A group of test TOC materials has been assembled, and documentation for the procedures -- both for scanning and editing -- has been prepared. The project is now at the staff-recruitment and training stage.

Though a production level routine is not yet in place, the team did create a number of useful scanning, editing, and automated data entry routines for the production of MARC records during the course of its work during the year as a result of its work on TOC. Several of these are in fact being used in other venues in the Cataloging Directorate to speed or even to make possible the creation of data that LC can use in its name authority or bibliographic records. Three of the more significant beneficiaries of these efforts are the Overseas Data Entry programs, SERLOC (Serials Location work), and the methodology developed for an international cooperative effort -- the creation of name authority records from hard copy data received from the British Library.

Bibliographic Enrichment: Research and Development

BEAT demonstrated the linking of citations (from the aforementioned Entrepreneur's Reference Guide) to MARC record surrogates as well as to images of actual title pages and tables of contents for sample citations. Using a World Wide Web server set up by BEAT team members, all the citations in the bibliography were linked to a text surrogate for the full bibliographic record representing the cataloging for the work cited. For an additional demonstration several links were also made to images showing the actual title pages or tables of contents for work cited in the publication. The result of this R&D is available for viewing on-line at LC and can also be accessed through the Internet.

In addition, a copy of the bibliography was also converted to the BookManager format with extensive hypertext and indexing and retrieval built in. This publication is available on line to anyone at the Library of Congress who wishes to access it. The Library's systems office (ITS) is also exploring ways to make this electronic text available on the Internet while still retaining full hypertext links and searching capabilities that characterize this version of the Entrepreneur's Reference Guide.

In the course of project work this year BEAT also found other areas where these R&D efforts could also make practical and productive improvements in operations or that could facilitate the enrichment of data or access thereto. Accordingly, the Team's 1995 plan includes expansion of R&D and new initiatives as direct outgrowth of the 1994 experiences. Some of these include additional work to improve the quality of information contained in Serial records, improvement of terminology in Public Finance, enhancement of classification terminology in the classes that relate to Business and Economics (H-HJ), and a dynamic process for access to table of contents or other types of information for cataloged materials in the Library's collections. The 1995 plan has been submitted to the Business research Project.

Electronic CIP

Building upon the techniques developed by BEAT in what we have called the Text Capture and Electronic Conversion projects (TCEC), the Directorate's Electronic CIP experiment is incorporating table of contents data from books submitted in electronic form from publishers participating in the experiment. These books are cataloged electronically, and when the structure, content, or nature of the tables of contents is appropriate, this data is included in the catalog record. Thus far, about half of the approximately 100 titles submitted electronically have had TOC data added to their MARC records.

Business Research Project Support; BEAT Outreach

BEAT staff regularly attended planning and coordinating meetings of the Business Research Project, which has funded and supported the activities and initiatives of the BEAT Team since its inception. During the year BEAT provided document review, descriptive information, and demonstrations of some of its work for BRP visitors and other staff.

The history and current activities of the BEAT Team were presented to a meeting of the Law Classification Advisory Committee meeting held at LC and documents describing BEAT initiatives and projects were prepared and distributed to LC staff through the Collections Services Newsletter and to the American Library Community through documents prepared for the Director and subsequently distributed at the meetings of the American Library Association.

The chair of BEAT--John Byrum--usually accompanied by his senior technical staff--Robert August, Richard Thaxter, and David Williamson--made presentations on BEAT activities to the Cataloging Council, the Directorate's Catalog Management Team, and related demonstrations of various TCEC techniques were made to the Librarian of Congress and senior Library management staff, to the Service Unit's Training Office, and to a wide variety of staff as well as many visitors.

The Team itself continued to meet bi-weekly to exchange information and to report progress or status of work in progress. Minutes from these meetings were widely circulated throughout the Library, reflecting the representation of many key Library units in the BEAT membership.