Crossing a Digital Divide:
AACR2 and Unaddressed Problems of Networked Resources

Comments by Glenn Patton
Presented at the Library of Congress Bicentennial Conference on Bibliographic Control in the New Millennium
November 15-17, 2000

Final version

Many of you, I'm sure, remember this famous exchange from the 1967 movie, The Graduate, where Mr. McGuire (played by Walter Brooke) completely confuses Ben Braddock (played by Dustin Hoffman) with this cryptic conversation:

Mr. McGuire: Come with me for a minute. Ben - I just want to say one word to you - just one word.
Ben Braddock: Yes, sir.
Mr. McGuire: Are you listening?
Ben Braddock: Yes, I am.
Mr. McGuire (gravely): "Plastics."[1]

Inflation being what it has been over the past 30 years, it shouldn't be surprising that there might now be more than one word that I want to share with you without, I hope, similarly confusing you.

Actually, I have three words related to topics in Matthew Beacom's paper that I would like to spend my brief time with this afternoon:

I want to make some comments about each and ask a few questions in hopes that these may stimulate some discussions during both the general sessions and in the small group discussions.

Is there a rationale for considering all networked resources published?

Beacom notes in his paper that "some have argued that we should treat everything on the Internet as published." As many of you are aware, that idea first surfaced in Nancy Olson's Cataloging Internet Resources,[2] prepared for OCLC as part of two Internet cataloging projects. It was subsequently included in provisions of the International Standard Bibliographic Description for Electronic Resources (ISBD(ER)).[3]

From my perspective as someone who participated in the decision to include this in the OCLC guidelines, part of the reason for doing so was the view that, as Beacom notes, "the Internet makes it so easy 'to place materials before the public.'" (one of the dictionary definitions of "publish"). However, there was also a pragmatic aspect to the recommendation. Over the years since high-quality photocopiers and laser printers became prevalent, my OCLC colleagues and I had spent what seemed like an inordinate amount of time helping catalogers define whether "borderline" publications (like genealogies, local histories, other local publications) were really "published." Much of this seemed to fall into the category of "unproductive" dithering that didn't, in the end, make any significant difference in access to the materials. At least in part, the guideline was designed to make that a moot point for similar electronic resources, a pragmatic view that I still share.

What does "publication" mean as we move from "find" to "identify" to "select" to "acquire/obtain access to/use"?

Matthew Beacom raises an excellent question when he asks, "What does 'published' mean on the Internet?" That is a question that we need to consider not only as catalogers but also in relation to other uses of bibliographic data. For example, we all know that "publisher" plays a significant role in the selection process. Think of all those approval plans that bring in all the publications of a particular publisher on a particular subject. They're set up that way because of the reputation of the publisher. Is there a parallel for networked resources?

Whole <--> part relationships in the networked environment

Moving on to "hierarchies" and linking, Bernhard Eversberg, one of our German colleagues who's participated in the list discussions, has raised the issue of whole/part relationships and how U.S. cataloging practice for many kinds of multi-part items is an impediment to sharing data internationally.[4]

Networked resources offer new possibilities for linking and we need to explore the potential for linking different types of records together, perhaps linking bibliographic descriptions at the collection level to other types of metadata for individual items in that collection.

A shift from "passive" to "active and immediate" hierarchical relationships?

It also seems to me that the shift from relatively passive relationships such as those expressed in print publications by "series title-pages" that give information about other volumes in the series or even lists of works by the same author to much more immediate and "in your face" relationships such as a page where the user is exposed not only to the table of contents for an issue of the electronic journal, IMF Staff Papers, but also to basically all of the information that is available at the International Monetary Fund's web site.[5]

Are we seeing a return to the 19th-century mixed catalog?

The third word for today is "granularity." Matthew Beacom mentions the possibility of a return to article-level cataloging. In a posting to the BIBCONTROL list, Pauline Cochrane reminds us that, in the 19th century, library catalogs sometimes contained journal article indexing (before we gave all that over to the commercial indexing and abstracting services).[6] As a result, Chapter 13 of AACR2, and its equivalents in previous versions of the rules, are among the least used portions of those rules.

It also seems clear that, in addition to the e-journal aggregations and article databases that seem to be transforming journal publishing, much of what is available on the Internet shares characteristics of "essays in a collection" or "chapters in a larger work" to mention only a couple of other targets for traditional In-analytics.

At what level of granularity are CORC participants creating records?

One thing that has become obvious in working the CORC project participants is the potential need for guidance about what is the appropriate level to describe a networked resource. Do you describe only at the "site" level or at a level below that -- a subsite that forms some kind of logical unit -- or at the individual item level, be that a paper or article, an image, or some other kind of resource?

To aid in looking at this issue, my colleague, Chandra Prabha of the OCLC Office of Research, has been examining a set of CORC resource descriptions created during the period from July 1999 through June 2000. One of the characteristics that she has looked at is the granularity of the cataloging unit. Preliminary analysis indicates that 60% of the resources describe something that appears to be a "whole" item while 33% represent something that is a part of a larger whole with the remaining 7% falling into a gray area that cannot be easily categorized.

This issue is very much involved with the question of "how can we ever hope to control something so vast and changeable as the Web" and I hope that one of the outcomes of this conference might be the beginning of some guidance on the issue of cataloging granularity. I think we all understand the idea that we're not cataloging "every takeout menu and place mat," as Robin Wendler noted in her comments, but catalogers need some help determining what it is that they are or should be cataloging.

A Parting Thought

I ran across a quotation in a recent issue of The Economist that made me think about the current state of cataloging for networked resources:

"Everything that can be invented has been invented." With these sweeping words, the Commissioner of the United States Office of Patents recommended in 1899 that his office be abolished, so spectacular had been the wave of innovation in the late 19th century.[7]

Beacom's 12 recommendations make it clear that "everything that can be cataloged has not been cataloged." Action on these recommendations would give us the consistency and the flexibility to handle networked resources ... and whatever else is lurking around the corner ... and, in the process, keep the library and the cataloger in the center of a networked environment.

