The EAD Tag Library(8) identifies the names and definitions of all EAD data elements defined in the DTD. These EAD Application Guidelines provide interpretative guidance to enable archivists to apply the DTD accurately and effectively when encoding their repositories' finding aids. These three documents (the DTD, the Tag Library, and the Application Guidelines) together comprise complete documentation for EAD Version 1.0.
This first chapter places EAD within the broader context of other archival descriptive standards and explains the choice of SGML as its technical environment. It emphasizes that while EAD's development began in the United States and its structure is rooted in this country's descriptive practices, EAD's developers incorporated significant concepts from the international descriptive framework provided by the General International Standard Archival Description (ISAD(G))(9) and from national descriptive content standards such as the Canadian Rules for Archival Description (RAD).(10) In addition, EAD elements were assigned language-neutral nomenclature designed to circumvent terminological differences and thereby encourage international application and acceptance of the DTD.
The EAD development process has been thoroughly documented elsewhere,(11) but one point is important to emphasize in the context of these Guidelines, namely that both the philosophical underpinnings and the structural particulars of EAD are firmly rooted in archival principles, tradition, and theory. The EAD development group analyzed archival finding aids as documents, as well as the descriptive principles embodied in the aforementioned ISAD(G) framework, RAD, and Archives, Personal Papers, and Manuscripts (APPM);(12) from this the group developed and articulated a set of design principles. These principles provided a conceptual foundation intended to ensure that EAD would remain grounded in the realities of past and current theory and practice.(13)
One key design principle states that EAD will accommodate both the creation of new finding aids and the conversion of existing (or legacy) data. EAD is indeed sufficiently flexible to achieve this, but at the same time, it seeks to foster structural uniformity across finding aids in the belief that adherence to a consistent data model increases successful document interchange among repositories and that greater standardization of finding aids would generally be a positive development. Another important design principle specifies that while a large and diverse universe of archival descriptive data exists, EAD is intended to accommodate data that supports description, control, navigation, indexing, and online or print presentation, but not necessarily data that is only intended to address local collection management needs.
In the United States, for example, some repositories have prepared published summaries of their holdings, and during the economic depression of the 1930s, the government funded a major historical records survey as a work relief project. In the late 1950s, more systematic efforts were initiated to assemble summary descriptions of resources nationwide. The commencement in 1959 of the multivolume National Union Catalog of Manuscript Collections,(15) followed by the 1961 publication of Hamer's path-breaking Guide to Archives and Manuscripts in the United States,(16) helped identify the location and general scope of the manuscript collections within participating repositories.(17)
Helpful as these paper-based projects were, however, it was not until the advent of the MARC AMC format in the late 1970s that repositories in the United States gained the ability to disseminate information about their holdings more widely via national bibliographic systems. MARC AMC provided (18) to the resulting catalog records enabled archival holdings to be searched with the same flexibility and precision as published materials. The advances of MARC AMC notwithstanding, the MARC records could accommodate only summary information about holdings; they could not absorb all the data in a detailed finding aid. They could, however, point to the existence of detailed paper-based finding aids. Nevertheless, it remained problematic that these detailed finding aids were not yet part of a shared online environment. This was especially frustrating because many finding aids gradually were being produced using word processing or database systems.
Although numerous American repositories embraced the MARC AMC format, most European institutions did not, working instead toward development of ISAD(G), which was adopted by the International Council on Archives (ICA) in 1993. ISAD(G) defines twenty-six elements "that may be combined to constitute the description of an archival entity" at any level.(19) ISAD(G) also provides a set of definitions for archival terminology and formulates four general principles to guide archivists in multilevel description. The primary motivating factor behind the development of this standard was the recognition that some level of descriptive consistency would be required in order to facilitate the exchange and retrieval of archival information in unified, multirepository or multinational information systems.
EAD is a more specific structural standard than ISAD(G) in that EAD is focused on the particular type of archival finding aid typically called an inventory or register. As mentioned earlier, however, the developers of EAD looked closely at ISAD(G) and made certain that its elements were accommodated within the EAD data structure.(20) Moreover, EAD is completely compatible with ISAD(G)'s principles of multilevel description. This comfortable fit between the ISAD(G) and EAD data structures is a primary reason why interest in EAD has been strongly international in scope.
Archivists who lack experience thinking explicitly about multilevel description when implementing a hierarchical data structure such as EAD will find that ISAD(G) provides a vital framework within which to situate EAD-related decision making.
This new environment provides us with an opportunity to reconceptualize how we deliver information to our users, both traditional archival users and entirely new potential audiences. Archivists were quick to recognize that the Internet provided opportunities for electronic dissemination of finding aids, and many rapidly established Gopher sites for that purpose. The results of these experiments were tantalizing but ultimately discouraging. Gopher software could manage finding aids only as simple text files lacking structural or typographical formatting and important features such as footnotes; this made lengthy finding aids difficult to navigate. Moreover, no mechanism existed to link the finding aids to any corresponding MARC records. A user searching a repository's online catalog therefore had to exit the catalog and log into the Gopher site to verify whether a finding aid existed (for those still using Gophers after Web-based online catalogs became available, this particular problem was eliminated).
The emergence of the World Wide Web in the early 1990s offered significant advantages over Gophers. Hyper Text Markup Language (HTML), the SGML DTD in which Web-based documents are currently encoded, furnished the mechanism to display finding aids with additional typographical nuances and navigational techniques. Moreover, the essence of the Web-the ability to create dynamic hyperlinks among documents stored at different locations-made it possible, particularly with the appearance of Web-based online catalogs, to link a MARC record to its corresponding finding aid.
It soon became clear, however, that HTML also has significant limitations. The principal problem lies in the fact that HTML is designed to provide only procedural encoding to facilitate improved layout and appearance; the intellectual structure or content of documents cannot be meaningfully encoded. For example, HTML easily represents features such as differing point sizes for headings or italics for formal titles, but it cannot distinguish a scope and content note from a biographical summary, a personal name from a geographic name, or a title from a date. Thus, HTML is unable to represent visually or permanently store the complex content and structure of archival finding aids. This means that HTML cannot enable sophisticated searching or navigation, nor ensure data permanence and facilitate future data migration. Moreover, although the basic rules and structure of HTML are relatively stable, its development environment is quite volatile and idiosyncratic, lacking the rigor of standards that is essential to successful information exchange and data migration.
One example of a more generalized scheme is the Text Encoding Initiative (TEI), an international cooperative effort to develop an SGML DTD for scholarly texts.(22) Pitti looked closely at TEI because it was an important humanities-based computing initiative, but he ultimately found its goals incompatible with the needs of finding aids. This was because TEI was designed to encode literary and other texts as objects of study, and such documents are very different from the type of descriptive metadata that archival finding aids represent. As a result there are many elements in TEI that are not needed in EAD. More significantly, key elements required for finding aids are not available in TEI. EAD was, however, made as consistent with TEI as possible: the basic TEI header structure was incorporated into EAD,(23) and element names and attributes conflict as little as possible. Moreover, there has been active communication between the EAD and TEI developers in order to ensure that EAD remains a compatible part of the larger universe of humanities-based computing initiatives.
As noted above, SGML is inherently hierarchical. EAD reflects the ability of a well-crafted SGML DTD to identify the constituent intellectual and physical parts of a predominantly text-based document as distinct fields or elements, and then to nest component parts, or subelements, within them. This nesting capability allows the encoder of a finding aid (and subsequently a researcher using the encoded finding aid online) to work first with high-level elements that reflect an overview of the finding aid, and then to unfold progressively more detailed sections. Conversely, certain browser software can enable a user to search an EAD finding aid directly at item- or folder-level, then to broaden or contextualize the search by examining other items contained at the same level, or to move further up in the hierarchy to such elements as a scope and content note for a particular series or for an entire collection.
Employing the principle of inheritance, SGML enables elements at a lower level in a hierarchy to inherit the information encoded in higher-level elements; this complies with the ISAD(G) rule regarding the nonrepetition of information.(24) This means that an encoder need not repeat descriptive data that already was entered at a higher level within the finding aid. Inheritance is illustrated in chapter 3, particularly in the figures in section 3.5.2.5.
In order to be Web-deliverable, XML simplifies some of SGML's complexities; EAD included few of these complexities and so was easily made XML-compliant. The full implications of XML with respect to EAD implementation are covered in section 4.3.2 and in chapter 6.
XML was adopted by the World Wide Web Consortium as a Web standard in 1998. Version 5.0 of Microsoft's Internet Explorer browser supports XML documents, and as of early 1999, Netscape had incorporated XML into the beta versions of its next browser release.
Some questions surrounding the coexistence of MARC and EAD derive from two aspects of MARC implementation that have concerned some archivists: first, the fact that a MARC record is just a summary, not the complete finding aid; and second, that the preparation of a MARC record adds one more resource-intensive step to the arrangement and description of archival materials. EAD seeks to address both of these concerns by identifying the relationships between MARC data elements and their corollaries within encoded finding aids. This is achieved by specifying encoding analogs for EAD elements that correspond directly to specific MARC fields (see section 3.5.3.1 for details).
The use of encoding analogs provides the potential for repositories to consolidate EAD encoding and MARC cataloging into a single activity by generating a basic MARC record automatically from EAD; the opposite also can be accomplished by importing a MARC record into an EAD finding aid in order to add collection-level descriptive information and controlled access points to an existing container listing. Either activity would be accomplished by means of a programming script (see section 4.3.4 for more information). A MARC record exported from an EAD finding aid potentially could be uploaded into a larger MARC system, such as RLIN, OCLC, or a local online catalog. Repositories following this course would still retain the option of further editing the resulting MARC records using whatever MARC-based editing software they normally utilize. Automated routines have not yet been developed for these processes, but repositories wishing to explore these options are advised to consult the MARC-to-EAD crosswalk found in appendix B to identify concordances between data elements.
While EAD provides a much more flexible and detailed data structure for archival description than does MARC, EAD is a data structure standard, not a data content standard, and therefore does not mandate authoritative forms of content for any of its elements. This is potentially a significant drawback for information exchange. Standardization of the content of EAD descriptive elements can be achieved, however, if repositories or consortia develop and adhere to specific data content conventions, or "best practices." The content of EAD elements that have encoding analog attributes can be chosen based on a data content standard such as RAD or APPM, or a data value standard such as the Library of Congress Name Authority File (LCNAF) or Library of Congress Subject Headings (LCSH).
The summer issue (Context and Theory) contains six articles written by members of the EAD development team that provide background iformation on these topics: aspects of the history of archival description and of information systems that establish the context within which EAD was developed; the nature of structured information in general and of EAD's structure in particular; administrative and technical issues that must be considered prior to implementing EAD; and EAD's significance as an emerging standard for archival description.(27)
The fall issue (Case Studies) contains six case studies written by EAD "early implementers," which is to say archivists at institutions that implemented EAD while it was still under development, prior to publication of the Version 1.0 DTD in August 1998. The first case study describes the process of "reengineering" finding aids to conform to EAD's data structure and to maximize user comprehension within the Web environment; this article may be the very best place for an archivist contemplating EAD implementation to begin reading.(28) The other five articles detail the software, hardware, and encoding choices made by particular institutions in the course of EAD implementation. The case studies may be particularly meaningful after you have read chapters 1 through 3 of these Guidelines, because the significance of the various institutions' choices will then be clearer.
As with most adult education, attending a workshop can be an exceptionally helpful way in which to begin learning a new standard, particularly one based in state-of-the-art information technologies. The combination of a well-informed instructor and a cadre of fellow students, eager to learn and to share their experiences, can serve both to demystify many aspects of EAD and to build confidence in your ability to succeed.(32) On the other hand, it is important to note that any complex standard takes time to master fully; a workshop can only give you the basics and get you started on the right foot. These Guidelines can help reinforce and expand on such instruction and lead you to additional resources to address your increasingly sophisticated learning needs.
The Library of Congress