The document element <ead> encloses all other elements. It indicates to a computer that what follows is a machine-readable version of a finding aid that has been encoded using the SGML document type definition known as Encoded Archival Description. Setting the AUDIENCE attribute in <ead> to "external" will display the contents of all of the subelements, unless the attribute in an individual subelement is set to "internal." The element <ead> also has a RELATEDENCODING attribute that can be used to declare a descriptive encoding system, such as MARC, Dublin Core, or ISAD(G), to which many EAD elements can be mapped using the ENCODINGANALOG attribute. <eadheader> and <frontmatter>, the other two high-level elements inside <ead>, are discussed in detail in section 3.6 at the end of this chapter.
The required <eadheader> is an essential part of a properly encoded finding aid; it contains metadata about the title, author, and creation date of the finding aid, as well as information about the language in which the finding aid is written and details about its encoding. The optional <frontmatter> element includes subelements that can be used to create nicely formatted title pages and other publication-type prefatory material such as acknowledgements and introductions.
Since EAD permits a great deal of flexibility in the order of information within <archdesc>, these Guidelines discuss the <archdesc> subelements in an order that matches a suggested sequence for the information in an online finding aid. This order is unlikely to correspond exactly to the manner in which you collect and compile the information in your finding aids. As mentioned in section 3.2, information for portions of a finding aid may have been gathered over a long period of time. When processing a collection, particularly one that is large and complex, you may draft descriptions of discrete and possibly noncontiguous parts as you complete their organization and arrangement. Presenting this information in the most "logical" order for your researchers offers a challenge that these Guidelines aim to help you address.
Note that some of the tagged examples in this chapter omit required elements when the additional tagging would obscure the point being made in the accompanying text. Whenever you have a question as to where an element is available or which parent elements are required, consult the EAD Tag Library.
As noted in section 3.5, the EAD element that encompasses the text of the archival finding aid is <archdesc>, within which are nested all other descriptive elements. The <archdesc> element is a wrapper; it holds the other elements together in a cohesive package. In addition, its required LEVEL attribute identifies the highest level of description represented in the finding aid, which is usually set to "fonds," "collection," or "record group." Occasionally a finding aid may relate to only one "series," "subgroup," "subseries," "file," or "item," and those alternative values could also be selected for the <archdesc> LEVEL attribute. Although EAD accommodates such instances, these Guidelines have been written with the assumption that most inventories and registers will capture at least a basic description of the highest-level fonds, collection, or record group before describing individual components, and that the <archdesc> LEVEL attribute should therefore reflect the highest tier in the collection's hierarchy. Lower tiers in the hierarchy are identified by setting the LEVEL attribute in the component descriptions, as explained in section 3.5.2.
There are several other attributes in <archdesc> that can be used to control or provide information for the entire finding aid:
An <archdesc> start-tag that includes this full complement of attributes may look like this (the attributes may be arranged in any order):
<archdesc audience="external" relatedencoding="marc" langmaterial="eng" legalstatus="public" level="fonds" type="register">
Having specified this control information in <archdesc>, the archivist then proceeds to identify some basic facts about the collection by using the Descriptive Identification <did> element.
Because these questions apply to all levels of archival groupings, from the fonds, record group, or collection, down to the item, <did> is available at all levels of description. While a single occurrence of <did> is required (usually as the first element within <archdesc>), specific elements within <did> are not, because not all will be needed at every level of description. For example, once Origination <origination> or Repository <repository> data has been specified for an entire body of materials, it may not need to be repeated at the series or file level.
One of the advantages of bundling such information in <did> is that it serves as a wrapper for these essential pieces of information in an online environment, where retrieval of coherent chunks of descriptive information about a given archival unit is critically important to the end user's understanding of a search result. It may also encourage good descriptive practice by reminding archivists to include the same basic data at all levels of description.
The first occurrence of <did>, which represents the highest level of description for a given body of materials, should allow a researcher to determine whether the materials are pertinent to his or her line of inquiry without having to read far down into the finding aid. To facilitate this resource discovery and recognition, the first <did>, referred to hereafter as the "high-level <did>," should include the following elements, which are discussed in greater detail below:
At its most basic, the high-level <did> might therefore look like this (with the sequence of <did> subelements determined by the repository):
<did> <repository>Harry Ransom Humanities Research Center</repository> <origination>Stoppard, Tom</origination> <unittitle>Tom Stoppard Papers</unittitle> <unitdate>1944-1995</unitdate> <physdesc>68 boxes (28 linear feet)</physdesc> <abstract>The papers of British playwright Tom Stoppard (b. 1937) encompass his entire career and consist of multiple drafts of his plays, from the well-known <title render="italic">Rosencrantz and Guildenstern Are Dead</title> to several that were never produced, correspondence, photographs, and posters, as well as materials from stage, screen, and radio productions from around the world.</abstract> </did>
Other elements such as ID of the Unit <unitid> and Physical Location <physloc> should also be included if the repository assigns a unique identifier (perhaps an accession number) to the materials, and if a physical location of the entire collection is specified in the finding aid. It is also recommended that the high-level <did> be given a Heading <head> and be encoded fairly specifically with various subelements and ENCODINGANALOG attributes; this will enable search engines to retrieve a basic description about the collection or to facilitate the extrapolation of a skeletal MARC record. Again, each of these subelements and attributes are explained in the sections which follow. A fully encoded <did> might look like this:
<did> <head>Summary Description of the Tom Stoppard Papers</head> <repository> <corpname>The University of Texas at Austin <subarea>Harry Ransom Humanities Research Center</subarea> </corpname> </repository> <origination> <persname source="lcnaf" encodinganalog="100">Stoppard, Tom</persname> </origination> <unittitle encodinganalog="245">Tom Stoppard Papers, </unittitle> <unitdate type="inclusive">1944-1995</unitdate> <physdesc encodinganalog="300"> <extent>68 boxes (28 linear feet)</extent> </physdesc> <unitid type="accession">R4635</unitid> <physloc audience="internal">14E:SW:6-8</physloc> <abstract>The papers of British playwright Tom Stoppard (b. 1937) encompass his entire career and consist of multiple drafts of his plays, from the well-known <title render="italic">Rosencrantz and Guildenstern Are Dead</title> to several that were never produced, correspondence, photographs, and posters, as well as materials from stage, screen, and radio productions from around the world.</abstract> </did>Such detailed markup, which includes subelements and attributes, is recommended at the highest level of your multilevel description but may not be necessary or even desirable at the component level. Conversely, other <did> subelements such as Container <container>, Note <note>, Digital Archival Object <dao>, or Digital Archival Object Group <daogrp> are often unnecessary in the high-level <did>. The subelement <container> is discussed in section 3.5.2.4; the latter three elements are mentioned briefly in the discussion of the high-level <did> and are discussed more thoroughly in section 3.5.1.7 and section 7.3.6.
All subelements within <did> have a LABEL attribute. This attribute functions somewhat like the Heading <head> element (which is used in lieu of LABEL for non-<did> elements; see section 3.5.1.7.1) in that it can be used to generate print or display constants. The <did> subelements carry a LABEL attribute instead of a <head> subelement primarily because the information contained in the <did> subelements tends to be brief-frequently only a few words-in contrast to such elements as <scopecontent> and <bioghist>, which tend to consist of longer narrative chunks of text. If each <did> subelement contained <head>, a Paragraph <p> would also be necessary in order to enter the text of the element; this would effectively double the amount of tagging for such small bits of information.
The LABEL attribute is especially useful at the highest-level <did> to aid readers in interpreting the collection summary description, while LABEL or <head> information may be less frequently necessary elsewhere in the finding aid. For example, this markup
<did> <repository label="Repository:"> <corpname>The University of Texas at Austin <subarea>Harry Ransom Humanities Research Center</subarea> </corpname> </repository> <origination label="Creator:"> <persname source="lcnaf" encodinganalog="100">Stoppard, Tom</persname> </origination> <unittitle label="Title:">Tom Stoppard Papers, <unitdate type="inclusive">1944-1995</unitdate> </unittitle> </did>can generate the following display based on specifications in a stylesheet:
Repository: The University of Texas at Austin Harry Ransom Humanities Research Center Creator: Stoppard, Tom Title: Tom Stoppard Papers, 1944-1995
A stylesheet is a text file or output specification that is used by a processing system in conjunction with the encoded finding aid to control how the document will be displayed or formatted.(56) Stylesheets define the appearance of each element in each of its contexts within the document. Any element can be assigned specific display features, such as font size, style, and color. A stylesheet also can be used to insert preceding characters or spacing to an element, rather than using the LABEL attribute as noted in the example above. A stylesheet allows you to modify the element's display or formatting features in relation to where the element appears in the finding aid. For example, in certain contexts, such as in the high-level <did>, you may want the <unittitle> of the collection or fonds to appear on a separate line, in a certain font size and style, and preceded by the word "Title," followed by a colon. Elsewhere in the finding aid, you may want the <unittitle> of a component to appear inline and in a smaller font size than that of the higher-level <unittitle>. The stylesheet allows you to control those decisions so you need not hardwire formatting codes into a document in the same way that you do in a word processing or HTML document.
Omitting such "formatting" instructions from your data allows you to change the appearance of all your finding aids by simply changing the stylesheet; this also will simplify the encoding of each individual EAD finding aid. Keep in mind, however, that an end user can choose to replace the stylesheet you created with a different stylesheet. Use of the LABEL attribute may better ensure that words you designate will stay with the finding aid regardless of the stylesheet attached to it.
Note also that it is possible to create multiple stylesheets to use with your EAD finding aids. For example, you may create one stylesheet for online display and another for printed output. Creating your finding aid in EAD allows you to separate what the text actually is from how the text is rendered, thereby making it possible to process or format the same text in different ways.
Note that in most instances the repository that provides intellectual access to the materials is also the institution that holds physical custody, but when that is not the case, the name and other pertinent information about the physical custodian should be encoded in the <physloc> element.
<origination label="creator">Mary Hutchinson</origination>
in combination with stylesheet instructions, could result in either of the following outputs:
Creator: Mary Hutchinson Mary Hutchinson, creator
In addition to the LABEL attribute, <origination>, like many EAD elements, has an ENCODINGANALOG attribute that enables you to specify a MARC or other encoding scheme field that relates to this element (in this case, the MARC 100 field). Either of the following options is valid in EAD, but the latter is more specific:
<origination encodinganalog="100" label="creator">Mary Hutchinson</origination> <origination label="creator"> <persname encodinganalog="100">Mary Hutchinson</persname> </origination>
It would also be possible to invert the <persname> data (Hutchinson, Mary) to match the formatting of a MARC 100 field for retrieval purposes. Using the NORMAL attribute would accomplish the same purpose:
<origination label="creator"> <persname encodinganalog="100" normal="Hutchinson, Mary">Mary Hutchinson</persname> </origination>
For more information about "name" subelements, see the discussion of <controlaccess> in section 3.5.3.
In the high-level <did>, the <unittitle> is the title of the collection, or perhaps of a subgroup or series, depending on what the highest level of description in the finding aid is. If the title of the body of materials includes a formal title, it may be desirable to nest the Title <title> element within <unittitle> for display or retrieval purposes. For example:
<unittitle encodinganalog="245">Stuart Johnson Collection of <title>Alice in Wonderland</title> Memorabilia</unittitle>
This markup would permit the display or printing of the title Alice in Wonderland in any fashion desired by the repository, including but not limited to italics, through use of the RENDER attribute in <title> or a stylesheet. In addition, it would facilitate the retrieval of the phrase "alice in wonderland" in a <unittitle> or <title> search, and, through the use of the ENCODINGANALOG attribute, the export of the text within the <unittitle> element to a MARC record for the archival collection.
<unittitle>Stuart Johnson Collection of <title>Alice in Wonderland</title> Memorabilia, <unitdate type="inclusive"> 1905-1928</unitdate></unittitle> <unittitle>Stuart Johnson Collection of <title>Alice in Wonderland</title> Memorabilia, </unittitle> <unitdate type="inclusive">1905-1928</unitdate>
Each repository should choose one of the above methods of encoding with regard to the relative placement of <unittitle> and <unitdate> and be consistent both within an individual finding aid and across all finding aids. National descriptive standards may provide guidance on this point. Archivists who catalog their materials using APPM are accustomed to thinking about span and bulk dates for a body of materials as part of the title, while RAD users treat such dates as a separate data element in the Dates of Creation Area.(58) In either case, <unittitle> and <unitdate> information may be displayed together.
The element <unitdate> also has a NORMAL attribute that allows dates to be stated in a standardized form: YYYYMMDD. Use of this attribute would facilitate retrieval of date information if implemented consistently. However, date information is provided in numerous formats in finding aids, as in the following examples:
March 17, 1946 17 March 1946 1946 March 17 ca. 1946 1946? 1940s n.d. undated
Because of this the additional markup required to supply normalized dates for searching purposes may be prohibitively time consuming.
It is possible to put a basic statement of extent into the <physdesc> element without using any subelements:
<physdesc>149 cubic feet</physdesc> <physdesc>3800 photographs</physdesc>
In many cases, this level of markup is sufficient. The use of subelements, such as Extent <extent> and Genre/Physical Characteristic <genreform>, as well as attributes in <physdesc> can, however, render the physical description much more specific:
<physdesc encodinganalog="300">149 cubic feet</physdesc> <physdesc> <extent>3800</extent> <genreform>black and white prints (photographs)</genreform> </physdesc> <physdesc> <extent>46</extent> <genreform>sound recordings</genreform> </physdesc>
What is the benefit of such additional encoding? Any time information is encoded at a more granular (detailed) level, the ability to manipulate and reuse the data is enhanced. For retrieval purposes, you might want to search for a particular type of photograph, such as albumen prints, salted paper prints, or hand-colored prints. While this can be done using a keyword search, searching for these types of terms within a <physdesc> or <genreform> element improves the relevance of the search result.
<abstract>The archive comprises records mainly from the pre-1837 Archdeaconry before its removal from the jurisdiction of York to that of Lincoln. Most of the documents stem from the Archdeacon's twice-yearly visitations and the cases pursued in his court, the earliest dating from the 16th century. The Archdeaconry of Nottingham joined with the county of Derby [from the Diocese of Lichfield] to form the Diocese of Southwell.</abstract>
The <abstract> element could easily be confused with <note>, which also is available within <did>. The <note> element should not be used for summary descriptive information, but rather to cite the source of a quotation (as in a footnote), provide a short explanatory statement or user directive, or for miscellaneous purposes such as to indicate the basis for an assertion. In general, the generic text element <note> should never be used when a more specific structural EAD element is more appropriate.
<note><p>Note to researchers: To request materials, please note both the location and box numbers shown below.</p></note>
In the high-level <did>, a <note> also could be used to alert the reader to the fact that the materials described in the high-level <did> are in fact a component of a larger body of materials that had to be described in separate EAD instances because of the difficulties encountered in parsing and downloading a single large finding aid file for the entire fonds or record group. The creation and linking of separate finding aids for a single large collection is discussed in greater detail in connection with the Archival Reference <archref> element in section 7.3.3.
Because of its utility as explanatory text, the <note> element is also available outside <did>, as explained in section 3.5.1.7.3.
Two attributes are available in <unitid> that are not available anywhere else in EAD and which should be used only in the high-level <did>: COUNTRYCODE and REPOSITORYCODE. COUNTRYCODE provides the unique code, taken from the ISO 3166 Codes for the Representation of Names of Countries, for the country in which the archival materials are held. REPOSITORYCODE contains another unique code, taken from the official repository code list for the country in which the repository is located, for the repository responsible for the intellectual control of the materials being described.(59) These two attributes relate specifically to the ISAD(G) reference codes in the Identity Statement Area(60) and guarantee uniqueness of the <unitid> in a multinational finding aid database. If desired, the attribute values could be manipulated by a stylesheet to display or print the name of the country and the name of the repository as part of the <unitid> information. At the highest level of description, a <unitid> might look like this:
<unitid countrycode="gbr" repositorycode="067">ES</unitid>
<physloc>14E:SW:6-8</physloc> <physloc>The Mary Hutchinson Papers are stored offsite, and 24-hour notice is required to retrieve the materials.</physloc>
<physloc> is repeatable, so both types of information can be provided when needed. If the repository chooses to include the shelf location in the finding aid for its own internal use, the information can be encoded but shielded from public access by using the AUDIENCE attribute (if your server is capable of suppressing information coded as "internal" when delivering your EAD files to users):
<physloc audience="internal">14E:SW:6-8</physloc>
The Library of Congress