Skip Navigation Links The Library of Congress >> Standards
Metadata Encoding and Transmission Standard (METS) Official Web Site
METS_Profile: @xsi:schemaLocation="http://www.loc.gov/METS_Profile/ http://www.loc.gov/standards/mets/profile_docs/mets.profile.v1-2.xsd http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-0.xsd"
title:
UCB Paged Text Object Profile
abstract:
This profile represents a specific subset of the Model Paged Text Object Profile. UC Berkeley Library METS objects with associated text content files, or with both image content files and text content files implement this profile.
date:
2004-04-27T08:00:00
contact:
name:
Rick Beaubien
address:
Library Systems Office, Rm. 386 Doe Library, University of California, Berkeley, CA 94720-6000
phone:
(510) 643-9776
email:
rbeaubie@library.berkeley.edu
related_profile: @RELATIONSHIP="subset of" @URI="http://www.loc.gov/mets/profiles/00000005.xml"
Model Paged Text Object Profile
related_profile: @RELATIONSHIP="extends" @URI="http://www.loc.gov/mets/profiles/00000002.xml"
UCB Imaged Object Profile
extension_schema:
name:
NISOIMG
context:
mets/amdSec/techMD/mdWrap/xmlData
note:
Used for technical metadata about image content files.
extension_schema:
name:
textmd
context:
mets/amdSec/techMD/mdWrap/xmlData
note:
Used for technical metadata about text content files.
description_rules:

All applications of the MODS schema in conforming METS documents follow the MODS User Guidelines published by Library of Congress' Network Development and MARC Standards Office.

controlled_vocabularies:
vocabulary:
name:
Model Paged Text object Profile <file> USE attribute values
maintenance_agency:
Library Systems Office, The General Library, University of California, Berkeley
values:
value:
archive image
value:
reference image
value:
thumbnail image
value:
tei transcription
value:
tei translation
value:
ocr
value:
ocr dirty
context: @ID="vc1" @RELATEDMAT="fileSec2"

mets/fileSec/fileGrp/@USE

mets/fileSec/fileGrp/file/@USE

description:

These are the supported values for <file> and <fileGrp> USE attributes in paged text objects conforming to this profile.

"archive image", "reference image" and "thumbnail image" are appropriate values to describe the USE of image content files. "archive image" designates image masters; "thumbnail image" image thumbnails; and "reference image" any intermediate resolutions intended for reference purposes.

"tei transcription" and "tei translation" are appropriate values to describe associated structured text files encoded according to TEI rules;"tei transcription" designates direct TEI transcriptions of text based materials; "tei translation" designates TEI translations of these materials from their original language.

"ocr" and "ocr dirty" should be used to designate versions of the text produced by ocr technologies. "ocr dirty" would be used to distinguish ocr text that is not suitable for presentation to the user from clean "ocr".

A given segment of the source material could be represented by more than one content file of a particular USE. For example, the page of a manuscript that has been digitized could be represented by an image master (USE="archive image"), a thumbnail image (USE="thumbnail image"), but two jpeg reference images of different resolutions (USE="reference image")

vocabulary:
name:
Model Paged Text Object <structMap> TYPE attribute values
maintenance_agency:
Library Systems Office, The General Library, University of California, Berkeley
values:
value:
physical
value:
logical
value:
mixed
context: @ID="vc2" @RELATEDMAT="structMap2"

mets/structMap/@TYPE

description:

These are the supported values for the <structMap> TYPE attribute in METS documents conforming to this profile.

"physical" designates a purely physical structure. For example, a book divided into page views.

"logical" designates a purely logical structure. For example, a book divided into chapters; or a diary divided into diary entries.

"mixed" designates a mixed structure. For example, a book divided into chapters, divided into page views.

structural_requirements:
metsRootElement:
requirement: @ID="metsRoot1"

The root <mets>element must include a LABEL attribute value.

requirement: @ID="metsRoot2"

The root <mets> element must include an OBJID attribute value containing a valid ark and that uniquely identifies the object in its owning repository.

metsHdr:
requirement: @ID="metsHdr1"

Conforming METS documents must contain a metsHdr element.

requirement: @ID="metsHdr2"

<metsHdr> element must include the CREATEDATE attribute value. It must also include the LASTMODDATE attribute value if this does not coincide with the CREATEDATE

requirement:

<metsHdr> element must include a child <agent> element identifying the person or institution responsible for creating the METS object.

dmdSec:
requirement: @ID="dmdSec1"

Conforming METS documents may, but need not, contain a one or more <dmdSec> elements. Each <dmdSec> may in turn contain a <dmdRef> or a <dmdWrap>

requirement: @ID="dmdSec2"

If a <dmdSec> of a conforming document contains a <dmdWrap> with <xmlData>, the <xmlData> must conform to the MODS schema.

amdSec:
requirement: @ID="amdSec1"

Conforming METS documents may but need not contain an <amdSec> element. This <amdSec> may but need not contain one or more <techMD> elements, <sourceMD> elements, <rightsMD> elements and/or <digiprovMD> elements.

requirement: @ID="amdSec2"

A conforming METS document will contain no more than one <amdSec> element. All <techMD>, <sourceMD>, <rightsMD> and <digiprovMD> elements will appear in this single <amdSec> element.

requirement: @ID="amdSec3"

If one or more <techMD> elements pertaining to image content files are present, they must contain <xmlData> of NISOIMG type conforming to the MIX schema.

requirement: @ID="amdSec4"

If one or more <techMD> elements pertaining to text content files are present, they must contain <xmlData> conforming to the textmd schema.

requirement: @ID="amdSec5"

If one or more <rightsMD> elements are present, they must contain <xmlData> conforming to the METSRights schema.

requirement: @ID="amdSec6"

Any <sourceMD> or <digiprovMD> elements should contain <xmlData> conforming to a METS Editorial Board endorsed schema whenever such a schema exists and covers the requisite concepts.

requirement: @ID="amdSec7"

Source metadata pertaining to image content files may be expressed as part of any MIX encoded technical metadata in <techMD> elements rather than in separate <sourceMD> elements. This might occur whenever the available source metadata is minimal and covered by the MIX schema.

fileSec:
requirement: @ID="fileSec1"

The <fileSec> of a conforming METS document must contain a parent <fileGrp> for each file format/use represented by the content files. For example, the <fileSec> of a typical METS document implementing this profile might contain one <fileGrp> representing TIFF master images, one <fileGrp> representing high resolution JPEG reference images , one <fileGrp> representing medium resolution JPEG reference images, one <fileGrp> representing GIF thumbnail images, and one <fileGrp> representing TEI transcriptions. Each of these <fileGrp> elements may or may not contain subsidiary <fileGrp> elements representing subgroups of the content files.

requirement: @ID="fileSec2" @RELATEDMAT="vc1"

Each <file> represented in the <fileSec> must have an associated USE attribute. The USE attribute may be expressed directly at the <file> element level. Alternately, however, the USE attribute may be expressed in conjunction with the <fileGrp> that is the immediate parent of a <file> element; in this case it is taken to pertain to all <file> elements in the <fileGrp>. Supported <file>/<fileGrp> USE attribute values appear in the <controlled_vocabularies> section of this document.

requirement: @ID="fileSec3"

Any <file> element may reference any number of pertinent top level adminstrative metadata elements within the <amdSec> via its AMDID attribute value. It should only reference ID values at the <techMD>, <rightsMD>, <sourceMD> and/or <digiprovMD> levels of the <amdSec>

structMap:
requirement: @ID="structMap1"

A conforming METS document must contain only one <structMap>.

requirement: @ID="structMap2" @RELATEDMAT="vc2"

A conforming <structMap> must contain a TYPE attribute. Supported TYPE values appear in the <controlled_vocabularies> section of this document ("logical","physical", or "mixed").

requirement: @ID="structMap3"

Each <div> must include a LABEL attribute value.

requirement: @ID="structMap4"

A <div> element at any level may point to one or more pertinent <dmdSec> elements via its DMDID attribute value. However, the DMDID attribute should only reference IDs specified at the <dmdSec> element level, and not IDs at lower levels. For example, a <div> DMDID attribute should not reference an ID value of an element within the <xmlData> section of a <dmdSec>

requirement: @ID="structMap5"

A <div> element may or may not directly contain <fptr> elements. (In other words, a <div> of the <structMap> may or may not have content files directly associated with it).

requirement: @ID="structMap6"

An <fptr> element must either 1) directly point to a <file> element via its FILEID attribute; or 2) contain an <area> element that points to a <file> element; or 3) contain a <seq> element comprising multiple <area>a elements that point to the relevant <file> elements. METS documents implementing this profile must not use the <par> element. <structMap>s of "physical" and "mixed" TYPEs must not use either the <par> or <seq> elements.

requirement: @ID="structMap7"

An <fptr> element could directly contain an <area> element if only a portion of an integral file manifests the parent <div>. This is likely to occur in either of two cases. 1) This would typically be the case when the parent <div> element represented just a segment of the entire document and the <fptr> represented a tei transcription or a tei translation. In this case, the <area> element under the <fptr> would point to the <file> element representing the tei document (via its FILEID attribute) and must at least indicate the starting point of the the relevant section of the referenced tei file via the <area> BEGIN attribute. The BEGIN attribute, in this case, would have a BETYPE of "IDREF". The <area> element might also express the end point of the relevant section of the referenced file via its END attribute, but it need not do so. 2) When a <structMap> represents a logical structure, its individual <div> elements may each be manifested by only a portion of the associated image content files represented by its child <fptr> elements. In this case, an <fptr> element representing an image content file could, but need not, contain a <area> element which specified the shape and coordinates of the relevant section of the image via the <area> element's SHAPE and COORDS attribute values.

requirement: @ID="structMap8"

An <fptr> element would contain a <seq> element if multiple files needed to be"played" in sequence to manifest a division. This might be the case if the <structMap> expressed a logical structure and a <div> in that structure required several files to manifest it. For example, the <div> elements in the <structMap> for a diary might represent diary entries; and some of these entries might span multiple physical pages, and hence require multiple image content files to manifest them. In this case, the <div> representing the spanned diary entry would contain at least one <fptr>element; this <fptr> element would contain a <seq> element which in turn contained a separate <area> element pointing to each <file> element representing a page the diary entry spans. The <area> elements may include SHAPE and COORDS attribute values to identify the relevant sections of the associated image files, but they need not do so.

requirement: @ID="structMap9"

Each <fptr> element that does not contain subsidiary <area> or <seq> elements must point directly to a <file> element in the <fileSec> via its FILEID attributes. Similarly, each <area> element appearing under an <fptr> element or a <seq> element must point to directly to a <file> element via its FILEID attribute.

structLink:
requirement: @ID="structLink1"

A conforming METS document may contain a <structLink> element. This profile, however, establishes no guidelines or expectations for its use.

behaviorSec:
requirement: @ID="behaviorSec1"

A conforming METS document may contain a <behaviorSec> element. This profile, however, establishes no guidelines or expectations for its use.

multiSection:
requirement: @ID="multi1"

Only <file> elements will reference <techMD>, <sourceMD> <rightsMD> and/or <digiprovMD> elements. In other words, documents implementing this profile will express administrative metadata in conjunction with content files only rather than in conjunction with <div> elements in the <structMap>.

requirement: @ID="multi2"

Only <div> elements will reference <dmdSec> elements. In other words, documents implementing this profile will express descriptive metadata in conjunction with divisions of the <structMap> and not in conjunction with individual content files (<file> elements).

technical_requirements:
content_files:
requirement:

If a METS document conforming to this profile has associated image content files, the master (archive) images must be represented and of TIFF format.

requirement:

At least one version of any image content must be of JPEG or GIF format. In other words, at least one content file format must be natively supported by typical internet browsers.

requirement:

All "tei translation" and "tei transcription" files must be encoded according to version 1 of the "TEI Text Encoding in Libraries: Guidelines for Best Encoding Practices" maintained by the Digital Library Federation (http://www.diglib.org/standards/tei.htm).

  Top of Page Top of Page
 
  The Library of Congress >> Standards
  July 1, 2011

Legal | External Link Disclaimer

Contact Us