Skip Navigation Links The Library of Congress >> Cataloging
Program for Cooperative Cataloging - Library of Congress
  PCC Home >> Archived Reports
Find in

Windows Clipboard and MARC Records: A Proposal

Prepared for the PCC Standing Committee on Automation by:
Mark Wilson
Head, Research & Development
The Library Corporation


Background

There are four generalized methods for transferring data between Windows applications: OLE2, DDE, vendor specific messaging, and the Windows clipboard facility. Of the four, only the Windows clipboard method takes advantage of an already existing functionality in every Windows application. Implementation of this method of data transfer imposes no great burden on any vendor and would assure the library community of at least a minimal level of standardized communication between Windows applications developed by different vendors.

There are exactly five MARC components to be addressed in terms of clipboard activity:

  1. A MARC record;
  2. A MARC record leader;
  3. An array of one or more MARC fields;
  4. Field data containing indicators;
  5. Field data without indicators.

An additional clipboard component, text from a non-MARC application, need not be addressed because Windows automatically handles such transfers.

Consider for the purposes of this discussion two applications: a source application wishing to select and pass a MARC component and a target application ready to receive and display the passed component. For the target there are only two real issues. The target must be able to discover which MARC component has been passed and determine what should be done with it. The source application has a simpler role to play; it need only format and identify the MARC component it wishes to pass via the clipboard. A secondary issue, exception handling, is not addressed in this document.

The resolution of these issues lies in a well defined standard. Passed components need to have an object identifier (OID) known to both applications, need to be defined in terms of composition and layout, and for the peace of mind of the librarians using both applications, need to exhibit standard behaviors when cut or pasted between applications. In practice, all that need be done is for vendors to prepare their applications to register the OID, follow a clipboard layout scheme, and behave as described below.

As an aside, an OID is a programming construct known to developers but hidden from users. OIDs can be registered with Microsoft to limit the possibility of confusion between MARC and non-MARC applications. This implies that a word processor, for instance, would "know" how to handle a pasted MARC component. However, since Microsoft has invested heavily in OLE2, and since the developers of non-MARC applications may have much higher priorities than making their applications MARC aware, the Committee might best assume it is on its own and create a set of MARC component OIDs, descriptions, and behavior specifications.

There are two possible OID implementation schemes. In the first, each MARC component would have its own identifier. In the second, there would exist a single MARC OID and each component would have a prefatory header declaring its nature, for instance, MARC1, MARC2, MARC3, MARC4, MARC5. There are no real differences between the two schemes save in the instance when a MARC component is passed to a non-MARC application. In this instance, MARC aware applications should, but would not be required to, provide a default text rendering for non-MARC applications.

This document seeks to describe consistent source and target behavior. In brief, pasting the following components to a MARC aware target results in the following behaviors:

  1. MARC Record -- replace all data.
  2. Leader -- replace some data.
  3. Fields -- add to record but do not replace any data.
  4. Data with indicators -- replace all data in the field.
  5. Data, no indicators -- insert into field and do not replace any data.

In all instances, the source must provide the OID, put the data into the clipboard in the formats described below, and insure that any source internal representation of the data is rendered in the USMARC II extended character set. The target must be ready to recognize the OID, alter its window as described below, and be prepared to translate from the extended character set into any target internal representation.

Component 1: A MARC Record.

Identifier: OID + MARC1 or unique OID1.
Description: A MARC record in MARC II Communications format, consisting of leader, directory, and data segments as defined by USMARC Communications format standards.
Source Behavior: The source must insure the record is formatted under USMARC II standards and that all internal representations are translated into the MARC extended character set. A pasted MARC record need not fulfill cataloging standards, i.e., so long as a valid leader, directory and at least one field are selected, the object will be considered a MARC record for the purpose of clipboard transfer. If the source application permits selection of the leader and several, but not all fields, it must adjust the passed leader and directory to reflect the actual contents of the passed record.
Target Behavior: The target, on discovering a MARC record OID in the clipboard, will discard all data in its window and replace it with a construction of the pasted MARC record by interpreting the passed leader and directory.
Exceptions: The Committee may wish to establish default behaviors for pasted records that exceed the target's record or field size limits. An alternative is to leave this behavior undefined save that the target application may not fail under these error conditions.

Component 2: A MARC Leader.

Identifier: OID + MARC2 or OID2.
Description: A leader is exactly 24 characters of ASCII text in USMARC defined sequence. Although the leader is defined as 24 characters, all but ten of these characters are system defined and two of those ten are either implementation defined or undefined. Only the descriptive elements (record type, bib level, etc.) are of interest when pasted to the clipboard
Source Behavior: When only the leader is selected, the source should pass the leader contents unmodified and in USMARC II Communications format sequence. The source may, but is not required to, construct a dummy leader in which all but the descriptive elements are ASCII blanks.
Target Behavior: The target, on discovering a MARC leader OID in the clipboard, should extract only the descriptive elements for pasting to its window. All other elements in the target's leader should reflect the state of the record, if any, in the target window. The target is not responsible for updating any other element in its window based upon information in the leader (for instance, the 008 field). The target would: 1. Maintain any system defined characters in its own window (record length, offset to data, etc.). 2. Replace the descriptive leader characters in the target with those of the source (record type, bib level, etc.). 3. Behavior with respect to implementation defined characters is left undefined. OCLC, for instance, uses an implementation defined character for private information. An application would be free to copy, discard, or change this value.
Exceptions: A well-behaved application would refuse to paste a leader to an inappropriate part of its window.

Component 3: An Array of Fields.

Identifier: OID + MARC3 or OID3.
Description: An array of MARC fields consists of one or more fields without a MARC leader component. A standard must be developed to indicate the number of fields in the array, the size of the array, and to identify the individual members of the array. As a suggestion, that standard might be: 1. A header, consisting of Field Count and Array Size; 2. For each field, a repeated structure consisting of Field Number, Field Size, and Field Data. Source Behavior: The source must be prepared to present the clipboard with an array of fields as defined in the Descriptive portion of this section. All internal data must be rendered in the USMARC extended character set. The Committee should determine whether terminators (hex 1E) should be stripped or required for each field, or whether it is the responsibility of the target to inspect each passed field for the presence or lack of field terminators.
Target Behavior: The target will add all passed fields to its window; no existing field will be replaced or altered. The target is expected to deal with the issues of changes in record size, etc., in a consistent fashion. The target may, but is not required to, place the fields in their appropriate position within the record. The target may, but is not required to, deal with duplicate or inappropriate fields as the result of an array paste.
Exceptions: As with a full MARC record, the Committee may wish to suggest default behaviors when the number or length of fields exceeds the target application's limits.

Component 4: Field Data with Indicators.

Identifier: OID + MARC4 or OID4.
Description: Text in which the initial two characters (or perhaps the only characters) are designated as indicators.
Source Behavior: The source must be able to identify a cut or copy that does not include field number but does include indicators. To avoid complexities, the source should insure that if one indicator is cut or copied, both must be. If this is not to be the rule, then the description of the data will have to include an indicator count. Field terminators (hex 1E) should be stripped.
Target Behavior: The target will replace all data in the pasted-to field with the data from the clipboard. This behavior is suggested because: 1. The indicators must be pasted to their defined position; 2. It is impossible to determine by inspection what should happen to any data already in the target field with respect to the pasted data.

Component 5: Field Data without Indicators.

Identifier: OID + MARC5 or OID5.
Description: Any ASCII text. Any field terminator (hex 1E) should be stripped. The Committee might decide that this paste is identical to a paste from a non-MARC aware application. However, if an OID identifies this MARC component, the target application is assured that the data is in the USMARC extended character set.
Source: The source should insure that any text cut or copied is rendered in the USMARC extended character set.
Target: The target should insert the text following standard Windows pasting behavior.
Top of Page Top of Page
  PCC Home >> Archived Reports
Find in
  The Library of Congress >> Cataloging
  January 3, 2008
Contact Us  
BIBCO CONSER NACO SACO Program for Cooperative Cataloging Home