Windows Clipboard and MARC Records: A Proposal
Prepared for the PCC Standing Committee on Automation by:
Mark Wilson
Head, Research & Development
The Library Corporation
Background
There are four generalized methods for transferring data between Windows applications:
OLE2, DDE, vendor specific messaging, and the Windows clipboard facility. Of
the four, only the Windows clipboard method takes advantage of an already existing
functionality in every Windows application. Implementation of this method of
data transfer imposes no great burden on any vendor and would assure the library
community of at least a minimal level of standardized communication between
Windows applications developed by different vendors.
There are exactly five MARC components to be addressed in terms of clipboard
activity:
- A MARC record;
- A MARC record leader;
- An array of one or more MARC fields;
- Field data containing indicators;
- Field data without indicators.
An additional clipboard component, text from a non-MARC application, need
not be addressed because Windows automatically handles such transfers.
Consider for the purposes of this discussion two applications: a source application
wishing to select and pass a MARC component and a target application ready
to receive and display the passed component. For the target there are only
two real issues. The target must be able to discover which MARC component has
been passed and determine what should be done with it. The source application
has a simpler role to play; it need only format and identify the MARC component
it wishes to pass via the clipboard. A secondary issue, exception handling,
is not addressed in this document.
The resolution of these issues lies in a well defined standard. Passed components
need to have an object identifier (OID) known to both applications, need to
be defined in terms of composition and layout, and for the peace of mind of
the librarians using both applications, need to exhibit standard behaviors
when cut or pasted between applications. In practice, all that need be done
is for vendors to prepare their applications to register the OID, follow a
clipboard layout scheme, and behave as described below.
As an aside, an OID is a programming construct known to developers but hidden
from users. OIDs can be registered with Microsoft to limit the possibility
of confusion between MARC and non-MARC applications. This implies that a word
processor, for instance, would "know" how to handle a pasted MARC component.
However, since Microsoft has invested heavily in OLE2, and since the developers
of non-MARC applications may have much higher priorities than making their
applications MARC aware, the Committee might best assume it is on its own and
create a set of MARC component OIDs, descriptions, and behavior specifications.
There are two possible OID implementation schemes. In the first, each MARC
component would have its own identifier. In the second, there would exist a
single MARC OID and each component would have a prefatory header declaring
its nature, for instance, MARC1, MARC2, MARC3, MARC4, MARC5. There are no real
differences between the two schemes save in the instance when a MARC component
is passed to a non-MARC application. In this instance, MARC aware applications
should, but would not be required to, provide a default text rendering for
non-MARC applications.
This document seeks to describe consistent source and target behavior. In
brief, pasting the following components to a MARC aware target results in the
following behaviors:
- MARC Record -- replace all data.
- Leader -- replace some data.
- Fields -- add to record but do not replace any data.
- Data with indicators -- replace all data in the field.
- Data, no indicators -- insert into field and do not replace any data.
In all instances, the source must provide the OID, put the data into the clipboard
in the formats described below, and insure that any source internal representation
of the data is rendered in the USMARC II extended character set. The target
must be ready to recognize the OID, alter its window as described below, and
be prepared to translate from the extended character set into any target internal
representation.
Component 1: A MARC Record.
Identifier: OID + MARC1 or unique OID1.
Description: A MARC record in MARC II Communications format, consisting
of leader, directory, and data segments as defined by USMARC Communications
format standards.
Source Behavior: The source must insure the record is formatted under
USMARC II standards and that all internal representations are translated into
the MARC extended character set. A pasted MARC record need not fulfill cataloging
standards, i.e., so long as a valid leader, directory and at least one field
are selected, the object will be considered a MARC record for the purpose of
clipboard transfer. If the source application permits selection of the leader
and several, but not all fields, it must adjust the passed leader and directory
to reflect the actual contents of the passed record.
Target Behavior: The target, on discovering a MARC record OID in
the clipboard, will discard all data in its window and replace it with a construction
of the pasted MARC record by interpreting the passed leader and directory.
Exceptions: The Committee may wish to establish default behaviors
for pasted records that exceed the target's record or field size limits. An
alternative is to leave this behavior undefined save that the target application
may not fail under these error conditions.
Component 2: A MARC Leader.
Identifier: OID + MARC2 or OID2.
Description: A leader is exactly 24 characters of ASCII text in USMARC
defined sequence. Although the leader is defined as 24 characters, all but
ten of these characters are system defined and two of those ten are either
implementation defined or undefined. Only the descriptive elements (record
type, bib level, etc.) are of interest when pasted to the clipboard
Source Behavior: When only the leader is selected, the source should
pass the leader contents unmodified and in USMARC II Communications format
sequence. The source may, but is not required to, construct a dummy leader
in which all but the descriptive elements are ASCII blanks.
Target Behavior: The target, on discovering a MARC leader OID in
the clipboard, should extract only the descriptive elements for pasting to
its window. All other elements in the target's leader should reflect the state
of the record, if any, in the target window. The target is not responsible
for updating any other element in its window based upon information in the
leader (for instance, the 008 field). The target would: 1. Maintain any system
defined characters in its own window (record length, offset to data, etc.).
2. Replace the descriptive leader characters in the target with those of the
source (record type, bib level, etc.). 3. Behavior with respect to implementation
defined characters is left undefined. OCLC, for instance, uses an implementation
defined character for private information. An application would be free to
copy, discard, or change this value.
Exceptions: A well-behaved application would refuse to paste a leader
to an inappropriate part of its window.
Component 3: An Array of Fields.
Identifier: OID + MARC3 or OID3.
Description: An array of MARC fields consists of one or more fields
without a MARC leader component. A standard must be developed to indicate the
number of fields in the array, the size of the array, and to identify the individual
members of the array. As a suggestion, that standard might be: 1. A header,
consisting of Field Count and Array Size; 2. For each field, a repeated structure
consisting of Field Number, Field Size, and Field Data. Source Behavior: The
source must be prepared to present the clipboard with an array of fields as
defined in the Descriptive portion of this section. All internal data must
be rendered in the USMARC extended character set. The Committee should determine
whether terminators (hex 1E) should be stripped or required for each field,
or whether it is the responsibility of the target to inspect each passed field
for the presence or lack of field terminators.
Target Behavior: The target will add all passed fields to its window;
no existing field will be replaced or altered. The target is expected to deal
with the issues of changes in record size, etc., in a consistent fashion. The
target may, but is not required to, place the fields in their appropriate position
within the record. The target may, but is not required to, deal with duplicate
or inappropriate fields as the result of an array paste.
Exceptions: As with a full MARC record, the Committee may wish to
suggest default behaviors when the number or length of fields exceeds the target
application's limits.
Component 4: Field Data with Indicators.
Identifier: OID + MARC4 or OID4.
Description: Text in which the initial two characters (or perhaps
the only characters) are designated as indicators.
Source Behavior: The source must be able to identify a cut or copy
that does not include field number but does include indicators. To avoid complexities,
the source should insure that if one indicator is cut or copied, both must
be. If this is not to be the rule, then the description of the data will have
to include an indicator count. Field terminators (hex 1E) should be stripped.
Target Behavior: The target will replace all data in the pasted-to
field with the data from the clipboard. This behavior is suggested because:
1. The indicators must be pasted to their defined position; 2. It is impossible
to determine by inspection what should happen to any data already in the target
field with respect to the pasted data.
Component 5: Field Data without Indicators.
Identifier: OID + MARC5 or OID5.
Description: Any ASCII text. Any field terminator (hex 1E) should
be stripped. The Committee might decide that this paste is identical to a paste
from a non-MARC aware application. However, if an OID identifies this MARC
component, the target application is assured that the data is in the USMARC
extended character set.
Source: The source should insure that any text cut or copied is rendered
in the USMARC extended character set.
Target: The target should insert the text following standard Windows
pasting behavior.
|