MARC-SGML and SGML-MARC Conversion Program User Guide
1 Introduction
The term "MARC DTD" (
MAchine
Readable
Cataloging
Document
Type
Definition), refers to implementations of Standard
Generalized Markup Language (SGML). SGML is a technique for
representing documents in machine-readable form which was approved
as an international standard, ISO 8879 (Information
processing--Text and office systems--Standard Generalized Markup
Language). It was developed to fill the need for a non-proprietary
standard for text encoding so that machine-readable data could be
exchanged between dissimilar text encoding environments. SGML is
widely used in the publishing industry where documents are created
using various computer systems. SGML supports the definition of
sets of elements, some of them abstract, that constitute specific
document types (for example, journal articles). The MARC DTDs treat
machine-readable cataloging records as a distinct type of document.
They define all the elements that might constitute a MARC record in
parallel with the lists of data elements defined in the five USMARC
formats.
The MARC-SGML encoding was designed to be an alternate
structure for the information in standard MARC structure records
(the structure defined in ISO 2709; Information and
documentation--Format for information exchange). The MARC-SGML
implementation provides full mappability between the two standard
data structures. MARC-SGML was developed because there are some
situations in which users find SGML (ISO 8879) a more appropriate
structure that MARC (ISO 2709). Because some users will find that
for one task or process they prefer to use data encoding using the
MARC structure and for another they prefer SGML, it is very helpful
to have tools to convert from one to the other as needed.
The Network Development and MARC Standards Office at the Library of
Congress funded development of two software programs for converting
between the MARC and SGML structures. The programs are:
- mrc2sgm.pl
- MARC record to tag-valid converter
- sgm2mrc.pl
- Tag-valid SGML to MARC record converter
1.1 Organization of this Manual
- Chapter 1, Introduction
- This introduction.
- Chapter 2, MARC-SGML Conversion
- Describes the inputs,
outputs, and operation of the
mrc2sgm.pl conversion program.
- Chapter 3, SGML-MARC Conversion
- Describes the inputs,
outputs, and operation of the
mrc2sgm.pl conversion program.
- Chapter 4, MARC Description File
- Outlines the use of
the MARC Description File that controls the conversion. For more
information, see the Maintenance Manual.
- Chapter 5, Character-Entity Map File
- Outlines the use
of the Character-Entity Map File that controls the conversion
between character codes in the MARC records and entities in the
corresponding SGML structure. For more information, see the
Maintenance Manual.
- Chapter 6, Log File
- Describes the format of the log
file generated by the conversion programs
- Chapter 7, Command File
- Describes the format of the
command file that may be used as an alternative to typing all the
command options on the command line.
- Chapter 8, Limitations
- Lists some of the limitations of
the conversion processes.
- Appendix A: User Interface
- Lists the complete command
options of both conversion programs.
- Appendix B: Error Messages
- Lists the complete error
messages from both conversion programs. Each list is in
alphabetical order.
- Appendix C: Installation
- How to install the
programs.
1.2 Maintenance Manual
Additional information about the format of the MARC Description
File and the Character-Entity Map File as well as descriptions of
the functioning of the two programs is contained in the companion
Maintenance Manual.
1.3 Typographic Conventions
When text is to be input to a computer or output by a computer,
and some pieces of that text vary, the variable pieces are shown in
italics in this manual. This means that the user is to replace
what is shown in italics by a specific value or that in computer
output, the program will substitute an actual value.
Go to top of document
2 MARC-SGML Conversion
2.1 Input
Input to the MARC-SGML conversion utility is a string of one or
more MARC-structured records. Ideally these are valid USMARC
records, but other varieties of MARC and locally-extended MARC can
also be converted.
The input must be and the output will be "well-formed", but
neither is guaranteed to be valid. "Well-formed" means that the
MARC input files must contain MARC records with correctly
structured Leader, Directory, and variable control and data fields,
followed by an End-of-Record mark. MARC Leader positions 06 and 07
must be present and contain values that are legal according to the
MARC Description File. (NOTE: Definition of positions 06 and 07 in
the Description File can be modified by the user of the conversion
program and are thus not limited to values specified as valid in
USMARC.) With this small exception, no particular fields in the
Leader, control, or data fields are required nor are particular
subfields required within a given field. Field and subfield order
need not follow the USMARC format specifications. In short, the
record must be well-structured MARC but need not meet the full
requirements of valid USMARC records.
2.2 Output
The output is a data stream of tagged but unvalidated SGML
data, with tagged elements containing many sub-elements for each
properly-structured MARC record in the input file. Each record
element will contain unique subelements for its Leader, each of its
variable control fields, variable data fields, and subfields, in
addition to any grouping elements specified in the MARC Description
File. Ideally, the resulting tagged element is MARC-SGML data that
is valid according to one of the MARC DTDs, but SGML parsing is not
required for conversion.
The input must be and the output will be "well-formed", but
neither is guaranteed to be valid. "Well-formed" means that in
tagged output all start-tags and end-tags will be paired and that
all attributes will be quoted. (NOTE: This is not "well-formed" in
the XML (e
Xtendible
Markup
Language) sense since XML uses a different syntax for empty
tags and requires that all entities are declared). Element and
attribute names are constructed according to the mechanism
established in the MARC Bibliographic and Authority DTDs. Grouping
elements are constructed according to the MARC Description file.
The specific elements, grouping elements, and the relationships
between the elements will not necessarily be those in the current
MARC DTDs. Therefore the output instances are not guaranteed to
parse cleanly. NOTE: Since that the rules of SGML as specified in
ISO 8879 have been followed, a DTD could be written that is valid
for any particular tagged MARC instance.
Unless an output filename is specified as a command option, the
program's output is written to the file named "
stdout.sgm" in the current directory. Unless a log file name
is specified as a command option, the program's log is written to
the file "
mrc2sgm.log" in the current directory.
2.3 Control of the Conversion Process
The conversion process is table-driven, with top-level
controlling data coming directly from the conversion operator and
detailed controlling data provided in the (user modifiable)
MARC Description File, which specifies the information
necessary to generate tags and attributes in the output SGML. There
is no direct connection to the MARC DTDs except as is built into
this file.
2.4 Invoking the Program
On DOS and Windows systems, the program is run from the DOS
prompt. The program command line must be preceded by "
perl -S", as in the following examples, so that the program
is run by the PERL interpreter.
On Unix systems, the "
perl -S" is not necessary since Unix systems are able to
automatically start the PERL interpreter to execute the program.
2.5 Examples
The following examples illustrate the use of the
mrc2sgm.pl program in the DOS and Windows command format.
The complete program options are listed in Appendix A.
The user will type (or place in a batch file) the following
commands and parameter settings:
perl -S mrc2sgm.pl file.mrc
The file of MARC records, "
file.mrc", will be converted to SGML. Since no options were
given, the default minimal conversion of significant SGML
characters will be performed, the output will be written to "
stdout.sgm", and the log file written to "
mrc2sgm.log"
perl -S mrc2sgm.pl -charconv file.mrc
This is the same as the previous example except that "
-charconv" specifies that that non-ASCII characters will be
converted to SGML entities using the character conversion table
referenced by the program.
perl -S mrc2sgm.pl -command mycmd.cmd file.mrc
The program is executed using the options contained in the
command file "
mycmd.cmd".
perl -S mrc2sgm.pl -command mycmd.cmd -log mylog.log file.mrc
The program is executed using the options contained in the
command file "
mycmd.cmd", except that the command line option "
-log mylog.log" overrides both the default log file name and
any "
-log" log file name assignment in the command file.
Go to top of document
3 SGML-MARC Conversion
3.1 Input
Input to the conversion utility is an SGML instance consisting
of one or more logical MARC records, marked up using element tags
from one of the MARC DTDs. The input data is assumed to be valid
(parsable) SGML although not necessarily parsed. Since this
conversion utility is table-driven and does not reference a DTD,
the utility cannot verify the validity of the SGML input data.
3.2 Output
The output from the SGML-MARC converter is a datastream of MARC
record data. When the input is structurally sound, the output will
be "well-formed", but neither is guaranteed to be valid.
"Well-formed" means that the MARC output files will contain MARC
records with a correctly structured Leader, Directory, and variable
control and data fields. Each record will terminate with the
USMARC end-of-record character (hexadecimal 1D). Leader positions
06 and 07 will contain values that are legal according to the MARC
Description File. (NOTE: As with the MARC-SGML converter,
definition of positions 06 and 07 in the Description File can be
modified by the user of the conversion program and are thus not
limited to values specified as valid in USMARC.) With this small
exception, no particular fields in the Leader, control, or data
fields are required nor are particular subfields required within a
given field. Fields will be listed in the Directory in ascending
numerical order. In short, the record is well-structured MARC but
need not meet the full requirements of valid USMARC records.
Unless an output filename is specified as a command option, the
programs output is written to the file "
stdout.mrc" in the current directory.
Unless a log file name is specified as a command option, the
program's log is written to the file "
sgm2mrc.log" in the current directory.
3.3 Control of the Conversion Process
The conversion process is table-driven, with top-level
controlling data coming directly from the conversion operator and
detailed controlling data provided in the user modifiable
MARC Description File, which specifies the information
necessary to create positionally-defined elements with their
correct length. There is no direct connection to the MARC DTDs
except as is built into this file.
3.4 Invoking the Program
On DOS and Windows systems, the program is run from the DOS
prompt. The program command line must be preceded by "
perl -S", as in the following examples, so that the program
is run by the PERL interpreter.
On Unix systems, the "
perl -S" is not necessary since Unix systems are able to
automatically start the PERL interpreter to execute the program.
3.5 Examples
The following examples illustrate the use of the
sgm2mrc.pl program in DOS and Windows command format. The
complete program options are listed in Appendix A.
The user will type (or place in a batch file) the following:
perl -S sgm2mrc.pl file.sgm
The file of SGML structured MARC records, "
file.sgm", will be converted to the MARC record structure.
Since no options were given, the default minimal conversion of the
entities for significant SGML characters will be performed, the
output will be written to "
stdout.mrc", and the log file written to "
sgm2mrc.log".
perl -S sgm2mrc.pl -charconv file.sgm
This is the same as the previous example except that "
-charconv" specifies that that the entities for non-ASCII
characters will be converted to characters in the USMARC Basic and
Extended Latin character sets using the character conversion table.
perl -S sgm2mrc.pl -command mycmd.cmd file.sgm
The program is executed using the options contained in the
command file "
mycmd.cmd".
perl -S sgm2mrc.pl -command mycmd.cmd -log mylog.log file.mrc
The program is executed using the options contained in the
command file "
mycmd.cmd", except that the command line option "
-log mylog.log" overrides both the default log file name and
any "
-log" log file name assignment in the command file.
Go to top of document
4 MARC Description File
The conversion utilities are controlled by a single SGML file
containing a description of the MARC record format. The program
itself is not hard-wired for any implementation of MARC records,
and it reads the description file to find out what to expect in the
MARC records or the MARC SGML. An error is signaled if an input
MARC record or MARC SGML element does not conform to the
description.
The MARC Description File format is explained in the
Maintenance Manual.
Go to top of document
5 Character-Entity Map File
These files control the conversion of characters in the MARC
data to both ISO-defined and MARC-specific entities in the SGML
output and the conversion of entities in the SGML to MARC records.
Two conversion files are required by the program; an
upper-register-to-entity conversion file and a character-to-entity
conversion file. An additional file may be specified by the user.
The mapping in the selected conversion file is converted into
program code executed by the program to perform the
character-to-entity conversion.
If no conversion is specified, the default, built-in conversion
takes place. It affects only a small set of significant SGML
characters and control characters.
The Character-Entity Map File format is explained in the
Maintenance Manual.
Go to top of document
6 Log File
Both conversion utilities generate log files. If not specified
in a command option, the default log file for mrc2sgm.pl is "
mrc2sgm.log", and for sgm2mrc.pl is "
sgm2mrc.log". A log file lists:
- The time the conversion started and ended;
- The files used in the conversion;
- The number of records processed, converted, and skipped;
and
- Any error messages.
Except for the content of the error messages, the format of the log
files from the two conversion utilities is identical.
Example 1
In this example, no command file was specified, the default
SGML character conversion was used, there were no errors, and eight
records were processed.
mrc2sgm.pl started at Thu Nov 13 19:36:08 1997
------------------------------------------------------------
Command File:
MARC Description File: C:\Perl\lib/Marcconv/marcdesc.sgm
Input File: sgm2mrc.mrc
Output File: stdout.sgm
Character Conversion: SGML
Conversion File:
Log File: mrc2sgm.log
------------------------------------------------------------
End of input file reached.
Records processed: 8
Records converted: 8
Records skipped: 0
mrc2sgm.pl ended at Thu Nov 13 19:36:12 1997
Example 2
In this example, the command file is "
sgm2mrc.cmd", the character conversion was used and the
conversion file is listed, there was one error, and 41 records were
processed.
c:\LOCAL\BIN/sgm2mrc.pl started at Fri Nov 14 17:03:12 1997
------------------------------------------------------------
Command File: sgm2mrc.cmd
MARC Description File: C:\Perl\lib/Marcconv/marcdesc.sgm
Input File: marc_hd.sgm
Output File: sgml_hd.mrc
Character Conversion: Character conversion
Conversion File: C:\Perl\lib/Marcconv/charconv.sgm
Log File: sgml_hd_mrc.log
------------------------------------------------------------
Record 5 has end-tag, "mrcb841-", that does not match variable data
element format; record not converted.
End of input file reached.
Records processed: 41
Records converted: 40
Records skipped: 1
c:\LOCAL\BIN/sgm2mrc.pl ended at Fri Nov 14 17:03:40 1997
Go to top of document
7 Command File
Command files have a very simple format. Lines that begin with
the number sign ("#") are treated as comments and are ignored, and
all other line are treated as one or more command line parameters.
The parameters in the command file, if used, are read by the
program
before the parameters on the command line, so the command
line parameters can override those specified in a command file.
Example 1
In the following example, two command line options are shown:
the program's output will be written to "
mrc2sgm.sgm" instead of "
stdout.sgm", and the program will use "
marcdesc.sgm" as its MARC Description File instead of the
default.
-o mrc2sgm.sgm -marcdesc marcdesc.sgm
Example 2
In the following example the command file specifies the same
two command line options, but it includes a comment, and each
command is on a separate line. The effect is exactly the same
since line breaks and white space are not important in command
files (as long as there is at least one space or line break between
successive words).
# Command files are easy!!
-o mrc2sgm.sgm
-marcdesc marcdesc.sgm
Go to top of document
8 Limitations
8.1 Input Validation
The mrc2sgm.pl program does not validate the content of input
MARC records. If the input MARC records do not contain valid MARC
data elements, unexpected results may occur.
Likewise, the sgm2mrc.pl program does not validate the content
of input SGML MARC records. Although sgm2mrc.pl does use the
NSGMLS parser, it uses the parser's non-validating mode that
recognizes elements' start tags and end tags only. If the input is
not proper SGML, unexpected results may occur.
8.2 Character Issues
The character-to-entity and entity-to-character conversions
handle specific character values numbers and SGML entities. The
character values correspond to characters in the USMARC Extended
Latin character set (ANSI/NISO Z39.47). If alternative character
sets, such as the USMARC Arabic or Cyrillic sets, are used,
unexpected results may occur.
8.3 SGML
The output produced by mrc2sgm.pl is unvalidated SGML. The
tags that are output are determined by information in the MARC
Description File and in incoming MARC records themselves, not by a
DTD. The SGML output must be parsed without error against a DTD
before it can be considered type-valid SGML.
Go to top of document
Appendix A: Program Commands
MARC-to-SGML Conversion
The command
is used to invoke the MARC-to-SGML
conversion utility. The command can be followed by a variety of
parameter settings, each of which follows the command, separated by
a space, as follows.
mrc2sgm.pl [-command
file] [-sgmlconv | -registerconv |
-charconv | -userconv
file] [-log
file] [-o
file]
[-marcdesc
file] [-help]
input-file
Remember, parts of these parameter settings highlighted above
are determined by the user. Not all parameter setting must be
used. When a parameter setting is not provided by the user, system
defaults are applied by the conversion utility. Descriptions of
each parameter setting are as follows:
- -command
file
- Read program command options from "file"
- -sgmlconv
- Perform minimal, "SGML sanity" character
conversion using the built-in conversion table.
This is the default character conversion.
- -registerconv
- Supports conversion between
upper-register characters and lower-register characters using the
built-in conversion table.
The minimal conversion will also be performed.
- -charconv
- Convert characters to entities using the
built-in conversion table.
The minimal SGML conversion will also be performed.
- -userconv
file
- Perform character conversion using the
user-supplied conversion specification in "file".
An error will be signaled if "file" is not specified or if
"file" cannot be opened, or if "file" is not a file of the correct
format.
The minimal SGML conversion will also be performed.
- -log
file
- Write the output log to "file".
If this option is not specified, the log will be written to "
mrc2sgm.log" in the current directory.
- -o
file
- Write the unvalidated SGML output to "file" instead of to the
default file "
stdout.sgm".
- -marcdesc
file
- Read the MARC Description File named "
file" instead of the default MARC Description File that the
program automatically reads on initialization.
- -help
- Print help information then quit.
- input-file
- The name of the input MARC record file
SGML-to-MARC Conversion
The command
is used to invoke the SGML-to-MARC
conversion utility. The command can be followed by a variety of
parameter settings, each of which follows the command, separated by
a space, as follows.
sgm2mrc.pl [-command
file] [-sgmlconv | -registerconv |
-charconv | -userconv
file] [-log
file] [-o
file]
[-marcdesc
file] [-help]
input-file
Remember, parts of these parameter settings highlighted above
are determined by the user. Not all parameter setting must be
used. When a parameter setting is not provided by the user, system
defaults are applied by the conversion utility. Descriptions of
each parameter setting are as follows:
- -command
file
- Read program command options from "file"
- -sgmlconv
- Perform minimal, "SGML sanity" character
conversion using the built-in conversion table.
This is the default character conversion.
- -registerconv
- Supports conversion of upper-register
characters and lower-register characters using the built-in
conversion table.
The minimal conversion will also be performed.
- -charconv
- Convert characters to entities using the
built-in conversion table.
The minimal SGML conversion will also be performed.
- -userconv
file
- Perform character conversion using the
user-supplied conversion specification in "file".
An error will be signaled if "file" is not specified or if
"file" cannot be opened, or if "file" is not a file of the correct
format.
The minimal SGML conversion will also be performed.
- -log
file
- Write the output log to "file".
If this option is not specified, the log will be written to "
mrc2sgm.log" in the current directory.
- -o
file
- Write the unvalidated SGML output to "file" instead of to the
default file "
stdout.sgm".
- -marcdesc
file
- Read the MARC Description File named "
file" instead of the default MARC Description File that the
program automatically reads on initialization.
- -help
- Print help information then quit.
- input-file
- The name of the input MARC SGML record file
Go to top of document
Appendix B: Error Messages
MARC-to-SGML Conversion
The MARC-to-SGML conversion utility, encoded in the "
" PERL script, generates a variety
of useful messages when error conditions are encountered during
execution of the program. The possible error messages are listed
below in alphabetical order.
Could not locate required file "
File"
One of the data files required for correct operation could not
be found. Check to make sure that the program was installed
correctly and that the Marcconv directory was installed in PERL's
lib directory.
End of input file reached
The program has finished processing the input file. While
errors may have occurred that caused records to be skipped, the
entire input file has been read by the program.
Extraneous characters at end of file after
RecordCount records
The input file did not end with an
end-of-record character.
Invalid Leader cp 06-07 in record
RecordCount; record not converted.
The data in Leader character positions 06 and 07 did not match
any of the allowed values declared in the MARC Description File.
mrc2sgm.pl ended at "
Time"
Processing ended at the indicated time.
mrc2sgm.pl started at "
Time"
Processing began at the indicated time.
Record
RecordCount directory length is "
Length", which is not divisible by 12. Skipping
record.
The Directory should be a sequence of 12-character entries
ending with an end-of-field character. If the Directory length,
excluding the end-of-field character, is not a multiple of 12,
there is an error in the Leader and/or the Directory, and the
record is skipped.
Record
RecordCount does not end with EOF, EOR sequence. Skipping
record.
The last two characters of every MARC record should be an
end-of-field character (hex '1E'), which terminates the last field,
and an end-of-record character (hex '1D') which terminates the
record itself. When one or both of these is not present, the
record is skipped.
Records converted:
ConvertCount
The program successfully converted the number of records
indicated.
Records processed:
RecordCount
The program processed the number of records indicated, which
may be greater than the number that were successfully converted.
Records skipped:
SkipCount
The indicated number of records were skipped because of errors
in the records.
SGML-to-MARC Conversion
The SGML-to-MARC conversion utility, encoded in the "
" PERL script, generates a variety
of useful messages when error conditions are encountered during
execution of the program. The possible error messages are listed
below in alphabetical order.
Could not locate required file "
File"
One of the data files required for correct operation of the
PERL script could not be found. Check to make sure that the
program was installed correctly and that the MARCCONV directory was
installed in PERL's LIB directory.
End of input file reached
The program has finished processing the input file. While
errors may have occurred that caused records to be skipped, the
entire input file has been read by the program.
sgm2mrc.pl ended at "
Time"
Processing ended at the indicated time.
sgm2mrc.pl started at "
Time
"Processing began at the indicated time.
Record
RecordCount has end-tag, "
Name", that does not match positionally-defined element
format; record not converted.
The indicated end-tag does not match the </[
dtd_type][
marc_tag]-[
subtype]> format. An example of a valid end-tag is
"</mrca008-ci>".
Record
RecordCount has end-tag, "
Name", that does not match record without indicators or
subfields element format; record not converted.
The indicated end-tag does not match the </[
dtd_type][
marc_tag]> format. An example of a valid end-tag is
"</mrca005>".
Record
RecordCount has end-tag, "
Name", that does not match variable data element format;
record not converted
The indicated end-tag does not match the <[
dtd_type][
marc_tag]> format. An example of a valid end-tag is
"</mrca100>".
Record
RecordCount has invalid field cp 00, "
Value", for positionally-defined field "Field";
record not converted.
The indicated value does not match any of the allowed values
specified in the MARC Description File.
Record
RecordCount has invalid Leader cp 06-07, "
Value", for positionally-defined field "Field";
record not converted.
The indicated value does not match any of the allowed values
specified in the MARC Description File.
Record
RecordCount has tag, "
Name", that does not match positionally-defined element
format; record not converted.
The indicated start-tag does not match the <[
dtd_type][
marc_tag]-[
subtype]> format. An example of a valid start-tag is
"<subsect>Record
RecordCount has unknown group tag, "
Name"; record not converted.
The tag found where a group tag was expected was not a valid
group tag for the present document type as defined in the MARC
Description File.
Record
RecordCount has unknown Leader cluster element "
Name"; record not converted.
The tag found where a Leader cluster element was expected did
not match the <[
dtd_type]ldr-[
record_type]-[
cp]> format. Examples with valid formats are
"<mrcbldr-bd-06>" and "<mrcaldr-cd-16-19>".
Record
RecordCount has unknown Leader element "
Name"; record not converted.
The tag found where the Leader element was expected did not
match the <[
dtd_type]ldr> format. An example of a valid Leader
start-tag is "<mrcaldr>".
Record
RecordCount has unknown top-level tag "Name"; record
not converted
The tag found where the top-level tag for the SGML for a MARC
record was expected did not match any of the document types defined
in the MARC Description File.
Record
RecordCount too long; record not converted.
The MARC record constructed from the SGML exceeded the 99,999
character limit for MARC records.
Records converted:
ConvertCount
The program successfully converted the number of records
indicated.
Records processed:
RecordCount
The program processed the number of records indicated, which
may be greater than the number that were successfully converted.
Records skipped:
SkipCount
The indicated number of records were skipped because of errors
in the records.
Go to top of document
Appendix C: Installation
System Requirements
The conversion software requires a PERL script interpreter,
version 5.003 or later and the NSGMLS parser, version 1.2. PERL
interpreters are available free of charge from
http://www.perl.com. Likewise, versions of NSGMLS for a
variety of operating systems are available from
http://www.jclark.com/sp.
NONTE: Both PERL and NSGMLS need to be listed in the
"PATH" statement of your systems configuration ("config.sys") file.
Before You Start
Before you start to install any of the utilities and support
file, you will need to know three things about the hardware and
software on your computer:
Location of the PERL executable file, which should be "
perl.exe" on DOS and Windows systems and "
perl" on Unix systems;
Location of PERL's "LIB" directory;
Directory in which you want the
mrc2sgm.pl and
sgm2mrc.pl files installed. This should be in a directory
on your path.
Contents of Distribution
- manifest
- File listing used by
install.me
- readme.1st
- Installation instructions
- bin
- Directory containing the program files
- lib
- Directory containing PERL libraries and
modules
Installation
First you must have the appropriate MARC SGML conversion file
(a compressed file for the operating system you want to use can be
downloaded from the Library of Congress' MARCDTD server). PKUNZIP
(a decompression utility available from PKWARE), a PERL
interpreter, and a copy of NSGMLS that will run on your operating
system are also needed. These four should be either in the same
directory, or if in separate directories, be sure they are included
in your operating system's configuration file as part of the "PATH"
statement so that the operating system will know where these
programs are as you step through installation. The MARC SGML
converters, PKUNZIP, PERL, and NSGMLS are available for different
operating systems, so be sure you have obtained the files for the
operating system you intend to use.
Installation is performed from the operating system prompt. In
DOS and Windows this will be the "C:\" prompt, or whatever drive
you chose to install your conversion utilties to. The
pkunzip" utility will be used to decompress the MARC SGML
conversion software. (It may be easiest if you put PKUNZIP in the
same directory as the "zipped" file, or at least in the path so
that the zipped files can be found by PKUNZIP. (If you are
unfamiliar with PKUNZIP, you will need to familiarize yourself with
that utility first. It's really quite easy to use!). The PERL
interpreter will be needed to run the "
install.me" script among the MARC SGML conversion utility
files once it has been uncompressed.
Once you have uncompressed the MARC SGML "zip" file, type the
following at the system prompt:
perl install.me
Sample Installation
As the installation script runs you will be prompted to type in
the location of the directories where certain files should be
written. The directory and path names are always system specific,
so you will need to know where your PERL interpreter is on your
hardware. There is normally a separate subdirectory called "
perl" which contains other subdirectories including "
bin" (for binary files and other executable programs) and "
lib" (for various library files accessed by programs as they
run). The following is a sample of the dialog with a system during
installation. You will supply some of the text below as answers to
questions appearing on the screen during installation.
C:> perl install.me
Pathname of perl executable: \perl\bin\perl.exe
Directory to install executables: ("") \local\bin
Directory to install library files: ("") \perl\lib
You have specified the following:
PERL path: \local\bin
Bin directory: \local\bin
Lib directory: \perl\lib
Is this correct? ['y'] y
(\perl\bin\perl.exe)
Installing programs to "\local\bin":
bin/mrc2sgm.pl => \local\bin\mrc2sgm.pl bin/sgm2mrc.pl =>
\local\bin\sgm2mrc.pl
Installing lib files to "\perl\lib":
lib/SGMLS => \perl\lib\SGMLS
lib/SGMLS.pm => \perl\lib\SGMLS.pm
lib/sgmlspl.pl => \perl\lib\sgmlspl.pl
lib/sgmlspl.pm => \perl\lib\sgmlspl.pm
lib/Marcconv => \perl\lib\Marcconv
Notes on Installation Program
You must be in the same directory as the
install.me program. The programs require a PERL
interpreter, version 5 or later and NSGMLS version 1.2. (NOTE: The
PERL scripts used in the MARC SGML conversion utilities are
currently configured to work with NSGMLS version 1.2. Changes in
later versions of NSGMLS have been shown to cause problems which
should be fixed in later releases of the conversion utilities.)
The '#!' line in the installed programs are set to point to the
PERL executable specified in the installation process. Programs
are updated to include the path location of the installed library
files. Thus, you can install the library files in any location,
and the program will still work.
Manual Installation
If for some reason you cannot or do not want to run the
install.me program to install the conversion utilities, it
is possible to install them manually, but several additional steps
are involved. To install manually you must:
- Copy library (*.pl) files in the "lib" directory to the library
location you want.
- Copy the program files to the location you want them
installed.
- After copying the program files to their proper location, you
must edit each one so that it has the complete pathname of the PERL
interpreter. Edit the "#!/..." line (the first line of each
program file) so that it has the complete pathname of your perl
interpreter. NOTE: This step is unnecessary for DOS and Windows
users.
- Add a new line right after the "#!/..." line that contains the
following: unshift(@INC, "/path/to/lib/files")
The statement "/path/to/lib/files/" is the path to where you
copied the library files. NOTE: For DOS and Windows users, you
will have to use "\\" (without the quotes) as the directory
separator if using double quotes to delimit the path. For example:
unshift(@INC, "C:\\path\\to\\lib\\files")
- If the location to which you copied the library files is
already part of PERL's standard library search path, you do not
need to add the "unshift(...)" statement just described.
Go to top of document
Return to the
MARC Concise Format for Bibliographic Data Home Page
Go to the MARC Home
Page
Go to the Library of Congress Home
Page
Library
of Congress
Library of Congress Help Desk
(02-19-1999/rkb)