Federal cost-cutting, a presidential mandate to pursue strategic planning, and the challenge of adapting to technology-driven changes are all guiding federal libraries towards collaborative projects.
The Consortium of Navy Libraries (CNL) is an example of successful cooperation between different agency units. The group was formed in 1997 to promote Navy library programs and services, advise the Librarian of the Navy on matters affecting library services, provide customer education, share technical support and resources, and explore possibilities for leveraged buying and distributed costs.
FLICC's Consortial Purchasing Task Group is another example of library collaboration; the group is at work on a pilot project to test interagency purchasing of electronic journals through FEDLINK. The CNL is ahead of the curve; the group is currently working to take advantage of FEDLINK procurement mechanisms by aggregating its member Dialog accounts into one interagency agreement to qualify for group rates.
All Navy and Marine Corps libraries and information centers are eligible for membership. The CNL's stated mission is "to facilitate state-of-the-art access to library and information services to all Navy personnel in support of their missions, whether for operational readiness, research and development, situation awareness, decision making, education and training, or personal enrichment, wherever, whenever, and in an appropriate format." Members have formed working groups on procurement and licensing, strategies, providing around the clock service, and standards.
Recently, the CNL proposed to the Department of the Navy Chief Information Office that the group be tasked with participating in a new Navy initiative, the Naval Virtual Intranet (NVI), a Navy-wide intranet with e-mail, security features, etc. The first pilot will be in the Virginia Tidewater region. The CNL has proposed that library content be introduced in the pilot which is planned for later in 1998.
To further assist library managers in locating information about procurement policies and mechanisms and technology planning, the CNL has posted its charter and links to Web resources on acquisitions and information technology on its Web site (http://http://stinet.dtic.mil/). Work is underway to add links on collection development sites that are pertinent to CNL subject interests.
XML Release Supports Indexing and Metadata Creation
By Jessica Clark
With a February release of the eXtensible Markup Language (XML) 1.0 specification, the World Wide Web Consortium (W3C) built on its mission to offer stable standards for developing robust, interoperable, international Web sites. The new encoding scheme allows Web site developers in different industries to create customized elements to mark up content and data for use with specific applications.
The standard was created and developed by the W3C XML Working Group, which includes key industry players and experts in structured documents and electronic publishing. "XML is extensible, internationalized, robust, simple, and built for the Web," said Tim Bray, principal at Textuality and co-editor of the XML 1.0 specification. "Its arrival enables whole new classes of applications, and is a major step towards fulfilling some of the Internet's unrealized potential."
New Web standards work in tandem
Like Hypertext Markup Language (HTML), XML, is an encoding scheme derived from Standard Generalized Markup Language (SGML), a widely used international text processing standard. A key characteristic of SGML is the ability to create "document type definitions" (DTDs). HTML is only one DTDin its original form, it contained the basic tags needed to mark up standard reports and publications for transmission and display via Web browsers. XML is a simple "dialect" of SGML which allows site developers to fashion their own DTDs for use in indexing, workflow management, database processing, push technology, publishing, or financial transactions.
As mentioned in the January FEDLINK Technical News article (http://www.loc.gov/flicc/tn/98/01/tn9801.html#TECH NEWS), the W3C hopes that the HTML 4.0 standard will encourage site designers to use one set of elements to mark up the structure of a document and another to mark up its design and formatting. The W3C is now working on eXtensible Style Language (XSL), a stylesheet specification for XML similar to the Cascading Style Sheets (CSS) used to specify design for HTML documents.
W3C developers hope that the end result of these interlocking standards will be attractive sites that display on a variety of platforms and may reference or be referenced by powerful databases. XML will also enhance linking abilities; links coded in XML may lead to spans of text or to multiple reference points.
Encoding XML documents
With XML, site developers can create elements that laypeople may easily recognize. For example, a catalog record for a vacation brochure could be coded this way:
<? XML version="1.0" standalone="yes"?>
It is not hard to see how such open-ended encoding could easily be processed by database programs.
Unlike HTML, XML requires the document creator to either include a DTD or declare a document to be "standalone." XML elements must either contain a closing tag or be specified as not requiring one. XML encoding is also case-sensitive and displays any white space that the original document contains. For more information on XML encoding, see the resources listed below.
Creating metadata with XML
XML is expected to become a popular scheme for encoding metadata. As Rebecca Guenther explained at the 1998 FLICC Information Technology Update (see story, page 3), Dublin Core elements will be coded in XML as well as HTML. The Resource Description Framework (RDF), another initiative currently under development by the W3C, will provide a standard architecture to support site developers in creating sets of metadata elements which best suit their purposes.
XML-compliant editors, browsers, and validation programs are currently in development. Current versions of HTML browsers cannot parse XML elements, but industry watchers expect that new releases of popular browsers will. SGML editors and browsers, however, can read XML documents.
For more information on XML and SGML, visit:
Metadata 101: Beyond Traditional Cataloging
How can the exploding universe of heterogeneous electronic resources best be cataloged? Around the world, coalitions of information professionals and software developers are working on a solution as they develop standards for "metadata."
What is metadata? It is most simply defined as "data about data"structured information which describes an electronic document, image, movie, or sound. Metadata may be used to aid identification, description, and location of electronic or traditional documents or artifacts.
FLICC's 1998 Information Technology Update brought the creators and users of different metadata schemes together on January 28 to brief federal librarians on current efforts. Metadata 101: Beyond Traditional Cataloging was co-sponsored by FLICC and CENDI, an interagency working group composed of senior scientific and technical information managers from major programs in the Department of Commerce, Department of Education, Department of Energy, National Aeronautics and Space Administration, National Library of Medicine, Department of Defense, and Department of the Interior.
An introduction to metadata
In the past, different professionals have dealt with different types of metadata: librarians with card catalogs, museum professionals with collections records, geographers with map legends, physicians with patient charts. Now, information professionals, computer programmers, librarians, and content specialists are working on a common set of standards which will allow both automated search engines and end users to locate and use different types of online materials.
Understanding the Dublin Core
Rebecca Guenther, Senior MARC Standards Specialist, Library of Congress, described the process of developing the Dublin Core (DC), a core element set for metadata used in the discovery and identification of digital resources. The 15 standard elements are designed to be included either within the document which contains the resource, or in an associated file on the same server. Current DC elements include: TITLE, AUTHOR or CREATOR, SUBJECT and KEYWORDS, DESCRIPTION, PUBLISHER, OTHER CONTRIBUTOR, DATE, RESOURCE TYPE, FORMAT, RESOURCE IDENTIFIER, SOURCE, LANGUAGE, RELATION, COVERAGE, and RIGHTS MANAGEMENT.
Guenther explained the terms used to describe metadata. Semantics refers to the meaning of a metadata elementfor example, "author." Content refers to the specific value of the elementfor example, "Mark Twain". Syntax refers to the way that metadata is coded; coding languages may include Hypertext Markup Language (HTML), Standard Generalized Markup Language (SGML), eXtensible Markup Language (XML), or ISO 2709/Z39.2. Format refers to the layout of metadata. Popular metadata record formats include USMARC, Text Encoding Initiative (TEI), IAFA/Whois++Templates, and DC.
Using Dublin Core in HTML documents
Dublin Core is a flexible standard which focuses more on semantics than on syntax or format. A convention was established for encoding DC metadata in HTML, and conventions for other encoding syntaxes such as XML are under development. DC Elements are incorporated in HTML documents with the <META> tag. A sample set of metadata tags for the HTML version of this story might look like this:
<META NAME= "DC.Date" CONTENT= "February, 1998">
Developing an international data standard
The DC element set was developed through a series of international meetings which began with the 1995 Metadata Workshop, organized by OCLC and the National Center for Supercomputing Applications (NCSA). Participants named the standard after the meeting locationDublin, Ohio, the home of OCLC. In the first workshop, the participants hoped to build an interdisciplinary consensus about a core set of elements for describing "document-like-objects" that would facilitate resource discovery and be simple, cross-disciplinary, and international.
The Dublin meeting produced a key set of 13 elements which were repeatable, optional, and extensible. A second workshop in April of 1996, held in Warwick, England, was co-sponsored by the UK Office of Libraries and Networking (UKOLN) and OCLC. The meeting produced the "Warwick Framework," a conceptual model which posits different communities of metadata creators, who develop and maintain different sets of elements under the same set of metadata principles. A September 1996 meeting, sponsored by the Coalition for Networked Information (CNI) and OCLC, focused on developing elements for describing visual resources. Attendees decided that the DC would work for visual as well as textual resources; online discussion after this meeting resulted in the addition of two additional elements.
Structuralists vs. Minimalists
A subsequent meeting in Canberra, Australia focused on increasing the international range of participants and refining element qualifiers. Two distinct camps developed: a minimalist camp, who wanted to limit the DC to 15 simple elements, and a structuralist camp, who insisted on the importance of sub-elements and qualifiers. The group decided to approve three "Canberra Qualifiers": SCHEME, which refers to the formal data content standard; LANGUAGE, which refers to the language in which the element content is expressed; and TYPE or SUB-ELEMENT, which allows creators to extend other elements.
At an October 1997 workshop in Helsinki, Finland, participants agreed to work with the World Wide Web Consortium (W3C) to align DC elements with a formal data model called the "Resource Description Framework" (RDF). RDF is an infrastructure designed to support metadata across many Web-based activitiesthe realization of the Warwick Framework. The meeting also produced the "Finnish Finish," an agreement to stick with the 15 DC elements. Working groups were established to further discuss the DATE, RELATION, and RIGHTS MANAGEMENT elements, sub-elements, the data model, the relationship between DC and Z39.50, and format and resource types.
Dublin Core pilot programs
More than 30 major DC projects are being implemented in 10 countries. The library and museum communities are now exploring the adoption of DC metadata for describing non-electronic resources so that they may be located via the Web. "There is interdisciplinary and international recognition that Dublin Core is the lingua franca for resource discovery," said Guenther. She noted that most search engines now reference <META> tags when indexing Web pages. The W3C is also modifying specifications for the Platform for Internet Content Selection (PICS), designed for filtering internet content, to support generalized metadata like the DC.
"Who creates metadata?" Guenther asked. There are several possible answers: the creator of the Web documents, professional catalogers, search engine companies, or perhaps even automated computer programs. If librarians wish to take part in the creation of metadata, they will need to educate themselves and their supervisors about relevant issues and participate in ongoing debates as standards solidify.
Interfacing with other metadata schemes
Guenther listed other often-used metadata schemes, including MARC, the Government Information Locator Service (GILS), the Visual Resources Association (VRA) core categories, Categories for Description of Works of Art (CWDA), the Federal Geographic Data Committee Standard (FGDC), and the Encoded Archival Description (EAD). "It is necessary to create metadata crosswalks to standardize the mappings for interoperability," Guenther said. She noted that MARC works well for cataloging Internet resources, and DC records may be converted into MARC records and vice versa. Information about the DC/MARC/GILS crosswalk is available from the LC Network Development and MARC Standards Office (http://www.loc.gov/marc/dccross.html).
Dublin Core: next steps
Guenther explained the next steps in the adoption of Dublin Core as a standard. The W3C will implement RDF and XML, while the American Library Association (ALA) will form an ALA Task Force on Metadata. Dublin Core developers will create user guides and tools for the automatic generation of metadata. To track ongoing developments, see the Dublin Core home page (http://purl.oclc.org/metadata/dublin_core/).
A different model: developing digital geospatial metadata
While the Dublin Core elements represent the most general categories of metadata that might be shared across disciplines and platforms, the Content Standard for Digital Geospatial Metadata (CSDGM) is an example of a complex, content-specific metadata scheme. Richard Pearsall, Metadata Coordinator for the Federal Geographic Data Committee (FGDC) described the development and application of CSDGM.
CSDGM was developed in 1994 as a direct result of Executive Order 12906, "Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure." Recommended by the National Performance Review, the order (known as NSDI) was issued to coordinate state, local, tribal, and federal geospatial data collection efforts. It mandated the creation of a publicly accessible clearinghouse of geospatial data for use in public and private sector applications in transportation, community development, agriculture, emergency response, environmental management, and information technology. The order charged the FGDC, a committee of federal managers from agencies collecting geospatial data, with establishing and overseeing this clearinghouse. In order for the digital clearinghouse to work, the committee had to develop standard conventions for metadata.
Pearsall explained that the FGDC consulted existing standards and standards experts while developing CSDGM, including the ALA, the American National Standards Institute, the American Standards for Testing and Materials, the Federal Information Processing Standards, GILS, USMARC Standards, and professional glossaries. The CSDGM standard is designed to support common uses of metadata, such as transferring information from one medium to another, determining the fitness of a dataset, and online searching and comparison of data sets. The committee decided that the CSDGM syntax should be SGML, and that no format would be specified. Certain elements of CSDGM overlap with the DC elements, and may be read by search engines which index documents with the DC elements.
CSDGM contains more than 300 metadata elements, 50 of which are mandatory. The standard is applicable to dataset series, datasets, and individual geographic features. The FGDC tried to provide a common set of terminology, definitions, and information about values to be provided. The FGDC advises that the metadata is best created by the scientist or cartographer who collected the data, but may also be created by "metadata managers" in each agency. "Standardized data is still one of the problems we all face," said Pearsall. "Everyone wants to describe objects in their own terms."
The main sections in the standard include:
Many federal, state, and local agencies have adopted the CSDGM and are adding resources to the National Spatial Data Clearinghouse, which now contains more than 100,000 metadata records.
For general information about the standard, see the FGDC Metadata home page ( http://www.fgdc.gov/metadata/metadata.html). Participating agencies have also written programs to aid in the compilation and validation of geospatial metadata; a list of these is available on the FGDC site (http://www.fgdc.gov/metadata/toollist/gmms.html). "We're looking for anyone who's interested in working with us to create content-specific applications," said Pearsall. He urged interested attendees to contact the committee at [email protected].
More to come
The second half of the Metadata 101 program featured case studies from Montana State University, the U.S. Bureau of the Census, the National Biological Information Infrastructure, the Smithsonian Institution Libraries, and the Intelink Management Office. The remarks of these speakers will be covered in the April issue of FEDLINK Technical Notes.
OCLC enhances ILL
OCLC has added two new fields to the ILL Management Statistics quote and comma delimited files: Copyright Compliance and Library Type. These additions first appeared in the February 1998 files.
This field appears in the Borrowing Data Files with possible data entry of CCG, CCL, or Blank. Please note that the copyright compliance field on the February report will contain data only if the record was added or replaced after January 22, 1998, the date that OCLC added the additional field to the ILL Management Statistics program.
This field appears in both the Borrowing and Lending Data Files. Library type displays information based on information provided by the institution in their OCLC profile. OCLC internal programs also use an alphabetic code to distinguish the type of library. This code is converted by ILL Management Stats into the following named equivalents:
These two new data fields will be added to the end of the current ILL Management Statistics files. OCLC did not want to interfere with current users who have created programs that track data based on column location. (For example, if a user has a program that looks for the "LC Class Range" in Column M, had OCLC added Library Type into Column K, LC "Class Range" would have moved from Column M to Column N.)
ILL Document Supplier program
Effective immediately, the OCLC ILL Document Supplier Program has a new participant. The Research Investment, Inc. (OCLC symbol RSK) provides published and unpublished materials on a global basis. Materials range from conference proceedings, court documents, dissertations, theses, newspaper articles, and scientific and technical reports. For more information on Research Investment, see their Name-Address Directory (NAD) record (104011), visit their home page at http://www.researchinvest.com; contact them by phone at (216) 752-0300, by fax at (216)752-0330, or send email to [email protected].
ILL Direct Request
In early March 1998, OCLC added more functionality to the ILL Direct Request service that allows ILL Direct Request to process more requests and address some user concerns. The newest features are discussed below.
These enhancements will be incorporated into the online documentation on the ILL Direct Request Web site (http://www.oclc.org/oclc/menu/drill.htm). Additional enhancements are scheduled for ILL Direct Request later this year. Details about these enhancements will be provided later this spring.
Fixed-Fee pricing for cataloging
FY 1998/1999 will be the second year that OCLC offers fixed-fee pricing for cataloging as an alternative to transaction-based pricing. Cataloging fixed-fee pricing is available to all qualifying general members, including tape-loading members.
FEDLINK now has quotes available for qualifying libraries. Please contact the OCLC team at FEDLINK Network Operations to request a copy of the quote and an order form. Orders are due at FEDLINK by May 29, 1998. Please call or fax requests for quotes or orders to allow adequate time for processing.
Libraries that are currently fixed-fee subscribers will not automatically renew. These libraries must also complete a new order form to continue on fixed-fee pricing for FY 1998/1999.
OCLC calculated a cataloging fixed-fee to cover the period of July 1998 through June 1999. This fixed-fee is based upon annual transaction averages for the 104 covered product codes. The list of product codes is supplied with the quote and outlined below.
The fee covers most online and off-line cataloging product codes, including credits. OCLC has also added tape loading and Z39.50 cataloging product codes this year. Software products, e.g., CatME for Windows, and WorldCat Collection Sets (formerly, MajorMicrofilm sets) are not covered.
OCLC used the 24 months between January 1996 through December 1997 to calculate annual transaction averages. Then they apply current fiscal year prices (i.e., OCLC FY 1997/1998) and, if applicable, they add an increment equal to next fiscal year's (i.e., FY 1998/1999) projected price increase. At the time of this writing, OCLC has not released its prices for next fiscal year but FEDLINK will have projected price information in late April and will reconfirm the preliminary fixed fee quotes at that time.
In the second and subsequent years of participation in fixed-fee pricing for cataloging, OCLC offers the following discount: if the second or subsequent year's fixed-fee calculated price is greater than the previous year's fixed-fee price, OCLC will offer a discount that is half way between the difference of the two amounts.
Cataloging fixed-fee pricing is optional; libraries may opt to remain on transaction pricing. Use the following characteristics to determine whether a library is a good candidate for cataloging fixed-fee pricing.
Cataloging Label program
OCLC has added a new document on the OCLC Web site for OCLC Cataloging Label Program users. Called "Creating Label Files in Passport for Windows," the document offers instructions for label program users who want to verify that they have the correct version of the SaveBlock macro, to display a label in OCLC Cataloging with Passport software, to save the label to the text file, and to import the file into the Label Program.
For more information on the label program, see the document (http://www.oclc.org/oclc/man/10174lbl/pp2lp.htm). It is also linked to the Label Program Home Page (http://www.purl.org/oclc/label). Just pick "Label Files from Passport".
In March, FEDLINK began a new Web-based service to help library managers monitor OCLC usage. Each month during the fiscal year, the OCLC Usage Analysis Reports will analyze usage in each of the major OCLC service categories, such as cataloging and resource sharing. By providing these reports, FEDLINK hopes to help members manage costs, explain library processes to agency managers, determine if money is being spent efficiently, and track usage over time and by library.
There are two analysis reports provided for each library: the summary provides totals of transactions and costs by category for each month in the fiscal year; and the detail displays all the billable products and transactions in each category. The two reports are linked together so users can move easily from the monthly summary to the detail report for that month.
Both reports group charges into two types:
The summary report shows the cost per service category (including charges and credits), the percentage of the total cost that each category represents, the cost per productive transaction, the overhead costs per transaction, and the ratio of searches to productive transactions.
"Productive" cataloging charges include copy cataloging charges such as FTUs (First Time Use), original inputs, record upgrades, etc. "Productive" resource sharing charges include the count of ILL requests and loans.
"Overhead" charges are all transactions and products not directly involved in the productive categories. Examples of overhead items include telecommunications charges, manuals and other documentation. The report also calculates search ratio, which represents the number of searches divided by the number of productive transactions. The ratio allows member to track efficiency of search practices.
(Note regarding Reference charges: because most FirstSearch users are now purchasing searches in bulk, the charges associated with the purchase will only appear in the month they were billed. The usage (searches) will appear in the usage detail report, but will have no dollar amount.)
Each month's summary links to a line-item detail report are sorted into the service categories listed above. Charges are listed in descending order of cost, showing the most expensive items first. Each line item is marked with the OCLC symbol charged, which should help agencies that manage more than one symbol identify cost and usage patterns among their libraries. "Production" line items are marked with an asterisk (*), and search line items are marked with a plus (+).
The new usage analysis files are the latest part of a growing collection of data tools that FEDLINK offers to members via the ALIX-FS system on the FLICC/FEDLINK Web site. OCLC monthly service charges have been available on ALIX-FS since 1994 in comma-delimited ASCII format, designed to be imported into a spreadsheet or database program. Members can also view a balance report for all their transfer pay accounts (updated nightly), and can download invoice and delivery order data that has appeared on their monthly statements of accounts (updated weekly).
FEDLINK adds OCLC records each month to ALIX-FS once invoice records arrive from OCLC. Usually, these records are added on or after the 15th of the following month. To access these new reports, go to the FLICC Web page (http://www.loc.gov/flicc), choose the Member Financial Services button, log on to ALIX-FS, go to OCLC Account Information section, and select either the detail or summary report.
For assistance with your account or with your ALIX-FS ID and password, contact the FEDLINK Hotline at (202) 707-4900 or send email to [email protected].
FEDLINK Technical Notes is published by the Federal Library and Information Center Committee. Send suggestions of areas for FLICC attention or for inclusion in FEDLINK Technical Notes to:
Federal Library and Information Center Committee
Library of Congress, 101 Independence Avenue SE, Washington, DC 20540-4935
Executive Director: Susan Tarr Editor-In-Chief: Robin Hatziyannis
FLICC was established in 1965 (as the Federal Library Committee) by the Library of Congress and the Bureau of the Budget for the purpose of concentrating the intellectual resources of the federal library and related information community. FLICC's goals are: To achieve better utilization of library and information center resources and facilities; to provide more effective planning development, and operation of federal libraries and information centers; to promote an optimum exchange of experience, skill, and resources; to promote more effective service to the nation at large; and to foster relevant educational opportunities.
Return to FLICC Home Page
Library of Congress
Comments: Library of Congress Help Desk (04/01/98)