Pinyin Conversion Project

Planning Meeting

October 7, 1999, 9:00 a.m. - 4:00 p.m.
Conference Room LM 642
Madison Building, Library of Congress


Minutes

Attending:

Columbia University
Robert Wolven

Council of East Asian Libraries
Peter Zhou

Harvard University
Dale Flecker
Jeffry Horrell

Library of Congress
Philip Melzer
Beacher Wiggins

OCLC
Georgia Brown
Lynn Kellar
Glenn Patton

Research Libraries Group
Karen Smith-Yoshimura

University of California at Berkeley
Bernie Hurley
Lee Leighton

University of California at Los Angeles
Sarah Elman

University of Chicago
Frances McNamara
Judith Nadler

University of Michigan
Joan Butler
Xiaofei Chen
Julie Grygotis
Phyllis Valentine

Handouts

Contents

  1. Introductions; Purpose of the Meeting
  2. Status Reports
    1. LC Status Report
    2. OCLC Status Report
    3. RLG Status Report
  3. Definition of Day 1
  4. Prerequisites for Day 1
    1. The Marker on BIB Records
    2. Conversion of the Name Authority File
    3. Conversion of Subject Authorities
  5. Day 2, Day 3
  6. Timeline
  7. Conversion of Non-Chinese Records
  8. Cleanup
  9. Services Provided by the Utilities
  10. Standard for Accuracy of Conversion
  11. Headings for Place Names in Taiwan
  12. Communications

1. Introductions; Purpose of the Meeting

Horrell chaired the meeting, and delivered opening remarks. He thanked Wiggins for organizing the meeting, and said that it was important to meet here because of role of the Library of Congress. The participants then introduced themselves.

Horrell described the pinyin conversion as being significant for cataloging. Large libraries face a variety of issues with regard to processes that are critical to our futures. Participants were urged to be clear about desired outcomes, and to seek a collective plan. He reviewed the agenda.

Flecker said that lacking in discussion up to now has been how libraries outside of LC will deal with changes. Because the changes lend themselves to large-scale automation, he hoped there would be agreement on how to proceed. Participants should send a consistent message to OCLC and RLG, and arrive at a generally desirable approach. To do this, libraries must sacrifice small differences to reach consensus. Valentine said that one concern was the mixture of CJK and other files sharing one authority file.

Hurley asked who the participants represent; California has its own system, while others have vendors. Nadler said that since the critical mass of players is not present, do we send our recommendations to others for input? How final are our recommendations? Flecker said that the participants have no standing as a group, but need to talk anyway. Horrell said that the group should seek consensus. Zhou represents CEAL, which is comprised of members from over 100 collections. Their primary concern is having sufficient time for planning. He saw a lack of planning on the part of individual libraries up to now.

Go to Contents

2. Status Reports

2A. LC Status Report

Melzer gave the LC status report. Wiggins has proposed that Day 1 occur on June 1 2000, and that Day 1 would take place when converted LC records begin appearing in RLIN database (and OCLC?). After that time, pinyin romanization would be the standard in new and changed bib and authority records. The proposal was intended to provide all parties with a target around which to focus planning and discussion. Wiggins said that he hoped to reach consensus on what Day 1 meant for LC and others, and to put conversion of NARs in sync with other plans.

The sequence of conversion tasks, and which tasks will be undertaken as cleanup, will become clearer as the capabilities of conversion programs is known, and plans for conversion of NARs take shape.

LC has formed a working group to coordinate the conversion of NARs with OCLC and plan for NAR conversion activities within LC.

LC is ready to begin conversion of subject headings. Henceforth, new subject headings should use pinyin for systematically romanized Chinese words. The change will be a transition over a number of months, rather than a clean break. Converted headings will appear on weekly lists. 35 headings with potential to double-convert have been identified, and will be set aside and converted on or about Day 1. New romanization guidelines will be issued in several days, to be used to support the conversion of subject headings.

Additions and changes have been made to the draft conversion specifications, including the creation of data dictionaries to convert headings for all of the conventional place names that were recently changed to current BGN-approved form. Plans are being made to prevent the program from converting headings for place names in Taiwan (to preserve the Wade-Giles forms of heading recommended by BGN).

Flecker asked which fields in LC bibliographic recored would be converted. Smith-Yoshimura said all variable fields, except those that the spec says not to convert; Melzer said that testing will determine which subfields can be converted safely.

Go to Contents

2B. OCLC Status Report

Patton outlined OCLC's plans for products and services. Discussion at the RLG Forum at ALA identified areas for attention.

LC Chinese records were loaded into WLN because it has a linked file; those OCLC/WLN capabilities were then used to trace back from LC bib records via headings to NARs, to define universe of headings on Chinese records (and, by extension, on non-Chinese records as well). Patton distributed statistics showing that there are 205,000 unique headings on LC bib records. The total includes all headings, Wade-Giles and other; not all would require conversion. It is encouraging that 68% of the headings re only used once. Patton saw these as constituting a large universe of NARs that could be converted by machine. Then, if there were a conversion problem, the negative effect would be limited.

OCLC will work with LC to develop specifications to convert NARs. Those specs should mirror the specs used for bib records. Special attention will be paid to conversion of x- refs. Some of the challenges include names in WG form with western forenames, and authorized 'previously used' x-refs. OCLC is now working on mapping syllables. OCLC will attempt to identify the universe of non-Chinese records with Chinese text or headings; to convert all 43 million bib records in WorldCat; and to scan the entire name authority file (names and subjects).

Brown said that OCLC wants to convert the authority file first. She wondered if RLG could make use of OCLC's authority file changes to find and change headings on non- Chinese bib records. Because it's such a large project, changes to the bib file will probably have to be accomplished in batches of 250,000; there cannot be a "hot cutover". OCLC favors the marking of fields because doing so will make a gradual conversion possible.

Kellar reported that the OCLC Office of Research is making progress in writing an algorithm to identify Wade-Giles and pinyin syllables. She saw a 4% error rate ("not convert" rate), but is working to reduce it. Programmers can identify Wade-Giles with 3 adjacent syllables, but not (yet) with two. They are encountering difficulty in identifying personal names because those names tend to be brief.

Flecker asked why OCLC was only working with LC records; Kellar responded that those were the records that have been loaded into WLN. OCLC is working on development with LC -- algorithms, methods, and processes. Kellar saw discussion with LC coming down to where to draw the line on the algorithm between machine and manual conversion. Brown speculated that conversion of NARs would begin in June 2000. The authority file could then be used to find headings on bib records, but that process would require many months. The order of conversion is open to discussion: it could start anywhere, for example, beginning with the most recent records and working backwards, if that were found to be desirable to OCLC users.

Go to Contents

2C. RLG Status Report

Smith-Yoshimura reported that RLG programmers had finished their internal design and first preliminary test; the results so far are uncovering gaps and ambiguities in the spec. The draft specs will require a number of adjustments. An environment has been created for LC test records, and a home for records after they have been converted. This makes it convenient for RLG to compare old and new records. Results of testing will be shared with OCLC and LC. RLG and OCLC have agreed to try to reach the same test results, using the same test records.

RLG won't know how long conversion will take until the first production runs. Smith- Yoshimura anticipated that in early 2000 RLG will share the spec with other libraries, and will use its conversion program to test some of their records. RLG will convert LC's Chinese records first, which represent about 20% of the total number of Chinese-language titles in the RLG Union Catalog. RLG expects LC's Chinese records could all be converted by the June 1, 2000 "Day 1" date LC proposed.

Smith-Yoshimura thought that libraries would receive snapshots of their records as soon as RLG had converted their records. RLG now believes that it would be best to convert the largest collections before the others, so that users will see an early and increasing wave of converted records appearing in the RLG Union Catalog. A period of mixed files is seen as being inevitable, with pinyin records gradually replacing most Wade-Giles records. Some non-standard records may be left as they are because they are too non-standard to convert.

Smith-Yoshimura saw libraries having different Day 1s, which RLG could support as long as a library commits to the same "Day 1" for all *its* records. Kellar hoped that, on Day 1, the name authority file will have been completely converted.

[Break]

Go to Contents

3. Definition of Day 1

Horrell asked the group to define Day 1. Wiggins said that after Day 1 LC will use pinyin for all romanization of Chinese, including acquisitions. Smith-Yoshimura said that RLG will convert LC records before Day 1. But wouldn't there be a time disconnect if OCLC can only begin converting NARs in June? Kellar said yes, OCLC would accept LC converted bibs, then change the rest of the file. Nadler said that converting NARs before bibs was a vital decision, more important than others. Wiggins said that the June 1 date was not immutable; if it seems not to be workable, LC will consider changing it, and will then work with OCLC and RLG to set new Day 1. He stressed that conversion had been postponed for too long, and must occur in 2000.

Wolven saw a long transition of mixed files. Libraries should be free to choose to use Wade-Giles or pinyin records for copy cataloging. If RLG has all LC records converted on Day 1, then copy cataloging staff could be told to use pinyin. Smith-Yoshimura urged participants not to update their records from Wade-Giles to pinyin after Day 1, but wait until they are converted by computer programs, in order to avoid creating gap records.

What milestones will occur after Day 1? Participants from large collections felt that libraries must continue to deal with copy and original cataloging differently. Hurley asked if an institution's Day 1 came late, could it perform Wade-Giles copy cataloging up until then? There was consensus that Day 1 meant using pinyin for original cataloging or creating a "new" record without source copy (including acquisitions or in-process records). When a library is ready to have its files converted and be phased in, then it would cease copy cataloging in Wade-Giles. Each library would need to determine a date when Wade-Giles cataloging stops. Smith-Yoshimura said that it would take RLG approximately 2 weeks to scan a library's files; therefore, the end of Wade-Giles copy cataloging could be attuned to that timeframe. Brown agreed that Day 1 should apply to original cataloging; then if a copy cataloger found Wade-Giles in a record, Wade-Giles could be used, until that library' file changes. Brown said that if a Wade-Giles record came to OCLC after Day 1, OCLC would have to identify and convert it.

Hurley thought that Day 2 would occur when a library's records are converted; that date would vary from one institution to another. Flecker explained that libraries face a trade-off: not wanting to change to pinyin until most of the copy cataloging they encounter is converted, and having split files and mixed romanization practice in their catalogs for a long time. Kellar said that OCLC hopes to use converted authorities to lead to access points on bib records, and then convert the access points. Brown felt that, if there were a marker on NARs, then OCLC could distribute pinyin-converted NARs separately; this step would help outside libraries convert their files. Flecker and Hurley agreed that there seemed to be no way of avoiding split files.

Kellar said that she now heard:

Hurley suggested that it would be a small commitment for libraries to also perform copy cataloging in pinyin on Day 1. Leighton thought there could be one Day 1 for LC, and then many others. Flecker sought agreement on one Day 1 for everyone. Zhou thought that for clarity, the timeline should have one Day 1, but not necessarily the same Day 2 or Day 3 for different libraries. After more discussion, Flecker summed up by saying that, after Day 1, original cataloging would be done in pinyin, and records using pinyin would contain a marker.

Go to Contents

4. Prerequisites for Day 1

(if Day 1 means that all new/original records in created in pinyin)

4A. The Marker on BIB Records

Brown said that if original cataloging is done in pinyin before conversion, then those bib records must include a marker. Kellar was concerned about records marked Chinese that did not include any Wade-Giles syllables. Wolven proposed that, after Day 1, either original cataloging gets a marker, or the machine supplies it to a record to show that it has been processed.

LC had proposed adding a marker in the 986 field, which is a local field. RLG and OCLC put forward a counter-proposal that fields with unconverted text also be marked at the field level with a marker in a non-alphanumeric subfield. Flecker reported that the LC and joint OCLC/RLG proposals for a marker on bib records had generated much discussion at Harvard. His colleagues disliked a non-alphabetic subfield marker, and many vendor systems could not handle them. There seemed to be agreement that the non-alphabetic subfield would pose problems. Wiggins said that CDS saw problems in redistributing a non-alphanumeric marker; LC would have to evaluate the impact of that option, and may not approve it. Smith-Yoshimura suggested that CDS not distribute the subfields with the problem, but strip the field out and then distribute the records. Wiggins said that the OCLC/RLG proposal for a marker in a local field such as the 986 would not pose a problem for LC cataloging staff.

Brown wondered how the utilities and LC could work with vendors to arrive at solution. McNamara suggested putting out testfiles and sending them to vendors. Brown said that OCLC would produce proposals in paper form, and then get feedback from members who work with various local systems.

Hurley asked why a field level marker was necessary if there was a marker in the 986 field. Patton said that LC proposed embedding all the information in a 986. OCLC and RLG proposed a field level marker because if there is partial conversion, the indication in the 986 seemed vague. OCLC and RLG saw a problem, a potential difference in maintaining consistency in records, and in how information would be identified in the field. There is agreement that the marker should be as straightforward as possible. He said that because many local systems rearrange the order of fields, the 986 might become ambiguous. When he heard comments that this problem is not immediately evident, he understood that from perspective of East Asian catalogers, but OCLC was also concerned with other users of these records. Smith-Yoshimura felt that if a trained cataloger were to look at a record, they would be able to distinguish Wade-Giles and pinyin data right away.

Melzer said that LC put forward a less complex proposal because the cleanup process at LC would probably consist of just one step. Hurley favored a simple marker in the 986 field that indicated "converted completely" or "not converted completely". Nadler wanted something more substantive because some libraries might not get to the cleanup stage for some time. Hurley asked how much information would be necessary for cleanup. Flecker suggested that more input would be needed before a decision could be made. OCLC, RLG and LC will write up a description of the problem and suggest options, and disseminate them for comment. They will also ask their vendors for input. This will be vetted at ALA Mid-winter.

[Lunch]

Go to Contents

4B. Conversion of the Name Authority File

Horrell asked if it would be useful for LC to have an updated name authority file before Day 1. Wiggins said yes, while it is not a prerequisite for conversion, it would be useful in a world of shared authorities. Nadler felt that it was necessary to convert NARs before bibs, and mark them.

Melzer said that if conversion of name authority file is done in phases, then there could be some overlap with conversion of bib records. Smith-Yoshimura said that scenario would rasie a concern about double conversion of certain syllables. Melzer said that LC is keeping that in mind. He shared a handout that explained how he believes that double-conversion could happen. He suggested that the problem could be addressed by attention to the sequence in which authorities are converted. Melzer said that the potential for double conversion among the conventional place names was limited to two headings. 35 potential double converts among subject headings had been identified and will be converted on or about Day 1 to minimize the chance for double conversion. He felt that it should be possible to identify potential double-converts among name headings, put them aside, and convert them on or about Day 1. Wiggins said that the chances would be further minimized if the name authority file can be converted before Day 1.

Nadler asked what the implications would be if the name authority file were not converted before Day 1. Hurley thought there would be more work for the original cataloger, and Kellar predicted local system problems with authority control; Hurley thought that California's Day 1 could occur later than LC's. Nadler asked whether OCLC could convert NARs before Day 1. Wolven said that a gap of 30 days would have a much smaller impact on libraries than a gap of 6 months. Hurley then suggested that Day 1 be moved back to October 1, to give OCLC more time to convert NARs. Flecker stated the sense of non-LC attendees that at least most NARs should be converted before Day 1. There seemed to be consensus on October 1 as Day 1. Wiggins said he did not want to see continual slippage, since this project has been put off so frequently in the past. He heard the recommendation of the group, and would take it under advisement. He hoped that the trade-off for the later date would be agreement on having a converted authority file before Day 1. Brown said the conference callers (LC, OCLC and RLG) would discuss dates and work on a proposal regarding conversion of NARs.

Melzer announced that the LC NAR Conversion Group had been looking into how a fixed field might be used as a marker to identify converted NARs. Brown and others voiced strong support for such a marker. Wolven wondered whether the same marker would be used on NARs for machine and human conversions. Wiggins said LC would work with OCLC and RLG to reach a decision on a marker for name authorities.

Nadler said that new NARs should be established in pinyin after Day 1. If Wade-Giles NARs are encountered, should they be converted before they are used? Smith-Yoshimura said that yes, after Day one a cataloger should convert it, and add a marker at the same time. Should NACO libraries help with the conversion process? Elman suggested non-NACO libraries report to LC so that LC can convert Wade-Giles headings they encounter and use after Day 1. Hurley hoped that a computer program could do that. Smith-Yoshimura was concerned that NARs represented by more than one hit on bib records would be converted before Day 1.

Smith-Yoshimura said that RLG would produce a list of headings on Chinese records, in Wade-Giles and pinyin forms, which could be used to 1) verify that NARs have been changed and distributed, 2) match converted forms on NARs, and 3) identify headings without NARs. Wolven wondered whether it would be desirable for a machine to generate NARs from no-hitters. Melzer said the list would be used by LC for manual cleanup and review. Should an x-ref be created from Wade-Giles form on new pinyin NARs? Melzer saw no utility in doing so, since he assumed that after conversion the library community would move away from Wade-Giles quickly. Elman and Smith-Yoshimura favored providing a Wade-Giles x-ref for access, while Hurley thought such a reference would be useful for linked authority files.

Go to Contents

4C. Conversion of Subject Authorities

Flecker asked what conversion was being done in advance of machine conversion. Melzer said that headings for more than 260 Chinese conventional place names, and over 5300 related NARs had been changed to BGN-approved forms. Also, LC planned to begin converting subject headings on October 1, and direct that new subject headings use pinyin romanization. Identification and conversion of subject headings was entirely a manual task. Incremental conversion was intended to give other libraries an opportunity to keep up with the changes. Smith-Yoshimura and Flecker urged LC not to release converted subject headings until Day 1. They felt that it would be better to include the conversion of subject headings in bibliographic records as an aspect of the general conversion. It was suggested that a data dictionary be used to convert 650 fields. LC had not begun to convert subject headings, and Wiggins said LC would reconsider in light of the feelings of the group.

Go to Contents

5. Day 2, Day 3

Day 2 was defined as the point at which a library would do all prospective cataloging (both original and copy) in pinyin. Presumably it is also the point at which a library would authorize the retrospective conversion of its existing catalog data by a utility. Wolven thought that after Day 2, no more of a library's records would need to be converted. Nadler said that, to diminish the length of time between Day 1 and Day 2, it would be helpful for libraries to receive a snapshot of their converted records as soon after conversion as possible. Wolven thought that if OCLC converted the WorldCat file backwards by date, then the difference between Day 1 and Day 2 could be reduced. Brown said that OCLC could convert by working backwards, through the OCLC database, thereby making recently created pinyin records available for copy cataloging.

OCLC was asked about its timeframe for conversion of bib records. Brown said that date has not yet been determined. Zhou thought that Day 2 could not occur until OCLC converts its bib records. Hurley wondered if Day 2 would occur when OCLC had completed its conversion, or only a certain percentage. Valentine said local libraries could make plans if they had some dates to work with. Flecker thought the timing of Day 2 was tied to the conversion process. Hurley said that Day 2 would occur when libraries looked for copy and only found pinyin records. Leighton agreed that when the database changes, then Day 2 would occur -- or a library would declare Day 2. Wolven thought Day 2 meant that a library would cease disseminating records that used Wade-Giles romanization.

Nadler recommended that LC should declare that Day 1 and Day 2 must happen in 2000. Wiggins said that LC would declare Day 1 for LC's romanization of Chinese; others can decide on their Day 1's or Day 2's depending on their own situations. It was then agreed that Day 2 would occur for a given library when the majority of recent records appear on the database in pinyin, and that from then on a library would perform all copy cataloging of Chinese material in pinyin.

Nadler saw an advantage to declaring that the difference between Day 1 and Day 2 should not exceed a certain period of time. Flecker said there would be bad effects if there is a long time between Day 1 and Day 2, or if there is a long time between RLG and OCLC conversions (because that would lengthen the time during which people would import Wade-Giles records). He hoped that the gap between Day 1 and Day 2 would be brief, perhaps no more than 6 months. Smith-Yoshimura said that the RLG conversion should take about 6 months to accomplish, but that it would apply only to Chinese records. Someone noted that, if RLG begins to convert on March 1 and Day 1 occurs on October 1, then there will be a gap. Valentine said that if this sort of gap occurs, then Michigan would not provide updates to RLIN during that time, but would only perform them locally during that time; copy cataloging procedures would have to be revised. Brown said that OCLC would convert more than Chinese records, but will note opinions about order in which conversion takes place.

Flecker thought that a Day 3 would occur, then, when it was no longer necessary to add a marker to bib records.

Go to Contents

6. Timeline

Flecker said that he perceived a consensus developing around the following sort of timeframe and sequence of events:

Jan. 2000
  • decision on marker
  • OCLC defines options and services
[a gap period will occur; LC, OCLC and RLG will work to keep the gap period as short as possible, and to mitigate the negative impact]
October 1, 2000
  • Day 1 occurs: official starting date for pinyin
  • LC uses pinyin romanization for all cataloging
  • converted LC bibliographic records have been distributed
  • most NARs have been converted by OCLC and distributed by OCLC and LC
  • other libraries use pinyin romanization for original cataloging, and begin transition to copy cataloging in pinyin
April 1, 2001
  • conversion of OCLC and RLG files has been completed
  • all cataloging is done in pinyin by all libraries
October 1, 2001
  • Day 3 occurs: all cease using the pinyin marker

There was general agreement among participants from large collections that this sort of timeframe and sequence would be workable and desirable.

Brown and Wiggins said that OCLC, RLG and LC would consider these recommended dates as they work together to develop an implementation timeline. Brown cautioned that the dates should be described as tentative or 'wish dates'; OCLC will not be able to agree upon dates until it knows how much work it will take to reach the milestones. Wiggins also cautioned that, if the date of Day 1 is pushed too far back, LC would not be able to agree, and this could affect the relationship between the conversion of NARs and bib records.

Go to Contents

7. Conversion of Non-Chinese Records

What plans have been made for conversion of Japanese and Korean records? Smith-Yoshimura said that, because of the time factor, RLG will convert Chinese records first. She said that after conversion of Chinese records, RLG could scan for "chi" in the 041 field at some time in the future; because those would be the most likely candidates among Japanese and Korean records to have headings in Wade-Giles. Taking this step would expand the scope of the project, and before dong so, RLG woud have to estimate what would be involved. Brown said that OCLC is prepared to scan whole database, then asked if that seemed like a wise approach. Flecker thought that it would be a great service to do so, and that OCLC is best equipped to make such a conversion, but he was concerned that the utility of doing so might be counterbalanced by the time it would take to accomplish.

Go to Contents

8. Cleanup

Flecker asked Melzer what fields would remain unconverted after machine conversion of LC records. Melzer said that, at this time, he could only speculate, but thought that, among bib records, cleanup might primarily consist of manual conversion of unconverted subfields on Chinese records, and headings on non-Chinese records; among NARs, cleanup might include no-hitters, records that have been set aside because they would be more safely converted manually, and perhaps potential double-converts. He thought it likely that non-unique names would be converted manually.

Hurley asked if there was there any way to work jointly on cleanup. Valentine wondered if Michigan's cleanup would be useful to anyone else. Flecker thought that without a way to distribute corrected records to appropriate libraries, one library's manual cleanup would not be helpful to other libraries. Furthermore, once a library's catalog had been converted, there may be no way of taking advantage of LC cleanup work either. Brown noted that OCLC has a notification service: if a record changes, you can use it to change your local record. Smith-Yoshimura said that updates on RLIN benefit one library; there would be shared benefit only if all records in a cluster would be updated. Flecker asked how RLG could propagate that change to others. Kellar suggested that NACO libraries could help with manual conversion or cleanup of NARs.

[Break]

Go to Contents

9. Services Provided by the Utilities

Flecker asked about services that the utilities could make available to local libraries, many of which are intimidated by the prospect of conversion of local systems. He suggested that consensus by this group would help the utilities decide what services to offer. Local libraries also want to know what local vendors can do with converted records.

Brown said that OCLC would add a marker to all Chinese bib records, and otherwise only where there is a pinyin-related change. Libraries can send bib records and authorities to OCLC, where they will be converted. Will OCLC convert archival tapes or local records for local systems? Brown said that that is under consideration. Patton noted that OCLC has an archive of every update of every record they have received. Hurley said that if that service were to be made available, some libraries would send large sets to be converted. Leighton asked whether the OCLC archive included records from RLIN; Kellar said she would check. Brown said that OCLC is concerned that some records may change during the time it takes to process them, depending upon the timing of their conversions. Zhou asked about libraries that want only to have changed fields changed locally; several people said that would have to be a local system feature. Hurley and Flecker wanted to send records to OCLC and receive corresponding converted bibs and NARs in return. They said that most users would want their proper subset returned to them.

Would OCLC export all converted NARs? That would be helpful to local libraries, which could then take steps to suppress all non-matches. Brown said that option would be available, but at a certain cost. Kellar said, and Wiggins verified, that converted NARs would be exported to LC, and they would then be redistributed via CDS. What about libraries like Michigan, which gets its NARs from OCLC, not LC? Valentine said that some libraries would be interested in CDS distributing pinyin-converted NARs all together in a lump. Wiggins said he would check to see if that is possible.

Flecker said that libraries need to know what services are available, when they are available, and the cost of each. Brown said that OCLC needs to hear what libraries need to have done. Could OCLC report what conversion services they will offer at Big Heads meeting at ALA Midwinter? Brown said yes, but would need more time to determine prices; a snapshot would be cheaper than sending files in for conversion. Smith-Yoshimura said that RLG would not charge for conversion, but will charge for a snapshot; there are discounts available for large batches.

Go to Contents

10. Standard for Accuracy of Conversion

Flecker asked where the 'sensitivity level' should be set with regard to accuracy of conversion of bib records. Wiggins said he felt that a certain balance would have to be struck, and there we might reach something less than 100% accuracy in the interest of convenience. Nadler asked whether the same level of accuracy would be acceptable to East Asian libraries as to large collections and others. Smith-Yoshimura thought that would depend upon how much attention is given to the exceptions. Melzer thought that, as test results come in, there would be an attempt to strike a balance between spending time and money converting certain problematic subfields, vs. a pass-and-mark procedure that would involve manual cleanup.

Go to Contents

11. Headings for Place Names in Taiwan

Although Taiwan has announced that it now planned to convert the romanization of most geographic locations to pinyin at some point in the future, BGN must await formal notification or documentation from the government of Taiwan before changing its official forms of name. Because LC is required to use headings in the form recommended by BGN, the Pinyin Task Group is reluctantly drafting specifications that would prevent conversion of headings for place names in Taiwan. Zhou cautioned that, if LC has no foresight in this regard, he anticipates a second conversion project to change these headings in the near future. He recommended that the list of non-converts be limited to a small number. Smith-Yoshimura suggested that, perhaps if BGN changes its headings, then Taiwan will follow suit. She requested that LC appeal to them to change. Wiggins said he would look into it; Melzer said he would convey the group's concern to CPSO.

Go to Contents

12. Communications

Valentine thought that milestones were needed for communication. Nadler suggested communicating in layers. Valentine was concerned that, up to now, communication has not been to the whole library community. Wiggins said that LC, RLG and OCLC hold monthly conference calls; perhaps updates could be provided, based on these calls, in relation to the milestones that have been defined here. Smith-Yoshimura suggested that notes of this meeting could be posted on LC's pinyin home page, and felt that there had been a good deal of publicity already. Melzer collected the email addresses of all participants. LC planned to write up minutes of the meeting, send a draft to participants for editing in the week following the meeting, and then post the minutes on the pinyin home page.

Wiggins requested that time be set aside for an update on pinyin conversion at the Big Heads meeting at ALA Midwinter; Nadler, who is chair of the Big Heads group, agreed.

Zhou said that he wanted to post a brief summary of the meeting on behalf of CEAL right away; he will send draft notes to participants for feedback. He said that he was pleased because all points important to CEAL members were covered in the meeting.

(Notes by Lydia Hsieh, Thomas Tsai and Philip Melzer, Library of Congress)

Go to Contents
Pinyin Conversion Project Home Page
Cataloging Directorate Home Page
Library of Congress Home Page
Library of Congress
Library of Congress Help Desk (10/28/99)