Library of Congress
Pinyin Conversion Project

Frequently Asked Questions

Q - Why do the new pinyin romanization guidelines separate syllables, rather than joining them together into lexical units?

A - The Library of Congress posted and distributed draft pinyin romanization guidelines in 1997 and 1998. At that time, the Library received many comments and suggestions from LC staff as well as respondents from other institutions. The comments were taken into consideration in drawing up the new guidelines. The new guidelines were announced and made available to the library community in November, 1998. Final guidelines were issued in August 1999, and have appeared on the Library's pinyin home page for more than a year.

The Library reached its decisions about separation or joining of pinyin syllables very carefully and painstakingly. After having sought input on the draft guidelines from the library community in 1997, the Library took the extraordinary step of sending out a special request for comment on that issue alone in the spring of 1998. Many comments were received on this subject alone, from national libraries, bibliographic utilities, professional organizations, institutions, and individuals. Opinion was sharply divided on this issue. Proponents of both separated and joined syllables put forward well-reasoned arguments saying that their approach, and not the other one, was faithful to the Chinese language, easier to use, and would enhance computer searching. Some contemporary Chinese dictionaries romanize the language in syllables that are separated, others in syllables that are joined. We did not find that Chinese text was being romanized on Chinese publications according to a particular pattern.

After careful consideration, it was decided that syllables should be separated, except in personal names and geographic locations, generally consistent with Wade-Giles practice. Syllables can be aggregated by the use of a joiner, as is the practice on RLIN, or divided by spaces, as is the practice on OCLC. The joiner can either be retained or stripped out of a machine record by other automated systems.

The decision was based on operational necessity, and not theoretical grounds. It was not feasible to link the conversion to vernacular data because tens of thousands of the Library's Chinese bibliographic records are roman-only, and none of the 158,000 name authorities that were recently converted by OCLC include vernacular script.

Of necessity, we have relied on computer technology to perform the bulk of the conversion, as the National Library of Australia did when they converted to pinyin recently. It was important that the new pinyin romanization system be similar to the Wade-Giles system it is replacing. Continued separation of syllables has facilitated the machine conversion of Chinese text, and assures that pre- and post-conversion files are compatible with each other. Moving from a system of separated syllables to one of joined syllables would cause cleanup operations in bibliographic utilities and libraries to be prohibitively expensive and time-consuming.

The new Chinese romanization guidelines provide for the accurate and systematic representation of the sounds of the Chinese language in roman letters, so that the letters and syllables can be utilized for reliable identification, filing and retrieval of information. The guidelines are intended to serve the bibliographic and information needs of the Library, as well as the interests of the wider library community. Insofar as possible, they are intended to be clear and unambiguous; to be easy for library users to learn and to follow; to lend themselves to the greatest possible consistency of application; and to best facilitate the changeover from the Wade-Giles system.

We believe that the Chinese guidelines for connection of syllables are generally intended for the modern language. Like many other libraries, the Library of Congress collects and catalogs material representing the full range of human knowledge, including Chinese classical texts and books on many esoteric and technical subjects. We routinely catalog ancient texts, with terminology that simply cannot be understood in roman form, as well as recently published scientific and technical material with frequently new and unique vocabulary. For this reason, the Library was concerned that connection of syllables would introduce an element of subjectivity into the romanization of Chinese that is not present when syllables are divided -- a subjectivity that would inevitably lead to variance of practice and add to the cost of operations.

In the near future, the Chinese romanization guidelines will be refined and additional examples will be provided. The Library welcomes comments and suggestions that will promote the consistent and accurate application of the new Chinese romanization guidelines.

Pinyin Conversion Project Home Page
Cataloging Directorate Home Page
Library of Congress Home Page

Library of Congress
Library of Congress Help Desk (March 13, 2001)