Use the CJK Compatibility Database to quickly and conveniently replace non-MARC21 characters with MARC21 equivalents, or a missing character symbol.
Non-MARC21 characters and their MARC21 equivalents
The Unicode character set includes several hundred duplicate CJK characters (for example, 路, F937, and 路, 8DEF), as well as many others that represent close variants (for example, 步, 6B65, and 歩, 6B69). Generally, one of these variants is a MARC21 character, while the other is not.
Only MARC21 characters can be validated in USMARC records. However, sometimes the most logical way to create a character using a Microsoft IME produces a non-MARC21 character. For example, if one creates the common character 李 by keying 이 in the Korean IME, the result is a non-MARC21 character (F9E1). One must key in 리 to create the valid MARC21 form, 李, 674E.
The character 歩, 6B69, is created with the Japanese IME. But the Japanese form of this character is not a valid MARC21 form. The valid MARC21 equivalent, 步, 6B65, can only be created by the Korean or Chinese IME.
Only MARC21 characters can be validated properly in a MARC21 bibliographic record. Therefore, a non-MARC21 character in a bibliographic record must be replaced by its MARC21 equivalent.
The CJK Compatibility Database
The CJK Compatibility Database includes more than 450 non-MARC21 Chinese, Japanese and Korean characters, Hangul syllables and diacritic marks, matched with their MARC21 equivalents. The list of characters in the database was initially identified by LC staff, and was supplemented by entries in a similar database at Yale University. Characters that do not have a MARC21 equivalent are matched with the missing character symbol 〓.
The database is intended to enable catalogers to quickly and conveniently replace a non-MARC21 character with its MARC21 equivalent. Directions are given below.
The entire list may be viewed by clicking on the tab entitled Browse all entries. The list gives the Unicode value for each character, along with other information that may be helpful in identifying the characters and describing how the MARC21 character may be input.
Updating This Database
The database is a cooperative undertaking, and is intended for the use of all CJK catalogers. If you encounter a non-MARC21 character in the course of your work, please report it to us so that it can be added to the database. Notify Young Ki Lee, Senior Cataloging Specialist, Korean/Chinese Team, Library of Congress, at [email protected]
Directions
Replace a non-MARC21 character with its valid MARC21 equivalent by following these steps:
1) Copy the invalid character from your bibliographic record
2) Open the CJK Compatibility Database Page
3) Paste the invalid character in the white box and use the index "Invalid character"
4) Click "Submit"
Another screen will then appear with the valid alternative
5) Copy the valid alternative character or missing character symbol
6) Paste the valid alternative into your bibliographic record
Note: Characters can also be found by inputting the UTF of the valid MARC21 character or the UTF of the non-MARC21 variant, Pronunciation, or Han'gŭl reading.
When an invalid character is not available in LC's CJK Compatability Database, search the Unihan database to find the Unicode (i.e. UTF-16). For detailed instruction, see PCC CJK NACO Best Practices page.
Try: 金 鶴 娳 歩