Character Set Negotiation -- Implementation Level and Collections

Issue raised by: Sabrina Huang ([email protected]) 16 Dec 1998 17:09:24 -0000

In the Character Set and Language Negotiation definition, what do "implementationLevel" and "collections" refer to?

  1. Implementation Level
    ISO 10646 defines three implementation levels: 1-3. They pertain primarily to the use of "combining characters": One or more combining characters may be combined with a base character to encode a particular element of text. Some text elements that can be encoded this way are also encoded as single characters. For example, the single character LATIN SMALL LETTER A WITH GRAVE can also be represented by a composite sequence: LATIN SMALL LETTER A followed by COMBINING GRAVE ACCENT. A complete list of combining characters is supplied in B.1 of ISO 10646-1.

    • Level 3 permits all "combining characters". (Thus in level 3 a given character may have more than a single representation; as in the example above, there may be both the single-character representation of LATIN SMALL LETTER A WITH GRAVE as well as the composite character described.)
    • In Level 2, certain combining characters are not permitted; the list of those not permitted in level 2 is supplied in B.2 of ISO 10646-1.
    • In Level 1, combining characters are not permitted at all.
    • In addition, in levels 1 and 2, characters from HANGUL JAMO block are not permitted.

  2. Collections
    'collections' in the negotiation definition is an object identifier of the form
    1.0.10646.1.x.y.collection1.collection2. ......
    • x is the implementation level (as described above).
    • y is either 0 or 1. Generally, the value will be 1. (The usage of value 0 currently is not well-understood by the Z39.50 community and the value 1 should be used until such time as this clarification is updated. Annex M of ISO 10646:1 may be consulted for details.)
    • collection1, collection2, ... refer to the "collections" specified in Annex A of ISO 10646-1. For example, collection number 1 is "Basic Latin", positions 0020-007E, collection number 2 is "Latin-1 Supplement" 00A0-00FF.

Status: Approved (August 1999).
Library of Congress