The Library of Congress >> Especially for Librarians and Archivists >> Standards

MARC Standards

HOME >> MARC Development >> Discussion Paper List


MARC DISCUSSION PAPER NO. 2010-DP05

Link disclaimer

DATE: May 17, 2010
REVISED:

NAME: Language Coding for Moving Images in Field 041 of the MARC 21 Bibliographic Format

SOURCE: Online Audiovisual Catalogers (OLAC)

SUMMARY: This paper suggests revising the application of 008/35-37 and 041 $a and $j for moving image materials to create a spoken/sung/signed versus written language distinction.  In addition it suggests distinguishing between original language and language of intermediate translations that are both currently coded in subfield $h.

KEYWORDS: Field 041 (BD); Language code (BD); Original language (BD)

RELATED:

STATUS/COMMENTS:
5/17/10 - Made available to the MARC 21 community for discussion.

6/27/10 – Results of MARC Advisory Committee discussion: Because field 041 is used widely for all forms of material, how any changes would affect different forms of material (other than moving images) needs to be carefully considered in any future proposal. This should include having some complex examples, for instance for sound recordings. However, there was not consensus as to whether the paper should be brought back as a proposal, and some participants suggested that we may be asking one field to do too much. OLAC will reexamine the issues in the discussion paper and determine whether to pursue it further.


Discussion Paper No. 2010-DP05: Language Coding in Field 041

1. INTRODUCTION

Character position 008/35-37 (Language) and field 041 (Language code) work together to convey language information about a resource. This includes both information about multiple languages as well as more granular information about a part or aspect of the resource that has additional language characteristics, e.g. for language of accompanying materials, language of summaries, etc.  Specific instructions are given for special forms of material. The field description specifies the use of field 041 in addition to 008/35-37 for moving image materials in the following situations:

In 2007, a new 041 $j was added as a separate subfield for video subtitles and captions (MARC proposal 2007-01), which previously were included along with summaries/abstracts in subfield $b. The Online Audiovisual Catalogers (OLAC) now request that several adjustments be made for language coding of moving image materials.  This will allow for more useful presentations of language information to users and enable more useful search limits and search options.  The proposed changes are to:

-  Revise the usage guidelines of 008/35-37 and 041 $a and $j for moving image materials to clearly create a spoken/sung/signed versus written language distinction. The current guidelines do not work well for DVDs, which often have subsidiary spoken languages.
-  Separate the original language and the intermediate   translations of the main works into separate subfields in 041. This would also be particularly useful for film and literature, and possibly for other materials.

Field 041 is currently defined as follows:

First Indicator
Translation indication
# - No information provided
0 - Item not a translation/does not include a translation
1 - Item is or includes a translation

Second Indicator 
Source of code
# - MARC language code
7 - Source specified in subfield $2

Subfields
$a - Language code of text/sound track or separate title (R)
$b - Language code of summary or abstract (R)
$d - Language code of sung or spoken text (R)
$e - Language code of librettos (R)
$f - Language code of table of contents (R)
$g - Language code of accompanying material other than librettos (R)
$h - Language code of original and/or intermediate translations of text (R)
$j - Language code of subtitles or captions (R)
$2 - Source of code (NR)
$6 - Linkage (NR)
$8 - Field link and sequence number (R)

2 DISCUSSION

2.1 Recommendations for coding language information for moving image materials

The two proposed changes grew out of a discussion at an OLAC meeting about remaining ambiguities in video language coding after the approval of 041 $j. An OLAC task force was thus charged with developing best practices for coding language information for moving image materials, particularly DVDs, and with examining whether any changes could be made to the MARC format (coding or directions) that would improve access to the multiple types of language information found on videos. The Video Language Coding Best Practices Task Force's draft proposals can be found at http://www.olacinc.org/drupal/?q=node/36.

Another problem with the current usage guidelines is that it is not possible to unambiguously identify spoken languages on videos because the language of intertitles on silent films and the language of packaging, credits, or scripts are also included in 008/35-37 and 041 $a for films with no spoken content.

The task force based its recommendations on an examination of a variety of language situations that occur in video cataloging and on the following premises:

  1. Coded language data is intended for use in retrieval, limiting, and sorting.
  2. Coded language data does not need to describe all language-related information about an item that might be of interest to users. Coded language information can be expanded on and complemented by information in 546 free text language notes.
  3. Coded language data is most effective when it supports the retrieval of the language(s) of the main work(s) on the item, rather than the language(s) of supplementary and bonus materials.
  4. Coded language data is most effective when it supports retrieval based on language(s) in which the item is usable, rather than all language(s) that might be found in the item.
  5. For moving image materials, patrons are most interested in the retrieving, limiting, and sorting by the following types of language information:

2.2 Making a distinction between spoken/sung/signed languages versus written languages

After reflecting on the purpose of recording coded language information in MARC bibliographic records for moving image materials, OLAC believes that it is more important to distinguish between spoken and written language access than to maintain the current division of “main” language(s) recorded in 008/35-37 and 041 $a and “subsidiary” language(s) (i.e. subtitles and captions) recorded in 041 $j. Since the advent of DVDs, this distinction has not worked well because 008/35-37 and 041 $a can contain both the “main” spoken language and subsidiary alternate soundtracks.

Coded language data is intended for use in retrieval, limiting, and sorting. OLAC considered that the most logical way to limit data for moving images is by spoken or written language.  It might enable generation tabular displays of language information that would be more concise than field 546 notes.

The materials that would be affected by this change are primarily either silent films with intertitles or videos with no spoken content and no written content other than credits (e.g., some musical performances or artistic documentaries). Although sign language does not lend itself to this sort of distinction, discussion on the OLAC-list suggests that it would best be grouped with spoken and sung languages. Further examples of the types of situations that would be affected by this change may be found in OLAC’s draft Video Language Coding Best Practices and Recommendations (http://www.olacinc.org/drupal/?q=node/36).

The changes indicated above would be implemented by the following wording changes in the MARC documentation:

Current: 008/35-37 (Language)
For visual materials, (excluding original or historical projectable graphics), the language content is defined as the sound track, the accompanying sound, the overprinted titles (subtitles) or separate titles (for silent films), sign language when it is the sole medium of communication, or the accompanying printed script (for works with no sound or, if with sound, no narration).

Proposed: 008/35-37 (Language)
For moving image visual materials, code for the sound track, the accompanying sound, or sign language. For works with no sound or, if with sound, no narration, use zxx (no linguistic content).

Current:  zxx - No linguistic content
Item has no sung, spoken, or written textual content. Examples of such items are: 1) instrumental or electronic music; 2) sound recordings consisting of nonverbal sounds; 3) visual materials with no narration, printed titles, subtitles, captions, etc.; 4) computer files that consist of no more than the machine language (e.g., COBOL) or character codes (e.g., ASCII) used in source programs.

Proposed:  zxx - No linguistic content
Item has no sung, spoken, or written textual content. Examples of such items are: 1) instrumental or electronic music; 2) sound recordings consisting of nonverbal sounds; 3) moving image visual materials with no narration, i.e., no sound or with sound but no narration; ...

Current: $a - Language code of text/sound track or separate title   
For visual materials, subfield $a contains the code(s) of languages associated with the item, as well as any language code(s) of the languages of accompanying printed script or accompanying sound. Language code(s) of all languages of other accompanying material are recorded in subfield $g.

Proposed: $a - Language code of text/sound track or separate title  
For moving image visual materials, subfield $a contains the code(s) of spoken languages associated with the item, as well as sign language and any language code(s) of the languages of accompanying sound.

Current: $j - Language code of subtitles or captions
Language code(s) of subtitles or captions (open or closed, intended for users with hearing disabilities).

Proposed: $j - Language code of intertitles, subtitles, or captions
Language codes for written languages providing access to moving image materials, such as intertitles, subtitles, or captions (open or closed, intended for users with hearing disabilities). It does not include the languages of the credits, packaging, or accompanying material. If needed, the language of credits is recorded in field 546 (Language Note) and the language of packaging or accompanying material is recorded in 041 subfield $g (Language code of accompanying material other than librettos).   The language(s) are recorded in English alphabetical order.

2.3 New optional 041 subfield for original language of work(s)

For moving image resources it is also important to provide access to the original language of  the moving images. This has been discussed on the OLAC-list a number of times and many catalogers have heard from public services librarians or users that they want to be able to search or limit by this information. Users in many situations are interested in films that were originally in French, Spanish, Arabic, etc. and we do not currently have an effective way to provide this information.

Subfield $h  is defined in field 041 as Language code of original and/or intermediate translations of text, which mingles the language of the original with that if the intermediate translations. It is only used when something is a translation, and moving image resources that are in another language are not necessarily considered a translation of the original.

A subfield is desired for the primary language(s) of the original work. In the case of moving images, this would generally be a spoken language, but for films that were originally silent with intertitles, it would be a written language. The current proposal does not provide a reliable way to distinguish between original spoken and original written languages. It is not clear to OLAC that the added complexity of doing so would outweigh any benefits from trying to encode this distinction.

Three options might be considered:

Option 1:  Create a new subfield for intermediate translations and reserve $h for the original language.  This would be reasonable if it can be shown that most current uses of $h are for original language with a small percent for intermediate works.  This is the approach taken when $j was added.  MARBI decided to leave the existing $b for the most common usage (summary/abstract) and move the less commmonly used (overprinted title/subtitle) into the new subfield.

Option 2:  Create a new subfield for the language of the original and narrow $h to the language of intermediate translations.

Option 3:  Make $h obsolete and create two new subfields, one for the original and one for the intermediate translations. 

Based on feedback that that OLAC received on the draft Video Language Coding Best Practices and Recommendations (http://www.olacinc.org/drupal/?q=node/36), catalogers think that such a subfield would be useful for more materials than just film and video. Therefore it is suggested that a subfield for original language be identified for field 041 for use for all resource types.

3. EXAMPLES ($? is used for the subfield of the originals in the examples below)

3.1. English language film with English, French, or Spanish soundtracks; closed-captioned in English; optional subtitles in English, French, Spanish, Portuguese, Chinese, or Thai. English packaging and menus.

008/35-37 eng
041 1# $a eng $a fre $a spa $g eng $j eng $j chi $f fre $j por $j spa $j tha $? eng
546 ## Closed-captioned; English or dubbed French or Spanish soundtrack; optional English, French, Spanish, Portuguese, Chinese, or Thai subtitles

Tabular format:  
Spoken: English; French; Spanish
Subtitles/Captions/Intertitles: Chinese; English; French; Portuguese;
  Spanish; Thai
Original: English

3.2. Symphony performance; no spoken/sung language; credits in German; disc menu and packaging in English.

008/35-37 zxx
041 1# $g eng

3.3 A Chaplin silent film on DVD with multiple subtitle tracks.

008/35-37 zxx
041 1# $a zxx $j eng $j chi $j fre $j kor $j por $j spa $j tha $? eng
546 ## Silent film with English intertitles and musical acc.; optional French, Spanish, Portuguese, Chinese, Thai, or Korean subtitles. Optional audio commentary track in English. Menus in English, Spanish or Portuguese.

3.4. An Algerian DVD that is a clear mixture of French and Arabic. The characters often switch between the two languages within a sentence and depending on who they are talking to use either French or Arabic. No subtitles.

008/35-37 ara
041 0# $a ara $a fre $? ara $? fre
546 ## Dialogue consists of a mixture of Arabic and French.

3.5. Joyeaux Noel. Soundtrack of DVD and original film in English, French, and German; optional English, Spanish, or Portuguese subtitles.

008/35-37 fre
041 1# $a fre $a eng $a ger $j eng $j por $j spa $h fre $h eng $h ger $? fre $? eng $? ger
546 ## Soundtrack in a mixture of French, English and German; optional English, Spanish, or Portuguese subtitles.

3.6. A hypothetical DVD of a French film that has optional English, French, or German soundtracks and English, Spanish, or Portuguese subtitles.

008/35-37 fre
041 1# $a fre $a eng $a ger $j eng $j por $j spa $h fre $? fre
546 ## French, English, or German soundtracks; optional English, Spanish, or Portuguese subtitles.

4. QUESTIONS FOR DISCUSSION

4.1. What impact would these changes in the guidelines have on the use of existing subfields? Will clean-up of existing records be necessary?

4.2. How do these changes apply to forms of material other than moving images?

4.3. Which of the options for distinguishing the original language should be pursued?

4.4. Are there other fields that could be used to convey this information? (In terms of providing access to the original language of the work, OLAC considered using 655 genre/form headings, but catalogers expressed a preference for not including this information here, at least for moving images. OLAC also considered the development of work records that would include original language, but believes that this solution would be more complicated and would take a longer time to implement.)


HOME >> MARC Development >> Discussion Paper List

The Library of Congress >> Especially for Librarians and Archivists >> Standards
( 12/21/2010 )
Legal | External Link Disclaimer Contact Us