ALTO
|
Element | Attribute name | Description | |
---|---|---|---|
TextBlock |
language |
ISO639-2 language character code | |
String |
CONTENT |
String content (word) |
|
|
SUBS_TYPE |
HypPart1 |
If content is the first part of a hyphenated word, applies only for the last word of a line if it is hyphenated |
|
|
HypPart2 |
If content is the second part of a hyphenated word, applies only for the first word of a line if it is hyphenated |
|
SUBS_CONTENT |
Complete content of a hyphenated word |
|
|
WC |
Word Confidence: Confidence level of the OCR results for this string. A float value between 0 (unsure) and 1 (confident) |
|
|
CC |
Confidence level of each character in that string. A list of numbers, one number between 0 (confident) and 9 (unsure) for each character |
|
STYLEREFS |
Text style used for this string, if it is different from the parent text block style |
||
STYLE |
Any combination of font style (italics, bold, …) |
||
|
ALTERNATIVE |
(element) Any number of alternative strings to be used instead |
|
Illustration |
TYPE |
A user defined description of the type of the illustration |
|
|
FILEID |
A link to a seperate file that contains just the illustration. |
|
ComposedBlock |
TYPE |
A user defined description of the type of the composed block |
|
|
FILEID |
A link to a separate file that contains just the composed block |
June 8, 2016 |