ALTO Technical Metadata for Layout and Text Objects

About ALTO


The Analyzed Layout and Text Object (ALTO) XML Schema was initially developed by the METAe project group External URL: for use with the Library of Congress' Metadata Encoding and Transmission Schema (METS). While METS excels in describing the structure of objects, a schema related to the content and layout information of each piece of the object was missing. Claus Gravenhorst, who helped create ALTO for the METAe project, states that:

"During the METAe project, we learned that there is no standard to handle word positions and physical layout information (print space, margins, etc.), an essential feature for high performance repositories that are able to highlight elements within documents. Therefore, the ALTO schema has been developed. In the METS file, there are file pointers to the ALTO files that contain the text, other elements (illustrations, etc.), and word positions. We would like ALTO or a similar schema to become a standard as we do not see an alternative right now." [1]

CCS Content Conversion Specialists GmbH maintained the ALTO standard, CCS having played a crucial role in ALTO's development dating back to its creation during the METAe project . Then in August 2009, the Library of Congress (LC) Network Development and MARC Standards Office became the official maintenance agency for the ALTO XML Schema. At that time LC set up an Editorial Board to help shape and advocate for ALTO. The Board thus oversees maintenance of the ALTO XML Schema and helps foster usage in the digital library community.

  1. "Editors' Interview with Günter Mühlberger and Claus Gravenhorst of METAe." RLG DigiNews October 15, 2004. Available online at: External URL: