By BABAK HAMIDZADEH
Development of the World Digital Library (WDL) encompassed three distinct, yet inseparable major technical tasks: the deployment of a new application to handle unique cataloging and translation requirements; the development of a set of tools to select, track and assemble content from 25 institutions in 18 countries; and development and delivery of a state-of-the-art website able to handle the highest number and most geographically dispersed set of visitors the Library has ever received in a single day. All development work took place at the Library during a 13-month period, with more than 30 staff members contributing to the effort.
Cataloging & Translation
Critical to the WDL’s success were complete, accurate descriptive records for every item—including new, in-depth contextual narratives written by subject-matter experts—all with high-quality translations in the seven supported languages. Within just four months, a team at the Library of Congress built a cataloging application to satisfy these requirements and the workflows surrounding them.
Starting with existing bibliographic records, a team of catalogers worked concurrently over many months using the cataloging application’s web interface to review, augment and normalize each item’s descriptive record. Once marked complete, records were reviewed and approved for translation. The records were then sent to an externally hosted translation system managed by LingoTek, a translation company based in Utah that worked under contract to the Library of Congress.
As translators from around the world completed their work, the WDL cataloging application automatically fetched the translated records for inclusion within the website.
The Content Pipeline
In addition to being cataloged and translated, each item needed to be validated, tracked and prepared for display on the website. The set of tools and workflows created to support these activities was called the “content pipeline.”
As CDs, DVDs, and hard drives arrived from around the world, their content was logged into a master inventory and transferred to a secure working area in the Library. Starting from the high-quality original images, various “web-friendly” versions were generated for specific purposes: small “thumbnails” for search and browse pages; hand-selected “showcase” images for the individual item pages; and large numbers of “image tiles” to support zooming in and out to see even the tiniest detail within the image viewer or page turner. Over one million files were created to ensure that WDL’s content would be viewed in the highest quality possible.
Once all the various components of an item were complete, they were assembled into an item “package” for deployment. Automated methods to validate completed items, assemble the packages and load them into the website were all part of the content pipeline. Finally, provisions were made for change management; changes to item components would automatically travel through the pipeline and seamlessly replace any existing version within the website.
Development of the WDL’s website (www.wdl.org ) had one overarching goal: to make it easy for users to explore the spectacular collection of international treasures. Advanced features such as the home-page timeline and the full-screen image viewer needed to work on every mainstream web browser. Above all, performance needed to be fast, and the site had to be continuously available throughout and after a highly publicized launch.
Technical planning for launch started on Day One. To complement the Library’s ability to host and serve large-scale and high-demand websites, the team chose Akamai, a company specializing in supporting popular sites requiring 24/7 availability. Akamai stores multiple copies of web pages on 48,000 servers in 70 countries. This service ensures optimal performance for WDL users who access the site from every country in the world.
In the end, the goal of the WDL is to provide access to content. By successfully delivering that content to every visitor who began coming through its “digital doors” on April 21, the development team accomplished its mission.
WDL 1.0 Technical Info
- Development Time: 13 Months
- Lines of Code: 50,000
- Test Cases Written: 600
- Development Platform: Linux
- Deployment Platform: Solaris
- Key Technologies: Django Solr/Lucene Python Squid NginX MySQL SeaDragon
- Launch Day Statistics: Page Views 7.1 Million, Visitors 600,000, Peak Hits/Hour 32 Million
Babak Hamidzadeh is director of the Repository Development Center in the Library’s Office of Strategic Initiatives.