Library of Congress

Web Archiving

The Library of Congress > Web Archiving
{subscribe_url: '/share/sites/Bapu4ruC/webarchiving.php'}

If the URL of this page appeared in your web server logs, it is because your website is being crawled by the Internet Archive (external link) on behalf of the Library of Congress.

If crawling is impacting the performance of your site or you have other concerns, please immediately email:
archive-crawler-agent[AT]lists[DOT]sourceforge[DOT]net and cc: the Library of Congress Web Archiving Team at [email protected].

About the Library of Congress Web Archives

The United States Library of Congress preserves and provides enduring access to the Nation’s cultural artifacts. The Library's traditional functions—acquiring, cataloging, preserving and serving collection materials of historical importance to the Congress and to the American people to foster education and scholarship—extend to digital materials, including websites. The Library has selected your site for inclusion in its historic collection of Internet materials. An email notification with further information has been sent separately to a contact at your organization identified by our team.

How We Collect Your Website

The Library of Congress has contracted with the Internet Archive to collect content from websites at regular intervals. We have a number of ongoing collections; each notification states the specific project that your site is a part of. For more information about our projects visit our collections page.

The Internet Archive uses the Heritrix (external link) crawler to collect websites on behalf of the Library of Congress. Our crawler is instructed to bypass robots.txt (external link) in order to obtain the most complete and accurate representation of websites such as yours. Site elements that are sometimes excluded through robots.txt instructions that are vital to the reproduction of a site’s look, feel, and functionality include images, CSS, and JavaScript, to name a few.

We would prefer to refrain from crawling any site areas that are not intended for the general public, such as administrative sections. We are happy to discuss ways to facilitate capture of desired elements and content by the Library of Congress or its agents or to mitigate crawling of sections we do not wish to collect.

The Library hopes that you share its vision of preserving web materials. If you have questions, comments or recommendations concerning the collection of your website by the Library of Congress, please do not hesitate to contact the Web Archiving Team.