Related Resources at the Library
Information Especially for Webmasters and Site Owners
Technical Questions
-
My site has a password-protected area that requires a user ID and password. Will this protected content be archived?
Normally we don’t archive password-protected content, but in cases where our recommending officers are interested in such content, we would only archive with permission from the Web site owner.
-
Your crawler is causing problems with my site. Whom should I contact?
The Library or its agent always plans to "politely" crawl sites to minimize server impact, however, occasionally there may be problems. Please contact us immediately if you notice any problems.
-
How often and for how long will you be collecting my site?
Typically we crawl a Web site once a week or once monthly, depending in part on the likelihood of a change of content. The collection period varies depending on the scope of a particular project. Some Web captures that are related to specific time-sensitive events take place for a limited time (e.g., before and immediately after a national election, or immediately following an event such as September 11). Other captures may be ongoing with no specified end date.
-
The URL for the site you want to archive has changed. How do I let you know?
Please contact us with any URL changes.
-
I have a robots.txt exclusion on my Web site blocking other crawlers from certain parts of my site. How does this affect your collecting activity?
The Library attempts to collect as much of the site as possible to create an accurate snapshot for future researchers. If we have permission to collect a site, we generally ignore robots.txt exclusions.
The Library’s Permissions Process
-
I got an e-mail from the Library of Congress and it looks like it might be spam. How do I tell if it is safe to click on the link?
Due to the large number of permission requests mailed, the Library uses a database and mail server to send out requests. In the e-mail, the Library asks you to click on a link to respond via a Web form. This process allows the Library to track responses more accurately and quickly. The link begins with http://www.loc.gov/minerva/display_acceptance.php or http://www.loc.gov/minerva/display_crawl_acceptance.php
The e-mail you receive from the Library of Congress contains webcapture@loc.gov in the "from" address, and "Library of Congress Permission Request" in the subject line. At the bottom of the e-mail message is the line, "LC Reference: [collection name, id number]", which is the Library’s internal tracking information.
If you have any questions about the e-mail you received, you may also reply directly to it, and the Library’s project staff will assist you.
-
I am having difficulties filling out your permissions form.
Please contact us if you have problems with the form, or reply to the e-mailed permission request and someone from the Library’s project team will assist you.
-
Why have I received multiple permission requests from the Library of Congress?
The Library is currently required to send notice to all selected Web sites in every collection it initiates, even if the site has previously granted or denied permission. Therefore, if different URLs within a Web site have been selected for a collection, you may receive more than one request for permission. Or, if your site was selected for more than one collection, such as the Election 2002 Web Archive and the Election 2004 Web Archive, you will receive multiple requests. The Library is working to improve its tools and policies to minimize this duplication of requests.
Researcher Access to Web Archives
-
If I grant or deny permission to allow the Library to display offsite, what does that mean?
By granting permission to display the archived site offsite, the site owner is authorizing the Library of Congress to provide access to the archived copies of its Web site on the Library’s public Web site. By denying offsite access, the Library may catalog and identify the site as part of a particular collection on the public Web site, but the archived site itself will only be available to researchers who visit the Library of Congress buildings in Washington, D.C.
-
When will my archived site or the collection it’s in be available to researchers?
Web collections are made available as permissions, Library policy and resources permit. There is normally a significant lag time before the Web archive collections are made available to researchers onsite or offsite.
To retrieve current information about your organization, the public will need to visit your "live" Web site; the Library’s archived copy will not be confused with or compete with your Web site. However, if you have concerns about public access to the archived version of your Web site, you may grant the Library permission to capture your site but deny permission to provide public access over the Library’s Web site. This allows the Library to archive your site but limits access to researchers onsite at the Library of Congress.
-
What if I change my mind about allowing access to researchers offsite?
If you are a copyright owner of or otherwise have exclusive control over materials presently available through this collection and do not wish your materials to be available through the Library of Congress public Web site, please contact the Library of Congress at your earliest convenience.
In making a takedown request, please identify the specific Web site, date and time information, materials you claim rights to, and the nature of your rights. (e.g., www.september11site.com, September 14, 2001, 1:45 p.m., page 1, photograph of twin towers, creator John Doe, photograph registered for copyright).
-
How are sites displayed in the archive? Are the archived sites cataloged and described?
Some, but not all, of the Web sites in the archive have been cataloged in MODS. Others are available via browse lists by URL. In the future, we hope to make some of our collections searchable by keyword.
To view an example of how the sites display in the archive, visit the MINERVA project’s Election 2002 Web Archive and perform a search to find bibliographic records that point to the archive. For example, at this Election 2002 web archive record, the link "Archived Site" will retrieve, by date, all captures the Library has made of this site.
Other
-
Will Library of Congress take over hosting of my site?
No. By archiving your site, the Library of Congress is preserving a snapshot of your site at a particular time. You are still responsible for hosting and maintaining your live Web site.
-
I would like to archive my Web site. Can you help me?
At this time, the Library of Congress does not have a program to archive individual Web sites on request.