- Preservation Home
- About
- Collections Care
- Conservation
- Digital Preservation
- Emergency Management
- En Español
- FAQ
- Preservation Science
- Resources
- Outreach & Training Opportunities
- Have a preservation question?
Ask-a-Librarian
Related Links
X. Web Archives
This format specification covers the Library’s preferred format for archived web content, as well as a preferred “format” for presentation of web content for archiving (in other words, best practices for content creators to help in creating preservation-friendly websites). The Library is aware that websites, including blogs, social media and other web content that make up websites, are presented and created in formats for viewing in a web browser, and are often different than the standard format that is recommended for preservation and long-term access. Given that the focus of this document is preservation and long-term access, the following format preferences favor those outcomes, and include recommendations for best practices to better enable preservation of web content.
i. Websites | ||
---|---|---|
Preferred | Acceptable | |
A. Technical Characteristics |
|
|
B. Formats | The Library, and other organizations involved in web archiving, are preserving web content in the Web Archive (WARC) |
|
C. Delivery Method | Capture using tools that produce non-proprietary output, to conform with standard formats and requirements |
Transmission of WARC or ARC_IA files created by web content producers or other archiving organizations |
D. Metadata |
|
The ARC_IA should be named in a manner that easily identifies the archiving institution (see WARC standard for recommended naming conventions) |
E. Technological Measures |
|
Tools currently available cannot capture all web content, so certain types of web content may not be preservable through web capture at this time. These include:
|