Software, E-Resource 1000 .gov PowerPoint dataset Dot gov PowerPoint dataset / One thousand dot gov PowerPoint dataset
About this Item
Title
- 1000 .gov PowerPoint dataset
Other Title
- Dot gov PowerPoint dataset
- One thousand dot gov PowerPoint dataset
Names
- Library of Congress Web Archiving Program
Created / Published
- Washington, D.C. : Library of Congress Web Archiving Program, [2018]
Headings
- - Electronic government information--United States
- - Presentation graphics software--United States
- - Microsoft PowerPoint (Computer file)
- - Web archives--United States
Genre
- Data sets
- Web archives
Notes
- - "This dataset of 1,000 PowerPoint files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as PowerPoint files and hosted on .gov domains. The set includes 1,000 unique files and minimal metadata about these, including links to their locations within the Library's web archive."-- Web archive datasets website.
- - "Dataset originally created 11/6/2018."--README file
- - "This dataset is based on exploratory work begun by the Library of Congress's Web Archiving Team in 2018. The goal of the work is to explore the contents of the Library's web archives through analysis of the indexes containing metadata from the harvested web content, as stored in CDX files. The metadata contained in the indexes was used for initial analysis, rather than the archived content stored in WARC and ARC container files, since W/ARC files present significant challenges due to large size and high processing requirements. The CDX indexes used in this initial analysis were six terabytes (TB) in size, which is a fraction of the web archive content in W/ARC files constituting nearly 1.5 petabytes (PB) at the time of analysis (November 2018)."-- README file
- - Title from Web Archive Datasets website, viewed February 23, 2021.
Medium
- Online resource (datasets)
Call Number/Physical Location
- JF1525.A8
Repository
- s-Online Electronic Resource
Digital Id
Library of Congress Control Number
- 2020445282
Online Format
- compressed data
LCCN Permalink
Additional Metadata Formats
Part of
Format
Contributor
Dates
Location
Language
Subject
Featured in
- In Conversation: LC Labs staff and Einstein Educator Fellow discuss library data, STEM education, and Primary Source Analysis
- In conversation: LC Labs staff and Einstein Educator Fellow discuss library data, STEM education, and primary source analysis
- Web Archive Datasets | Experiments | Work | Library of Congress Labs