Collection Items

  • Collection
    Giphy: [collected datasets]
    Giphy web archive
    Giphy, founded by Alex Chung and Jace Cooke in February 2013, is an online database and search engine that allows users to search for and share animated GIF files. This site contains a large collection of reaction GIFS, animated GIF, typically of a body in motion that is used online as a response or reaction.
    • Contributor: Library of Congress - American Folklife Center
    • Date: 2013
  • Software, E-Resource
    1000 .gov PDF dataset
    Dot gov PDF dataset | One thousand dot gov PDF dataset
    "This dataset of 1,000 PDF files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as PDF files and hosted on .gov domains. The set includes 1,000 unique PDF files and minimal metadata about these PDFs, including links to their locations within the Library's web archive."-- Web archive datasets website. "Dataset originally created...
    • Contributor: Library of Congress Web Archiving Program
    • Date: 2019
  • Software, E-Resource
    1000 .gov PowerPoint dataset
    Dot gov PowerPoint dataset | One thousand dot gov PowerPoint dataset
    "This dataset of 1,000 PowerPoint files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as PowerPoint files and hosted on .gov domains. The set includes 1,000 unique files and minimal metadata about these, including links to their locations within the Library's web archive."-- Web archive datasets website. "Dataset originally created 11/6/2018."--README file...
    • Contributor: Library of Congress Web Archiving Program
    • Date: 2018
  • Software, E-Resource
    3000 .gov tabular dataset
    Dot gov tabular dataset | Three thousand dot gov tabular dataset
    "Each of these datasets consist of 1,000 files generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as CSV, tab-separated (TSV), or Excel (XLS) files and hosted on .gov domains. Each set includes 1,000 unique CSV, TSV, and XLS files and minimal metadata about them, including links to their locations within the Library's web...
    • Contributor: Library of Congress Web Archiving Program
    • Date: 2019
  • Software, E-Resource
    1000 .gov image dataset
    Dot gov image dataset | One thousand dot gov image dataset
    "This dataset of 1,000 images was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as image files and hosted on .gov domains. The set includes 1,000 unique image files (primarily with GIF, JPG, PNG, and TIFF extensions) and minimal associated metadata, including links to their locations within the Library's web archive."-- Web archive...
    • Contributor: Library of Congress Web Archiving Program
    • Date: 2018
  • Software, E-Resource
    1000 .gov audio dataset
    Dot gov audio dataset | One thousand dot gov audio dataset
    "This dataset of 1,000 audio files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as audio files and hosted on .gov domains. The set includes 1,000 unique audio files and minimal metadata about them, including links to their locations within the Library's web archive."-- Web archive datasets website. "Dataset originally created 11/6/2018."--README...
    • Contributor: Library of Congress Web Archiving Program
    • Date: 2018
  • Software, E-Resource
    Iraq selected image metadata dataset This dataset includes a CSV containing metadata derived from the CDX line entry for each "image" file. The fields and their contents are described in the "Dataset Field Descriptions" section. It also includes a README file.
    • Contributor: Library of Congress
    • Date: 2019
  • Collection
    Dinosaur comics This dataset was generated from content harvested from the Library of Congress's web archive of qwantz.com (Dinosaur Comics!): https://www.loc.gov/item/lcwaN0009953/. It includes minimal metadata about 3,325 image objects from the Dinosaur Comics! web archive as well as the files themselves. This dataset was created as apart of exploratory work done by the Library of Congress's Web Archiving Team. README file This dataset includes: lcwa_dinosaurcomics_image_data.zip compressed...
    • Contributor: Library of Congress Web Archiving Program - North, Ryan
    • Date: 2019
  • Software, E-Resource
    Dataset from tribal leaders directory
    Tribal leaders directory
    "The Tribal Leaders Directory provides contact information for each federally recognized tribe. The electronic, map based, interactive directory also provides information about each BIA region and agency that provides services to a specific tribe. Additionally, the directory provides contact information for Indian Affairs leadership."--Directory website. Available in three formats: CSV, JSON, and XML. Archived by the Library of Congress May 2019. Description based on...
    • Contributor: United States. Bureau of Indian Affairs
    • Date: 2016
  • Collection
    Meme Generator : collected datasets
    Meme Generator | Meme Generator dataset
    Meme Generator allows users to create and share image macros (featuring a picture, or artwork, superimposed with text) in the style of popular internet memes. The site also serves as a searchable collection of user-created images. Meme images/image macros are widely used images that include text written to a range of templates used in online communication.
    • Contributor: Library of Congress - American Folklife Center
    • Date: 2010
  • Software, E-Resource
    Dataset from U.S. PIAAC Prison Study results : 2014 "Approximately 1,300 prisoners participated in the U.S. Program for the International Assessment of Adult Competencies (PIAAC), conducted from February through June 2014. Inmates in federal, state, and private prisons in the United States were assessed in literacy, numeracy, and problem solving in technology-rich environments (also called "PS-TRE") with the same assessments administered to a national sample of U.S. adults residing in households in 2012...
    • Contributor: National Center for Education Statistics
    • Date: 2016
  • Software, E-Resource
    Dataset from a picture of subsidized households : 2008 "Picture of Subsidized Households describes the nearly 5 million households living in HUD-subsidized housing in the United States for the year 2008. Picture 2008 provides characteristics of assisted housing units and residents, summarized at the national, state, public housing agency (PHA), project,census tract, county, Core-Based Statistical Area and city levels. New for 2008: Core-Based Statistical Areas have replaced the Metropolitan Statistical Area (MSA) summary...
    • Contributor: Evett, Steven R. - United States. Department of Housing and Urban Development
    • Date: 2008
  • Software, E-Resource
    Dataset of National Endowment for the Humanities grants, 1980-1989
    NEH grants dataset, 1980-1989
    "Information about NEH grants is contained in the files named NEH_Grantsxxxxx.zip. These files are broken into decades. The data is described in the file NEH_GrantsDictionary.pdf. Note that Metadata for grants that antedate the NEH electronic grants management system is sparser than that for more recent grants. The XML files for grants are available in two formats: one with hierarchical XML (a grant may have...
    • Contributor: National Endowment for the Humanities
    • Date: 2019
  • Software, E-Resource
    Dataset from Residential Energy Consuption Survey : 2009
    2009 RECS Survey Dataset
    "This 2009 version represents the 13th iteration of the RECS program. First conducted in 1978, the Residential Energy Consumption Survey is a national sample survey that collects energy-related data for housing units occupied as a primary residence and the households that live in them. Data were collected from 12,083 households selected at random using a complex multistage, area-probability sample design. The sample represents 113.6...
    • Contributor: United States. Energy Information Administration. Office of the Consumption Data System
    • Date: 2014
  • Collection
    Creepypasta : [collected datasets]
    Creepy pasta
    Creepypasta is a wiki collection of horror-related urban legends or images that have been copy-and-pasted around the Internet. These entries are often brief, user-generated, paranormal stories intended to scare readers. They include gruesome tales of murder, suicide, and otherworldly occurrences.
    • Contributor: American Folklife Center
  • Software, E-Resource
    Quality controlled research weather data : USDA-ARS, Bushland, Texas
    Dataset from USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas for all days in 2016
    "The dataset contains 15-minute mean weather data from the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL) for all days in 2016. The data are from sensors deployed at standard heights over grass that is irrigated and mowed during the growing season to reference evapotranspiration...
    • Contributor: Evett, Steven R. - Ag Data Commons (U.S.)
    • Date: 2019
  • Web Page
    Rights & Access The Library of Congress is providing access to The Selected Datasets Collection for educational and research purposes. The Library has obtained permission for the use of many materials in the Collection, and presents additional materials for educational and research purposes in accordance with fair use under United States copyright law. Researchers should watch for modern documents that may be copyrighted (for example, published in the United...
  • Software, E-Resource
    Dataset from City Pair Program FY21 contract awards "The City Pair Program (CPP) was developed to provide discounted air passenger transportation services to federal government travelers. At its inception in 1980, this service covered only 11 markets. The program has since expanded, offering over 12,000 markets in FY20. The two tier fare structure includes a YCA fare and a deeply discounted _CA fare in selected markets."--City Pair Program website. Title devised by...
    • Contributor: United States. General Services Administration
    • Date: 2019
  • Software, E-Resource
    Nonimmigrant visa issuances by visa class and by nationality : FY1997-2018 NIV detail table "The general table for classes of non-immigrants issued visas is provided in the annual Report of the Visa Office. It is listed as Table XVI (A) "Classes of Nonimmigrants Issued Visas (Including Crewlist Visas and Border Crossing Cards).""--Nonimmigrant visa statistics website. Title from Nonimmigrant visa statistics website, viewed August 16, 2021.
    • Contributor: United States. Bureau of Consular Affairs
    • Date: 2016
  • Collection
    Enron email dataset "This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. The email dataset...
    • Contributor: Enron Corp - United States. Federal Energy Regulatory Commission - Cohen, William W.
    • Date: 2015
  • Software, E-Resource
    World production and exports of fresh vegetables eligible for importation into the United States "4/22/2016."--Phytosanitary regulation website. "This data product provides information on phytosanitary regulations affecting U.S. imports of 42 fresh fruits and vegetables. From 2008 to 2012, ERS has annually published statistics on the countries eligible to ship these goods to the United States and the extent to which they represent the whole of world trade. In 2015, the agency revised the format and added additional information...
    • Contributor: United States. Animal and Plant Health Inspection Service
    • Date: 2016
  • Software, E-Resource
    Dataset of World Digital Library
    World Digital Library dataset | WDL dataset
    "This dataset includes structured data in several formats, representing the same descriptive metadata as translated between the seven languages: Arabic (ar) English (en) Spanish (es) French (fr) Portuguese (pt) Russian (ru) Chinese (zh) The dataset is presented in multiple structured formats to serve different uses: CSV file for each language including all metadata fields and values. The field 'wdl_id' is the original WDL identifier...
    • Contributor: World Digital Library - Library of Congress - Unesco
    • Date: 2021
  • Software, E-Resource
    Datasets from Office of Coast Survey wrecks and obstructions database
    Wrecks and obstructions datasets
    "The Office of Coast Survey's Wrecks and Obstructions database contains information on the identified submerged wrecks and obstructions within the U.S. maritime boundaries. The data includes the position of each feature (latitude and longitude) along with a brief description. Information to populate the database comes from what is currently available on the electronic navigational chart (ENC) and Coast Survey's Automated Wrecks and Obstructions Information...
    • Contributor: United States. Office of Coast Survey
    • Date: 2018