Software, E-Resource Iraq selected image metadata dataset
About this Item
- Title
- Iraq selected image metadata dataset
- Summary
- This dataset includes a CSV containing metadata derived from the CDX line entry for each "image" file. The fields and their contents are described in the "Dataset Field Descriptions" section. It also includes a README file.
- Contributor Names
- Library of Congress
- Created / Published
- Washington, D.C. : Library of Congress, 2019
- Subject Headings
- - Iraq War, 2003-2011
- - Iraq--History--2003
- - Afghan War, 2001
- - United States--Military policy
- - United States--Politics and government--2001-2009
- - Digital images
- - Web archives
- Genre
- Data sets
- Notes
- - "This dataset contains metadata for 306,954 image objects from the Iraq War 2003 Collection. The metadata was extracted using Apache Spark to query across the Library of Congress's Web Archive indexes. This dataset was created to satisfy a specific researcher request and contains metadata about image objects from 21 domains from the years 2003-2006, per the researcher's request."-- Web archive datasets website.
- - "The information included in this metadata CSV was extracted using Apache Spark to query across the Library of Congress's Web Archive CDX indexes. CDX indexes contain one line per web object in the Web Archive and are delimited by a single space. See the CDX specification for more information about the format: https://web.archive.org/web/20171123000432/https://iipc.github.io/warc-specifications/specifications/cdx-format/cdx-2006/. External More in depth information on how exactly the information was extracted can be found in the "How Was It Created" section below. This dataset was created to satisfy a specific researcher request."-- README file
- - Title devised by the cataloger
- Medium
- Online resource (dataset)
- Call Number/Physical Location
- JK1896
- Repository
- s-Online Electronic Resource
- Digital Id
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019667240
- Library of Congress Control Number
- 2019667240
- Online Format
- compressed data
- LCCN Permalink
- https://lccn.loc.gov/2019667240
- Additional Metadata Formats
- MARCXML Record
- MODS Record
- Dublin Core Record
Part of
Format
Contributors
Dates
Locations
Subjects
Rights & Access
Cite This Item
More Software, E-Resources like this
-
Software, E-Resource1000 .gov image dataset.
Dot gov image dataset | One thousand dot gov image dataset "This dataset of 1,000 images was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as image files and hosted on .gov...- Contributor: Library of Congress Web Archiving Program
- Date: 2018
-
Software, E-Resource1000 .gov audio dataset.
Dot gov audio dataset | One thousand dot gov audio dataset "This dataset of 1,000 audio files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as audio files and hosted on...- Contributor: Library of Congress Web Archiving Program
- Date: 2018
-
Software, E-Resource3000 .gov tabular dataset.
Dot gov tabular dataset | Three thousand dot gov tabular dataset "Each of these datasets consist of 1,000 files generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as CSV, tab-separated (TSV), or...- Contributor: Library of Congress Web Archiving Program
- Date: 2019
-
Software, E-Resource1000 .gov PowerPoint dataset.
Dot gov PowerPoint dataset | One thousand dot gov PowerPoint dataset "This dataset of 1,000 PowerPoint files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as PowerPoint files and hosted on...- Contributor: Library of Congress Web Archiving Program
- Date: 2018
-
Software, E-Resource1000 .gov PDF dataset
Dot gov PDF dataset | One thousand dot gov PDF dataset "This dataset of 1,000 PDF files was generated from indexes of the Web archives, which were used to derive a random list of 1,000 items identified as PDF files and hosted on...- Contributor: Library of Congress Web Archiving Program
- Date: 2019