Collection Software, E-Resource Simple English Wikipedia Simplewiki
More Resources
About this Item
Title
- Simple English Wikipedia
Other Title
- Simplewiki
Summary
- The dataset is composed of the content of Simple Wikipedia including articles and revision history in XML. The XML dumps are in a Export format and compressed in bzip2 and .7z formats; while SQL dumps are in mysqldump https://meta.wikimedia.org/wiki/Data_dumps External
Names
- Wikimedia Foundation, publisher
Created / Published
- [San Francisco, CA] : Wikimedia Foundation
Contents
- Articles, templates, media/file descriptions, and primary meta-pages -- All pages with complete edit history -- All pages with complete page edit history -- Log events to all pages and users -- All pages, current versions only -- First-pass for page XML data dumps -- Extracted page abstracts for Yahoo.
Headings
- - Electronic encyclopedias
Genre
- Data sets
Notes
- - Website for dataset launched November 17, 2003.
- - "This is the front page of the Simple English Wikipedia. Wikipedias are places where people work together to write encyclopedias in different languages. We use Simple English words and grammar here. The Simple English Wikipedia is for everyone! That includes children and adults who are learning English." - website home page.
- - "A complete copy of all Wikimedia wikis, in the form of wikitext source and metadata embedded in XML. A number of raw database tables in SQL form are also available. These snapshots are provided at the very least monthly and usually twice a month." - Wikimedia Downloads Database backup dumps page.
- - First downloaded by the Library of Congress on January 23, 2019.
- - Title from website home page (viewed May 8, 2019).
Medium
- textual datasets
Call Number/Physical Location
- AE5
Repository
- s-Online Electronic Resource
Digital Id
- https://simple.wikipedia.org External
- https://dumps.wikimedia.org/backup-index.html External
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20190101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20200120
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20210101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20220101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20230101
- https://hdl.loc.gov/loc.gdc/gdcdatasets.2019205402_20240101
Library of Congress Control Number
- 2019205402
Rights Advisory
- Creative Commons Attribution-ShareAlike 3.0 United States https://creativecommons.org/licenses/by-sa/3.0/us/ External
Online Format
- compressed data