The Wikipedia XML corpus

From WikiLit
Jump to: navigation, search
Publication (help)
The Wikipedia XML corpus
Authors: Ludovic Denoyer, Patrick Gallinari [edit item]
Citation: ACM SIGIR Forum 40 (1): 64-69. 2006 June.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1147197.1147210.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
The Wikipedia XML corpus is a publication by Ludovic Denoyer, Patrick Gallinari.


[edit] Abstract

Wikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages.

[edit] Research questions

"In this article, we describe a set of XML collections based on Wikipedia."

Research details

Topics: Other corpus topics, Research platform [edit item]
Domains: Computer science [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Other [edit item]
Data source: Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Secondary dataset [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Multiple [edit item]

[edit] Conclusion

"In this article, we describe a set of XML collections based on Wikipedia."

[edit] Comments


Further notes[edit]

Facts about "The Wikipedia XML corpus"RDF feed
AbstractWikipedia is a well known free content, muWikipedia is a well known free content, multilingual encyclopedia written collaboratively by contributors around the world. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. This encyclopedia is composed of millions of articles in different languages.llions of articles in different languages.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionIn this article, we describe a set of XML collections based on Wikipedia.
Data sourceWikipedia pages +
Doi10.1145/1147197.1147210 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22The%2BWikipedia%2BXML%2Bcorpus%22 +
Has authorLudovic Denoyer + and Patrick Gallinari +
Has domainComputer science +
Has topicOther corpus topics + and Research platform +
Issue1 +
MonthJune +
Pages64-69 +
Peer reviewedYes +
Publication typeJournal article +
Published inACM SIGIR Forum +
Research designOther +
Research questionsIn this article, we describe a set of XML collections based on Wikipedia.
Revid10,964 +
TheoriesUndetermined
Theory typeAnalysis +
TitleThe Wikipedia XML corpus
Unit of analysisArticle +
Urlhttp://0-dl.acm.org.mercury.concordia.ca/citation.cfm?doid=1147197.1147210 +
Volume40 +
Wikipedia coverageMain topic +
Wikipedia data extractionSecondary dataset +
Wikipedia languageMultiple +
Wikipedia page typeArticle +
Year2006 +