Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia
Publication (help) | |
---|---|
Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia | |
Authors: | Antonella Elia [edit item] |
Citation: | RaeL: Revista Electronica de Linguistica Aplicada (8): 248-271. 2009 December. |
Publication type: | Journal article |
Peer-reviewed: | Yes |
Database(s): | |
DOI: | Define doi. |
Google Scholar cites: | Citations |
Link(s): | |
Added by Wikilit team: | Yes |
Search | |
Article: | Google Scholar BASE PubMed |
Other scholarly wikis: | AcaWiki Brede Wiki WikiPapers |
Web search: | Bing Google Yahoo! — Google PDF |
Other: | |
Services | |
Format: | BibTeX |
Contents
[edit] Abstract
This paper is part of a wider corpus based study focused on Web encyclopedias (Elia 2008). It is built on and extends the comparative analysis of Emigh and Herring (2005). In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britannica encyclopedic entries. Linguistic features such as type/token ratio, word and sentence length, and Index of Readability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are readable and standardized in a way not very dissimilar from those produced by experts in the Encyclopaedia Britannica Online.
[edit] Research questions
"In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britanica encyclopedic entries. Linguistic featuressuch as type/token ratio, word and sentence length, and Index of Reliability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are reliable and standardized in a way not very dissimilar from those produced by experts in the Encyclopedia Britanica Online.
Research Questions: 1. To what extent are lexical density, word length and sentence length similar or different in two online encyclopedias? 2. Is Index of Reliability, as a quantifiable parameter, equal or divergent in Britanica and Wikipedia? And if so, to what extent does it differ? 3. Do the different authoring processes affect the style of the encyclopedic genre, as examplified in Wikipedia and Britanica Online, in terms of lexical specificity and readability?"
Research details
Topics: | Readability and style [edit item] |
Domains: | Information science [edit item] |
Theory type: | Analysis [edit item] |
Wikipedia coverage: | Main topic [edit item] |
Theories: | "Since the linguistic investigation is mainly frequency based, the count of occurances was standardized to make the quantitative findings comparable. Standardization of frequency count was made following Biber's theory (1998:263), which demonstrates that raw frequency counts are not directly comparable when textual units have different lengths." [edit item] |
Research design: | Experiment [edit item] |
Data source: | Experiment responses, Wikipedia pages [edit item] |
Collected data time dimension: | Cross-sectional [edit item] |
Unit of analysis: | Website [edit item] |
Wikipedia data extraction: | Live Wikipedia [edit item] |
Wikipedia page type: | Article [edit item] |
Wikipedia language: | English [edit item] |
[edit] Conclusion
"This paper is built on and extends the comparative analysis of Emigh and Herring (2005) leading further empirical support to their earlier findings which showed that Wikipedia is not statistically distinguishable from Columbia Encyclopedia in some features. Thus the results from this study seem to support and confirm their conclusion in terms of the different linguistic features measured and according to a different object of comparison: Britanica Online.It shows that the maturity of their style (Wikipedians), at least from the quantitative point of view, is not dissimilar from that of online proprietary encyclopedias."
[edit] Comments