Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia

From WikiLit
Jump to: navigation, search
Publication (help)
Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia
Authors: Antonella Elia [edit item]
Citation: RaeL: Revista Electronica de Linguistica Aplicada (8): 248-271. 2009 December.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: Define doi.
Google Scholar cites: Citations
Link(s):
Added by Wikilit team: Yes
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia is a publication by Antonella Elia.


[edit] Abstract

This paper is part of a wider corpus based study focused on Web encyclopedias (Elia 2008). It is built on and extends the comparative analysis of Emigh and Herring (2005). In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britannica encyclopedic entries. Linguistic features such as type/token ratio, word and sentence length, and Index of Readability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are readable and standardized in a way not very dissimilar from those produced by experts in the Encyclopaedia Britannica Online.

[edit] Research questions

"In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britanica encyclopedic entries. Linguistic featuressuch as type/token ratio, word and sentence length, and Index of Reliability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are reliable and standardized in a way not very dissimilar from those produced by experts in the Encyclopedia Britanica Online.

Research Questions: 1. To what extent are lexical density, word length and sentence length similar or different in two online encyclopedias? 2. Is Index of Reliability, as a quantifiable parameter, equal or divergent in Britanica and Wikipedia? And if so, to what extent does it differ? 3. Do the different authoring processes affect the style of the encyclopedic genre, as examplified in Wikipedia and Britanica Online, in terms of lexical specificity and readability?"

Research details

Topics: Readability and style [edit item]
Domains: Information science [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Since the linguistic investigation is mainly frequency based, the count of occurances was standardized to make the quantitative findings comparable. Standardization of frequency count was made following Biber's theory (1998:263), which demonstrates that raw frequency counts are not directly comparable when textual units have different lengths." [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Website [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"This paper is built on and extends the comparative analysis of Emigh and Herring (2005) leading further empirical support to their earlier findings which showed that Wikipedia is not statistically distinguishable from Columbia Encyclopedia in some features. Thus the results from this study seem to support and confirm their conclusion in terms of the different linguistic features measured and according to a different object of comparison: Britanica Online.It shows that the maturity of their style (Wikipedians), at least from the quantitative point of view, is not dissimilar from that of online proprietary encyclopedias."

[edit] Comments


Further notes[edit]

Facts about "Quantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia"RDF feed
AbstractThis paper is part of a wider corpus basedThis paper is part of a wider corpus based study focused on Web encyclopedias (Elia 2008). It is built on and extends the comparative analysis of Emigh and Herring (2005). In particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britannica encyclopedic entries. Linguistic features such as type/token ratio, word and sentence length, and Index of Readability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are readable and standardized in a way not very dissimilar from those produced by experts in the Encyclopaedia Britannica Online.ts in the Encyclopaedia Britannica Online.
Added by wikilit teamYes +
Collected data time dimensionCross-sectional +
ConclusionThis paper is built on and extends the comThis paper is built on and extends the comparative analysis of Emigh and Herring (2005) leading further empirical support to their earlier findings which showed that Wikipedia is not statistically distinguishable from Columbia Encyclopedia in some features. Thus the results from this study seem to support and confirm their conclusion in terms of the different linguistic features measured and according to a different object of comparison: Britanica Online.It shows that the maturity of their style (Wikipedians), at least from the quantitative point of view, is not dissimilar from that of online proprietary encyclopedias. that of online proprietary encyclopedias.
Data sourceExperiment responses + and Wikipedia pages +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Quantitative%2Bdata%2Band%2Bgraphics%2Bon%2Blexical%2Bspecificity%2Band%2Bindex%2Bof%2Breadability%3A%2Bthe%2Bcase%2Bof%2BWikipedia%22 +
Has authorAntonella Elia +
Has domainInformation science +
Has topicReadability and style +
Issue8 +
MonthDecember +
Pages248-271 +
Peer reviewedYes +
Publication typeJournal article +
Published inRaeL: Revista Electronica de Linguistica Aplicada +
Research designExperiment +
Research questionsIn particular, attention is focused on theIn particular, attention is focused on the English edition of Wikipedia. A quantitative analysis compares Wikipedia vs. Britanica encyclopedic entries. Linguistic featuressuch as type/token ratio, word and sentence length, and Index of Reliability are analyzed. The findings show to what extent collaboratively produced Wikipedia entries are reliable and standardized in a way not very dissimilar from those produced by experts in the Encyclopedia Britanica Online.

Research Questions: 1. To what extent are lexical density, word length and sentence length similar or different in two online encyclopedias? 2. Is Index of Reliability, as a quantifiable parameter, equal or divergent in Britanica and Wikipedia? And if so, to what extent does it differ?

3. Do the different authoring processes affect the style of the encyclopedic genre, as examplified in Wikipedia and Britanica Online, in terms of lexical specificity and readability?
ms of lexical specificity and readability?
Revid10,917 +
TheoriesSince the linguistic investigation is mainSince the linguistic investigation is mainly frequency based, the count of occurances was standardized to make the quantitative findings comparable. Standardization of frequency count was made following Biber's theory (1998:263), which demonstrates that raw frequency counts are not directly comparable when textual units have different lengths.when textual units have different lengths.
Theory typeAnalysis +
TitleQuantitative data and graphics on lexical specificity and index of readability: the case of Wikipedia
Unit of analysisWebsite +
Wikipedia coverageMain topic +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageEnglish +
Wikipedia page typeArticle +
Year2009 +