Ranking of Wikipedia articles in search engines revisited: fair ranking for reasonable quality?

From WikiLit
Jump to: navigation, search
Publication (help)
Ranking of Wikipedia articles in search engines revisited: fair ranking for reasonable quality?
Authors: Dirk Lewandowski, Ulrike Spree [edit item]
Citation: Journal of the American Society for Information Science and Technology  : . 2011.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1002/asi.21423.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Ranking of Wikipedia articles in search engines revisited: fair ranking for reasonable quality? is a publication by Dirk Lewandowski, Ulrike Spree.


[edit] Abstract

Abstract This paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles. After an overview of current research on information quality in Wikipedia, a summary of the extended discussion on the quality of encyclopedic entries in general is given. On this basis, a heuristic method for evaluating Wikipedia entries is developed and applied to Wikipedia articles that scored highly in a search engine retrieval effectiveness test and compared with the relevance judgment of jurors. In all search engines tested, Wikipedia results are unanimously judged better by the jurors than other results on the corresponding results position. Relevance judgments often roughly correspond with the results from the heuristic evaluation. Cases in which high relevance judgments are not in accordance with the comparatively low score from the heuristic evaluation are interpreted as an indicator of a high degree of trust in Wikipedia. One of the systemic shortcomings of Wikipedia lies in its necessarily incoherent user model. A further tuning of the suggested criteria catalog, for instance, the different weighing of the supplied criteria, could serve as a starting point for a user model differentiated evaluation of Wikipedia articles. Approved methods of quality evaluation of reference works are applied to Wikipedia articles and integrated with the question of search engine evaluation.

[edit] Research questions

"This paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles.

1. Which applicable quality standards (heuristics) exist for evaluating Wikipedia articles? In what context were they developed and applied and do they justice to the generic markings of Wikipedia articles? 2. Based on the research on existing quality standards, we developed our own heuristics. With the help of these heuristics human evaluators should be able to make sound and intersubjectively comprehensible quality judgments of individual Wikipedia articles. As we wanted to develop an easy-to-apply tool our heuristic had to meet the following requirements: a. Human evaluators can evaluate individual Wikipedia articles on the basis of the provided criteria catalog and can agree whether a given article meets a certain criterion or not. b. On the basis of the criteria catalog human evaluators attain similar evaluating scores for the same article. c. On the basis of the criteria catalog noticeable differences in quality of Wikipedia articles can be determined. 3. The calibrated heuristic was applied to Wikipedia articles that scored highly in the retrieval test to find out: a. whether there exist noticeable differences in quality among the examples of our sample; b. whether there are really bad articles among the highly ranked articles. 4. On this basis new insight into the user judgment of Wikipedia hits is possible as it can now be analyzed: a. how user relevance judgments of the Wikipedia hits in the search engine results correspond with scores from the heuristic evaluation; b. how useful the ranked articles are; c. whether the ranking is appropriate, respectively whether good entries are ranked high enough."

Research details

Topics: Ranking and popularity, Computational estimation of trustworthiness [edit item]
Domains: Information science [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "In the theory of specialized lexicography, quality management is firmly grounded on the determination of a user structure consisting of the three aspects of user presupposition: degree of expertise such as layperson or expert, user situation referring to the actual usage such as text production or understanding, and user intention, which can widely vary from gathering factual information to background information or references (Geeb, 1998). So far, Wikipedia has no determined user structure and is trying to serve the needs of the general user as well as the expert. Based on this, it could be concluded that quality problems are to be expected, especially for articles in arcane academic areas like mathematics, as the knowledge gap between the general user and the specialist is large.

In accordance with our theoretical assumption (see previous section) that the quality of an encyclopedia article should always be evaluated not only against the aims and objectives of the encyclopedia but also against its user structure and expectations, we strove to design a flexible and adaptable heuristic." [edit item]

Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: German [edit item]

[edit] Conclusion

"In general, our study confirms that the ranking of Wikipedia articles in search engines is justified by a satisfactory overall quality of the articles. For general informational queries, the negative assessment of Wikipedia articles could not be reinforced with the exception of relatively poor quality concerning orthographical and grammatical correctness.

Our study showed that despite the intense research on Wikipedia quality there is still a lack of commonly agreed on authoritative heuristics as well as evaluation methods (research question 1). However, from the range of existing quality criteria we were able to derive a heuristics adequate for evaluating Wikipedia articles (research question 2). Jurors agreed on the provided criteria catalog (research question 2a).

Our heuristic method is apt for the task of detecting quality distinctions, as the quality differences between articles in the sample were clearly noticeable (research question 2c).

In answer to research question 4b, 4c (“Is the ranking appropriate? Are good entries ranked high enough?”), we can say that the rankings in search engines are at least appropriate. According to the user judgment of relevancy, the search engine providers would even be well advised to rank Wikipedia articles even higher than they do now.

However, a definite assessment is difficult, as relevance judgment is too multifarious and not solely dependent on content quality of the result. Regarding the correspondence of relevance judgments and scores from the heuristic evaluation (research question 4a), we found some conformance, but as relevance is a multifaceted concept, the results can only give an indication with regard to the reliability of the ranking.

In conclusion, the ranked articles were useful (research question 4b). We did not find articles that were useless (research question 4). However, usefulness varied considerably. While we assume that users' trust in Wikipedia lets them judge most articles as relevant, based on the heuristic evaluation we cannot recommend always showing Wikipedia results on top of the results list."

[edit] Comments


Further notes[edit]

Facts about "Ranking of Wikipedia articles in search engines revisited: fair ranking for reasonable quality?"RDF feed
AbstractAbstract This paper aims to review the fieAbstract This paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles. After an overview of current research on information quality in Wikipedia, a summary of the extended discussion on the quality of encyclopedic entries in general is given. On this basis, a heuristic method for evaluating Wikipedia entries is developed and applied to Wikipedia articles that scored highly in a search engine retrieval effectiveness test and compared with the relevance judgment of jurors. In all search engines tested, Wikipedia results are unanimously judged better by the jurors than other results on the corresponding results position. Relevance judgments often roughly correspond with the results from the heuristic evaluation. Cases in which high relevance judgments are not in accordance with the comparatively low score from the heuristic evaluation are interpreted as an indicator of a high degree of trust in Wikipedia. One of the systemic shortcomings of Wikipedia lies in its necessarily incoherent user model. A further tuning of the suggested criteria catalog, for instance, the different weighing of the supplied criteria, could serve as a starting point for a user model differentiated evaluation of Wikipedia articles. Approved methods of quality evaluation of reference works are applied to Wikipedia articles and integrated with the question of search engine evaluation. the question of search engine evaluation.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionIn general, our study confirms that the raIn general, our study confirms that the ranking of Wikipedia articles in search engines is justified by a satisfactory overall quality of the articles. For general informational queries, the negative assessment of Wikipedia articles could not be reinforced with the exception of relatively poor quality concerning orthographical and grammatical correctness.

Our study showed that despite the intense research on Wikipedia quality there is still a lack of commonly agreed on authoritative heuristics as well as evaluation methods (research question 1). However, from the range of existing quality criteria we were able to derive a heuristics adequate for evaluating Wikipedia articles (research question 2). Jurors agreed on the provided criteria catalog (research question 2a).

Our heuristic method is apt for the task of detecting quality distinctions, as the quality differences between articles in the sample were clearly noticeable (research question 2c).

In answer to research question 4b, 4c (“Is the ranking appropriate? Are good entries ranked high enough?”), we can say that the rankings in search engines are at least appropriate. According to the user judgment of relevancy, the search engine providers would even be well advised to rank Wikipedia articles even higher than they do now.

However, a definite assessment is difficult, as relevance judgment is too multifarious and not solely dependent on content quality of the result. Regarding the correspondence of relevance judgments and scores from the heuristic evaluation (research question 4a), we found some conformance, but as relevance is a multifaceted concept, the results can only give an indication with regard to the reliability of the ranking.

In conclusion, the ranked articles were useful (research question 4b). We did not find articles that were useless (research question 4). However, usefulness varied considerably. While we assume that users' trust in Wikipedia lets them judge most articles as relevant, based on the heuristic evaluation we cannot recommend always showing Wikipedia results on top of the results list.
ipedia results on top of the results list.
Data sourceExperiment responses + and Wikipedia pages +
Doi10.1002/asi.21423 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Ranking%2Bof%2BWikipedia%2Barticles%2Bin%2Bsearch%2Bengines%2Brevisited%3A%2Bfair%2Branking%2Bfor%2Breasonable%2Bquality%3F%22 +
Has authorDirk Lewandowski + and Ulrike Spree +
Has domainInformation science +
Has topicRanking and popularity + and Computational estimation of trustworthiness +
Peer reviewedYes +
Publication typeJournal article +
Published inJournal of the American Society for Information Science and Technology +
Research designExperiment +
Research questionsThis paper aims to review the fiercely discThis paper aims to review the fiercely discussed question of whether the ranking of Wikipedia articles in search engines is justified by the quality of the articles.

1. Which applicable quality standards (heuristics) exist for evaluating Wikipedia articles? In what context were they developed and applied and do they justice to the generic markings of Wikipedia articles? 2. Based on the research on existing quality standards, we developed our own heuristics. With the help of these heuristics human evaluators should be able to make sound and intersubjectively comprehensible quality judgments of individual Wikipedia articles. As we wanted to develop an easy-to-apply tool our heuristic had to meet the following requirements: a. Human evaluators can evaluate individual Wikipedia articles on the basis of the provided criteria catalog and can agree whether a given article meets a certain criterion or not. b. On the basis of the criteria catalog human evaluators attain similar evaluating scores for the same article. c. On the basis of the criteria catalog noticeable differences in quality of Wikipedia articles can be determined. 3. The calibrated heuristic was applied to Wikipedia articles that scored highly in the retrieval test to find out: a. whether there exist noticeable differences in quality among the examples of our sample; b. whether there are really bad articles among the highly ranked articles. 4. On this basis new insight into the user judgment of Wikipedia hits is possible as it can now be analyzed: a. how user relevance judgments of the Wikipedia hits in the search engine results correspond with scores from the heuristic evaluation; b. how useful the ranked articles are;

c. whether the ranking is appropriate, respectively whether good entries are ranked high enough.
ether good entries are ranked high enough.
Revid10,920 +
TheoriesIn the theory of specialized lexicography,In the theory of specialized lexicography, quality management is firmly grounded on the determination of a user structure consisting of the three aspects of user presupposition: degree of expertise such as layperson or expert, user situation referring to the actual usage such as text production or understanding, and user intention, which can widely vary from gathering factual information to background information or references (Geeb, 1998). So far, Wikipedia has no determined user structure and is trying to serve the needs of the general user as well as the expert. Based on this, it could be concluded that quality problems are to be expected, especially for articles in arcane academic areas like mathematics, as the knowledge gap between the general user and the specialist is large. In accordance with our theoretical assumption (see previous section) that the quality of an encyclopedia article should always be evaluated not only against the aims and objectives of the encyclopedia but also against its user structure and expectations, we strove to design a flexible and adaptable heuristic.design a flexible and adaptable heuristic.
Theory typeAnalysis +
TitleRanking of Wikipedia articles in search engines revisited: fair ranking for reasonable quality?
Unit of analysisArticle +
Urlhttp://dx.doi.org/10.1002/asi.21423 +
Wikipedia coverageMain topic +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageGerman +
Wikipedia page typeArticle +
Year2011 +