Analysis of community structure in Wikipedia

From WikiLit
Revision as of 16:34, October 8, 2012 by Arrto (Talk | contribs) (Not added by Wikilit)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Publication (help)
Analysis of Community Structure in Wikipedia
Authors: Dmitry Lizorkin, Olena Medelyan, Maria Grineva [edit item]
Citation: 18th int. conf. on World Wide Web (WWW)  : 1221-1222. 2009. Madrid, Spain.
Publication type: Conference paper
Peer-reviewed: Yes
DOI: Define doi.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: No
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Format: BibTeX
Analysis of Community Structure in Wikipedia is a publication by Dmitry Lizorkin, Olena Medelyan, Maria Grineva.

[edit] Abstract

We present the results of a community detection analysis of the Wikipedia graph. Distinct communities in Wikipedia contain semantically closely related articles. The central topic of a community can be identified using PageRank. Extracted communities can be organized hierarchically similar to manually created Wikipedia category structure.

[edit] Research questions

Research details

Topics: Missing topics [edit item]
Domains: Missing domains [edit item]
Theory type: Missing theory_type [edit item]
Wikipedia coverage: [edit item]
Theories: [edit item]
Research design: [edit item]
Data source: [edit item]
Collected data time dimension: [edit item]
Unit of analysis: Missing unit_of_analysis [edit item]
Wikipedia data extraction: Missing wikipedia_data_extraction [edit item]
Wikipedia page type: Missing wikipedia_page_type [edit item]
Wikipedia language: Missing wikipedia_language [edit item]

[edit] Conclusion

[edit] Comments

[edit] References

  • F. Bellomi and R. Bonato. Network analysis for Wikipedia. In Wikimania, 2005.
  • A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70:066111, 2004.
  • D. Milne and I. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In Wikipedia and AI workshop at the AAAI, 2008

Further notes[edit]

The entire Wikipedia graph can be automatically organized into a hierarchy of communities comprising thematically related Wikipedia articles. Combined with the PageRank analysis to identify their central topics, we can automatically produce an ontological structure similar to the existing Wikipedia category tree. Evaluation of the accuracy of such structure will be a part of our future work, however the initial experiments demonstrate the potential of our method.

The community-detection analysis is fully language-independent. Thus, it will be particular useful for Wikipedias in languages, where a category structure is not as well developed as in the English Wikipedia. Furthermore, community detection analysis could be used to improve existing categories, created by humans without the knowledge of the integral hyperlink organization, or to augment Wikipedia search results with same-community terms.