Last modified on January 30, 2014, at 20:23

Discovering missing links in Wikipedia

Publication (help)
Discovering missing links in Wikipedia
Authors: Sisay Fissaha Adafre, Maarten de Rijke [edit item]
Citation: LinkKDD '05 Proceedings of the 3rd international workshop on Link discovery  : 90-97. 2005.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1134271.1134284.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Discovering missing links in Wikipedia is a publication by Sisay Fissaha Adafre, Maarten de Rijke.


[edit] Abstract

In this paper we address the problem of discovering missing hypertext links in Wikipedia. The method we propose consists of two steps: first, we compute a cluster of highly similar pages around a given page, and then we identify candidate links from those similar pages that might be missing on the given page. The main innovation is in the algorithm that we use for identifying similar pages, {LTRank}, which ranks pages using co-citation and page title information. Both {LTRank} and the link discovery method are manually evaluated and show acceptable results, especially given the simplicity of the methods and conservativeness of the evaluation criteria.

[edit] Research questions

"In this paper we address the problem of discovering missing hypertext links in Wikipedia. The method we propose consists of two steps: first, we compute a cluster of highly similar pages around a given page, and then we identify candidate links from those similar pages that might be missing on the given page."

Research details

Topics: Ranking and clustering systems [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"The main innovation is in the algorithm that we used for identifying similar pages, LTRank, which ranks pages using co-citation and page title information. Both LTRank and the discovery method were evaluated and showed acceptable results, especially given the simplicity of the methods and conservativeness of the evaluation criteria. Though the methods are not perfect, they could be used as an online authoring aid by revealing a ranked list of important candidate links, and the associated Wikipedia links. To some extent, this would provide a page’s author with a global view of the structure of Wikipedia while locally updating or editing a page."

[edit] Comments

"The method proposed to discover missing links in Wikipedia "could be used as an online authoring aid by revealing a ranked list of important candidate links, and the associated Wikipedia links." p.96"


Further notes[edit]