Browse wiki

Jump to: navigation, search
Knowledge derived from Wikipedia for computing semantic relatedness
Abstract Wikipedia provides a semantic network for Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to {WordNet} on various bench-marking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.semantic relatedness for a German dataset.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments "Semantic relatedness computed using the W"Semantic relatedness computed using the Wikipedia category network consistently correlates better with human judgments than a simple baseline based on Google counts. It is also competitive with WordNet for datasets specifically modeling semantic relatedness human judgments....Unfortunately, the Wikipedia categorization still suffers from some limitations, i.e., it cannot be considered a fully-fledged ontology, as the relations between categories are not semantically-typed." p.206egories are not semantically-typed." p.206
Conclusion In this article we investigated the use ofIn this article we investigated the use of Wikipedia for computing semantic relatedness and its application to a real-world NLP task, coreference resolution. We assumed the Wikipedia category graph to represent a semantic network modeling relations between concepts, and we computed their relatedness from it. Even if the categorization feature has been introduced into Wikipedia only three years ago, our results indicate that semantic relatedness computed using the Wikipedia category network consistently correlates better with human judgments than a simple baseline based on Google counts. It is also competitive with WordNet for datasets specifically modeling semantic relatedness human judgments. Because all available dataset are small and seem to be assembled rather arbitrarily we perform an extrinsic evaluation with an NLP application, i.e. a coreference resolution system, where we register for some datasets no statistically significant differences between the improvements given by features induced from WordNet and the ones from Wikipedia. Wikipedia provides a large amount of information as encyclopedic entries at the leaves of the category network, e.g. named entities. The encyclopedia gets continuously updated and the derived knowledge can be used to analyze current information. The text and the category network both provide semi-structured information and can be mined with more precision than unstructured data gathered from the web. Unfortunately, the Wikipedia categorization still suffers from some limitations, i.e., it cannot be considered a fully-fledged ontology, as the relations between categories are not semantically-typed. In the near future we will concentrate on making the semantic relations between concepts explicit in theWikipedia category network (Ponzetto&Strube, 2007b). The availability of explicit semantic relations will allow for inducing semantic similarity rather than semantic relatedness measures, which may be more suitable for coreference resolution. What is most interesting about our results is that they indicate that a collaboratively created folksonomy can actually be used in NLP applications with the same benefit as hand-crafted taxonomies or ontologies. as hand-crafted taxonomies or ontologies.
Data source Archival records  + , Experiment responses  + , Wikipedia pages  +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Knowledge%2Bderived%2Bfrom%2BWikipedia%2Bfor%2Bcomputing%2Bsemantic%2Brelatedness%22  +
Has author Simone Paolo Ponzetto + , Michael Strube +
Has domain Computer science +
Has topic Semantic relatedness +
Pages 181-212  +
Peer reviewed Yes  +
Publication type Journal article  +
Published in Journal of Artificial Intelligence Research +
Research design Experiment  +
Research questions Wikipedia provides a semantic network for Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.semantic relatedness for a German dataset.
Revid 10,844  +
Theories Undetermined
Theory type Design and action  +
Title Knowledge derived from Wikipedia for computing semantic relatedness
Unit of analysis Article  +
Url http://en.scientificcommons.org/43376530  +
Volume 30  +
Wikipedia coverage Main topic  +
Wikipedia data extraction Dump  +
Wikipedia language English  +
Wikipedia page type Article  + , Information categorization and navigation  +
Year 2007  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:29:26  +
Categories Semantic relatedness  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:29:21  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.