Browse wiki

Jump to: navigation, search
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
Abstract Computing semantic relatedness of natural Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia. We use machine learning techniques to explicitly represent the meaning of any text as a weighted vector of Wikipedia-based concepts. Assessing the relatedness of texts in this space amounts to comparing the corresponding vectors using conventional metrics (e.g., cosine). Compared with the previous state of the art, using ESA results in substantial improvements in correlation of computed relatedness scores with human judgments: from r = 0.56 to 0.75 for individual words and from r = 0.60 to 0.72 for texts. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.A model is easy to explain to human users.
Added by wikilit team Added on initial load  +
Collected data time dimension Longitudinal  +
Comments "Empirical evaluation confirms that using ESA, [the method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia] leads to substantial improvements in computing word and text relatedness." p. 1611
Conclusion Compared to LSA, which only uses statisticCompared to LSA, which only uses statistical cooccurrence information, our methodology explicitly uses the knowledge collected and organized by humans. Compared to lexical resources such as WordNet, our methodology leverages knowledge bases that are orders of magnitude larger and more comprehensive. Empirical evaluation confirms that using ESA leads to substantial improvements in computing word and text relatedness. Compared with the previous state of the art, using ESA results in notable improvements in correlation of computed relatedness scores with human judgements: from r = 0.56 to 0.75 for individual words and from r = 0.60 to 0.72 for texts. Furthermore, due to the use of natural concepts, the ESA model is easy to explain to human users.A model is easy to explain to human users.
Data source Archival records  + , Experiment responses  + , Wikipedia pages  +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Computing%2Bsemantic%2Brelatedness%2Busing%2BWikipedia-based%2Bexplicit%2Bsemantic%2Banalysis%22  +
Has author Evgeniy Gabrilovich + , Shaul Markovitch +
Has domain Computer science +
Has topic Semantic relatedness +
Pages 1606-1611  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence +
Research design Design science  + , Experiment  +
Research questions We propose a novel method, called ExplicitWe propose a novel method, called Explicit Semantic Analysis (ESA), for fine-grained semantic representation of unrestricted natural language texts. Our method represents meaning in a high-dimensional space of natural concepts derived from Wikipedia (http://en.wikipedia.org), the largest encyclopedia in existence. We employ text classification techniques that allow us to explicitly represent the meaning of any text in terms of Wikipedia-based concepts. We evaluate the effectiveness of our method on automatically computing the degree of semantic relatedness between fragments of natural language text.etween fragments of natural language text.
Revid 10,709  +
Theories Undetermined
Theory type Design and action  +
Title Computing semantic relatedness using Wikipedia-based explicit semantic analysis
Unit of analysis Article  +
Url http://dl.acm.org/citation.cfm?id=1625535  +
Wikipedia coverage Main topic  +
Wikipedia data extraction Dump  +
Wikipedia language Not specified  +
Wikipedia page type Article  +
Year 2007  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:25:37  +
Categories Semantic relatedness  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:21:56  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.