Browse wiki

Jump to: navigation, search
Enhancing cluster labeling using Wikipedia
Abstract This work investigates cluster labeling enThis work investigates cluster labeling enhancement by utilizing Wikipedia, the free on-line encyclopedia. We describe a general framework for cluster labeling that extracts candidate labels from Wikipedia in addition to important terms that are extracted directly from the text. The labeling quality" of each candidate is then evaluated by several independent judges and the top evaluated candidates are recommended for labeling. Our experimental results reveal that the Wikipedia labels agree with manual labels associated by humans to a cluster much more than with significant terms that are extracted directly from the text. We show that in most cases even when human's associated label appears in the text pure statistical methods have difficulty in identifying them as good descriptors. Furthermore our experiments show that for more than 85% of the clusters in our test collection the manual label (or an inflection or a synonym of it) appears in the top five labels recommended by our system.top five labels recommended by our system.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments "Cluster labeling withWikipedia is extremely successful, as shown by our results, especially in collections of documents whose topics are covered well by Wikipedia concepts." p. 146
Conclusion Cluster labeling withWikipedia is extremelCluster labeling withWikipedia is extremely successful, as shown by our results, especially in collections of documents whose topics are covered well by Wikipedia concepts. For domain specific collections, with topics that are not com- pletely covered by Wikipedia, the proposed candidates may hurt the system’s performance due to their irrelevance to the documents’ topics. For such collections, an intelligent decision should be made regarding the use of Wikipedia or another external resource; alternatively, a choice could be made to focus only on inner terms for labeling. The deci- sion should be made by analyzing the given collection with respect to Wikipedia. Developing such a collection specific decision making as part of the labeling framework is left for further research.ng framework is left for further research.
Conference location Boston, MA, United states +
Data source Archival records  + , Experiment responses  + , Wikipedia pages  +
Dates 19-23 +
Doi 10.1145/1571941.1571967 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Enhancing%2Bcluster%2Blabeling%2Busing%2BWikipedia%22  +
Has author David Carmel + , Haggai Roitman + , Naama Zwerdling +
Has domain Computer science +
Has topic Ranking and clustering systems +
Month July  +
Pages 139-146  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in SIGIR '09 Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval +
Publisher Association for Computing Machinery +
Research design Experiment  +
Research questions This work investigates cluster labeling enThis work investigates cluster labeling enhancement by uti- lizing Wikipedia, the free on-line encyclopedia. We describe a general framework for cluster labeling that extracts candi- date labels from Wikipedia in addition to important terms that are extracted directly from the text. The“labeling qual- ity” of each candidate is then evaluated by several indepen- dent judges and the top evaluated candidates are recom- mended for labeling.candidates are recom- mended for labeling.
Revid 10,747  +
Theories Undetermined
Theory type Design and action  +
Title Enhancing cluster labeling using Wikipedia
Unit of analysis Article  +
Url http://dl.acm.org/citation.cfm?id=1571967  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language Not specified  +
Wikipedia page type Article  +
Year 2009  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:26:14  +
Categories Ranking and clustering systems  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:25:57  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.