Browse wiki

Jump to: navigation, search
Mining domain-specific thesauri from Wikipedia: a case study
Abstract Domain-specific thesauri are high-cost, hiDomain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia. In a comparison with a professional thesaurus for agriculture we find that Wikipedia contains a substantial proportion of its concepts and semantic relations; furthermore it has impressive coverage of contemporary documents in the domain. Thesauri derived using our techniques capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts.stakingly-constructed manual counterparts.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments "In a comparison with a professional thesa"In a comparison with a professional thesaurus for agriculture we find that Wikipedia contains a substantial proportion of its concepts and semantic relations; furthermore it has impressive coverage of contemporary documents in the domain. Thesauri derived using our techniques capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts."takingly-constructed manual counterparts."
Conclusion In a comparison with a professional thesauIn a comparison with a professional thesaurus for agriculture we find that Wikipedia contains a substantial proportion of its concepts and semantic relations; furthermore it has impressive coverage of contemporary documents in the domain. Thesauri derived using our techniques capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts.stakingly-constructed manual counterparts.
Data source Wikipedia pages  +
Doi 10.1109/WI.2006.119 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Mining%2Bdomain-specific%2Bthesauri%2Bfrom%2BWikipedia%3A%2Ba%2Bcase%2Bstudy%22  +
Has author David N. Milne + , Olena Medelyan + , Ian H. Witten +
Has domain Computer science + , Library science +
Has topic Comprehensiveness + , Data mining +
Pages 442-448  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence +
Research design Statistical analysis  + , Other  +
Research questions Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia.
Revid 10,872  +
Theories Undetermined
Theory type Design and action  +
Title Mining domain-specific thesauri from Wikipedia: a case study
Unit of analysis Website  +
Url http://dl.acm.org/citation.cfm?id=1249168  +
Wikipedia coverage Main topic  +
Wikipedia data extraction Dump  +
Wikipedia language English  +
Wikipedia page type Article  +
Year 2006  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:29:37  +
Categories Comprehensiveness  + , Data mining  + , Computer science  + , Library science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:29:49  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.