Browse wiki

Jump to: navigation, search
Unsupervised query segmentation using generative language models and Wikipedia
Abstract In this paper, we propose a novel unsupervIn this paper, we propose a novel unsupervised approach to query segmentation, an important task in Web search. We use a generative query model to recover a query's underlying concepts that compose its original segmented form. The model's parameters are estimated using an expectation-maximization (EM) algorithm, optimizing the minimum description length objective function on a partial corpus that is specific to the query. To augment this unsupervised learning, we incorporate evidence from Wikipedia. Experiments show that our approach dramatically improves performance over the traditional approach that is based on mutual information, and produces comparable results with a supervised method. In particular, the basic generative language model contributes a 7.4% improvement over the mutual information based method (measured by segment F1 on the Intersection test set). EM optimization further improves the performance by 14.3%. Additional knowledge from Wikipedia provides another improvement of 24.3%, adding up to a total of 46% improvement (from 0.530 to 0.774). of 46% improvement (from 0.530 to 0.774).
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments "the basic generative language model contr"the basic generative language model contributes a 7.4% improvement over the mutual information based method (measured by segment F1 on the Intersection test set). EM optimization further improves the performance by 14.3%. Additional knowledge from Wikipedia provides another improvement of 24.3%, adding up to a total of 46% improvement (from 0.530 to 0.774)."of 46% improvement (from 0.530 to 0.774)."
Conclusion Experiments show that our approach dramatiExperiments show that our approach dramatically improves performance over the traditional approach that is based on mutual information, and produces comparable results with a supervised method. In particular, the basic generative language model contributes a 7.4% improvement over the mutual information based method (measured by segment F1 on the Intersection test set). EM optimization further improves the performance by 14.3%. Additional knowledge from Wikipedia provides another improvement of 24.3%, adding up to a total of 46% improvement (from 0.530 to 0.774). of 46% improvement (from 0.530 to 0.774).
Data source Experiment responses  + , Websites  + , Wikipedia pages  +
Doi 10.1145/1367497.1367545 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Unsupervised%2Bquery%2Bsegmentation%2Busing%2Bgenerative%2Blanguage%2Bmodels%2Band%2BWikipedia%22  +
Has author Bin Tan + , Fuchun Peng +
Has domain Computer science +
Has topic Query processing +
Pages 347-356  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in Proceeding of the 17th international conference on World Wide Web +
Research design Experiment  + , Statistical analysis  +
Research questions In this paper, we propose a novel unsupervIn this paper, we propose a novel unsupervised approach to query segmentation, an important task in Web search. We use a generative query model to recover a query's underlying concepts that compose its original segmented form. The model's parameters are estimated using an expectation-maximization (EM) algorithm, optimizing the minimum description length objective function on a partial corpus that is specific to the query. To augment this unsupervised learning, we incorporate evidence from Wikipedia.g, we incorporate evidence from Wikipedia.
Revid 11,014  +
Theories Undetermined
Theory type Design and action  +
Title Unsupervised query segmentation using generative language models and Wikipedia
Unit of analysis N/A  +
Url http://dl.acm.org/citation.cfm?id=1367545  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language English  +
Wikipedia page type Article  + , Log  +
Year 2008  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:32:00  +
Categories Query processing  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:32:04  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.