Browse wiki

Jump to: navigation, search
Learning to rank with (a lot of) word features
Abstract In this article we present Supervised SemaIn this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing {(LSI)}, our models take account of correlations between words (synonymy, polysemy). However, unlike {LSI} our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods. providing realistically scalable methods.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Conclusion Our empirical study covered query and docuOur empirical study covered query and document retrieval, cross-language retrieval and ad-placement. Our main conclusions were: (i) we found that the low rank model outperforms the full rank margin ranking perceptron with the same features as well as its sparsified version. We also outperform classical methods such as TFIDF, LSI or query expansion. Finally, it is also better than or comparable to“Hash Kernel”, another new supervised technique, in terms of accuracy, while having advantages in terms of efficiency; and (ii) Using Correlated feature hashing improves results even further. Both the low rank idea from (i) and correlated feature hashing (ii) prove to be effective ways to reduce the feature space size.ive ways to reduce the feature space size.
Data source Experiment responses  + , Wikipedia pages  +
Doi 10.1007/s10791-009-9117-9 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Learning%2Bto%2Brank%2Bwith%2B%28a%2Blot%2Bof%29%2Bword%2Bfeatures%22  +
Has author Bing Bai + , Jason Weston + , David Grangier + , Ronan Collobert + , Kunihiko Sadamasa + , Yanjun Qi + , Olivier Chapelle + , Kilian Weinberger +
Has domain Computer science +
Has topic Ranking and clustering systems +
Issue 3  +
Pages 291-314  +
Peer reviewed Yes  +
Publication type Journal article  +
Published in Information Retrieval +
Research design Experiment  +
Research questions In this article we present Supervised SemaIn this article we present Supervised Semantic Indexing (SSI) which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy).....We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing (CFH) and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods. providing realistically scalable methods.
Revid 10,850  +
Theories Undetermined
Theory type Design and action  +
Title Learning to rank with (a lot of) word features
Unit of analysis Article  +
Url http://www.springerlink.com/content/y693024624k15n40/  +
Volume 13  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language English  + , Japanese  +
Wikipedia page type Article  +
Year 2010  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:29:27  +
Categories Ranking and clustering systems  + , Computer science  + , Publications with missing comments  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:29:24  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.