Learning to rank with (a lot of) word features

From WikiLit
Jump to: navigation, search
Publication (help)
Learning to rank with (a lot of) word features
Authors: Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Olivier Chapelle, Kilian Weinberger [edit item]
Citation: Information Retrieval 13 (3): 291-314. 2010.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1007/s10791-009-9117-9.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Learning to rank with (a lot of) word features is a publication by Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Olivier Chapelle, Kilian Weinberger.


[edit] Abstract

In this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing {(LSI)}, our models take account of correlations between words (synonymy, polysemy). However, unlike {LSI} our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.

[edit] Research questions

"In this article we present Supervised Semantic Indexing (SSI) which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy).....We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing (CFH) and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods."

Research details

Topics: Ranking and clustering systems [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English, Japanese [edit item]

[edit] Conclusion

"Our empirical study covered query and document retrieval, cross-language retrieval and ad-placement. Our main conclusions were: (i) we found that the low rank model outperforms the full rank margin ranking perceptron with the same features as well as its sparsified version. We also outperform classical methods such as TFIDF, LSI or query expansion. Finally, it is also better than or comparable to“Hash Kernel”, another new supervised technique, in terms of accuracy, while having advantages in terms of efficiency; and (ii) Using Correlated feature hashing improves results even further. Both the low rank idea from (i) and correlated feature hashing (ii) prove to be effective ways to reduce the feature space size."

[edit] Comments


Further notes[edit]

Facts about "Learning to rank with (a lot of) word features"RDF feed
AbstractIn this article we present Supervised SemaIn this article we present Supervised Semantic Indexing which defines a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing {(LSI)}, our models take account of correlations between words (synonymy, polysemy). However, unlike {LSI} our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to different retrieval tasks, such as cross-language retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods. providing realistically scalable methods.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionOur empirical study covered query and docuOur empirical study covered query and document retrieval, cross-language retrieval

and ad-placement. Our main conclusions were: (i) we found that the low rank model outperforms the full rank margin ranking perceptron with the same features as well as its sparsified version. We also outperform classical methods such as TFIDF, LSI or query expansion. Finally, it is also better than or comparable to“Hash Kernel”, another new supervised technique, in terms of accuracy, while having advantages in terms of efficiency; and (ii) Using Correlated feature hashing improves results even further. Both the low rank idea from (i) and correlated feature hashing (ii) prove to be effective ways to reduce the feature space size.ive ways

to reduce the feature space size.
Data sourceExperiment responses + and Wikipedia pages +
Doi10.1007/s10791-009-9117-9 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Learning%2Bto%2Brank%2Bwith%2B%28a%2Blot%2Bof%29%2Bword%2Bfeatures%22 +
Has authorBing Bai +, Jason Weston +, David Grangier +, Ronan Collobert +, Kunihiko Sadamasa +, Yanjun Qi +, Olivier Chapelle + and Kilian Weinberger +
Has domainComputer science +
Has topicRanking and clustering systems +
Issue3 +
Pages291-314 +
Peer reviewedYes +
Publication typeJournal article +
Published inInformation Retrieval +
Research designExperiment +
Research questionsIn this article we present Supervised SemaIn this article we present Supervised Semantic Indexing (SSI) which defines

a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy).....We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing (CFH) and sparsification. We provide an empirical study of all these methods on retrieval tasks based on Wikipedia documents as well as an Internet advertisement task. We obtain state-of-the-art performance while providing realistically scalable methods.

providing realistically scalable methods.
Revid10,850 +
TheoriesUndetermined
Theory typeDesign and action +
TitleLearning to rank with (a lot of) word features
Unit of analysisArticle +
Urlhttp://www.springerlink.com/content/y693024624k15n40/ +
Volume13 +
Wikipedia coverageSample data +
Wikipedia data extractionDump +
Wikipedia languageEnglish + and Japanese +
Wikipedia page typeArticle +
Year2010 +