Difference between revisions of "Understanding user's query intent with Wikipedia"

From WikiLit
Jump to: navigation, search
m (added_by_wikilit_team field added)
(Changed the research design)
Line 18: Line 18:
 
|wikipedia_coverage=Main topic
 
|wikipedia_coverage=Main topic
 
|theories=Undetermined
 
|theories=Undetermined
|research_design=Case study, Ethnography
+
|research_design=Case study, Experiment
|collected_datatype=Direct observation
+
|collected_datatype=Experiment responses
 
|collected_data_time_dimension=Cross-sectional
 
|collected_data_time_dimension=Cross-sectional
 
|unit_of_analysis=N/A
 
|unit_of_analysis=N/A

Revision as of 17:03, November 15, 2013

Publication (help)
Understanding user's query intent with Wikipedia
Authors: Jian Hu, Gang Wang, Fred Lochovsky, Jian tao Sun, Zheng Chen [edit item]
Citation: WWW '09 Proceedings of the 18th international conference on World wide web  : . 2009.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1526709.1526773.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Understanding user's query intent with Wikipedia is a publication by Jian Hu, Gang Wang, Fred Lochovsky, Jian tao Sun, Zheng Chen.


[edit] Abstract

Understanding the intent behind a user's query can help search engine to automatically route the query to some corresponding vertical search engines to obtain particularly relevant contents, thus, greatly improving user satisfaction. There are three major challenges to the query intent classification problem: (1) Intent representation; (2) Domain coverage and (3) Semantic interpretation. Current approaches to predict the user's intent mainly utilize machine learning techniques. However, it is difficult and often requires many human efforts to meet all these challenges by the statistical machine learning approaches. In this paper, we propose a general methodology to the problem of query intent classification. With very little human effort, our method can discover large quantities of intent concepts by leveraging Wikipedia, one of the best human knowledge base. The Wikipedia concepts are used as the intent representation space, thus, each intent domain is represented as a set of Wikipedia articles and categories. The intent of any input query is identified through mapping the query into the Wikipedia representation space. Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small. Moreover, the method is very general and can be easily applied to various intent domains. We demonstrate the effectiveness of this method in three different applications, i.e., travel, job, and person name. In each of the three cases, only a couple of seed intent queries are provided. We perform the quantitative evaluations in comparison with two baseline methods, and the experimental results shows that our method significantly outperforms other methods in each intent domain.

[edit] Research questions

"In this paper, we propose a general methodology to the problem of query intent classification."

Research details

Topics: Query processing [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Case study, Experiment [edit item]
Data source: [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: N/A [edit item]
Wikipedia data extraction: Clone [edit item]
Wikipedia page type: Article, Log [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"The Wikipedia concepts are used as the intent representation space, thus, each intent domain is represented as a set of Wikipedia articles and categories. The intent of any input query is identified through mapping the query into the Wikipedia representation space. Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small. Moreover, the method is very general and can be easily applied to various intent domains. We demonstrate the effectiveness of this method in three different applications, i.e., travel, job, and person name. In each of the three cases, only a couple of seed intent queries are provided. We perform the quantitative evaluations in comparison with two baseline methods, and the experimental results shows that our method significantly outperforms other methods in each intent domain."

[edit] Comments

""Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small." p.471"


Further notes[edit]

Facts about "Understanding user's query intent with Wikipedia"RDF feed
AbstractUnderstanding the intent behind a user's qUnderstanding the intent behind a user's query can help search engine to automatically route the query to some corresponding vertical search engines to obtain particularly relevant contents, thus, greatly improving user satisfaction. There are three major challenges to the query intent classification problem: (1) Intent representation; (2) Domain coverage and (3) Semantic interpretation. Current approaches to predict the user's intent mainly utilize machine learning techniques. However, it is difficult and often requires many human efforts to meet all these challenges by the statistical machine learning approaches. In this paper, we propose a general methodology to the problem of query intent classification. With very little human effort, our method can discover large quantities of intent concepts by leveraging Wikipedia, one of the best human knowledge base. The Wikipedia concepts are used as the intent representation space, thus, each intent domain is represented as a set of Wikipedia articles and categories. The intent of any input query is identified through mapping the query into the Wikipedia representation space. Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small. Moreover, the method is very general and can be easily applied to various intent domains. We demonstrate the effectiveness of this method in three different applications, i.e., travel, job, and person name. In each of the three cases, only a couple of seed intent queries are provided. We perform the quantitative evaluations in comparison with two baseline methods, and the experimental results shows that our method significantly outperforms other methods in each intent domain.forms other methods in each intent domain.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
Comments"Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small." p.471
ConclusionThe Wikipedia concepts are used as the intThe Wikipedia concepts are used as the intent representation space, thus, each intent domain is represented as a set of Wikipedia articles and categories. The intent of any input query is identified through mapping the query into the Wikipedia representation space. Compared with previous approaches, our proposed method can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small. Moreover, the method is very general and can be easily applied to various intent domains. We demonstrate the effectiveness of this method in three different applications, i.e., travel, job, and person name. In each of the three cases, only a couple of seed intent queries are provided. We perform the quantitative evaluations in comparison with two baseline methods, and the experimental results shows that our method significantly outperforms other methods in each intent domain.forms other methods in each intent domain.
Doi10.1145/1526709.1526773 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Understanding%2Buser%27s%2Bquery%2Bintent%2Bwith%2BWikipedia%22 +
Has authorJian Hu +, Gang Wang +, Fred Lochovsky +, Jian tao Sun + and Zheng Chen +
Has domainComputer science +
Has topicQuery processing +
Peer reviewedYes +
Publication typeConference paper +
Published inWWW '09 Proceedings of the 18th international conference on World wide web +
Research designCase study + and Experiment +
Research questionsIn this paper, we propose a general methodology to the problem of query intent classification.
Revid10,078 +
TheoriesUndetermined
Theory typeDesign and action +
TitleUnderstanding user's query intent with Wikipedia
Unit of analysisN/A +
Urlhttp://dl.acm.org/citation.cfm?id=1526709.1526773 +
Wikipedia coverageMain topic +
Wikipedia data extractionClone +
Wikipedia languageEnglish +
Wikipedia page typeArticle + and Log +
Year2009 +