|A semantic approach for question classification using WordNet and Wikipedia|
|Authors:||Santosh Kumar Ray, Shailendra Singh, B.P. Joshi|
|Citation:||Pattern Recognition Letters 31 (13): 1935-1943. 2010.|
|Publication type:||Journal article|
|Google Scholar cites:||Citations|
|Added by Wikilit team:||Added on initial load|
|Article:||Google Scholar BASE PubMed|
|Other scholarly wikis:||AcaWiki Brede Wiki WikiPapers|
|Web search:||Bing Google Yahoo! — Google PDF|
Question Answering Systems, unlike search engines, are providing answers to the users' questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.
"In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly.
In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google"
|Theory type:||Design and action|
|Wikipedia coverage:||Sample data|
|Theories:||"Automatic validation of the answer is a relatively new concept.
Introduction of AVE (Automatic Validation Exercise) at QA@CLEF in 2006 (Cross Language Evaluation Forum, 2006) gave a major boost to research in this direction."
|Data source:||Experiment responses, Archival records, Wikipedia pages|
|Collected data time dimension:||Cross-sectional|
|Unit of analysis:||Article|
|Wikipedia data extraction:||Live Wikipedia|
|Wikipedia page type:||Article|
"We have tested our approach on TREC datasets and achieved 89.55% classification accuracy which is comparable to earlier reported research works. Based on the results, we can say that proposed method seems to be promising for question classification in the field of open-domain question answering. The distinctive points of the algorithm are lying in its dynamic and extendible properties which are needed for constantly changing open-domain Question Answering Systems. Thus, it seems to be logical assumption that in such scenario question classification with fixed set of classes will not be enough and we need methods which could introduce new set of classes as and when needed."
"Secondary data: USC and TREC data sets; Wikipedia articles; Websites (WordNet)"