A semantic approach for question classification using WordNet and Wikipedia

From WikiLit
Jump to: navigation, search
Publication (help)
A semantic approach for question classification using WordNet and Wikipedia
Authors: Santosh Kumar Ray, Shailendra Singh, B.P. Joshi [edit item]
Citation: Pattern Recognition Letters 31 (13): 1935-1943. 2010.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1016/j.patrec.2010.06.012.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
A semantic approach for question classification using WordNet and Wikipedia is a publication by Santosh Kumar Ray, Shailendra Singh, B.P. Joshi.


[edit] Abstract

Question Answering Systems, unlike search engines, are providing answers to the users' questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.

[edit] Research questions

"In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly.

In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google"

Research details

Topics: Text classification [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Automatic validation of the answer is a relatively new concept.

Introduction of AVE (Automatic Validation Exercise) at QA@CLEF in 2006 (Cross Language Evaluation Forum, 2006) gave a major boost to research in this direction." [edit item]

Research design: Experiment [edit item]
Data source: Experiment responses, Archival records, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"We have tested our approach on TREC datasets and achieved 89.55% classification accuracy which is comparable to earlier reported research works. Based on the results, we can say that proposed method seems to be promising for question classification in the field of open-domain question answering. The distinctive points of the algorithm are lying in its dynamic and extendible properties which are needed for constantly changing open-domain Question Answering Systems. Thus, it seems to be logical assumption that in such scenario question classification with fixed set of classes will not be enough and we need methods which could introduce new set of classes as and when needed."

[edit] Comments

"Secondary data: USC and TREC data sets; Wikipedia articles; Websites (WordNet)"


Further notes[edit]

Facts about "A semantic approach for question classification using WordNet and Wikipedia"RDF feed
AbstractQuestion Answering Systems, unlike search Question Answering Systems, unlike search engines, are providing answers to the users' questions in succinct form which requires the prior knowledge of the expectation of the user. Question classification module of a Question Answering System plays a very important role in determining the expectations of the user. In the literature, incorrect question classification has been cited as one of the major factors for the poor performance of the Question Answering Systems and this emphasizes on the importance of question classification module designing. In this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. We have trained our system over a standard set of 5500 questions (by UIUC) and then tested it over five TREC question collections. We have compared our results with some standard results reported in the literature and observed a significant improvement in the accuracy of question classification. The question classification accuracy suggests the effectiveness of the method which is promising in the field of open-domain question classification. Judging the correctness of the answer is an important issue in the field of question answering. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Google. We have applied several heuristics for answer validation task and tested them against some popular web based open-domain Question Answering Systems over a collection of 500 questions collected from standard sources such as TREC, the Worldbook, and the Worldfactbook. The proposed method seems to be promising for automatic answer validation task.sing for automatic answer validation task.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsSecondary data: USC and TREC data sets; Wikipedia articles; Websites (WordNet)
ConclusionWe have

tested our approach on TREC dataseWe have tested our approach on TREC datasets and achieved 89.55% classification accuracy which is comparable to earlier reported research works. Based on the results, we can say that proposed method seems to be promising for question classification in the field of open-domain question answering. The distinctive points of the algorithm are lying in its dynamic and extendible properties which are needed for constantly changing open-domain Question Answering Systems. Thus, it seems to be logical assumption that in such scenario question classification with fixed set of classes will not be enough and we need methods which could introduce new set of classes as and when needed.uce new

set of classes as and when needed.
Data sourceExperiment responses +, Archival records + and Wikipedia pages +
Doi10.1016/j.patrec.2010.06.012 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22A%2Bsemantic%2Bapproach%2Bfor%2Bquestion%2Bclassification%2Busing%2BWordNet%2Band%2BWikipedia%22 +
Has authorSantosh Kumar Ray +, Shailendra Singh + and B.P. Joshi +
Has domainComputer science +
Has topicText classification +
Issue13 +
Pages1935-1943 +
Peer reviewedYes +
Publication typeJournal article +
Published inPattern Recognition Letters +
Research designExperiment +
Research questionsIn this article, we have proposed a questiIn this article, we have proposed a question classification method that exploits the powerful semantic features of the WordNet and the vast knowledge repository of the Wikipedia to describe informative terms explicitly. In this article, we are extending question classification as one of the heuristics for answer validation. We are proposing a World Wide Web based solution for answer validation where answers returned by open-domain Question Answering Systems can be validated using online resources such as Wikipedia and Googleine resources such as Wikipedia and Google
Revid10,640 +
TheoriesAutomatic validation of the answer is a relatively new concept.

Introduction of AVE (Automatic Validation Exercise) at QA@CLEF in 2006 (Cross Language Evaluation Forum, 2006) gave a major boost

to research in this direction.
Theory typeDesign and action +
TitleA semantic approach for question classification using WordNet and Wikipedia
Unit of analysisArticle +
Urlhttp://www.sciencedirect.com/science/article/pii/S0167865510001996 +
Volume31 +
Wikipedia coverageSample data +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageEnglish +
Wikipedia page typeArticle +
Year2010 +