Browse wiki

Jump to: navigation, search
A negative category based approach for Wikipedia document classification
Abstract Profile based methods have been successfulProfile based methods have been successfully used for the classification of unstructured texts. This paper presents a profile based method for Wikipedia XML document classification. We have used profiles built using negative category information. Our approach exploits the structure of Wikipedia documents to build profiles. Two class profiles are built; one based on the whole content and the other based on the initial description of the Wikipedia documents. In addition, we have also explored the option of using the terms in the section and subsection titles. The effectiveness of cosine and fractional similarity measures in classifying XML documents is compared. The importance of combining two profile based classifiers is experimentally shown to have worked better than individual classifiers.worked better than individual classifiers.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Comments Secondary (INEX: Wikipedia articles)
Conclusion This paper presents a method of Wikipedia This paper presents a method of Wikipedia classification. Since NCD based profile creation proved to perform well for non-overlapping categories, we have experimented with this method, coupled with the method that exploits IDES and title terms for profile creation. The IDES of the Wikipedia documents which contain domain specific terms helped to improve the performance of overall classification. Combination of two classifiers has shown better results than any of the classifiers taken individually. We also plan to extend this method, by exploring more Wikipedia specific structures such as links in a document.ic structures such as links in a document.
Data source Experiment responses  + , Wikipedia pages  +
Doi 10.1504/IJKEDM.2010.032582 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22A%2Bnegative%2Bcategory%2Bbased%2Bapproach%2Bfor%2BWikipedia%2Bdocument%2Bclassification%22  +
Has author Meenakshi Sundaram Murugeshan + , K. Lakshmi + , Saswati Mukherjee +
Has domain Computer science +
Has topic Text classification +
Month April  +
Pages 84-97  +
Peer reviewed Yes  +
Publication type Journal article  +
Published in International Journal of Knowledge Engineering and Data Mining +
Research design Experiment  +
Research questions This paper presents a profile based methodThis paper presents a profile based method for Wikipedia XML document classification. This research aims on exploiting profile-based classification. The focus of the work is on improving the profile creation thereby improving the performance of classification.proving the performance of classification.
Revid 10,637  +
Theories Undetermined
Theory type Design and action  +
Title A negative category based approach for Wikipedia document classification
Unit of analysis Article  +
Url http://inderscience.metapress.com/content/m538150712242802/  +
Volume 1  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language Not specified  +
Wikipedia page type Article  +
Year 2010  +
Creation dateThis property is a special property in this wiki. 13 March 2012 12:20:13  +
Categories Text classification  + , Computer science  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:19:41  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.