Browse wiki

Jump to: navigation, search
Classifying tags using open content resources
Abstract Tagging has emerged as a popular means to Tagging has emerged as a popular means to annotate on-line objects such as bookmarks, photos and videos. Tags vary in semantic meaning and can describe different aspects of a media object. Tags describe the content of the media as well as locations, dates, people and other associated meta-data. Being able to automatically classify tags into semantic categories allows us to understand better the way users annotate media objects and to build tools for viewing and browsing the media objects. In this paper we present a generic method for classifying tags using third party open content resources, such as Wikipedia and the Open Directory. Our method uses structural patterns that can be extracted from resource meta-data. We describe the implementation of our method on Wikipedia using WordNet categories as our classification schema and ground truth. Two structural patterns found in Wikipedia are used for training and classification: categories and templates. We apply our system to classifying Flickr tags. Compared to a WordNet baseline our method increases the coverage of the Flickr vocabulary by 115%. We can classify many important entities that are not covered by WordNet, such as, London Eye, Big Island, Ronaldinho, geo-caching and wii.g Island, Ronaldinho, geo-caching and wii.
Added by wikilit team Added on initial load  +
Collected data time dimension Cross-sectional  +
Conference location Barcelona, Spain +
Data source Experiment responses  + , Websites  + , Wikipedia pages  +
Dates 9-12 +
Doi 10.1145/1498759.1498810 +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Classifying%2Btags%2Busing%2Bopen%2Bcontent%2Bresources%22  +
Has author Simon Overell + , Börkur Sigurbjörnsson + , Roelof Van Zwol +
Has domain Computer science +
Has topic Text classification +
Pages 64-73  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in WSDM '09 Proceedings of the Second ACM International Conference on Web Search and Data Mining +
Publisher Association for Computing Machinery +
Research design Experiment  +
Research questions In this paper we present a generic method In this paper we present a generic method for classifying tags using third party open content resources, such as Wikipedia and the Open Directory. Our method uses structural patterns that can be extracted from resource meta-data. We describe the implementation of our method on Wikipedia using WordNet categories as our classi cation schema and ground truth.our classi cation schema and ground truth.
Revid 10,694  +
Theories Undetermined
Theory type Design and action  +
Title Classifying tags using open content resources
Unit of analysis Article  +
Url http://dl.acm.org/citation.cfm?id=1498810  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Dump  +
Wikipedia language English  +
Wikipedia page type Article  +
Year 2009  +
Creation dateThis property is a special property in this wiki. 15 March 2012 20:24:57  +
Categories Text classification  + , Computer science  + , Publications with missing conclusion  + , Publications with missing comments  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:21:42  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.