Last modified on January 30, 2014, at 20:21

Categorising social tags to improve folksonomy-based recommendations

Publication (help)
Categorising social tags to improve folksonomy-based recommendations
Authors: Iván Cantador, Ioannis Konstas, Joemon M. Jose [edit item]
Citation: Journal of Web Semantics  : . 2011.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: Define doi.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Categorising social tags to improve folksonomy-based recommendations is a publication by Iván Cantador, Ioannis Konstas, Joemon M. Jose.


[edit] Abstract

In social tagging systems, users have different purposes when they annotate items. Tags not only depict the content of the annotated items, for example by listing the objects that appear in a photo, or express contextual information about the items, for example by providing the location or the time in which a photo was taken, but also describe subjective qualities and opinions about the items, or can be related to organisational aspects, such as self-references and personal tasks.

Current folksonomy-based search and recommendation models exploit the social tag space as a whole to retrieve those items relevant to a tag-based query or user profile, and do not take into consideration the purposes of tags. We hypothesise that a significant percentage of tags are noisy for content retrieval, and believe that the distinction of the personal intentions underlying the tags may be beneficial to improve the accuracy of search and recommendation processes.

We present a mechanism to automatically filter and classify raw tags in a set of purpose-oriented categories. Our approach finds the underlying meanings (concepts) of the tags, mapping them to semantic entities belonging to external knowledge bases, namely WordNet and Wikipedia, through the exploitation of ontologies created within the W3C Linking Open Data initiative. The obtained concepts are then transformed into semantic classes that can be uniquely assigned to content- and context-based categories. The identification of subjective and organisational tags is based on natural language processing heuristics.

We collected a representative dataset from Flickr social tagging system, and conducted an empirical study to categorise real tagging data, and evaluate whether the resultant tags categories really benefit a recommendation model using the Random Walk with Restarts method. The results show that content- and context-based tags are considered superior to subjective and organisational tags, achieving equivalent performance to using the whole tag space.

[edit] Research questions

"We present a mechanism to automatically filter and classify raw tags in a set of purpose-oriented categories. Our approach finds the underlying meanings (concepts) of the tags, mapping them to semantic entities belonging to external knowledge bases, namely WordNet and Wikipedia, through the exploitation of ontologies created within the W3C Linking Open Data initiative. The obtained concepts are then transformed into semantic classes that can be uniquely assigned to content- and context-based categories. The identification of subjective and organisational tags is based on natural language processing heuristics."

Research details

Topics: Ontology building [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Measuring the relatedness of two nodes in the graph can be

achieved using the Random Walks with Restarts (RWR) theory (L. Lovasz, 1996)" [edit item]

Research design: Experiment [edit item]
Data source: Experiment responses, Websites, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Secondary dataset [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"Analysing our categorisation results, we found that, in most of the cases, ambiguities occurred with social tags classified into both content and context categories, especially in those cases where the social tags corresponded to locations. Thus, although it would be convenient to correctly disambiguate and classify such tags, the results obtained with our recommendation model are still valid as its most accurate recommendations were obtained exploiting content- and context-based tags. Ambiguities in subjective and organisational tags may occur but their influence in the recommendations is relatively much lower. Nonetheless, for recommendation purposes, we find very interesting the possibility of exploring sentiment analysis approaches to enhance our subjective and organisational tag categorisation strategy based on regular expressions. As discussed in the paper, theremayexist incorrect tag assignments to subjective subcategories. For example, the tag bad hotel is categorised by our approach as a “quality” tag as it satisfies the [*<adjective><noun>*] regular expression, whereas it should be categorised as an “opinion” tag."

[edit] Comments

"Websites (Flickr, WOrdnet) Wikipedia articles"


Further notes[edit]