Tree-traversing ant algorithm for term clustering based on featureless similarities

From WikiLit
Jump to: navigation, search
Publication (help)
Tree-traversing ant algorithm for term clustering based on featureless similarities
Authors: Wilson Wong, Wei Liu, Mohammed Bennamoun [edit item]
Citation: Data Mining and Knowledge Discovery 15 (3): 349-381. 2007.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1007/s10618-007-0073-y.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Tree-traversing ant algorithm for term clustering based on featureless similarities is a publication by Wilson Wong, Wei Liu, Mohammed Bennamoun.


[edit] Abstract

Many conventional methods for concepts formation in ontology learning have relied on the use of predefined templates and rules, and static resources such as WordNet. Such approaches are not scalable, difficult to port between different domains and incapable of handling knowledge fluctuations. Their results are far from desirable, either. In this paper, we propose a new ant-based clustering algorithm, Tree-Traversing Ant (TTA), for concepts formation as part of an ontology learning system. With the help of Normalized Google Distance (NGD) and n of Wikipedia (nW) as measures for similarity and distance between terms, we attempt to achieve an adaptable clustering method that is highly scalable and portable across domains. Evaluations with an seven datasets show promising results with an average lexical overlap of 97% and ontological improvement of 48%. At the same time, the evaluations demonstrated several advantages that are not simultaneously present in standard ant-based and other conventional clustering methods.

[edit] Research questions

"In this paper, we propose a new antbased clustering algorithm, Tree-Traversing Ant (TTA), for concepts formation as part of an ontology learning system.With the help of Normalized GoogleDistance (NGD) and n◦ ofWikipedia (n◦W) as measures for similarity and distance between terms, we attempt to achieve an adaptable clustering method that is highly scalable and portable across domains."

Research details

Topics: Ontology building [edit item]
Domains: Computer science [edit item]
Theory type: Analysis, Design and action [edit item]
Wikipedia coverage: Other [edit item]
Theories: "the TTAs will employ a new measure called n◦ ofWikipedia (n◦W) for quantifying the distance between two terms based on the cross-linking of Wikipedia articles (Wong et al. 2006)." [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"Seven of the most notable strength of the TTA with NGD and n◦W are: – Able to further distinguish hidden structures within clusters; – Flexible in regards to the discovery of clusters; – Capable of identifying and isolating outliers; – Tolerance to differing cluster sizes; – Able to produce consistent results; – Able to identify implicit taxonomic relationships between clusters; and – Inherent capability of coping with synonyms, word senses and the fluctuation in terms usage."

[edit] Comments

"we have proposed the innovative use of featureless similarity based on Normalized Google Distance (NGD) and n◦ of Wikipedia (n◦W). The use of the two similarity measures as part of a new hybrid clustering algorithm called Tree-Traversing Ant (TTA) demonstrated excellent results during our evaluations."


Further notes[edit]