Exploring words with semantic correlations from Chinese Wikipedia

From WikiLit
Revision as of 02:20, January 29, 2014 by Mehdi (Talk | contribs) (changed the collected data time dimension)

Jump to: navigation, search
Publication (help)
Exploring words with semantic correlations from Chinese Wikipedia
Authors: Yun Li, Kaiyan Huang, Fuji Ren, Yixin Zhong [edit item]
Citation: 5th IFIP International Conference on Intelligent Information Processing  : . 2008 October 19-22. Beijing.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: 10.1007/978-0-387-87685-6_14.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Exploring words with semantic correlations from Chinese Wikipedia is a publication by Yun Li, Kaiyan Huang, Fuji Ren, Yixin Zhong.


[edit] Abstract

This paper introduces a way of exploring words with semantic relations from Chinese Wikipedia documents. A corpus with structured documents is generated from Chinese Wikipedia pages. Then considering of the hyperlinks, text overlaps and word frequencies, word pairs with semantic relations are explored. Words can be self clustered into groups with tight semantic relations. We roughly measure the semantic relatedness with different document based algorithms and analyze the reliability of our measures in comparing experiment.

[edit] Research questions

"In this paper, we work on semantic correlation between Chinese words based onWikipedia documents. A corpus with about 50,000 structured documents is generated fromWikipedia pages. Then considering of hyper-links, text overlaps and word frequency, about 300,000 word pairs with semantic correlations are explored from these documents.We roughly measure the degree of semantic correlations and find groups with tight semantic correlations by self clustering."

Research details

Topics: Semantic relatedness [edit item]
Domains: Computer science [edit item]
Theory type: Analysis [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Chinese [edit item]

[edit] Conclusion

"In this paper, the Chinese Wikipedia pages are used for semantic related word searching. Considering of hyper-links, text overlaps and word frequency, 360,304 word pairs with semantic correlations are explored from 54,745 structured documents from Wikipedia. We also roughly measured semantic correlations, analyzed the reliability of our measures. As with similar hierarchical structure, algorithms and applications for WordNet, Hownet may be transplanted toWikipedia. Semantic Relatedness is used to measuring the degree of semantic correlations, not considering of the difference of relation types. By analyzing the properties of different algorithms based on text overlap or information contents, we are hoping to find a reliable way of searching for groups with semantic correlations and compute the semantic relatedness. For research on semantic relations in NLP, Wikipedia could be employed more in future works. Acknowledgements This research has been partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research (B), 19300029. Thanks to Associate Professor Suzuki, and Doctor Matsumoto from The University of Tokushima for instructions."

[edit] Comments

"For research on semantic relations in NLP, Wikipedia could be employed more in future works"


Further notes[edit]

Facts about "Exploring words with semantic correlations from Chinese Wikipedia"RDF feed
AbstractThis paper introduces a way of exploring wThis paper introduces a way of exploring words with semantic relations from Chinese Wikipedia documents. A corpus with structured documents is generated from Chinese Wikipedia pages. Then considering of the hyperlinks, text overlaps and word frequencies, word pairs with semantic relations are explored. Words can be self clustered into groups with tight semantic relations. We roughly measure the semantic relatedness with different document based algorithms and analyze the reliability of our measures in comparing experiment.y of our measures in comparing experiment.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsFor research on semantic relations in NLP, Wikipedia could be employed more in future works
ConclusionIn this paper, the Chinese Wikipedia pagesIn this paper, the Chinese Wikipedia pages are used for semantic related word

searching. Considering of hyper-links, text overlaps and word frequency, 360,304 word pairs with semantic correlations are explored from 54,745 structured documents from Wikipedia. We also roughly measured semantic correlations, analyzed the reliability of our measures. As with similar hierarchical structure, algorithms and applications for WordNet, Hownet may be transplanted toWikipedia. Semantic Relatedness is used to measuring the degree of semantic correlations, not considering of the difference of relation types. By analyzing the properties of different algorithms based on text overlap or information contents, we are hoping to find a reliable way of searching for groups with semantic correlations and compute the semantic relatedness. For research on semantic relations in NLP, Wikipedia could be employed more in future works. Acknowledgements This research has been partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Scientific Research (B), 19300029. Thanks to Associate

Professor Suzuki, and Doctor Matsumoto from The University of Tokushima for instructions.
University of Tokushima for instructions.
Conference locationBeijing +
Dates19-22 +
Doi10.1007/978-0-387-87685-6 14 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Exploring%2Bwords%2Bwith%2Bsemantic%2Bcorrelations%2Bfrom%2BChinese%2BWikipedia%22 +
Has authorYun Li +, Kaiyan Huang +, Fuji Ren + and Yixin Zhong +
Has domainComputer science +
Has topicSemantic relatedness +
MonthOctober +
Peer reviewedYes +
Publication typeConference paper +
Published in5th IFIP International Conference on Intelligent Information Processing +
Research designExperiment +
Research questionsIn this paper, we work on semantic correlaIn this paper, we work on semantic correlation between Chinese words

based onWikipedia documents. A corpus with about 50,000 structured documents is generated fromWikipedia pages. Then considering of hyper-links, text overlaps and word frequency, about 300,000 word pairs with semantic correlations are explored from these documents.We roughly measure the degree of semantic correlations and

find groups with tight semantic correlations by self clustering.
semantic correlations by self clustering.
Revid10,608 +
TheoriesUndetermined
Theory typeAnalysis +
TitleExploring words with semantic correlations from Chinese Wikipedia
Unit of analysisArticle +
Urlhttp://www.springerlink.com/content/d11702t4503n2678/ +
Wikipedia coverageSample data +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageChinese +
Wikipedia page typeArticle +
Year2008 +