Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval

From WikiLit
Jump to: navigation, search
Publication (help)
Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval
Authors: Yu-Chun Wang, Richard Tzong-Han Tsai, Wen-Lian Hsu [edit item]
Citation: Expert Systems with Applications 36 : 3990-3995. 2009.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1016/j.eswa.2008.02.067.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval is a publication by Yu-Chun Wang, Richard Tzong-Han Tsai, Wen-Lian Hsu.


[edit] Abstract

Named entity (NE) translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating NEs from Korean to Chinese in order to improve Korean-Chinese cross-language information retrieval (KCIR). The ideographic nature of Chinese makes NE translation difficult because one syllable may map to several Chinese characters. We propose a hybrid NE translation system. First, we integrate two online databases to extend the coverage of our bilingual dictionaries. We use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. We also use Naver.com's people search engine to find a query name's Chinese or English translation. The second component of our system is able to learn Korean-Chinese (K-C), Korean-English (K-E), and English-Chinese (E-C) translation patterns from the web. These patterns can be used to extract K-C, K-E and E-C pairs from Google snippets. We found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search. Mean average precision was as high as 0.3385 and recall reached 0.7578. Our method can handle Chinese, Japanese, Korean, and non-CJK NE translation and improve performance of KCIR substantially.

[edit] Research questions

"Named entity (NE) translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating NEs from Korean to Chinese in order to improve Korean–Chinese cross-language information retrieval (KCIR). The ideographic nature of Chinese makes NE translation difficult because one syllable may map to several Chinese characters. We propose a hybrid NE translation system."

Research details

Topics: Cross-language information retrieval [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Other [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: Documents, Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Chinese, English, Korean [edit item]

[edit] Conclusion

"We found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search. Mean average precision was as high as 0.3385 and recall reached 0.7578. Our method can handle Chinese, Japanese, Korean, and non-CJK NE translation and improve performance of KCIR substantially."

[edit] Comments

"We found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search."


Further notes[edit]

Facts about "Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval"RDF feed
AbstractNamed entity (NE) translation plays an impNamed entity (NE) translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating NEs from Korean to Chinese in order to improve Korean-Chinese cross-language information retrieval (KCIR). The ideographic nature of Chinese makes NE translation difficult because one syllable may map to several Chinese characters. We propose a hybrid NE translation system. First, we integrate two online databases to extend the coverage of our bilingual dictionaries. We use Wikipedia as a translation tool based on the inter-language links between the Korean edition and the Chinese or English editions. We also use Naver.com's people search engine to find a query name's Chinese or English translation. The second component of our system is able to learn Korean-Chinese (K-C), Korean-English (K-E), and English-Chinese (E-C) translation patterns from the web. These patterns can be used to extract K-C, K-E and E-C pairs from Google snippets. We found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search. Mean average precision was as high as 0.3385 and recall reached 0.7578. Our method can handle Chinese, Japanese, Korean, and non-CJK NE translation and improve performance of KCIR substantially.improve performance of KCIR substantially.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsWe found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search.
ConclusionWe found KCIR performance using this hybriWe found KCIR performance using this hybrid configuration over five times better than that a dictionary-based configuration using only Naver people search. Mean average precision was as high as 0.3385 and recall reached 0.7578. Our method can handle Chinese, Japanese, Korean, and non-CJK NE translation and improve performance of KCIR substantially.improve performance of KCIR substantially.
Data sourceDocuments +, Experiment responses + and Wikipedia pages +
Doi10.1016/j.eswa.2008.02.067 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Web-based%2Bpattern%2Blearning%2Bfor%2Bnamed%2Bentity%2Btranslation%2Bin%2BKorean-Chinese%2Bcross-language%2Binformation%2Bretrieval%22 +
Has authorYu-Chun Wang +, Richard Tzong-Han Tsai + and Wen-Lian Hsu +
Has domainComputer science +
Has topicCross-language information retrieval +
Pages3990-3995 +
Peer reviewedYes +
Publication typeJournal article +
Published inExpert Systems with Applications +
Research designExperiment +
Research questionsNamed entity (NE) translation plays an impNamed entity (NE) translation plays an important role in many applications, such as information retrieval and machine translation. In this paper, we focus on translating NEs from Korean to Chinese in order to improve Korean–Chinese cross-language information retrieval (KCIR). The ideographic nature of Chinese makes NE translation difficult because one syllable may map to several Chinese characters. We propose a hybrid NE translation system.We propose a hybrid NE translation system.
Revid11,039 +
TheoriesUndetermined
Theory typeDesign and action +
TitleWeb-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval
Unit of analysisArticle +
Urlhttp://dx.doi.org/10.1016/j.eswa.2008.02.067 +
Volume36 +
Wikipedia coverageOther +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageChinese +, English + and Korean +
Wikipedia page typeArticle +
Year2009 +