Japanese-Chinese information retrieval with an iterative weighting scheme

From WikiLit
Jump to: navigation, search
Publication (help)
Japanese-Chinese information retrieval with an iterative weighting scheme
Authors: Chu-Cheng Lin, Yu-Chun Wang, Richard Tzong-Han Tsai [edit item]
Citation: Journal of Information Science and Engineering 26 (2): 685-697. 2010. Institute of Information Science, Academia sinica.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: Define doi.
Google Scholar cites: Not available
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Japanese-Chinese information retrieval with an iterative weighting scheme is a publication by Chu-Cheng Lin, Yu-Chun Wang, Richard Tzong-Han Tsai.


[edit] Abstract

This paper describes our {Japanese-Chinese} cross language information retrieval system. We adopt query-translation" approach and employ both a conventional {Japanese-Chinese} bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system we propose that query terms should be processed differently based on their written forms. We use an iterative method for weight-tuning and term disambiguation which is based on the {PageRank} algorithm. When evaluating on the {NTCIR-5} test set our system achieves as high as 0.2217 and 0.2276 in relax {MAP} {(Mean} Average Precision) measurement of T-runs and D-runs."

[edit] Research questions

"This paper describes our Japanese-Chinese cross language information retrieval system. We adopt “query-translation” approach and employ both a conventional Japanese- Chinese bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system, we propose that query terms should be processed differently based on their written forms. We use an iterative method for weighttuning and term disambiguation, which is based on the PageRank algorithm."

Research details

Topics: Cross-language information retrieval [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Undetermined" [edit item]
Research design: Mathematical modeling [edit item]
Data source: Documents, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Chinese, Japanese [edit item]

[edit] Conclusion

"We exploited the nature of Japanese vocabulary and the Japanese writing system for better translations. Using Kanji for translation yields significant improvements in our evaluation. The results of the evaluation confirm that foreign terms are widely transcribed in Katakana. To cope with ambiguity, we have adopted an iterative disambiguating scheme. The current implementation of this scheme, which uses the likelihood function as its weight function, proved to be effective in the evaluation. Our system has MAP as high as 0.2276, and outperforms the previous NT-CIR-5 CLIR Japanese-Chinese T-runs’ best rigid MAP by 111%, and D-runs’ by 19%."

[edit] Comments


Further notes[edit]

Facts about "Japanese-Chinese information retrieval with an iterative weighting scheme"RDF feed
AbstractThis paper describes our {Japanese-ChineseThis paper describes our {Japanese-Chinese} cross language information retrieval system. We adopt query-translation" approach and employ both a conventional {Japanese-Chinese} bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system we propose that query terms should be processed differently based on their written forms. We use an iterative method for weight-tuning and term disambiguation which is based on the {PageRank} algorithm. When evaluating on the {NTCIR-5} test set our system achieves as high as 0.2217 and 0.2276 in relax {MAP} {(Mean} Average Precision) measurement of T-runs and D-runs."cision) measurement of T-runs and D-runs."
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionWe exploited the nature of Japanese vocabuWe exploited the nature of Japanese vocabulary and the Japanese writing system for

better translations. Using Kanji for translation yields significant improvements in our evaluation. The results of the evaluation confirm that foreign terms are widely transcribed in Katakana. To cope with ambiguity, we have adopted an iterative disambiguating scheme. The current implementation of this scheme, which uses the likelihood function as its weight function, proved to be effective in the evaluation. Our system has MAP as high as 0.2276, and outperforms the previous NT-CIR-5

CLIR Japanese-Chinese T-runs’ best rigid MAP by 111%, and D-runs’ by 19%.
est rigid MAP by 111%, and D-runs’ by 19%.
Data sourceDocuments + and Wikipedia pages +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Japanese-Chinese%2Binformation%2Bretrieval%2Bwith%2Ban%2Biterative%2Bweighting%2Bscheme%22 +
Has authorChu-Cheng Lin +, Yu-Chun Wang + and Richard Tzong-Han Tsai +
Has domainComputer science +
Has topicCross-language information retrieval +
Issue2 +
Pages685-697 +
Peer reviewedYes +
Publication typeJournal article +
Published inJournal of Information Science and Engineering +
PublisherInstitute of Information Science, Academia sinica +
Research designMathematical modeling +
Research questionsThis paper describes our Japanese-Chinese This paper describes our Japanese-Chinese cross language information retrieval system.

We adopt “query-translation” approach and employ both a conventional Japanese- Chinese bilingual dictionary and Wikipedia to translate query terms. We propose that Wikipedia can be regarded as a good dictionary for named entity translation. According to the nature of Japanese writing system, we propose that query terms should be processed differently based on their written forms. We use an iterative method for weighttuning

and term disambiguation, which is based on the PageRank algorithm.
which is based on the PageRank algorithm.
Revid11,191 +
TheoriesUndetermined
Theory typeDesign and action +
TitleJapanese-Chinese information retrieval with an iterative weighting scheme
Unit of analysisArticle +
Urlhttp://cat.inist.fr/?aModele=afficheN&cpsidt=22521034 +
Volume26 +
Wikipedia coverageSample data +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageChinese + and Japanese +
Wikipedia page typeArticle +
Year2010 +