Bridging domains using world wide knowledge for transfer learning

From WikiLit
Jump to: navigation, search
Publication (help)
Bridging domains using world wide knowledge for transfer learning
Authors: Evan Wei Xiang, Bin Cao, Derek Hao Hu, Qiang Yang [edit item]
Citation: IEEE Transactions on Knowledge and Data Engineering 22 (6): 770-783. 2010.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1109/TKDE.2010.31.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Bridging domains using world wide knowledge for transfer learning is a publication by Evan Wei Xiang, Bin Cao, Derek Hao Hu, Qiang Yang.


[edit] Abstract

A major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a bridge that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.

[edit] Research questions

"In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a “bridge” that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain."

Research details

Topics: Text classification [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Sample data [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment, Mathematical modeling, Statistical analysis [edit item]
Data source: Experiment responses, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Not specified [edit item]

[edit] Conclusion

"By conducting experiments on different difficult domain adaptation tasks, we show that our algorithm can significantly outperform several existing domain adaptation approaches in situations when the source and target domains are far from each other. In each case, an auxiliary domain can be used to fill in the information gap efficiently. We make three major contributions in this paper. 1) Instead of the traditional instance-based or feature-based perspective to view the problem of domain adaptation, we view the problem from a new perspective, i.e., we consider the problem of transfer learning as one of filling in the information gap based on a large document corpus. We show that we can obtain useful information to bridge the source and the target domains from auxiliary data sources. 2) Instead of devising new models for tackling the domain adaptation problems, we show that we can successfully bridge the source and target domains using well developed semisupervised learning algorithms. 3) We propose a minmargin algorithm that can effectively identify and reduce the information gap between two domains."

[edit] Comments


Further notes[edit]

Facts about "Bridging domains using world wide knowledge for transfer learning"RDF feed
AbstractA major problem of classification learningA major problem of classification learning is the lack of ground-truth labeled data. It is usually expensive to label new data instances for training a model. To solve this problem, domain adaptation in transfer learning has been proposed to classify target domain data by using some other source domain data, even when the data may have different distributions. However, domain adaptation may not work well when the differences between the source and target domains are large. In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a bridge that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain. We conduct experiments on several real-world cross-domain text classification tasks and demonstrate that our proposed approach can outperform several existing domain adaptation approaches significantly.omain adaptation approaches significantly.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
ConclusionBy conducting experiments on different difBy conducting experiments on different difficult

domain adaptation tasks, we show that our algorithm can significantly outperform several existing domain adaptation approaches in situations when the source and target domains are far from each other. In each case, an auxiliary domain can be used to fill in the information gap efficiently. We make three major contributions in this paper. 1) Instead of the traditional instance-based or feature-based perspective to view the problem of domain adaptation, we view the problem from a new perspective, i.e., we consider the problem of transfer learning as one of filling in the information gap based on a large document corpus. We show that we can obtain useful information to bridge the source and the target domains from auxiliary data sources. 2) Instead of devising new models for tackling the domain adaptation problems, we show that we can successfully bridge the source and target domains using well developed semisupervised learning algorithms. 3) We propose a minmargin algorithm that can effectively identify and reduce the information gap between two domains.e

the information gap between two domains.
Data sourceExperiment responses + and Wikipedia pages +
Doi10.1109/TKDE.2010.31 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Bridging%2Bdomains%2Busing%2Bworld%2Bwide%2Bknowledge%2Bfor%2Btransfer%2Blearning%22 +
Has authorEvan Wei Xiang +, Bin Cao +, Derek Hao Hu + and Qiang Yang +
Has domainComputer science +
Has topicText classification +
Issue6 +
Pages770-783 +
Peer reviewedYes +
Publication typeJournal article +
Published inIEEE Transactions on Knowledge and Data Engineering +
Research designExperiment +, Mathematical modeling + and Statistical analysis +
Research questionsIn this paper, we design a novel

transfer In this paper, we design a novel transfer learning approach, called BIG (Bridging Information Gap), to effectively extract useful knowledge in a worldwide knowledge base, which is then used to link the source and target domains for improving the classification performance. BIG works when the source and target domains share the same feature space but different underlying data distributions. Using the auxiliary source data, we can extract a “bridge” that allows cross-domain text classification problems to be solved using standard semisupervised learning algorithms. A major contribution of our work is that with BIG, a large amount of worldwide knowledge can be easily adapted and used for learning in the target domain.nd used

for learning in the target domain.
Revid10,688 +
TheoriesUndetermined
Theory typeDesign and action +
TitleBridging domains using world wide knowledge for transfer learning
Unit of analysisArticle +
Urlhttp://dx.doi.org/10.1109/TKDE.2010.31 +
Volume22 +
Wikipedia coverageSample data +
Wikipedia data extractionDump +
Wikipedia languageNot specified +
Wikipedia page typeArticle +
Year2010 +