Using Wikipedia to bootstrap open information extraction

From WikiLit
Revision as of 18:44, October 18, 2013 by Ochado (Talk | contribs) (Text replace - "([ ][|]research_design=)([^ ]*Experiment)([^ ]*[ ][|]collected_datatype=)([^ ]*)([^ ]*[ ])" to "\1\2\3Experiment responses, \4\5")

Jump to: navigation, search
Publication (help)
Using Wikipedia to bootstrap open information extraction
Authors: Daniel S. Weld, Raphael Hoffmann, Fei Wu [edit item]
Citation: ACM SIGMOD Record 37 (4): 62-68. 2009.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1519103.1519113.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Using Wikipedia to bootstrap open information extraction is a publication by Daniel S. Weld, Raphael Hoffmann, Fei Wu.


[edit] Abstract

An abstract is not available.

[edit] Research questions

"this paper presents Kylin as a case study of open IE. We start by describing Kylin’s use of Wikipedia to power the self-supervised training of information extractors. Then, in Section 3 we show how Wikipedia training can be seen as a bootstrapping method enabling extraction from the wider set of general Web pages. Not even the best machine-learning algorithms have production-level precision"

Research details

Topics: Information extraction [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Other [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Clone [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: Not specified [edit item]

[edit] Conclusion

"This paper describes Kylin, which uses self-supervised learning to train relationally-targeted extractors from Wikipedia infoboxes. We explained how shrinkage and retraining allow Kylin to improve extractor robustness, and we demonstrate that these extractors can successfully mine tuples from a broader set of Web pages. Finally, we argued that the best way to utilize human efforts is by inviting humans to quickly validate the correctness of machine-generated extractions."

[edit] Comments

"We advocate an alternative approach: using Wikipedia to generate relation-specific training data for a broad set of thousands of relations."


Further notes[edit]

Facts about "Using Wikipedia to bootstrap open information extraction"RDF feed
AbstractAn abstract is not available.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsWe advocate an alternative approach: using Wikipedia to generate relation-specific training data for a broad set of thousands of relations.
ConclusionThis paper describes Kylin, which uses selThis paper describes Kylin, which uses self-supervised learning

to train relationally-targeted extractors from Wikipedia infoboxes. We explained how shrinkage and retraining allow Kylin to improve extractor robustness, and we demonstrate that these extractors can successfully mine tuples from a broader set of Web pages. Finally, we argued that the best way to utilize human efforts is by inviting humans to quickly

validate the correctness of machine-generated extractions.
rectness of machine-generated extractions.
Doi10.1145/1519103.1519113 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22Using%2BWikipedia%2Bto%2Bbootstrap%2Bopen%2Binformation%2Bextraction%22 +
Has authorDaniel S. Weld +, Raphael Hoffmann + and Fei Wu +
Has domainComputer science +
Has topicInformation extraction +
Issue4 +
Pages62-68 +
Peer reviewedYes +
Publication typeJournal article +
Published inACM SIGMOD Record +
Research designExperiment +
Research questionsthis paper presents Kylin as a case study

this paper presents Kylin as a case study of open IE. We start by describing Kylin’s use of Wikipedia to power the self-supervised training of information extractors. Then, in Section 3 we show how Wikipedia training can be seen as a bootstrapping method enabling extraction from the wider set of general Web pages. Not even the

best machine-learning algorithms have production-level precisionalgorithms have production-level precision
Revid9,923 +
TheoriesUndetermined
Theory typeDesign and action +
TitleUsing Wikipedia to bootstrap open information extraction
Unit of analysisArticle +
Urlhttp://0-portal.acm.org.mercury.concordia.ca/citation.cfm?id=1519103.1519113&coll=DL&dl=GUIDE&CFID=112057072&CFTOKEN=12571171&preflayout=flat +
Volume37 +
Wikipedia coverageOther +
Wikipedia data extractionClone +
Wikipedia languageNot specified +
Wikipedia page typeArticle +
Year2009 +