The YAGO-NAGA approach to knowledge discovery

From WikiLit
Revision as of 16:30, February 6, 2014 by Ochado (Talk | contribs) (Changed action research to design science)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Publication (help)
The YAGO-NAGA approach to knowledge discovery
Authors: Gjergji Kasneci, Maya Ramanath, Fabian M. Suchanek, Gerhard Weikum [edit item]
Citation: SIGMOD Record 37 (4): 41-47. 2008.
Publication type: Journal article
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1519103.1519110.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
The YAGO-NAGA approach to knowledge discovery is a publication by Gjergji Kasneci, Maya Ramanath, Fabian M. Suchanek, Gerhard Weikum.


[edit] Abstract

This paper gives an overview on the {YAGO-NAGA} approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. {YAGO} harvests infoboxes and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of {WordNet} in order to ensure that all entities have proper classes and the class system is consistent. Currently, the {YAGO} knowledge base contains about 19 million instances of binary relations for about 1.95 million entities. Based on intensive sampling, its accuracy is estimated to be above 95 percent. The paper presents the architecture of the {YAGO} extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for {YAGO}, coined {NAGA.} It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources.

[edit] Research questions

"This paper gives an overview on the YAGO-NAGA approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. The paper presents the architecture of the YAGO extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for YAGO, coined NAGA. It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources."

Research details

Topics: Information extraction [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Other [edit item]
Theories: "Undetermined" [edit item]
Research design: Design science [edit item]
Data source: Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Live Wikipedia [edit item]
Wikipedia page type: Other [edit item]
Wikipedia language: Not specified [edit item]

[edit] Conclusion

"The YAGO knowledge base represents all facts in the form of unary and binary relations: classes of individual entities, and pairs of entities connected by specific relationship types. This data model can be seen as a typed graph with entities and classes corresponding to nodes and relations corresponding to edges. It can also be interpreted as a collection of RDF triples with two adjacent nodes and their connecting edge denoting a (subject, predicate, object) triple."

[edit] Comments

"YAGO harvests infoboxes and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of WordNet in order to ensure that all entities have proper classes and the class system is consistent."


Further notes[edit]

Facts about "The YAGO-NAGA approach to knowledge discovery"RDF feed
AbstractThis paper gives an overview on the {YAGO-This paper gives an overview on the {YAGO-NAGA} approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. {YAGO} harvests infoboxes and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of {WordNet} in order to ensure that all entities have proper classes and the class system is consistent. Currently, the {YAGO} knowledge base contains about 19 million instances of binary relations for about 1.95 million entities. Based on intensive sampling, its accuracy is estimated to be above 95 percent. The paper presents the architecture of the {YAGO} extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for {YAGO}, coined {NAGA.} It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources.racted from natural-language text sources.
Added by wikilit teamAdded on initial load +
Collected data time dimensionCross-sectional +
CommentsYAGO harvests infoboxes

and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of WordNet in order to ensure that all

entities have proper classes and the class system is consistent.
ConclusionThe YAGO knowledge base represents all facThe YAGO knowledge base represents all facts in the

form of unary and binary relations: classes of individual entities, and pairs of entities connected by specific relationship types. This data model can be seen as a typed graph with entities and classes corresponding to nodes and relations corresponding to edges. It can also be interpreted as a collection of RDF triples with two

adjacent nodes and their connecting edge denoting a (subject, predicate, object) triple.
ing a (subject, predicate, object) triple.
Data sourceWikipedia pages +
Doi10.1145/1519103.1519110 +
Google scholar urlhttp://scholar.google.com/scholar?ie=UTF-8&q=%22The%2BYAGO-NAGA%2Bapproach%2Bto%2Bknowledge%2Bdiscovery%22 +
Has authorGjergji Kasneci +, Maya Ramanath +, Fabian M. Suchanek + and Gerhard Weikum +
Has domainComputer science +
Has topicInformation extraction +
Issue4 +
Pages41-47 +
Peer reviewedYes +
Publication typeJournal article +
Published inSIGMOD Record +
Research designDesign science +
Research questionsThis paper gives an overview on the YAGO-NThis paper gives an overview on the YAGO-NAGA approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. The paper presents the architecture of the YAGO extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for YAGO, coined NAGA. It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources.racted from natural-language text sources.
Revid11,152 +
TheoriesUndetermined
Theory typeDesign and action +
TitleThe YAGO-NAGA approach to knowledge discovery
Unit of analysisArticle +
Urlhttp://dx.doi.org/10.1145/1519103.1519110 +
Volume37 +
Wikipedia coverageOther +
Wikipedia data extractionLive Wikipedia +
Wikipedia languageNot specified +
Wikipedia page typeOther +
Year2008 +