Last modified on January 30, 2014, at 20:34

YAGO: a core of semantic knowledge

Publication (help)
YAGO: a core of semantic knowledge
Authors: Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum [edit item]
Citation: WWW '07 Proceedings of the 16th international conference on World Wide Web  : 697-706. 2007.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: 10.1145/1242572.1242667.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
YAGO: a core of semantic knowledge is a publication by Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum.


[edit] Abstract

We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.

[edit] Research questions

"We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper."

Research details

Topics: Semantic relatedness, Ontology building [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Experiment [edit item]
Data source: Experiment responses, Websites, Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English [edit item]

[edit] Conclusion

"The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques."

[edit] Comments

""Our empirical evaluation of fact correctness shows an accuracy of about 95%" p. 697"


Further notes[edit]