Browse wiki

Jump to: navigation, search
Large-scale named entity disambiguation based on Wikipedia data
Abstract This paper presents a large-scale system fThis paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles. both news stories and Wikipedia articles.
Added by wikilit team Yes  +
Collected data time dimension Cross-sectional  +
Conclusion We presented a large scale named entity diWe presented a large scale named entity disambiguation system that employs a huge amount of information automatically extracted from Wikipedia over a space of more than 1.4 million entities. In tests on both real news data and Wikipedia text, the system obtained accuracies exceeding 91% and 88%.obtained accuracies exceeding 91% and 88%.
Conference location Prague +
Data source Experiment responses  + , Wikipedia pages  +
Google scholar url http://scholar.google.com/scholar?ie=UTF-8&q=%22Large-scale%2Bnamed%2Bentity%2Bdisambiguation%2Bbased%2Bon%2BWikipedia%2Bdata%22  +
Has author Silviu Cucerzan +
Has domain Computer science +
Has topic Information extraction + , Other natural language processing topics +
Pages 708-716  +
Peer reviewed Yes  +
Publication type Conference paper  +
Published in Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning +
Publisher Association for Computational Linguistics +
Research design Experiment  +
Research questions The system discussed in this paper performThe system discussed in this paper performs both named entity identification and disambiguation. ... The disambiguation component, which constitutes the main focus of the paper, employs a vast amount of contextual and category information automatically extracted from Wikipedia .... We augment the Wikipedia category information with information automatically extracted from Wikipedia list pages and use it in conjunction with the context information in a vectorial model that employs a novel disambiguation method.hat employs a novel disambiguation method.
Revid 10,845  +
Theories Undetermined
Theory type Design and action  +
Title Large-scale named entity disambiguation based on Wikipedia data
Unit of analysis Article  +
Url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.1582  +
Wikipedia coverage Sample data  +
Wikipedia data extraction Live Wikipedia  +
Wikipedia language English  +
Wikipedia page type Article  +
Year 2007  +
Creation dateThis property is a special property in this wiki. 16 October 2012 16:49:05  +
Categories Information extraction  + , Other natural language processing topics  + , Computer science  + , Publications with missing comments  + , Publications  +
Modification dateThis property is a special property in this wiki. 30 January 2014 20:29:22  +
hide properties that link here 
  No properties link to this page.
 

 

Enter the name of the page to start browsing from.