Last modified on January 30, 2014, at 20:27

Extracting lexical semantic knowledge from Wikipedia and Wiktionary

Publication (help)
Extracting lexical semantic knowledge from Wikipedia and Wiktionary
Authors: Torsten Zesch, Christof Müller, Iryna Gurevych [edit item]
Citation: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)  : . 2008.
Publication type: Conference paper
Peer-reviewed: Yes
Database(s):
DOI: Define doi.
Google Scholar cites: Citations
Link(s): Paper link
Added by Wikilit team: Added on initial load
Search
Article: Google Scholar BASE PubMed
Other scholarly wikis: AcaWiki Brede Wiki WikiPapers
Web search: Bing Google Yahoo!Google PDF
Other:
Services
Format: BibTeX
Extracting lexical semantic knowledge from Wikipedia and Wiktionary is a publication by Torsten Zesch, Christof Müller, Iryna Gurevych.


[edit] Abstract

Recently, collaboratively constructed resources such as Wikipedia and Wiktionary have been discovered as valuable lexical semantic knowledge bases with a high potential in diverse Natural Language Processing (NLP) tasks. Collaborative knowledge bases however significantly differ from traditional linguistic knowledge bases in various respects, and this constitutes both an asset and an impediment for research in NLP. This paper addresses one such major impediment, namely the lack of suitable programmatic access mechanisms to the knowledge stored in these large semantic knowledge bases. We present two application programming interfaces for Wikipedia and Wiktionary which are especially designed for mining the rich lexical semantic information dispersed in the knowledge bases, and provide efficient and structured access to the available knowledge. As we believe them to be of general interest to the NLP community, we have made them freely available for research purposes.

[edit] Research questions

"This paper addresses one such major impediment, namely the lack of suitable programmatic access mechanisms to the knowledge stored in these large semantic knowledge bases. We present two application programming interfaces for Wikipedia and Wiktionary which are especially designed for mining the rich lexical semantic information dispersed in the knowledge bases, and provide efficient and structured access to the available knowledge."

Research details

Topics: Computational linguistics [edit item]
Domains: Computer science [edit item]
Theory type: Design and action [edit item]
Wikipedia coverage: Main topic [edit item]
Theories: "Undetermined" [edit item]
Research design: Other [edit item]
Data source: Wikipedia pages [edit item]
Collected data time dimension: Cross-sectional [edit item]
Unit of analysis: Article [edit item]
Wikipedia data extraction: Dump [edit item]
Wikipedia page type: Article [edit item]
Wikipedia language: English, German [edit item]

[edit] Conclusion

"This paper presented Java based APIs that allow for efficient access to Wikipedia and Wiktionary, and demonstrated cases of their usage. As the APIs are freely available for research purposes, we think that they will foster NLP research using the collaborative knowledge bases Wikipedia and Wiktionary"

[edit] Comments

"Research design: design science"


Further notes[edit]