|Codifying collaborative knowledge: using Wikipedia as a basis for automated ontology learning|
|Authors:||Tao Guo, David G. Schwartz, Frada Burstein, Henry Linger|
|Citation:||Knowledge Management Research & Practice 7 (3): 206-17. 2009 September.|
|Publication type:||Journal article|
|Google Scholar cites:||Citations|
|Added by Wikilit team:||Added on initial load|
|Article:||Google Scholar BASE PubMed|
|Other scholarly wikis:||AcaWiki Brede Wiki WikiPapers|
|Web search:||Bing Google Yahoo! — Google PDF|
In the context of knowledge management, ontology construction can be considered as a part of capturing of the body of knowledge of a particular problem domain. Traditionally, ontology construction assumes a tedious codification of the domain experts knowledge. In this paper, we describe a new approach to ontology engineering that has the potential of bridging the dichotomy between codification and collaboration turning to Web 2.0 technology. We propose to shift the primary source of ontology knowledge from the expert to socially emergent bodies of knowledge such as Wikipedia. Using Wikipedia as an example, we demonstrate how core terms and relationships of a domain ontology can be distilled from this socially constructed source. As an illustration, we describe how our approach achieved over 90\% conceptual coverage compared with Gold standard hand-crafted ontologies, such as Cyc. What emerges is not a folksonomy, but rather a formal ontology that has nonetheless found its roots in social knowledge.
"In this paper, we argue that traditional approaches to ontology construction that rely on expert input and published documentation are inconsistent with the dynamic needs to enable situated action. Such approaches require significant effort to address multiple practices and different perspectives of the work domain and often fail in supporting knowledge sharing in time and space. We study an alternative ontology learning technique, which should be more efficient, sufficiently accurate and workable from an engineering perspective. We propose an innovative approach to ontology engineering that has the potential of bridging the traditional dichotomy between codification and collaboration through creative use of the knowledge management technology of Web 2.0. By shifting the primary source of knowledge from the expert to socially emergent bodies of knowledge created as a result of Web 2.0 development, we have identified the potential of using collaborative knowledge, rather than brittle expert knowledge, as the basis for ontology construction."
|Domains:||Information science, Knowledge management|
|Theory type:||Design and action|
|Wikipedia coverage:||Sample data|
|Research design:||Design science, Experiment|
|Data source:||Experiment responses, Wikipedia pages|
|Collected data time dimension:||Cross-sectional|
|Unit of analysis:||Category|
|Wikipedia data extraction:||Live Wikipedia|
|Wikipedia page type:||Article, Information categorization and navigation|
|Wikipedia language:||Not specified|
"We have proposed and illustrated an application of a semiautomatic approach to collaborative ontology learning that shows promising results when compared to two Gold standard hand-crafted ontologies with over 90% CC reached in 1-h effort by a non-expert. Our emerging ability to incorporate such knowledge in ontologies as the basis for knowledge management tools will result in richer, more precise, and more relevant knowledge codification, in an ever-changing world in which access to social knowledge plays an increasingly important role. As we advance testing of the ontology learning component, we expect that the impact on a broader engineering methodology will be substantial, and yet, much more work is needed in this area. Using additional meta-knowledge characteristics of the collaborative corpus as provided by the Wikipedia, API also opens up a number of interesting directions as mentioned above."
""Research design" should be "design science". There is a small evaluation with comparison against WordNet and Cyc. One could argue that "Research design" should also include "experiment".
"Wikipedia language could be set to "English". This is implicit given the words they are using.
"Wikipedia page type" should also include "Information categorization and navigation". These are downloaded.
"Unit of analysis" is mostly "category", - although pages are also used in their system."