An evaluation of medical knowledge contained in Wikipedia and its use in the LOINC database

Publication
Authors: Jeff Friedlin, Clement J McDonald
Citation: Journal of the American Medical Informatics Association 17 (3): 283-287. 2010 May.
Publication type: Journal article
Peer-reviewed: Yes
DOI: 10.1136/jamia.2009.001180.
An evaluation of medical knowledge contained in Wikipedia and its use in the LOINC database is a publication by Jeff Friedlin, Clement J McDonald.

Abstract

The logical observation identifiers names and codes {(LOINC)} database contains 55 000 terms consisting of more atomic components called parts. {LOINC} carries more than 18 000 distinct parts. It is necessary to have definitions/descriptions for each of these parts to assist users in mapping local laboratory codes to {LOINC.} It is believed that much of this information can be obtained from the internet; the first effort was with Wikipedia. This project focused on 1705 laboratory analytes (the first part in the {LOINC} laboratory name). Of the 1705 parts queried, 1314 matching articles were found in Wikipedia. Of these, 1299 (98.9\%) were perfect matches that exactly described the {LOINC} part, 15 (1.14\%) were partial matches (the description in Wikipedia was related to the {LOINC} part, but did not describe it fully), and 102 (7.76\%) were mis-matches. The current release of {RELMA} and {LOINC} include Wikipedia descriptions of {LOINC} parts obtained as a direct result of this project.

Research questions

"The purpose of this study was twofold. First, we wished to evaluate the degree of medical knowledge contained in the online encyclopedia Wikipedia and the feasibility of using that knowledge as a means of adding description information to a laboratory and clinical observations database (LOINC). Second, we desired to test our software’s ability to automatically extract relevant information from Wikipedia based on queries generated from part names taken from the LOINC part database."

Research details

Topics: Other information retrieval topics
Domains: Computer science, Health
Theory type: Design and action
Wikipedia coverage: Sample data
Theories: "Undetermined"
Research design: Design science, Experiment
Data source: Archival records, Experiment responses, Wikipedia pages
Collected data time dimension: Cross-sectional
Unit of analysis: Subject
Wikipedia data extraction: Live Wikipedia
Wikipedia page type: Article
Wikipedia language: English

Conclusion

"We conclude that Wikipedia contains a surprisingly large amount of scientific and medical data and could effectively be used as an initial knowledge base for specific medical informatics and research projects. The software we developed to automate the matching of LOINC part names to Wikipedia articles performed satisfactorily with high sensitivity and moderate specificity. The current release of RELMA and LOINC include descriptions of LOINC parts obtained from Wikipedia as a direct result of this project."

Comments

"Wikipedia pages + LOINC database

"Research design" should also include "Design science". "Experiment" is perhaps also ok."

Further notes