Integrating terminologies into standard SQL: a new approach for research on routine data.

Saved in:
Bibliographic Details
Title: Integrating terminologies into standard SQL: a new approach for research on routine data.
Authors: Sander, André, Wauer, Roland
Source: Journal of Biomedical Semantics; 4/24/2019, Vol. 10 Issue 1, pN.PAG-N.PAG, 1p
Subject Terms: SQL, ELECTRONIC health records, TERMS & phrases, TRANSLATIONAL research
Abstract: Background: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. Methods: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. Results: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. Conclusions: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Biomedical Semantics is the property of BioMed Central and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Complementary Index
Description
Abstract:Background: Most electronic medical records still contain large amounts of free-text data. Semantic evaluation of such data requires the data to be encoded with sufficient classifications or transformed into a knowledge-based database. Methods: We present an approach that allows databases accessible via SQL (Structured Query Language) to be searched directly through semantic queries without the need for further transformations. Therefore, we developed I) an extension to SQL named Ontology-SQL (O-SQL) that allows to use semantic expressions, II) a framework that uses a standard terminology server to annotate free-text containing database tables and III) a parser that rewrites O-SQL to SQL, so that such queries can be passed to the database server. Results: I) We compared several semantic queries published to date and were able to reproduce them in a reduced, highly condensed form. II) The quality of the annotation process was measured against manual annotation, and we found a sensitivity of 97.62% and a specificity of 100.00%. III) Different semantic queries were analyzed, and measured with F-scores between 0.91 and 0.98. Conclusions: We showed that systematic analysis of free-text-containing medical records is possible with standard tools. The seamless connection of ontologies and standard technologies from the database field represents an important constituent of unstructured data analysis. The developed technology can be readily applied to relationally organized data and supports the increasingly important field of translational research. [ABSTRACT FROM AUTHOR]
ISSN:20411480
DOI:10.1186/s13326-019-0199-z