MedScan, a natural language processing engine for MEDLINE abstracts

Motivation: The importance of extracting biomedical information from scientific publications is well recognized. A number of information extraction systems for the biomedical domain have been reported, but none of them have become widely used in practical applications. Most proposals to date make ra...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Bioinformatics Ročník 19; číslo 13; s. 1699 - 1706
Hlavní autori: Novichkova, Svetlana, Egorov, Sergei, Daraselia, Nikolai
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Oxford Oxford University Press 01.09.2003
Oxford Publishing Limited (England)
Predmet:
ISSN:1367-4803, 1460-2059, 1367-4811
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Motivation: The importance of extracting biomedical information from scientific publications is well recognized. A number of information extraction systems for the biomedical domain have been reported, but none of them have become widely used in practical applications. Most proposals to date make rather simplistic assumptions about the syntactic aspect of natural language. There is an urgent need for a system that has broad coverage and performs well in real-text applications. Results: We present a general biomedical domain-oriented NLP engine called MedScan that efficiently processes sentences from MEDLINE abstracts and produces a set of regularized logical structures representing the meaning of each sentence. The engine utilizes a specially developed context-free grammar and lexicon. Preliminary evaluation of the system's performance, accuracy, and coverage exhibited encouraging results. Further approaches for increasing the coverage and reducing parsing ambiguity of the engine, as well as its application for information extraction are discussed. Availability: MedScan is available for commercial licensing from Ariadne Genomics, Inc.
Bibliografia:Contact: nikolai@ariadnegenomics.com
istex:8DB0571D536A2ED0966F964D82780280F33996FE
ark:/67375/HXZ-1T1VBV4N-J
local:btg207
ObjectType-Article-1
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ObjectType-Undefined-1
ObjectType-Feature-3
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btg207