Bioinformatic Workflow Extraction from Scientific Texts based on Word Sense Disambiguation

This paper introduces a method for automatic workflow extraction from texts using Process-Oriented Case-Based Reasoning (POCBR). While the current workflow management systems implement mostly different complicated graphical tasks based on advanced distributed solutions (e.g., cloud computing and gri...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE/ACM transactions on computational biology and bioinformatics Ročník 15; číslo 6; s. 1979 - 1990
Hlavní autori: Halioui, Ahmed, Valtchev, Petko, Diallo, Abdoulaye Banire
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.11.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1545-5963, 1557-9964
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:This paper introduces a method for automatic workflow extraction from texts using Process-Oriented Case-Based Reasoning (POCBR). While the current workflow management systems implement mostly different complicated graphical tasks based on advanced distributed solutions (e.g., cloud computing and grid computation), workflow knowledge acquisition from texts using case-based reasoning represents more expressive and semantic case representations. We propose in this context, an ontology-based workflow extraction framework to acquire processual knowledge from texts. Our methodology extends the classic NLP techniques to extract and disambiguate complex tasks and relations in texts. Using a graph-based representation of workflows and a domain ontology, our extraction process uses a context-aware approach to recognize workflow components in texts: data and control flows. We applied our framework in a technical domain in bioinformatics: i.e., phylogenetic analyses. An evaluation based on workflow semantic similarities in a gold standard proves that our approach provides promising results in the process extraction domain. Both data and implementation of our framework are available in:  http://labo.bioinfo.uqam.ca/tgowler .
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1545-5963
1557-9964
DOI:10.1109/TCBB.2018.2847336