Does the magic of BERT apply to medical code assignment? A quantitative study

Unsupervised pretraining is an integral part of many natural language processing systems, and transfer learning with language models has achieved remarkable results in downstream tasks. In the clinical application of medical code assignment, diagnosis and procedure codes are inferred from lengthy cl...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Computers in biology and medicine Ročník 139; s. 104998
Hlavní autoři:	Ji, Shaoxiong, Hölttä, Matti, Marttinen, Pekka
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	United States Elsevier Ltd 01.12.2021 Elsevier Limited
Témata:	BERT Building codes Codes Empirical analysis Information management Internal Medicine Knowledge Language Machine learning Medical code assignment Medical coding Natural Language Processing Neural networks Other Pretrained language models Quantitative analysis Quantitative study Semantics Transfer learning Medical code assignment BERT Quantitative study Pretrained language models
ISSN:	0010-4825, 1879-0534, 1879-0534
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Unsupervised pretraining is an integral part of many natural language processing systems, and transfer learning with language models has achieved remarkable results in downstream tasks. In the clinical application of medical code assignment, diagnosis and procedure codes are inferred from lengthy clinical notes such as hospital discharge summaries. However, it is not clear if pretrained models are useful for medical code prediction without further architecture engineering. This paper conducts a comprehensive quantitative analysis of various contextualized language models' performances, pretrained in different domains, for medical code assignment from clinical notes. We propose a hierarchical fine-tuning architecture to capture interactions between distant words and adopt label-wise attention to exploit label information. Contrary to current trends, we demonstrate that a carefully trained classical CNN outperforms attention-based models on a MIMIC-III subset with frequent codes. Our empirical findings suggest directions for building robust medical code assignment models. •A study on knowledge transfer via mixed-domain and task-adaptive language model.•A thorough comparative study to answer the research questions.•A hierarchical BERT architecture with a label attention mechanism.•Vanilla CNN model with appropriate training can improve the predictive performance.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0010-4825 1879-0534 1879-0534
DOI:	10.1016/j.compbiomed.2021.104998