Automatic Medical Report Generation via Latent Space Conditioning and Transformers
This paper presents a comprehensive exploration of integrating artificial intelligence (AI) in the healthcare sector, focusing on the development and implementation of a novel framework called VAE-GPT. Our architecture combines Variational Autoencoder (VAE) and Generative Pre-trained Transformer (GP...
Uložené v:
| Vydané v: | IEEE International Conference on Dependable, Autonomic and Secure Computing (Online) s. 0428 - 0435 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
14.11.2023
|
| Predmet: | |
| ISSN: | 2837-0740 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | This paper presents a comprehensive exploration of integrating artificial intelligence (AI) in the healthcare sector, focusing on the development and implementation of a novel framework called VAE-GPT. Our architecture combines Variational Autoencoder (VAE) and Generative Pre-trained Transformer (GPT), to generate high-quality medical reports. The VAE component enables the model to learn a latent space representation of the images, capturing the underlying patterns and structures. The GPT component leverages the power of transformer-based language models to generate coherent and contextually relevant text. Additionally, a novel metric, Medical Embeddings Attention Distance (MEAD), is proposed in order to capture the semantic similarity between the generated and training medical reports, taking into account the importance of specific words determined by the attention module. Experiments on real dataset demonstrate that our framework achieves state-of-the-art comparable performances in generating accurate and informative medical reports. |
|---|---|
| ISSN: | 2837-0740 |
| DOI: | 10.1109/DASC/PiCom/CBDCom/Cy59711.2023.10361320 |