Automatic Medical Report Generation via Latent Space Conditioning and Transformers

This paper presents a comprehensive exploration of integrating artificial intelligence (AI) in the healthcare sector, focusing on the development and implementation of a novel framework called VAE-GPT. Our architecture combines Variational Autoencoder (VAE) and Generative Pre-trained Transformer (GP...

Full description

Saved in:
Bibliographic Details
Published in:IEEE International Conference on Dependable, Autonomic and Secure Computing (Online) pp. 0428 - 0435
Main Authors: Adornetto, Carlo, Guzzo, Antonella, Vasile, Andrea
Format: Conference Proceeding
Language:English
Published: IEEE 14.11.2023
Subjects:
ISSN:2837-0740
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a comprehensive exploration of integrating artificial intelligence (AI) in the healthcare sector, focusing on the development and implementation of a novel framework called VAE-GPT. Our architecture combines Variational Autoencoder (VAE) and Generative Pre-trained Transformer (GPT), to generate high-quality medical reports. The VAE component enables the model to learn a latent space representation of the images, capturing the underlying patterns and structures. The GPT component leverages the power of transformer-based language models to generate coherent and contextually relevant text. Additionally, a novel metric, Medical Embeddings Attention Distance (MEAD), is proposed in order to capture the semantic similarity between the generated and training medical reports, taking into account the importance of specific words determined by the attention module. Experiments on real dataset demonstrate that our framework achieves state-of-the-art comparable performances in generating accurate and informative medical reports.
ISSN:2837-0740
DOI:10.1109/DASC/PiCom/CBDCom/Cy59711.2023.10361320