Multi-modal transformer architecture for medical image analysis and automated report generation

Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automat...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Scientific reports Ročník 14; číslo 1; s. 19281 - 18
Hlavní autoři: Raminedi, Santhosh, Shridevi, S., Won, Daehan
Médium: Journal Article
Jazyk:angličtina
Vydáno: London Nature Publishing Group UK 20.08.2024
Nature Publishing Group
Nature Portfolio
Témata:
ISSN:2045-2322, 2045-2322
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder–decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.
AbstractList Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder-decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.
Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder-decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder-decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.
Abstract Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder–decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.
ArticleNumber 19281
Author Won, Daehan
Shridevi, S.
Raminedi, Santhosh
Author_xml – sequence: 1
  givenname: Santhosh
  orcidid: 0000-0002-8795-9454
  surname: Raminedi
  fullname: Raminedi, Santhosh
  organization: School of Computer Science and Engineering, Vellore Institute of Technology
– sequence: 2
  givenname: S.
  orcidid: 0000-0002-0038-7212
  surname: Shridevi
  fullname: Shridevi, S.
  email: shridevi.s@vit.ac.in
  organization: Centre for Advanced Data Science, Vellore Institute of Technology
– sequence: 3
  givenname: Daehan
  orcidid: 0000-0002-2566-8061
  surname: Won
  fullname: Won, Daehan
  organization: Department of Systems Science and Industrial Engineering, The State University of New York (SUNY), Binghamton University
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39164302$$D View this record in MEDLINE/PubMed
BookMark eNp9Uk1v1DAQjVAR_aB_gAOKxIVLwF9x7BNCFdBKRVzgbE3sSepVEi-2g9R_j3e3lLaH-mJr_ObNm5l3Wh0tYcGqekPJB0q4-pgEbbVqCBON1FrRpn1RnTAi2oZxxo4evI-r85Q2pJyWaUH1q-qYayoFJ-ykMt_XKftmDg6mOkdY0hDijLGGaG98RpvXiHWJ1TM6bwvIzzBiDQtMt8mn8nA1rDnMkNHVEbch5nrEBSNkH5bX1csBpoTnd_dZ9evrl58Xl831j29XF5-vG9sKmhuHPXfEScUYyLZIl9YyAMZU3yGAVdINA5cMJbQAciAd7QnXPZGi110ZyFl1deB1ATZmG4vKeGsCeLMPhDgaiNnbCY3uSK8V010pIRyn0FvhNIHOUjW0Q1e4Ph24tmtfura4lMFMj0gf_yz-xozhj6GUc0n0Ts37O4YYfq-Yspl9sjhNsGBYk-FEt7QTSqsCffcEuglrLNPdo4QWRHVtQb19KOley79FFgA7AGwMKUUc7iGUmJ1hzMEwphjG7A1jdqzqSZL1eb-20pafnk_lh9RU6iwjxv-yn8n6C5Xt1ec
CitedBy_id crossref_primary_10_1038_s41598_025_95666_8
crossref_primary_10_1016_j_inffus_2025_103442
crossref_primary_10_1162_imag_a_00548
crossref_primary_10_14710_jmasif_16_1_73127
crossref_primary_10_1038_s41598_025_04344_2
crossref_primary_10_1016_j_procs_2025_07_131
crossref_primary_10_3389_fdgth_2025_1535168
crossref_primary_10_1158_0008_5472_CAN_24_3630
crossref_primary_10_4103_jcecho_jcecho_62_25
crossref_primary_10_1007_s00371_025_04109_y
crossref_primary_10_1109_ACCESS_2025_3554184
crossref_primary_10_1007_s12672_025_02910_8
crossref_primary_10_4274_balkanmedj_galenos_2025_2025_4_69
Cites_doi 10.1109/tmi.2022.3167808
10.1016/j.ipm.2019.102178
10.1016/j.jksuci.2020.04.001
10.1016/j.media.2023.102798
10.1109/access.2022.3232719
10.1609/aaai.v33i01.33016666
10.1109/tmi.2024.3372638
10.1093/jamia/ocv080
10.1007/s11604-023-01487-y
10.1007/s11280-022-01013-6
10.1109/access.2021.3056175
10.1016/j.media.2022.102510
10.1101/2022.08.30.22279318
10.1147/jrd.2015.2393193
10.1016/j.imu.2021.100557
10.1016/j.compbiomed.2022.105382
10.1016/b978-0-12-822468-7.00014-6
10.1016/j.procs.2023.08.094
10.1109/icdm.2019.00083
10.18653/v1/2020.emnlp-main.112
10.1109/cvprw59228.2023.00383
10.1007/978-3-030-32226-7_80
10.1109/cvpr52688.2022.01179
10.1145/3581754.3584126
10.1007/978-3-030-60248-2_48
10.1109/cvpr46437.2021.01354
10.1007/978-3-030-69541-5_36
10.1007/978-3-030-00928-1_52
10.1109/cvpr.2016.274
10.1109/iccv51070.2023.00268
10.1145/3616855.3635691
10.1007/978-3-031-20053-3_30
ContentType Journal Article
Copyright The Author(s) 2024
2024. The Author(s).
The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
The Author(s) 2024 2024
Copyright_xml – notice: The Author(s) 2024
– notice: 2024. The Author(s).
– notice: The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: The Author(s) 2024 2024
DBID C6C
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7X7
7XB
88A
88E
88I
8FE
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
FYUFA
GHDGH
GNUQQ
HCIFZ
K9.
LK8
M0S
M1P
M2P
M7P
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
7X8
5PM
DOA
DOI 10.1038/s41598-024-69981-5
DatabaseName Springer Nature OA Free Journals
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
ProQuest Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Biology Database (Alumni Edition)
Medical Database (Alumni Edition)
Science Database (Alumni Edition)
ProQuest SciTech Collection
ProQuest Natural Science Collection
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
ProQuest Central Essentials
Biological Science Collection
AUTh Library subscriptions: ProQuest Central
Natural Science Collection
ProQuest One Community College
ProQuest Central
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Biological Science Collection
Health & Medical Collection (Alumni Edition)
PML(ProQuest Medical Library)
Science Database (ProQuest)
Biological Science Database
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Open Access Full Text
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Biology Journals (Alumni Edition)
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
ProQuest Health & Medical Research Collection
Health Research Premium Collection
Health and Medicine Complete (Alumni Edition)
Natural Science Collection
ProQuest Central Korea
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
ProQuest Science Journals (Alumni Edition)
ProQuest Biological Science Collection
ProQuest Central Basic
ProQuest Science Journals
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest SciTech Collection
ProQuest Hospital Collection (Alumni)
ProQuest Health & Medical Complete
ProQuest Medical Library
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList MEDLINE
Publicly Available Content Database

CrossRef
MEDLINE - Academic


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2045-2322
EndPage 18
ExternalDocumentID oai_doaj_org_article_970b98297cc24d31abc4d90a7c18f5f7
PMC11336090
39164302
10_1038_s41598_024_69981_5
Genre Journal Article
GrantInformation_xml – fundername: Vellore Institute of Technology, Chennai
GroupedDBID 0R~
3V.
4.4
53G
5VS
7X7
88A
88E
88I
8FE
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKDD
ABDBF
ABUWG
ACGFS
ACSMW
ACUHS
ADBBV
ADRAZ
AENEX
AEUYN
AFKRA
AJTQC
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
AZQEC
BAWUL
BBNVY
BCNDV
BENPR
BHPHI
BPHCQ
BVXVI
C6C
CCPQU
DIK
DWQXO
EBD
EBLON
EBS
ESX
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
KQ8
LK8
M0L
M1P
M2P
M48
M7P
M~E
NAO
OK1
PIMPY
PQQKQ
PROAC
PSQYO
RNT
RNTTT
RPM
SNYQT
UKHRP
AASML
AAYXX
AFFHD
AFPKN
CITATION
PHGZM
PHGZT
PJZUB
PPXIY
PQGLB
CGR
CUY
CVF
ECM
EIF
NPM
7XB
8FK
K9.
PKEHL
PQEST
PQUKI
Q9U
7X8
5PM
ID FETCH-LOGICAL-c541t-deb3d0d6822a652046cc2aa228b7eaac86dff362e6a5aa6f071b039b064b97103
IEDL.DBID DOA
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001295308500071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2045-2322
IngestDate Fri Oct 03 12:51:42 EDT 2025
Tue Nov 04 02:05:35 EST 2025
Sun Nov 09 09:57:20 EST 2025
Tue Oct 07 07:52:50 EDT 2025
Mon Jul 21 06:04:44 EDT 2025
Sat Nov 29 05:24:01 EST 2025
Tue Nov 18 21:22:23 EST 2025
Fri Feb 21 02:38:52 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Vision transformer
Retrieval augmentation
Generative pre-trained transformer
Language English
License 2024. The Author(s).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c541t-deb3d0d6822a652046cc2aa228b7eaac86dff362e6a5aa6f071b039b064b97103
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0002-0038-7212
0000-0002-2566-8061
0000-0002-8795-9454
OpenAccessLink https://doaj.org/article/970b98297cc24d31abc4d90a7c18f5f7
PMID 39164302
PQID 3094940875
PQPubID 2041939
PageCount 18
ParticipantIDs doaj_primary_oai_doaj_org_article_970b98297cc24d31abc4d90a7c18f5f7
pubmedcentral_primary_oai_pubmedcentral_nih_gov_11336090
proquest_miscellaneous_3095174898
proquest_journals_3094940875
pubmed_primary_39164302
crossref_primary_10_1038_s41598_024_69981_5
crossref_citationtrail_10_1038_s41598_024_69981_5
springer_journals_10_1038_s41598_024_69981_5
PublicationCentury 2000
PublicationDate 2024-08-20
PublicationDateYYYYMMDD 2024-08-20
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-20
  day: 20
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle Scientific reports
PublicationTitleAbbrev Sci Rep
PublicationTitleAlternate Sci Rep
PublicationYear 2024
Publisher Nature Publishing Group UK
Nature Publishing Group
Nature Portfolio
Publisher_xml – name: Nature Publishing Group UK
– name: Nature Publishing Group
– name: Nature Portfolio
References Nakaura, Yoshida, Kobayashi, Shiraishi, Nagayama, Uetani, Kidoh, Hokamura, Funama, Hirai (CR2) 2023; 42
Li, Liu, Wang, Chang, Liang (CR30) 2022; 26
CR17
Mohsan, Akram, Rasool, Alghamdi, Baqai, Abbas (CR21) 2022; 11
CR16
Chen, Yang, Wei, Heidari, Zheng, Li, Chen, Hu, Zhou, Guan (CR13) 2022; 144
Hou, Zhao, Liu, Chang, Hu (CR3) 2021; 9
CR12
CR11
CR10
CR32
Shaukat, Tanzeem, Ahmad, Ahmad (CR33) 2021; 1
Yang, Wu, Ge, Zhou, Xiao (CR19) 2022; 80
Iqbal, Qureshi (CR31) 2022; 34
Alfarghaly, Khaled, Elkorany, Helal, Fahmy (CR6) 2021; 24
Li, Liang, Hu, Xing (CR22) 2019; 33
Kisilev, Walach, Barkan, Ophir, Alpert, Hashoul (CR15) 2015; 59
Danu, Marica, Karn, Georgescu, Mansoor, Ghesu (CR18) 2023; 221
Liu, Guo, Yong, Xu (CR29) 2024; 1
Liu, Li, Hu, Guan, Tian (CR1) 2020; 57
CR5
Yu (CR14) 2023; 4
CR7
Demner-Fushman, Kohli, Rosenman, Shooshan, Rodriguez, Antani, Thoma, McDonald (CR26) 2016; 23
CR28
CR9
CR27
CR25
CR24
Yang, Wu, Ge, Zheng, Zhou, Xiao (CR8) 2023; 86
CR23
CR20
Dalmaz, Yurt, Cukur (CR4) 2022; 41
O Alfarghaly (69981_CR6) 2021; 24
T Nakaura (69981_CR2) 2023; 42
S Yang (69981_CR8) 2023; 86
Y Chen (69981_CR13) 2022; 144
S Yang (69981_CR19) 2022; 80
69981_CR10
69981_CR32
69981_CR11
F Yu (69981_CR14) 2023; 4
69981_CR25
69981_CR23
69981_CR24
69981_CR27
69981_CR28
O Dalmaz (69981_CR4) 2022; 41
T Iqbal (69981_CR31) 2022; 34
P Kisilev (69981_CR15) 2015; 59
A Liu (69981_CR29) 2024; 1
M Li (69981_CR30) 2022; 26
MD Danu (69981_CR18) 2023; 221
D Demner-Fushman (69981_CR26) 2016; 23
CY Li (69981_CR22) 2019; 33
M Liu (69981_CR1) 2020; 57
69981_CR20
MS Shaukat (69981_CR33) 2021; 1
69981_CR12
69981_CR5
D Hou (69981_CR3) 2021; 9
69981_CR16
69981_CR17
69981_CR9
69981_CR7
MM Mohsan (69981_CR21) 2022; 11
References_xml – volume: 41
  start-page: 2598
  issue: 10
  year: 2022
  end-page: 2614
  ident: CR4
  article-title: ResViT: Residual vision transformers for multimodal medical image synthesis
  publication-title: IEEE Trans. Med. Imaging
  doi: 10.1109/tmi.2022.3167808
– volume: 57
  start-page: 102178
  issue: 2
  year: 2020
  ident: CR1
  article-title: Image caption generation with a dual attention mechanism
  publication-title: Inf. Process. Manag.
  doi: 10.1016/j.ipm.2019.102178
– volume: 34
  start-page: 2515
  issue: 6
  year: 2022
  end-page: 2528
  ident: CR31
  article-title: The survey: Text generation models in deep learning
  publication-title: J. King Saudi Univ. Comput. Inf. Sci.
  doi: 10.1016/j.jksuci.2020.04.001
– volume: 86
  start-page: 102798
  year: 2023
  ident: CR8
  article-title: Radiology report generation with a learned knowledge base and multi-modal alignment
  publication-title: Med. Image Anal.
  doi: 10.1016/j.media.2023.102798
– volume: 11
  start-page: 1814
  year: 2022
  end-page: 1824
  ident: CR21
  article-title: Vision transformer and language model-based radiology report generation
  publication-title: IEEE Access
  doi: 10.1109/access.2022.3232719
– volume: 33
  start-page: 6666
  issue: 01
  year: 2019
  end-page: 6673
  ident: CR22
  article-title: Knowledge-driven encoding, retrieval, and paraphrasing for medical image report generation
  publication-title: Proc. AAAI Conf. Artif. Intell.
  doi: 10.1609/aaai.v33i01.33016666
– ident: CR16
– volume: 1
  start-page: 1
  year: 2024
  ident: CR29
  article-title: Multi-grained radiology report generation with sentence-level image-language contrastive learning
  publication-title: IEEE Trans. Med. Imaging
  doi: 10.1109/tmi.2024.3372638
– ident: CR12
– ident: CR10
– volume: 23
  start-page: 304
  issue: 2
  year: 2016
  end-page: 310
  ident: CR26
  article-title: Preparing a collection of radiology examinations for distribution and retrieval
  publication-title: J. Am. Med. Inform. Assoc.
  doi: 10.1093/jamia/ocv080
– volume: 42
  start-page: 190
  issue: 2
  year: 2023
  end-page: 200
  ident: CR2
  article-title: Preliminary assessment of automated radiology report generation with generative pre-trained transformers: Comparing results to radiologist-generated reports
  publication-title: Jpn. J. Radiol.
  doi: 10.1007/s11604-023-01487-y
– volume: 26
  start-page: 253
  issue: 1
  year: 2022
  end-page: 270
  ident: CR30
  article-title: Auxiliary signal-guided knowledge encoder–decoder for medical report generation
  publication-title: World Wide Web
  doi: 10.1007/s11280-022-01013-6
– volume: 9
  start-page: 21236
  year: 2021
  end-page: 21250
  ident: CR3
  article-title: Automatic report generation for chest X-ray images via adversarial reinforcement learning
  publication-title: IEEE Access
  doi: 10.1109/access.2021.3056175
– ident: CR25
– volume: 80
  start-page: 102510
  year: 2022
  ident: CR19
  article-title: Knowledge matters chest radiology report generation with general and specific knowledge
  publication-title: Med. Image Anal.
  doi: 10.1016/j.media.2022.102510
– ident: CR27
– ident: CR23
– volume: 4
  start-page: 9
  year: 2023
  ident: CR14
  article-title: Evaluating progress in automatic chest X-ray radiology report generation
  publication-title: Patterns
  doi: 10.1101/2022.08.30.22279318
– volume: 59
  start-page: 2
  issue: 2/3
  year: 2015
  ident: CR15
  article-title: From medical images to automatic medical report generation
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/jrd.2015.2393193
– volume: 24
  start-page: 100557
  year: 2021
  ident: CR6
  article-title: Automated radiology report generation using conditioned transformers
  publication-title: Inform. Med. Unlocked
  doi: 10.1016/j.imu.2021.100557
– ident: CR17
– volume: 144
  start-page: 105382
  year: 2022
  ident: CR13
  article-title: Generative adversarial networks in medical image augmentation: A review
  publication-title: Comput. Biol. Med.
  doi: 10.1016/j.compbiomed.2022.105382
– ident: CR11
– ident: CR9
– ident: CR32
– ident: CR5
– ident: CR7
– volume: 1
  start-page: 221
  year: 2021
  end-page: 231
  ident: CR33
  article-title: Semantic similarity-based descriptive answer evaluation
  publication-title: Web Seman.
  doi: 10.1016/b978-0-12-822468-7.00014-6
– ident: CR28
– ident: CR24
– volume: 221
  start-page: 1102
  year: 2023
  end-page: 1109
  ident: CR18
  article-title: Generation of radiology findings in chest X-ray by leveraging collaborative knowledge
  publication-title: Procedia Comput. Sci.
  doi: 10.1016/j.procs.2023.08.094
– ident: CR20
– volume: 80
  start-page: 102510
  year: 2022
  ident: 69981_CR19
  publication-title: Med. Image Anal.
  doi: 10.1016/j.media.2022.102510
– ident: 69981_CR11
  doi: 10.1109/icdm.2019.00083
– volume: 9
  start-page: 21236
  year: 2021
  ident: 69981_CR3
  publication-title: IEEE Access
  doi: 10.1109/access.2021.3056175
– ident: 69981_CR10
  doi: 10.18653/v1/2020.emnlp-main.112
– ident: 69981_CR25
  doi: 10.1109/cvprw59228.2023.00383
– volume: 41
  start-page: 2598
  issue: 10
  year: 2022
  ident: 69981_CR4
  publication-title: IEEE Trans. Med. Imaging
  doi: 10.1109/tmi.2022.3167808
– volume: 24
  start-page: 100557
  year: 2021
  ident: 69981_CR6
  publication-title: Inform. Med. Unlocked
  doi: 10.1016/j.imu.2021.100557
– volume: 59
  start-page: 2
  issue: 2/3
  year: 2015
  ident: 69981_CR15
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/jrd.2015.2393193
– volume: 11
  start-page: 1814
  year: 2022
  ident: 69981_CR21
  publication-title: IEEE Access
  doi: 10.1109/access.2022.3232719
– ident: 69981_CR7
  doi: 10.1007/978-3-030-32226-7_80
– ident: 69981_CR12
  doi: 10.1109/cvpr52688.2022.01179
– volume: 4
  start-page: 9
  year: 2023
  ident: 69981_CR14
  publication-title: Patterns
  doi: 10.1101/2022.08.30.22279318
– ident: 69981_CR32
  doi: 10.1145/3581754.3584126
– ident: 69981_CR28
  doi: 10.1007/978-3-030-60248-2_48
– ident: 69981_CR24
  doi: 10.1109/cvpr46437.2021.01354
– ident: 69981_CR23
  doi: 10.1007/978-3-030-69541-5_36
– volume: 23
  start-page: 304
  issue: 2
  year: 2016
  ident: 69981_CR26
  publication-title: J. Am. Med. Inform. Assoc.
  doi: 10.1093/jamia/ocv080
– volume: 57
  start-page: 102178
  issue: 2
  year: 2020
  ident: 69981_CR1
  publication-title: Inf. Process. Manag.
  doi: 10.1016/j.ipm.2019.102178
– volume: 34
  start-page: 2515
  issue: 6
  year: 2022
  ident: 69981_CR31
  publication-title: J. King Saudi Univ. Comput. Inf. Sci.
  doi: 10.1016/j.jksuci.2020.04.001
– volume: 86
  start-page: 102798
  year: 2023
  ident: 69981_CR8
  publication-title: Med. Image Anal.
  doi: 10.1016/j.media.2023.102798
– volume: 42
  start-page: 190
  issue: 2
  year: 2023
  ident: 69981_CR2
  publication-title: Jpn. J. Radiol.
  doi: 10.1007/s11604-023-01487-y
– ident: 69981_CR9
  doi: 10.1007/978-3-030-00928-1_52
– ident: 69981_CR17
  doi: 10.1109/cvpr.2016.274
– volume: 144
  start-page: 105382
  year: 2022
  ident: 69981_CR13
  publication-title: Comput. Biol. Med.
  doi: 10.1016/j.compbiomed.2022.105382
– volume: 221
  start-page: 1102
  year: 2023
  ident: 69981_CR18
  publication-title: Procedia Comput. Sci.
  doi: 10.1016/j.procs.2023.08.094
– ident: 69981_CR20
  doi: 10.1109/iccv51070.2023.00268
– volume: 1
  start-page: 221
  year: 2021
  ident: 69981_CR33
  publication-title: Web Seman.
  doi: 10.1016/b978-0-12-822468-7.00014-6
– volume: 33
  start-page: 6666
  issue: 01
  year: 2019
  ident: 69981_CR22
  publication-title: Proc. AAAI Conf. Artif. Intell.
  doi: 10.1609/aaai.v33i01.33016666
– volume: 26
  start-page: 253
  issue: 1
  year: 2022
  ident: 69981_CR30
  publication-title: World Wide Web
  doi: 10.1007/s11280-022-01013-6
– ident: 69981_CR27
– volume: 1
  start-page: 1
  year: 2024
  ident: 69981_CR29
  publication-title: IEEE Trans. Med. Imaging
  doi: 10.1109/tmi.2024.3372638
– ident: 69981_CR16
  doi: 10.1145/3616855.3635691
– ident: 69981_CR5
  doi: 10.1007/978-3-031-20053-3_30
SSID ssj0000529419
Score 2.5307992
Snippet Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of...
Abstract Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual...
SourceID doaj
pubmedcentral
proquest
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 19281
SubjectTerms 692/699
692/699/1785
692/699/1785/3193
Algorithms
Automation
Deep learning
Diagnostic Imaging - methods
Efficiency
Generative pre-trained transformer
Humanities and Social Sciences
Humans
Image processing
Image Processing, Computer-Assisted - methods
Information processing
Medical research
multidisciplinary
Natural language processing
Radiology
Retrieval augmentation
Science
Science (multidisciplinary)
Semantics
Vision transformer
Visual perception
SummonAdditionalLinks – databaseName: Science Database (ProQuest)
  dbid: M2P
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Jb9UwELaggMSFfQkUZCRuYNVbEvuEAFFxgKoHkHqzvJYn8ZLyFiT-PWNnKY-lF26R40h2Zsbz2eP5BqHnqXYxCZ6ItW0iMgRBwO9EEmsfdRNYoLEkCn9oj47UyYk-Hg_c1uO1ymlNLAt16H0-Iz8QsA_RMvOvvzr7RnLVqBxdHUtoXEZXANmwfKXrIz-ez1hyFEsyPebKUKEO1uCvck4Zl6SBjQYj9Y4_KrT9f8Oaf16Z_C1uWtzR4c3_ncgtdGMEovj1oDm30aXY3UHXhtKUP-4iUzJzybIP0Gkzodu4wr9GHjC04eUQ6sGLJSxN2I4kJ_AQsN1uekDEMeAhNIFPC8l11oV76PPhu09v35OxGAPxtWQbEmDXHWhoAFDYpuawrfaeW8u5cm201qsmpATeMDa2trZJAF0cFdoB5HEaYIy4j_a6vosPEa6Zizyq2qaUJAA41UqXIm-8kiJSpyrEJpEYPzKV54IZX02JmAtlBjEaEKMpYjR1hV7M35wNPB0X9n6TJT33zBzbpaFfnZrRZI1uqdM58xgmKoNg1nkZNLWtZyrVqa3Q_iRgMxr-2pxLt0LP5tdgsjkOY7vYb0ufzA-uNMz0waBW80hyHrQUlFdI7SjczlB333SLL4UWnDEhGqpphV5Ounk-rn__i0cXT-Mxus6zudC8oO6jvc1qG5-gq_77ZrFePS329hO1BTS8
  priority: 102
  providerName: ProQuest
Title Multi-modal transformer architecture for medical image analysis and automated report generation
URI https://link.springer.com/article/10.1038/s41598-024-69981-5
https://www.ncbi.nlm.nih.gov/pubmed/39164302
https://www.proquest.com/docview/3094940875
https://www.proquest.com/docview/3095174898
https://pubmed.ncbi.nlm.nih.gov/PMC11336090
https://doaj.org/article/970b98297cc24d31abc4d90a7c18f5f7
Volume 14
WOSCitedRecordID wos001295308500071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: DOA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: M~E
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: AUTh Library subscriptions: ProQuest Central
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: BENPR
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: M7P
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest_Health & Medical Collection
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: 7X7
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: PIMPY
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Science Database
  customDbUrl:
  eissn: 2045-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000529419
  issn: 2045-2322
  databaseCode: M2P
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/sciencejournals
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3dixMxEA96p3Av4vetnmUF33S5bD52k0dP7lDwyiIK9Snk8yzYVtqt4H_vJNnW1s8XX8KyyUJ2ZpL5DZP5BaFngRsfKAmV1m2omHO0Ar_jK8-tl42rHfapUPhtOx6LyUR2O1d9xTNhmR44C-5UttjIWP9pLWGO1tpY5iTWra1F4CHVkeNW7gRTmdWbSFbLoUoGU3G6Ak8Vq8kIqxoIMeqK73miRNj_O5T562HJnzKmyRFd3Ea3BgRZvswzv4Ou-flddDPfKfntHlKppLaaLRwM6jew1C_L3ZRBCe_KWc7RlNMZ7CmlHthJ4MGVet0vAMp6V-acQnmV2KmjEu-jDxfn71-9roZbFCrLWd1XDsJlh10DSEA3nEA8DJLUmhBhWq-1FY0LAdyYbzTXugmAOQym0gBWMRLwB32ADuaLuT9GJa-NJ15wHUJggLxEy0zwpLGCUY-NKFC9kaiyA8V4vOnis0qpbipU1oICLaikBcUL9Hz7zZdMsPHX0WdRUduRkRw7vQCTUYPJqH-ZTIFONmpWw4pdKQpxrmSR379AT7fdsNZiAkXP_WKdxkRibyHhTx9mq9jOJBYwM4pJgcSevexNdb9nPv2U-LzrmtIGS1ygFxvT-jGvP8vi0f-QxWN0ROKawHG_PEEH_XLtn6Ab9ms_XS1H6Ho7aVMrRujw7HzcvRulhQbtJeli20J72L257D5-Bzj6Lcg
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Jb9QwFLZKAcGFfQkUMBKcwKpjO4l9QIitatVh1EORenMdL2UkZlJmAfVP8Rt5dpIpw9JbD9yixIns5Htbnt_3EHoWitoHzgIxpgpEOMcJ2B1PfGG9Kl3uqE-FwoNqOJQHB2pvDf3oa2HitspeJyZF7Rob_5FvcohDlIj866-Pv5LYNSpmV_sWGi0sdv3JdwjZZq923sP3fc7Y1of9d9uk6ypAbCHyOXEQPjrqSrCMpiwYxIfWMmMYk3XljbGydCGAWvelKYwpA9jgmnJVg-2uFdhjDs-9gC6KyCwWtwqyveU_nZg1E7nqanMol5szsI-xho0JUkJgk5Nixf6lNgF_823_3KL5W542mb-t6__bi7uBrnWONn7TSsZNtOYnt9DltvXmyW2kU-UxGTcOBs17791P8a-ZFQzn8LhNZeHRGFQvNh2JCxw4bBbzBjx-73CbesFHicQ7Yv0O-nQuq7uL1ifNxN9HuMhrz7wsTAhBgIMqK1EHz0orBfe0lhnKewho2zGxx4YgX3TaEcClbmGjATY6wUYXGXqxvOe45SE5c_TbiKzlyMghnk400yPdqSStKlqrWFkNCxWO56a2wilqKpvLUIQqQxs9oHSn2Gb6FE0Zerq8DCop5pnMxDeLNCbyn0sFK73Xwng5k1jnLThlGZIrAF-Z6uqVyehzoj3Pc85LqmiGXvaycDqvf7-LB2cv4wm6sr3_caAHO8Pdh-gqi6JKo_HYQOvz6cI_Qpfst_loNn2cZB2jw_OWkZ9MMZIT
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB6VLSAuvB-BAkGCE1ib2E5iHxACyopVy2oPIJWTcfwoK7Gbsg9Q_xq_jnEeW5ZHbz1wixInspNv5pvJeGYAHvusdJ5RT7QuPOHWMoK844jLjJO5TW3i6kTh_WI0EgcHcrwFP7pcmLCtstOJtaK2lQn_yPsM_RDJQ_31vm-3RYx3By-OvpLQQSpEWrt2Gg1E9tzxd3TfFs-Hu_itn1A6ePP-9VvSdhggJuPpklh0JW1ic2RJnWcUfUVjqNaUirJwWhuRW-9RxbtcZ1rnHvm4TJgskcdLidzM8LnnYBtNck57sD0evht_XP_hCTE0nso2Uydhor9AtgwZbZSTHN2clGQbbFg3Dfibpfvnhs3forY1GQ6u_M-v8Spcbk3w-GUjM9dgy82uw4WmKefxDVB1TjKZVhYHLTu73s3jX2MuMZ6Lp02QK55MUSnHui3vggc21qtlhb6As3ETlIkP6_LeQQpuwoczWd0t6M2qmbsDcZaWjjqRae89R9NVFLz0juZGcOaSUkSQdnBQpq3RHlqFfFH1XgEmVAMhhRBSNYRUFsHT9T1HTYWSU0e_CihbjwzVxesT1fxQtcpKySIpZci5xoVyy1JdGm5loguTCp_5IoKdDlyqVXkLdYKsCB6tL6OyChEoPXPVqh4TKqMLiSu93UB6PZOQAc5ZQiMQG2DfmOrmldnkc10QPU0ZyxOZRPCsk4uTef37Xdw9fRkP4SKKhtofjvbuwSUapDYJrLIDveV85e7DefNtOVnMH7SCH8OnsxaSnwXPnFw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-modal+transformer+architecture+for+medical+image+analysis+and+automated+report+generation&rft.jtitle=Scientific+reports&rft.au=Raminedi%2C+Santhosh&rft.au=Shridevi%2C+S.&rft.au=Won%2C+Daehan&rft.date=2024-08-20&rft.pub=Nature+Publishing+Group+UK&rft.eissn=2045-2322&rft.volume=14&rft_id=info:doi/10.1038%2Fs41598-024-69981-5&rft.externalDocID=PMC11336090
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2045-2322&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2045-2322&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2045-2322&client=summon