Multi-modal transformer architecture for medical image analysis and automated report generation
Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automat...
Uloženo v:
| Vydáno v: | Scientific reports Ročník 14; číslo 1; s. 19281 - 18 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
London
Nature Publishing Group UK
20.08.2024
Nature Publishing Group Nature Portfolio |
| Témata: | |
| ISSN: | 2045-2322, 2045-2322 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder–decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation. |
|---|---|
| AbstractList | Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder–decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation. Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder-decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation.Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder-decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation. Abstract Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of the results and report generation by examiners are time-consuming processes that lead to potential delays in diagnosis. We propose an automated report generation model for medical images leveraging an encoder–decoder architecture. Our model utilizes transformer architectures, including Vision Transformer (ViT) and its variants like Data Efficient Image Transformer (DEiT) and BERT pre-training image transformer (BEiT), as an encoder. These transformers are adapted for processing to extract and gain visual information from medical images. Reports are transformed into text embeddings, and the Generative Pre-trained Transformer (GPT2) model is used as a decoder to generate medical reports. Our model utilizes a cross-attention mechanism between the vision transformer and GPT2, which enables it to create detailed and coherent medical reports based on the visual information extracted by the encoder. In our model, we have extended the report generation with general knowledge, which is independent of the inputs and provides a comprehensive report in a broad sense. We conduct our experiments on the Indiana University X-ray dataset to demonstrate the effectiveness of our models. Generated medical reports from the model are evaluated using word overlap metrics such as Bleu scores, Rouge-L, retrieval augmentation answer correctness, and similarity metrics such as skip thought cs, greedy matching, vector extrema, and RAG answer similarity. Results show that our model is performing better than the recurrent models in terms of report generation, answer similarity, and word overlap metrics. By automating the report generation process and incorporating advanced transformer architectures and general knowledge, our approach has the potential to significantly improve the efficiency and accuracy of medical image analysis and report generation. |
| ArticleNumber | 19281 |
| Author | Won, Daehan Shridevi, S. Raminedi, Santhosh |
| Author_xml | – sequence: 1 givenname: Santhosh orcidid: 0000-0002-8795-9454 surname: Raminedi fullname: Raminedi, Santhosh organization: School of Computer Science and Engineering, Vellore Institute of Technology – sequence: 2 givenname: S. orcidid: 0000-0002-0038-7212 surname: Shridevi fullname: Shridevi, S. email: shridevi.s@vit.ac.in organization: Centre for Advanced Data Science, Vellore Institute of Technology – sequence: 3 givenname: Daehan orcidid: 0000-0002-2566-8061 surname: Won fullname: Won, Daehan organization: Department of Systems Science and Industrial Engineering, The State University of New York (SUNY), Binghamton University |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39164302$$D View this record in MEDLINE/PubMed |
| BookMark | eNp9Uk1v1DAQjVAR_aB_gAOKxIVLwF9x7BNCFdBKRVzgbE3sSepVEi-2g9R_j3e3lLaH-mJr_ObNm5l3Wh0tYcGqekPJB0q4-pgEbbVqCBON1FrRpn1RnTAi2oZxxo4evI-r85Q2pJyWaUH1q-qYayoFJ-ykMt_XKftmDg6mOkdY0hDijLGGaG98RpvXiHWJ1TM6bwvIzzBiDQtMt8mn8nA1rDnMkNHVEbch5nrEBSNkH5bX1csBpoTnd_dZ9evrl58Xl831j29XF5-vG9sKmhuHPXfEScUYyLZIl9YyAMZU3yGAVdINA5cMJbQAciAd7QnXPZGi110ZyFl1deB1ATZmG4vKeGsCeLMPhDgaiNnbCY3uSK8V010pIRyn0FvhNIHOUjW0Q1e4Ph24tmtfura4lMFMj0gf_yz-xozhj6GUc0n0Ts37O4YYfq-Yspl9sjhNsGBYk-FEt7QTSqsCffcEuglrLNPdo4QWRHVtQb19KOley79FFgA7AGwMKUUc7iGUmJ1hzMEwphjG7A1jdqzqSZL1eb-20pafnk_lh9RU6iwjxv-yn8n6C5Xt1ec |
| CitedBy_id | crossref_primary_10_1038_s41598_025_95666_8 crossref_primary_10_1016_j_inffus_2025_103442 crossref_primary_10_1162_imag_a_00548 crossref_primary_10_14710_jmasif_16_1_73127 crossref_primary_10_1038_s41598_025_04344_2 crossref_primary_10_1016_j_procs_2025_07_131 crossref_primary_10_3389_fdgth_2025_1535168 crossref_primary_10_1158_0008_5472_CAN_24_3630 crossref_primary_10_4103_jcecho_jcecho_62_25 crossref_primary_10_1007_s00371_025_04109_y crossref_primary_10_1109_ACCESS_2025_3554184 crossref_primary_10_1007_s12672_025_02910_8 crossref_primary_10_4274_balkanmedj_galenos_2025_2025_4_69 |
| Cites_doi | 10.1109/tmi.2022.3167808 10.1016/j.ipm.2019.102178 10.1016/j.jksuci.2020.04.001 10.1016/j.media.2023.102798 10.1109/access.2022.3232719 10.1609/aaai.v33i01.33016666 10.1109/tmi.2024.3372638 10.1093/jamia/ocv080 10.1007/s11604-023-01487-y 10.1007/s11280-022-01013-6 10.1109/access.2021.3056175 10.1016/j.media.2022.102510 10.1101/2022.08.30.22279318 10.1147/jrd.2015.2393193 10.1016/j.imu.2021.100557 10.1016/j.compbiomed.2022.105382 10.1016/b978-0-12-822468-7.00014-6 10.1016/j.procs.2023.08.094 10.1109/icdm.2019.00083 10.18653/v1/2020.emnlp-main.112 10.1109/cvprw59228.2023.00383 10.1007/978-3-030-32226-7_80 10.1109/cvpr52688.2022.01179 10.1145/3581754.3584126 10.1007/978-3-030-60248-2_48 10.1109/cvpr46437.2021.01354 10.1007/978-3-030-69541-5_36 10.1007/978-3-030-00928-1_52 10.1109/cvpr.2016.274 10.1109/iccv51070.2023.00268 10.1145/3616855.3635691 10.1007/978-3-031-20053-3_30 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2024 2024. The Author(s). The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. The Author(s) 2024 2024 |
| Copyright_xml | – notice: The Author(s) 2024 – notice: 2024. The Author(s). – notice: The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: The Author(s) 2024 2024 |
| DBID | C6C AAYXX CITATION CGR CUY CVF ECM EIF NPM 3V. 7X7 7XB 88A 88E 88I 8FE 8FH 8FI 8FJ 8FK ABUWG AEUYN AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO FYUFA GHDGH GNUQQ HCIFZ K9. LK8 M0S M1P M2P M7P PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI Q9U 7X8 5PM DOA |
| DOI | 10.1038/s41598-024-69981-5 |
| DatabaseName | Springer Nature OA Free Journals CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed ProQuest Central (Corporate) Health & Medical Collection ProQuest Central (purchase pre-March 2016) Biology Database (Alumni Edition) Medical Database (Alumni Edition) Science Database (Alumni Edition) ProQuest SciTech Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest One Sustainability ProQuest Central UK/Ireland ProQuest Central Essentials - QC Biological Science Collection ProQuest Central Natural Science Collection ProQuest One Community College ProQuest Central Korea Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) ProQuest Biological Science Collection ProQuest Health & Medical Collection Medical Database Science Database Biological Science Database ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Biology Journals (Alumni Edition) ProQuest Central ProQuest One Applied & Life Sciences ProQuest One Sustainability ProQuest Health & Medical Research Collection Health Research Premium Collection Health and Medicine Complete (Alumni Edition) Natural Science Collection ProQuest Central Korea Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) ProQuest Science Journals (Alumni Edition) ProQuest Biological Science Collection ProQuest Central Basic ProQuest Science Journals ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest SciTech Collection ProQuest Hospital Collection (Alumni) ProQuest Health & Medical Complete ProQuest Medical Library ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | Publicly Available Content Database CrossRef MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 2045-2322 |
| EndPage | 18 |
| ExternalDocumentID | oai_doaj_org_article_970b98297cc24d31abc4d90a7c18f5f7 PMC11336090 39164302 10_1038_s41598_024_69981_5 |
| Genre | Journal Article |
| GrantInformation_xml | – fundername: Vellore Institute of Technology, Chennai |
| GroupedDBID | 0R~ 3V. 4.4 53G 5VS 7X7 88A 88E 88I 8FE 8FH 8FI 8FJ AAFWJ AAJSJ AAKDD ABDBF ABUWG ACGFS ACSMW ACUHS ADBBV ADRAZ AENEX AEUYN AFKRA AJTQC ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS AZQEC BAWUL BBNVY BCNDV BENPR BHPHI BPHCQ BVXVI C6C CCPQU DIK DWQXO EBD EBLON EBS ESX FYUFA GNUQQ GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE KQ8 LK8 M0L M1P M2P M48 M7P M~E NAO OK1 PIMPY PQQKQ PROAC PSQYO RNT RNTTT RPM SNYQT UKHRP AASML AAYXX AFFHD AFPKN CITATION PHGZM PHGZT PJZUB PPXIY PQGLB CGR CUY CVF ECM EIF NPM 7XB 8FK K9. PKEHL PQEST PQUKI Q9U 7X8 5PM |
| ID | FETCH-LOGICAL-c541t-deb3d0d6822a652046cc2aa228b7eaac86dff362e6a5aa6f071b039b064b97103 |
| IEDL.DBID | M2P |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001295308500071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2045-2322 |
| IngestDate | Fri Oct 03 12:51:42 EDT 2025 Tue Nov 04 02:05:35 EST 2025 Sun Nov 09 09:57:20 EST 2025 Tue Oct 07 07:52:50 EDT 2025 Mon Jul 21 06:04:44 EDT 2025 Sat Nov 29 05:24:01 EST 2025 Tue Nov 18 21:22:23 EST 2025 Fri Feb 21 02:38:52 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Vision transformer Retrieval augmentation Generative pre-trained transformer |
| Language | English |
| License | 2024. The Author(s). Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c541t-deb3d0d6822a652046cc2aa228b7eaac86dff362e6a5aa6f071b039b064b97103 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0002-0038-7212 0000-0002-2566-8061 0000-0002-8795-9454 |
| OpenAccessLink | https://www.proquest.com/docview/3094940875?pq-origsite=%requestingapplication% |
| PMID | 39164302 |
| PQID | 3094940875 |
| PQPubID | 2041939 |
| PageCount | 18 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_970b98297cc24d31abc4d90a7c18f5f7 pubmedcentral_primary_oai_pubmedcentral_nih_gov_11336090 proquest_miscellaneous_3095174898 proquest_journals_3094940875 pubmed_primary_39164302 crossref_primary_10_1038_s41598_024_69981_5 crossref_citationtrail_10_1038_s41598_024_69981_5 springer_journals_10_1038_s41598_024_69981_5 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-08-20 |
| PublicationDateYYYYMMDD | 2024-08-20 |
| PublicationDate_xml | – month: 08 year: 2024 text: 2024-08-20 day: 20 |
| PublicationDecade | 2020 |
| PublicationPlace | London |
| PublicationPlace_xml | – name: London – name: England |
| PublicationTitle | Scientific reports |
| PublicationTitleAbbrev | Sci Rep |
| PublicationTitleAlternate | Sci Rep |
| PublicationYear | 2024 |
| Publisher | Nature Publishing Group UK Nature Publishing Group Nature Portfolio |
| Publisher_xml | – name: Nature Publishing Group UK – name: Nature Publishing Group – name: Nature Portfolio |
| References | Nakaura, Yoshida, Kobayashi, Shiraishi, Nagayama, Uetani, Kidoh, Hokamura, Funama, Hirai (CR2) 2023; 42 Li, Liu, Wang, Chang, Liang (CR30) 2022; 26 CR17 Mohsan, Akram, Rasool, Alghamdi, Baqai, Abbas (CR21) 2022; 11 CR16 Chen, Yang, Wei, Heidari, Zheng, Li, Chen, Hu, Zhou, Guan (CR13) 2022; 144 Hou, Zhao, Liu, Chang, Hu (CR3) 2021; 9 CR12 CR11 CR10 CR32 Shaukat, Tanzeem, Ahmad, Ahmad (CR33) 2021; 1 Yang, Wu, Ge, Zhou, Xiao (CR19) 2022; 80 Iqbal, Qureshi (CR31) 2022; 34 Alfarghaly, Khaled, Elkorany, Helal, Fahmy (CR6) 2021; 24 Li, Liang, Hu, Xing (CR22) 2019; 33 Kisilev, Walach, Barkan, Ophir, Alpert, Hashoul (CR15) 2015; 59 Danu, Marica, Karn, Georgescu, Mansoor, Ghesu (CR18) 2023; 221 Liu, Guo, Yong, Xu (CR29) 2024; 1 Liu, Li, Hu, Guan, Tian (CR1) 2020; 57 CR5 Yu (CR14) 2023; 4 CR7 Demner-Fushman, Kohli, Rosenman, Shooshan, Rodriguez, Antani, Thoma, McDonald (CR26) 2016; 23 CR28 CR9 CR27 CR25 CR24 Yang, Wu, Ge, Zheng, Zhou, Xiao (CR8) 2023; 86 CR23 CR20 Dalmaz, Yurt, Cukur (CR4) 2022; 41 O Alfarghaly (69981_CR6) 2021; 24 T Nakaura (69981_CR2) 2023; 42 S Yang (69981_CR8) 2023; 86 Y Chen (69981_CR13) 2022; 144 S Yang (69981_CR19) 2022; 80 69981_CR10 69981_CR32 69981_CR11 F Yu (69981_CR14) 2023; 4 69981_CR25 69981_CR23 69981_CR24 69981_CR27 69981_CR28 O Dalmaz (69981_CR4) 2022; 41 T Iqbal (69981_CR31) 2022; 34 P Kisilev (69981_CR15) 2015; 59 A Liu (69981_CR29) 2024; 1 M Li (69981_CR30) 2022; 26 MD Danu (69981_CR18) 2023; 221 D Demner-Fushman (69981_CR26) 2016; 23 CY Li (69981_CR22) 2019; 33 M Liu (69981_CR1) 2020; 57 69981_CR20 MS Shaukat (69981_CR33) 2021; 1 69981_CR12 69981_CR5 D Hou (69981_CR3) 2021; 9 69981_CR16 69981_CR17 69981_CR9 69981_CR7 MM Mohsan (69981_CR21) 2022; 11 |
| References_xml | – volume: 41 start-page: 2598 issue: 10 year: 2022 end-page: 2614 ident: CR4 article-title: ResViT: Residual vision transformers for multimodal medical image synthesis publication-title: IEEE Trans. Med. Imaging doi: 10.1109/tmi.2022.3167808 – volume: 57 start-page: 102178 issue: 2 year: 2020 ident: CR1 article-title: Image caption generation with a dual attention mechanism publication-title: Inf. Process. Manag. doi: 10.1016/j.ipm.2019.102178 – volume: 34 start-page: 2515 issue: 6 year: 2022 end-page: 2528 ident: CR31 article-title: The survey: Text generation models in deep learning publication-title: J. King Saudi Univ. Comput. Inf. Sci. doi: 10.1016/j.jksuci.2020.04.001 – volume: 86 start-page: 102798 year: 2023 ident: CR8 article-title: Radiology report generation with a learned knowledge base and multi-modal alignment publication-title: Med. Image Anal. doi: 10.1016/j.media.2023.102798 – volume: 11 start-page: 1814 year: 2022 end-page: 1824 ident: CR21 article-title: Vision transformer and language model-based radiology report generation publication-title: IEEE Access doi: 10.1109/access.2022.3232719 – volume: 33 start-page: 6666 issue: 01 year: 2019 end-page: 6673 ident: CR22 article-title: Knowledge-driven encoding, retrieval, and paraphrasing for medical image report generation publication-title: Proc. AAAI Conf. Artif. Intell. doi: 10.1609/aaai.v33i01.33016666 – ident: CR16 – volume: 1 start-page: 1 year: 2024 ident: CR29 article-title: Multi-grained radiology report generation with sentence-level image-language contrastive learning publication-title: IEEE Trans. Med. Imaging doi: 10.1109/tmi.2024.3372638 – ident: CR12 – ident: CR10 – volume: 23 start-page: 304 issue: 2 year: 2016 end-page: 310 ident: CR26 article-title: Preparing a collection of radiology examinations for distribution and retrieval publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocv080 – volume: 42 start-page: 190 issue: 2 year: 2023 end-page: 200 ident: CR2 article-title: Preliminary assessment of automated radiology report generation with generative pre-trained transformers: Comparing results to radiologist-generated reports publication-title: Jpn. J. Radiol. doi: 10.1007/s11604-023-01487-y – volume: 26 start-page: 253 issue: 1 year: 2022 end-page: 270 ident: CR30 article-title: Auxiliary signal-guided knowledge encoder–decoder for medical report generation publication-title: World Wide Web doi: 10.1007/s11280-022-01013-6 – volume: 9 start-page: 21236 year: 2021 end-page: 21250 ident: CR3 article-title: Automatic report generation for chest X-ray images via adversarial reinforcement learning publication-title: IEEE Access doi: 10.1109/access.2021.3056175 – ident: CR25 – volume: 80 start-page: 102510 year: 2022 ident: CR19 article-title: Knowledge matters chest radiology report generation with general and specific knowledge publication-title: Med. Image Anal. doi: 10.1016/j.media.2022.102510 – ident: CR27 – ident: CR23 – volume: 4 start-page: 9 year: 2023 ident: CR14 article-title: Evaluating progress in automatic chest X-ray radiology report generation publication-title: Patterns doi: 10.1101/2022.08.30.22279318 – volume: 59 start-page: 2 issue: 2/3 year: 2015 ident: CR15 article-title: From medical images to automatic medical report generation publication-title: IBM J. Res. Dev. doi: 10.1147/jrd.2015.2393193 – volume: 24 start-page: 100557 year: 2021 ident: CR6 article-title: Automated radiology report generation using conditioned transformers publication-title: Inform. Med. Unlocked doi: 10.1016/j.imu.2021.100557 – ident: CR17 – volume: 144 start-page: 105382 year: 2022 ident: CR13 article-title: Generative adversarial networks in medical image augmentation: A review publication-title: Comput. Biol. Med. doi: 10.1016/j.compbiomed.2022.105382 – ident: CR11 – ident: CR9 – ident: CR32 – ident: CR5 – ident: CR7 – volume: 1 start-page: 221 year: 2021 end-page: 231 ident: CR33 article-title: Semantic similarity-based descriptive answer evaluation publication-title: Web Seman. doi: 10.1016/b978-0-12-822468-7.00014-6 – ident: CR28 – ident: CR24 – volume: 221 start-page: 1102 year: 2023 end-page: 1109 ident: CR18 article-title: Generation of radiology findings in chest X-ray by leveraging collaborative knowledge publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2023.08.094 – ident: CR20 – volume: 80 start-page: 102510 year: 2022 ident: 69981_CR19 publication-title: Med. Image Anal. doi: 10.1016/j.media.2022.102510 – ident: 69981_CR11 doi: 10.1109/icdm.2019.00083 – volume: 9 start-page: 21236 year: 2021 ident: 69981_CR3 publication-title: IEEE Access doi: 10.1109/access.2021.3056175 – ident: 69981_CR10 doi: 10.18653/v1/2020.emnlp-main.112 – ident: 69981_CR25 doi: 10.1109/cvprw59228.2023.00383 – volume: 41 start-page: 2598 issue: 10 year: 2022 ident: 69981_CR4 publication-title: IEEE Trans. Med. Imaging doi: 10.1109/tmi.2022.3167808 – volume: 24 start-page: 100557 year: 2021 ident: 69981_CR6 publication-title: Inform. Med. Unlocked doi: 10.1016/j.imu.2021.100557 – volume: 59 start-page: 2 issue: 2/3 year: 2015 ident: 69981_CR15 publication-title: IBM J. Res. Dev. doi: 10.1147/jrd.2015.2393193 – volume: 11 start-page: 1814 year: 2022 ident: 69981_CR21 publication-title: IEEE Access doi: 10.1109/access.2022.3232719 – ident: 69981_CR7 doi: 10.1007/978-3-030-32226-7_80 – ident: 69981_CR12 doi: 10.1109/cvpr52688.2022.01179 – volume: 4 start-page: 9 year: 2023 ident: 69981_CR14 publication-title: Patterns doi: 10.1101/2022.08.30.22279318 – ident: 69981_CR32 doi: 10.1145/3581754.3584126 – ident: 69981_CR28 doi: 10.1007/978-3-030-60248-2_48 – ident: 69981_CR24 doi: 10.1109/cvpr46437.2021.01354 – ident: 69981_CR23 doi: 10.1007/978-3-030-69541-5_36 – volume: 23 start-page: 304 issue: 2 year: 2016 ident: 69981_CR26 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocv080 – volume: 57 start-page: 102178 issue: 2 year: 2020 ident: 69981_CR1 publication-title: Inf. Process. Manag. doi: 10.1016/j.ipm.2019.102178 – volume: 34 start-page: 2515 issue: 6 year: 2022 ident: 69981_CR31 publication-title: J. King Saudi Univ. Comput. Inf. Sci. doi: 10.1016/j.jksuci.2020.04.001 – volume: 86 start-page: 102798 year: 2023 ident: 69981_CR8 publication-title: Med. Image Anal. doi: 10.1016/j.media.2023.102798 – volume: 42 start-page: 190 issue: 2 year: 2023 ident: 69981_CR2 publication-title: Jpn. J. Radiol. doi: 10.1007/s11604-023-01487-y – ident: 69981_CR9 doi: 10.1007/978-3-030-00928-1_52 – ident: 69981_CR17 doi: 10.1109/cvpr.2016.274 – volume: 144 start-page: 105382 year: 2022 ident: 69981_CR13 publication-title: Comput. Biol. Med. doi: 10.1016/j.compbiomed.2022.105382 – volume: 221 start-page: 1102 year: 2023 ident: 69981_CR18 publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2023.08.094 – ident: 69981_CR20 doi: 10.1109/iccv51070.2023.00268 – volume: 1 start-page: 221 year: 2021 ident: 69981_CR33 publication-title: Web Seman. doi: 10.1016/b978-0-12-822468-7.00014-6 – volume: 33 start-page: 6666 issue: 01 year: 2019 ident: 69981_CR22 publication-title: Proc. AAAI Conf. Artif. Intell. doi: 10.1609/aaai.v33i01.33016666 – volume: 26 start-page: 253 issue: 1 year: 2022 ident: 69981_CR30 publication-title: World Wide Web doi: 10.1007/s11280-022-01013-6 – ident: 69981_CR27 – volume: 1 start-page: 1 year: 2024 ident: 69981_CR29 publication-title: IEEE Trans. Med. Imaging doi: 10.1109/tmi.2024.3372638 – ident: 69981_CR16 doi: 10.1145/3616855.3635691 – ident: 69981_CR5 doi: 10.1007/978-3-031-20053-3_30 |
| SSID | ssj0000529419 |
| Score | 2.5308676 |
| Snippet | Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual interpretation of... Abstract Medical practitioners examine medical images, such as X-rays, write reports based on the findings, and provide conclusive statements. Manual... |
| SourceID | doaj pubmedcentral proquest pubmed crossref springer |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 19281 |
| SubjectTerms | 692/699 692/699/1785 692/699/1785/3193 Algorithms Automation Deep learning Diagnostic Imaging - methods Efficiency Generative pre-trained transformer Humanities and Social Sciences Humans Image processing Image Processing, Computer-Assisted - methods Information processing Medical research multidisciplinary Natural language processing Radiology Retrieval augmentation Science Science (multidisciplinary) Semantics Vision transformer Visual perception |
| SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1JqxQxEA7yUPAi7rY-JYI3DS9bZzmq-PD08KDwbiFb64DTIzM9gv_eStIzzrhevDXp6ibUkqqiUl8h9Mwor6hMgtAsIpFeUOJp1ESrwGzIwg5DGzahLy7M5aV9dzDqq9wJa_DAjXFnVtNgS_9njBz-yXyIMlnqdWRm6IfaR061PUimGqo3t5LZuUuGCnO2AU9Vusm4JApSDEb6I09UAft_F2X-elnyp4ppdUTnN9GNOYLEL9vOb6ErebyNrrWZkt_uIFdbaslylYBo2oWleY0PSwYY1vCy1WjwYglnCvYzOgk8JOy30wpC2ZxwqyngjxWdugjxLvpw_ub967dknqJAYi_ZRBKky4kmBZGAVz2HfBg46T3nJujsfTQqDQO4sax8770aIOYIVNgAsUqwEH-Ie-hkXI35AcIiZh-0jpzmKLOWgSU_CJH6XNKSKDvEdhx1cYYYL5MuPrta6hbGNSk4kIKrUnB9h57vv_nSADb-Sv2qCGpPWcCx6wKojJtVxv1LZTp0uhOzmy124wTkuVYWfP8OPd2_BlsrBRQ_5tW20hRgb2NNh-43rdjvpDQwS0F5h8yRvhxt9fjNuPhU8bwZE0JRSzv0YqdaP_b1Z148_B-8eISu82ITtJyXp-hkWm_zY3Q1fp0Wm_WTalTfAbVIJeI priority: 102 providerName: Directory of Open Access Journals |
| Title | Multi-modal transformer architecture for medical image analysis and automated report generation |
| URI | https://link.springer.com/article/10.1038/s41598-024-69981-5 https://www.ncbi.nlm.nih.gov/pubmed/39164302 https://www.proquest.com/docview/3094940875 https://www.proquest.com/docview/3095174898 https://pubmed.ncbi.nlm.nih.gov/PMC11336090 https://doaj.org/article/970b98297cc24d31abc4d90a7c18f5f7 |
| Volume | 14 |
| WOSCitedRecordID | wos001295308500071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: DOA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: M~E dateStart: 20110101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Biological Science Database customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: M7P dateStart: 20110101 isFulltext: true titleUrlDefault: http://search.proquest.com/biologicalscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: 7X7 dateStart: 20110101 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: BENPR dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: PIMPY dateStart: 20110101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVPQU databaseName: Science Database customDbUrl: eissn: 2045-2322 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000529419 issn: 2045-2322 databaseCode: M2P dateStart: 20110101 isFulltext: true titleUrlDefault: https://search.proquest.com/sciencejournals providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LbxMxEB7RBqReeD8WSrRI3MCqvfau7ROiqBUcGkUIpHCyvLa3RCLZNg8k_j1j7yYlPHrhYkW7jmTvzNjfeDzfALxUla2o8JzQwB0RllNiqZNEVjXTdeC6abpiE3I0UpOJHvcHbsv-WuVmTUwLtW9dPCM_4uiHaBH5199cXJJYNSpGV_sSGnswQGTD4pWus2K8PWOJUSzBdJ8rQ7k6WuJ-FXPKCkEqdDQYKXf2o0Tb_zes-eeVyd_ipmk7Or3zvxO5C7d7IJq_7TTnHtwI8_twqytN-eMBmJSZS2atx06rDboNi_zXyEOOz_JZF-rJpzNcmnLbk5zgD5_b9apFRBx83oUm8vNEch114SF8Pj359O496YsxEFcKtiIevW5PfYWAwlZlgW61c4W1RaFqGax1qvJNg7thqGxpbdUgdKkp1zVCnlojjOGPYH_ezsMTyLkLtpbSFTQ4EaSombcN574M0btxIgO2EYlxPVN5LJjxzaSIOVemE6NBMZokRlNm8Gr7n4uOp-Pa3sdR0tuekWM7PWgX56Y3WaMlrXXMPMaJojYzWzvhNbXSMdWUjczgcCNg0xv-0lxJN4MX29dosjEOY-ehXac-kR9caZXB406ttiOJedCC0yIDtaNwO0PdfTOffk204IxxXlFNM3i90c2rcf37Wzy9fhrP4KCI5kLjgnoI-6vFOjyHm-77arpcDGFPTmRq1RAGxyej8cdhOtYYJkuMrcR2MP5wNv7yEy0iOts |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB6VAoIL70eggJHgBFYd23n4gBCvqlXLqoci9eY6ttOu1N2UfYD6p_iNjJ1ky_LorQduke1EdvLNjCfj-QbgRZmbnEknKPPCUmkEo4bZghZ5larKC1XXbbGJYjAo9_fV7gr86HNhwrHKXidGRe0aG_6Rrwv0Q5QM_OtvT77SUDUqRFf7EhotLLb96Xd02aZvtj7i933J-canvQ-btKsqQG0m0xl16D465nK0jCbPOPqH1nJjOC-rwhtjy9zVNap1n5vMmLxGG1wxoSq03ZVCeyzwuZfgsgzMYuGoIN9d_NMJUTOZqi43h4lyfYr2MeSwcUlzdGxSmi3Zv1gm4G972z-PaP4Wp43mb-Pm__bibsGNbqNN3rWScRtW_PgOXG1Lb57eBR0zj-mocTho1u_e_YT8Glkh2EZGbSiLDEeoeonpSFzwwhEznzW44_eOtKEXchhJvAPW78GXC1ndfVgdN2P_EIiw3lRFYTnzVvpCVqkztRAu88F7szKBtIeAth0TeygIcqzjiQBR6hY2GmGjI2x0lsCrxT0nLQ_JuaPfB2QtRgYO8djQTA51p5K0KlilQmY1LhSlNTWVlU4xU9i0rLO6SGCtB5TuFNtUn6EpgeeLblRJIc5kxr6ZxzGB_7xUZQIPWhgvZhLyvKVgPIFyCeBLU13uGQ-PIu15mgqRM8USeN3Lwtm8_v0uHp2_jGdwbXPv847e2RpsP4brPIgqC8ZjDVZnk7l_Alfst9lwOnkaZZ3AwUXLyE-01JEt |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LbxMxEB6VFBAX3o-FAkaCE1jx2t6HDwgBJSIqRDmAVE6u1_aWSCRb8gD1r_HrGO8jJTx664Hbyp5d2d5vZjwezwzA4zw1KZNOUOaFpdIIRg2zGc3SIlaFF6osm2IT2WiU7--r8Rb86GJhwrXKTibWgtpVNpyR9wXaIUqG_Ov9sr0WMd4dvDj6SkMFqeBp7cppNBDZ88ff0XxbPB_u4r9-wvngzYfXb2lbYYDaRMZL6tCUdMylqCVNmnC0Fa3lxnCeF5k3xuapK0sU8T41iTFpifq4YEIVqMcLhbpZ4HfPwTZuySXvwfZ4-H78aX3CE3xoMlZtpA4TeX-B2jJEtHFJUzRzYppsaMO6aMDfdrp_Xtj8zWtbK8PBlf95Ga_C5XYLTl42PHMNtvzsOlxoinIe3wBdxyTTaeWQaNnt6_2c_OpzIdhGpo2Ti0ymKJSJadO74IMjZrWs0BbwjjROGXJYp_cOXHATPp7J7G5Bb1bN_B0gwnpTZJnlzFvpM1nEzpRCuMQHu87KCOIODtq2OdpDqZAvur4rIHLdQEgjhHQNIZ1E8HT9zlGToeRU6lcBZWvKkF28bqjmh7oVVlplrFAh5honinwcm8JKp5jJbJyXSZlFsNOBS7cib6FPkBXBo3U3CqvggTIzX61qmpAZPVd5BLcbSK9HEiLApWA8gnwD7BtD3eyZTT7XCdHjWIiUKRbBs44vTsb177W4e_o0HsJFZA39bjjauweXeOBaFrTKDvSW85W_D-ftt-VkMX_QMj6Bg7Nmkp9OKpt2 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multi-modal+transformer+architecture+for+medical+image+analysis+and+automated+report+generation&rft.jtitle=Scientific+reports&rft.au=Raminedi%2C+Santhosh&rft.au=Shridevi%2C+S&rft.au=Won%2C+Daehan&rft.date=2024-08-20&rft.pub=Nature+Publishing+Group&rft.eissn=2045-2322&rft.volume=14&rft.issue=1&rft.spage=19281&rft_id=info:doi/10.1038%2Fs41598-024-69981-5&rft.externalDBID=HAS_PDF_LINK |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2045-2322&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2045-2322&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2045-2322&client=summon |