Three-Branch BERT-Based Text Classification Network for Gastroscopy Diagnosis Text

During a hospital visit, a significant volume of Gastroscopy Diagnostic Text (GDT) data are produced, representing the unstructured gastric medical records of patients undergoing gastroscopy. As such, GDTs play a crucial role in evaluating the patient’s health, shaping treatment plans, and schedulin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of crowd science Jg. 8; H. 1; S. 56 - 63
Hauptverfasser: Wang, Zhichao, Zheng, Xiangwei, Zhang, Jinsong, Zhang, Mingzhe
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Tsinghua University Press 01.03.2024
Schlagworte:
ISSN:2398-7294, 2398-7294
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract During a hospital visit, a significant volume of Gastroscopy Diagnostic Text (GDT) data are produced, representing the unstructured gastric medical records of patients undergoing gastroscopy. As such, GDTs play a crucial role in evaluating the patient’s health, shaping treatment plans, and scheduling follow-up visits. However, given the free-text nature of GDTs, which lack a formal structure, physicians often find it challenging to extract meaningful insights from them. Furthermore, while deep learning has made significant strides in the medical domain, to our knowledge, there are not any readily available text-based pre-trained models tailored for GDT classification and analysis. To address this gap, we introduce a Bidirectional Encoder Representations from Transformers (BERT) based three-branch classification network tailored for GDTs. We leverage the robust representation capabilities of the BERT pre-trained model to deeply encode the texts. A unique three-branch decoder structure is employed to pinpoint lesion sites and determine cancer stages. Experimental outcomes validate the efficacy of our approach in GDT classification, with a precision of 0.993 and a recall of 0.784 in the early cancer category. In pinpointing cancer lesion sites, the weighted F1 score achieved was 0.849.
AbstractList During a hospital visit, a significant volume of Gastroscopy Diagnostic Text (GDT) data are produced, representing the unstructured gastric medical records of patients undergoing gastroscopy. As such, GDTs play a crucial role in evaluating the patient’s health, shaping treatment plans, and scheduling follow-up visits. However, given the free-text nature of GDTs, which lack a formal structure, physicians often find it challenging to extract meaningful insights from them. Furthermore, while deep learning has made significant strides in the medical domain, to our knowledge, there are not any readily available text-based pre-trained models tailored for GDT classification and analysis. To address this gap, we introduce a Bidirectional Encoder Representations from Transformers (BERT) based three-branch classification network tailored for GDTs. We leverage the robust representation capabilities of the BERT pre-trained model to deeply encode the texts. A unique three-branch decoder structure is employed to pinpoint lesion sites and determine cancer stages. Experimental outcomes validate the efficacy of our approach in GDT classification, with a precision of 0.993 and a recall of 0.784 in the early cancer category. In pinpointing cancer lesion sites, the weighted F1 score achieved was 0.849.
Author Wang, Zhichao
Zheng, Xiangwei
Zhang, Mingzhe
Zhang, Jinsong
Author_xml – sequence: 1
  givenname: Zhichao
  surname: Wang
  fullname: Wang, Zhichao
  organization: School of Information Science and Engineering, Shandong Normal University,Jinan,China,250358
– sequence: 2
  givenname: Xiangwei
  surname: Zheng
  fullname: Zheng, Xiangwei
  organization: School of Information Science and Engineering, Shandong Normal University,Jinan,China,250358
– sequence: 3
  givenname: Jinsong
  surname: Zhang
  fullname: Zhang, Jinsong
  organization: School of Information Science and Engineering, Shandong Normal University,Jinan,China,250358
– sequence: 4
  givenname: Mingzhe
  surname: Zhang
  fullname: Zhang, Mingzhe
  organization: School of Information Science and Engineering, Shandong Normal University,Jinan,China,250358
BookMark eNp9kNtOAjEQhhuDiYi8gFf7Aos97ba9FETEGE0Qr5vZ0kIRt6TdRHl7l4OJ8cKrmUzm-zPzXaJOHWqL0DXBA1oWSt1MH0evA4opGyiCMWbkDHUpUzIXVPHOr_4C9VNatytUlIoL1UWz-Spamw8j1GaVDcezeT6EZBfZ3H412WgDKXnnDTQ-1NmzbT5DfM9ciNkEUhNDMmG7y-48LOuQfDpQV-jcwSbZ_qn20Nv9eD56yJ9eJtPR7VNuqKAkrxjn0haEOmJlSZkFbipWYYExqUgFJeWVlKUTTjBTgFxA6SQnxChSYLCM9dD0mLsIsNbb6D8g7nQArw-DEJcaYuPNxmpBCDghKV-ogheVALDUKccJVxaDVG2WPGaZ9qcUrdPGN4enmwh-ownWB9V6r1rvVeuT6half9CfU_6BvgHVAIII
CitedBy_id crossref_primary_10_1016_j_bspc_2025_107891
crossref_primary_10_1016_j_dss_2025_114421
crossref_primary_10_1016_j_eswa_2025_126583
crossref_primary_10_1016_j_im_2025_104219
crossref_primary_10_3390_app14188388
crossref_primary_10_1007_s11042_024_20557_5
Cites_doi 10.48550/ARXIV.1907.11692
10.3389/fpubh.2022.925011
10.1023/A:1010933404324
10.1016/j.eswa.2016.03.045
10.3390/biomedicines9101448
10.1145/312624.312647
10.1109/ACCESS.2021.3049734
10.1016/j.jksuci.2022.02.025
10.1007/BF00116251
10.48550/ARXIV.1706.03762
10.1007/978-3-642-24797-2_4
10.1145/3439726
10.1007/978-1-4842-2766-4_7
10.1016/j.jksuci.2023.101610
10.1109/TSMC.2021.3096974
10.1016/j.engappai.2016.02.002
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.26599/IJCS.2023.9100031
DatabaseName CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals (DOAJ)
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Sociology & Social History
EISSN 2398-7294
EndPage 63
ExternalDocumentID oai_doaj_org_article_711af7824d9545b7aae2f9f4149e0a89
10_26599_IJCS_2023_9100031
GroupedDBID 7WY
AAFWJ
AAGBP
AAYXX
ABGJK
ABJCF
ABVLG
ADBBV
AFFHD
AFKRA
AFPKN
ALMA_UNASSIGNED_HOLDINGS
ARAPS
BCNDV
BENPR
BEZIV
BGLVJ
CCPQU
CITATION
DWQXO
EBS
EJD
ESBDL
GEI
GROUPED_DOAJ
H13
HCIFZ
JAVBF
K7-
M0C
M7S
M~E
OK1
PHGZM
PHGZT
PIMPY
PQBIZ
PQGLB
PTHSS
XDTOA
ID FETCH-LOGICAL-c2721-b3448e512f1e8623ea4cb3b07001b1ba624b886f7f73c5a8da6f8411c9150ae33
IEDL.DBID DOA
ISSN 2398-7294
IngestDate Fri Oct 03 12:45:38 EDT 2025
Tue Nov 18 21:54:56 EST 2025
Sat Nov 29 08:07:22 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2721-b3448e512f1e8623ea4cb3b07001b1ba624b886f7f73c5a8da6f8411c9150ae33
OpenAccessLink https://doaj.org/article/711af7824d9545b7aae2f9f4149e0a89
PageCount 8
ParticipantIDs doaj_primary_oai_doaj_org_article_711af7824d9545b7aae2f9f4149e0a89
crossref_citationtrail_10_26599_IJCS_2023_9100031
crossref_primary_10_26599_IJCS_2023_9100031
PublicationCentury 2000
PublicationDate 2024-03-01
PublicationDateYYYYMMDD 2024-03-01
PublicationDate_xml – month: 03
  year: 2024
  text: 2024-03-01
  day: 01
PublicationDecade 2020
PublicationTitle International journal of crowd science
PublicationYear 2024
Publisher Tsinghua University Press
Publisher_xml – name: Tsinghua University Press
References ref13
ref24
ref12
ref14
Radford (ref20) 2018
ref11
ref22
Brown (ref21)
ref2
Devlin (ref4) 2019
ref1
ref17
ref16
ref19
ref18
ref7
ref9
ref3
ref6
Joachims (ref10) 1998
Hughes (ref15) 2017; 235
Liu (ref23) 2019
Zaremba (ref5) 2014
Zhou (ref8) 2019
References_xml – volume-title: Making large scale SVM learning practical
  year: 1998
  ident: ref10
– volume: 235
  start-page: 246
  year: 2017
  ident: ref15
  article-title: Medical text classification using convolutional neural networks
  publication-title: Stud. Health Technol. Inform.
– ident: ref18
  doi: 10.48550/ARXIV.1907.11692
– ident: ref1
  doi: 10.3389/fpubh.2022.925011
– ident: ref12
  doi: 10.1023/A:1010933404324
– ident: ref17
  doi: 10.1016/j.eswa.2016.03.045
– ident: ref2
  doi: 10.3390/biomedicines9101448
– year: 2019
  ident: ref4
  article-title: BERT: Pretraining of deep bidirectional transformers for language understanding
  publication-title: arXiv preprint
– ident: ref13
  doi: 10.1145/312624.312647
– year: 2019
  ident: ref23
  article-title: Fine-tune BERT for extractive summarization
  publication-title: arXiv preprint
– year: 2019
  ident: ref8
  article-title: Improving BERT fine-tuning with embedding normalization
  publication-title: arXiv preprint
– year: 2014
  ident: ref5
  article-title: Recurrent neural network regularization
  publication-title: arXiv preprint
– ident: ref7
  doi: 10.1109/ACCESS.2021.3049734
– ident: ref16
  doi: 10.1016/j.jksuci.2022.02.025
– ident: ref11
  doi: 10.1007/BF00116251
– ident: ref3
  doi: 10.48550/ARXIV.1706.03762
– ident: ref6
  doi: 10.1007/978-3-642-24797-2_4
– ident: ref14
  doi: 10.1145/3439726
– volume-title: Improving language understanding by generative pre-training
  year: 2018
  ident: ref20
– ident: ref24
  doi: 10.1007/978-1-4842-2766-4_7
– start-page: 1877
  volume-title: Proc. 34th Int. Conf. Neural Information Processing Systems
  ident: ref21
  article-title: Language models are few-shot learners
– ident: ref19
  doi: 10.1016/j.jksuci.2023.101610
– ident: ref22
  doi: 10.1109/TSMC.2021.3096974
– ident: ref9
  doi: 10.1016/j.engappai.2016.02.002
SSID ssj0002769479
Score 2.2888942
Snippet During a hospital visit, a significant volume of Gastroscopy Diagnostic Text (GDT) data are produced, representing the unstructured gastric medical records of...
SourceID doaj
crossref
SourceType Open Website
Enrichment Source
Index Database
StartPage 56
SubjectTerms bidirectional encoder representations from transformers (bert)
gastroscopy diagnostic text
text classification
Title Three-Branch BERT-Based Text Classification Network for Gastroscopy Diagnosis Text
URI https://doaj.org/article/711af7824d9545b7aae2f9f4149e0a89
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: Directory of Open Access Journals (DOAJ)
  customDbUrl:
  eissn: 2398-7294
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002769479
  issn: 2398-7294
  databaseCode: DOA
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources (ISSN International Center)
  customDbUrl:
  eissn: 2398-7294
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002769479
  issn: 2398-7294
  databaseCode: M~E
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELZQxcCCeIqWhzwgFpS2jp2HR1JaHoIKlYK6RY4fogi1VVuQuvDbOduhKgssLBmi2LK-XHLf2XffIXQaKx2nQpGgCA0EKElEAkENfHgsotIkRim3mfN8l3S76WDAH1ZafdmcMC8P7IFrJIQIA26MKQ7OvkiE0KHhhgGz102RutI9YD0rwdSrO06LOUu4r5IJ44jzxs1t67Fum4XXud3TpuSHJ1oR7HeepbOFNktKiC_8UrbRmh7toNqykgSfYV9Di72kx2IX9frwBnSQ2aYYLzhr9_pBBt5I4T78arHrc2kzgBzouOsTvTGwU3wlZnMrXzmeLPClT7IbztyoPfTUafdb10HZHCGQIURtQUEhsNLgrg3REJVQLZgsaNG058gFKUQcsiJNY4A7oTISqRKxSRkhkgMFFJrSfVQZjUf6AGGjLdFTiQIqwWBurhUNU2IkzEBlSKuIfAOVy1I53DaweMshgnDg5hbc3IKbl-BW0flyzMTrZvz6dGbxXz5pNa_dDbCEvLSE_C9LqP3HJIdoAxbGfJbZEarMp-_6GK3Lj_lwNj1xRgbX-8_2F5jM1Ac
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Three-Branch+BERT-Based+Text+Classification+Network+for+Gastroscopy+Diagnosis+Text&rft.jtitle=International+journal+of+crowd+science&rft.au=Wang%2C+Zhichao&rft.au=Zheng%2C+Xiangwei&rft.au=Zhang%2C+Jinsong&rft.au=Zhang%2C+Mingzhe&rft.date=2024-03-01&rft.issn=2398-7294&rft.eissn=2398-7294&rft.volume=8&rft.issue=1&rft.spage=56&rft.epage=63&rft_id=info:doi/10.26599%2FIJCS.2023.9100031&rft.externalDBID=n%2Fa&rft.externalDocID=10_26599_IJCS_2023_9100031
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2398-7294&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2398-7294&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2398-7294&client=summon