Unbiased microRNA-Disease Association Prediction Using ICD-11 Codes and Negative Sampling.

Gespeichert in:
Bibliographische Detailangaben
Titel: Unbiased microRNA-Disease Association Prediction Using ICD-11 Codes and Negative Sampling.
Autoren: Chang M; Education and Research Program for Future ICT Pioneers, Seoul National University, Seoul, South Korea.; Department of Otorhinolaryngology-Head and Neck Surgery, Chung-Ang University College of Medicine, Seoul, South Korea.; Department of Otorhinolaryngology-Head and Neck Surgery, Chung-Ang University Hospital, Seoul, South Korea., Jo J; Center for Semiconductor Technology, Korea Institute of Science and Technology (KIST), Seoul, South Korea., Ahn J; Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea.; Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea., Kang BG; Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea., Nam SM; Department of Neurosurgery, Seoul National University Hospital, Seoul, South Korea., Park CK; Department of Neurosurgery, Seoul National University Hospital, Seoul, South Korea., Yoon S; Education and Research Program for Future ICT Pioneers, Seoul National University, Seoul, South Korea.; Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, South Korea.; Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea.
Quelle: Pharmacology research & perspectives [Pharmacol Res Perspect] 2025 Dec; Vol. 13 (6), pp. e70192.
Publikationsart: Journal Article
Sprache: English
Info zur Zeitschrift: Publisher: John Wiley & Sons Ltd, British Pharmacological Society and American Society for Pharmacology and Experimental Therapeutics Country of Publication: United States NLM ID: 101626369 Publication Model: Print Cited Medium: Internet ISSN: 2052-1707 (Electronic) Linking ISSN: 20521707 NLM ISO Abbreviation: Pharmacol Res Perspect Subsets: MEDLINE
Imprint Name(s): Original Publication: [Hoboken, NJ] : John Wiley & Sons Ltd, British Pharmacological Society and American Society for Pharmacology and Experimental Therapeutics, [2013]-
MeSH-Schlagworte: MicroRNAs*/genetics , International Classification of Diseases*, Humans ; Computational Biology/methods ; Computer Simulation
Abstract: We developed a computational model, called "Unbiased microRNA-disease association predictor (UBMDA)," to predict microRNA-disease associations. UBMDA has two major differences from those reported previously. First, we did not apply a similarity-based feature extraction method, which is the main basis of previous studies. Instead, we used International Classification of Diseases 11th Revision disease codes and microRNA nucleotide sequences as input features. Thus, UBMDA can be applied to newly discovered or poorly studied microRNAs and diseases. Second, we constructed an appropriate negative sample dataset. A positive sample dataset consisting of microRNAs and diseases pairs with proven associations between microRNAs and diseases is publicly available. However, datasets reporting no associations between microRNAs and diseases are rare. Therefore, a negative sample dataset was created by combining microRNAs and diseases. Because more commonly studied microRNAs and diseases are more likely to be included in the positive sample dataset, creating a negative sample dataset without taking this bias into consideration could cause an imbalance in disease and microRNA frequencies between positive and negative sample datasets, leading to biased prediction. To prevent such an imbalance, we created a negative sample dataset considering the frequency of each microRNA and disease in the positive sample dataset, such that these frequencies were similar between the negative and positive sample datasets. We successfully developed a computational model with a simple and intuitive structure. UBMDA will contribute to accelerating the development of microRNA-related biomarkers and therapeutics.
(© 2025 The Author(s). Pharmacology Research & Perspectives published by British Pharmacological Society and American Society for Pharmacology and Experimental Therapeutics and John Wiley & Sons Ltd.)
References: Front Immunol. 2018 Jan 30;9:72. (PMID: 29441063)
Nucleic Acids Res. 2014 Jan;42(Database issue):D68-73. (PMID: 24275495)
Genomics Proteomics Bioinformatics. 2022 Jun;20(3):446-454. (PMID: 35643191)
IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):1774-1782. (PMID: 27392365)
Expert Opin Drug Discov. 2015 Jan;10(1):9-16. (PMID: 25405898)
Front Genet. 2023 Jan 12;13:1076554. (PMID: 36712859)
ScientificWorldJournal. 2013;2013:204658. (PMID: 23576899)
Nucleic Acids Res. 2024 Jan 5;52(D1):D1327-D1332. (PMID: 37650649)
Brief Bioinform. 2022 Jan 17;23(1):. (PMID: 34864877)
Bioinformatics. 2013 Mar 1;29(5):638-44. (PMID: 23325619)
Sci Transl Med. 2018 Jan 10;10(423):. (PMID: 29321258)
Nucleic Acids Res. 2019 Jan 8;47(D1):D1013-D1017. (PMID: 30364956)
Cells. 2020 Jan 11;9(1):. (PMID: 31940779)
Nucleic Acids Res. 2019 Jan 8;47(D1):D155-D162. (PMID: 30423142)
IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1413-1422. (PMID: 38607720)
Plant Cell. 2008 Dec;20(12):3186-90. (PMID: 19074682)
Prog Neurobiol. 2016 Aug;143:61-81. (PMID: 27317386)
Lancet. 2024 Mar 23;403(10432):1177-1191. (PMID: 38437854)
Nucleic Acids Res. 2014 Jan;42(1):609-21. (PMID: 24068553)
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3530-3538. (PMID: 34506289)
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D109-11. (PMID: 14681370)
Bioinformatics. 2022 Apr 12;38(8):2226-2234. (PMID: 35150255)
BMC Syst Biol. 2013 Oct 08;7:101. (PMID: 24103777)
IEEE J Biomed Health Inform. 2025 Jun;29(6):4486-4497. (PMID: 40031690)
Curr Biol. 2003 Apr 29;13(9):790-5. (PMID: 12725740)
Nucleic Acids Res. 2009 Jan;37(Database issue):D98-104. (PMID: 18927107)
Nat Rev Genet. 2011 Nov 18;12(12):861-74. (PMID: 22094949)
Nucleic Acids Res. 2008 Jan;36(Database issue):D154-8. (PMID: 17991681)
PLoS One. 2013 Aug 08;8(8):e70204. (PMID: 23950912)
Lancet. 2024 Jan 20;403(10423):293-304. (PMID: 38245249)
BMC Syst Biol. 2010 May 28;4 Suppl 1:S2. (PMID: 20522252)
Nucleic Acids Res. 2014 Jan;42(Database issue):D1070-4. (PMID: 24194601)
Bioinformatics. 2021 Aug 9;37(15):2112-2120. (PMID: 33538820)
Mol Biol Rep. 2012 May;39(5):6219-25. (PMID: 22231906)
J Adv Res. 2020 Aug 29;28:127-138. (PMID: 33364050)
PLoS One. 2008;3(10):e3420. (PMID: 18923704)
PLoS Comput Biol. 2019 Mar 27;15(3):e1006865. (PMID: 30917115)
Sci Rep. 2024 Jul 24;14(1):17028. (PMID: 39043798)
Biomed Res Int. 2015;2015:810514. (PMID: 26273645)
RNA. 2003 Mar;9(3):277-9. (PMID: 12592000)
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D140-4. (PMID: 16381832)
Gut. 2021 Feb;70(2):408-417. (PMID: 33067333)
Nucleic Acids Res. 2011 Jan;39(Database issue):D152-7. (PMID: 21037258)
BMC Bioinformatics. 2023 Mar 23;24(1):113. (PMID: 36959547)
Neural Regen Res. 2022 Feb;17(2):344-353. (PMID: 34269209)
Grant Information: NO. RS-2021-II211343 Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT); Artificial Intelligence Graduate School Program (Seoul National University); no.2022R1A3B1077720 National Research Foundation of Korea grants funded by the Korean Government (Ministry of Science and ICT); BK21 FOUR program of the Education and Research Program for Future ICT Pioneers, Seoul National University; IITP-2026-RS-2024-00397085 Institute of Information & Communications Technology Planning & Evaluation (IITP) under the Leading Generative AI Human Resources Development grant funded by the Korea government (MSIT); HUINNO AIM Company through HA-Rnd2325-predictClinicalDeterioration; GrantNo.3720250100 NAVER Digital Bio Innovation Research Fund, funded by NAVER Corporation; Al-Bio Research Grant through Seoul National University
Contributed Indexing: Keywords: association; convolutional neural network; disease; microRNA; prediction
Substance Nomenclature: 0 (MicroRNAs)
Entry Date(s): Date Created: 20251128 Date Completed: 20251128 Latest Revision: 20251201
Update Code: 20251201
PubMed Central ID: PMC12661550
DOI: 10.1002/prp2.70192
PMID: 41312972
Datenbank: MEDLINE
Beschreibung
Abstract:We developed a computational model, called "Unbiased microRNA-disease association predictor (UBMDA)," to predict microRNA-disease associations. UBMDA has two major differences from those reported previously. First, we did not apply a similarity-based feature extraction method, which is the main basis of previous studies. Instead, we used International Classification of Diseases 11th Revision disease codes and microRNA nucleotide sequences as input features. Thus, UBMDA can be applied to newly discovered or poorly studied microRNAs and diseases. Second, we constructed an appropriate negative sample dataset. A positive sample dataset consisting of microRNAs and diseases pairs with proven associations between microRNAs and diseases is publicly available. However, datasets reporting no associations between microRNAs and diseases are rare. Therefore, a negative sample dataset was created by combining microRNAs and diseases. Because more commonly studied microRNAs and diseases are more likely to be included in the positive sample dataset, creating a negative sample dataset without taking this bias into consideration could cause an imbalance in disease and microRNA frequencies between positive and negative sample datasets, leading to biased prediction. To prevent such an imbalance, we created a negative sample dataset considering the frequency of each microRNA and disease in the positive sample dataset, such that these frequencies were similar between the negative and positive sample datasets. We successfully developed a computational model with a simple and intuitive structure. UBMDA will contribute to accelerating the development of microRNA-related biomarkers and therapeutics.<br /> (© 2025 The Author(s). Pharmacology Research & Perspectives published by British Pharmacological Society and American Society for Pharmacology and Experimental Therapeutics and John Wiley & Sons Ltd.)
ISSN:2052-1707
DOI:10.1002/prp2.70192