BoostDILI: Extreme Gradient Boost-Powered Drug-Induced Liver Injury Prediction and Structural Alerts Generation
Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current and testing methods are complex and cumbersome. In this study, we developed...
Saved in:
| Published in: | Chemical research in toxicology Vol. 38; no. 5; p. 865 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
19.05.2025
|
| Subjects: | |
| ISSN: | 1520-5010, 1520-5010 |
| Online Access: | Get more information |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current
and
testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI. |
|---|---|
| AbstractList | Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current
and
testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI. Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current in vitro and in vivo testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI.Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of DILI is crucial for developing safer pharmaceuticals, yet current in vitro and in vivo testing methods are complex and cumbersome. In this study, we developed an extreme gradient boosting (XGB)-powered machine learning (ML) model for DILI prediction. Comparing various DILI prediction models is challenging because they rely on different public data sets. We comprehensively evaluated the proposed BoostDILI model to address two crucial questions: 1. Can insights derived from public data sets help in DILI prediction for Food and Drug Administration (FDA) approved drugs? 2. Can we generate structural alerts to improve the model's explainability? To address the first question, we developed a DILI prediction model using four publicly available data sets. This effort led to the creation of the BoostDILI model, which achieved a 5-fold CV accuracy of 0.70. A sequential feature selection method was employed to identify relevant descriptors. This model integrates feature-level representations derived from RDKit (12 features) and Mordred (23 features) features. Bayesian statistics was applied to identify high-performance substructures iteratively, and a structural alerts model was developed to address the second question. The developed model was further validated with two FDA-approved drug data sets, DILIst and DILIRank. The BoostDILI model offers a trustable solution for evaluating the DILI risk in preclinical research. The structural alerts help in identifying the substructures that may be responsible for DILI. The data set and the source code are available at https://github.com/Naga270588/BoostDILI. |
| Author | Borah, Gori Sankar Mahanta, Hridoy Jyoti Nagamani, Selvaraman Chutia, Hillul |
| Author_xml | – sequence: 1 givenname: Hillul surname: Chutia fullname: Chutia, Hillul organization: CSIR-North East Institute of Science and Technology, Jorhat 785006, India – sequence: 2 givenname: Gori Sankar surname: Borah fullname: Borah, Gori Sankar organization: School of Computer Science, The Assam Kaziranga University, Jorhat 785006, India – sequence: 3 givenname: Hridoy Jyoti surname: Mahanta fullname: Mahanta, Hridoy Jyoti organization: Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India – sequence: 4 givenname: Selvaraman orcidid: 0000-0002-7825-3994 surname: Nagamani fullname: Nagamani, Selvaraman organization: Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40241442$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNkMtOwzAURC0Eog_4hcpLNim2E-fBrrSlRKpEJWAdGfsGUiV28QPavydAkVjduZozs5gROtVGA0ITSqaUMHotpJvKN-gsOG_200QSwmN2goaUMxJxQsnpPz1AI-e2hNA-m52jQUJYQpOEDZG5Ncb5Rbkub_By7y10gFdWqAa0xz9etDGfYEHhhQ2vUalVkP2zbj7A4lJvgz3gTW830jdGY6EVfvQ2SB-saPGsBesdXoEGK76BC3RWi9bB5fGO0fPd8ml-H60fVuV8to5ETGIfFWmc5kJmqsgLrrgqOGe1qlma1zmVMiYpURxYJqnKpCC5oKrOAWQmgGdQUzZGV7-9O2veQz9S1TVOQtsKDSa4KqYFpUmWpqxHJ0c0vHSgqp1tOmEP1d9I7AuKvG_K |
| ContentType | Journal Article |
| DBID | CGR CUY CVF ECM EIF NPM 7X8 |
| DOI | 10.1021/acs.chemrestox.4c00532 |
| DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
| DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Public Health Pharmacy, Therapeutics, & Pharmacology |
| EISSN | 1520-5010 |
| ExternalDocumentID | 40241442 |
| Genre | Journal Article |
| GroupedDBID | --- -~X 29B 4.4 55A 5GY 5RE 5VS 7~N AABXI ABBLG ABJNI ABLBI ABMVS ABQRX ABUCX ACGFS ACJ ACS ADHLV AEESW AENEX AFEFF AGXLV AHGAQ ALMA_UNASSIGNED_HOLDINGS AQSVZ BAANH CGR CS3 CUPRZ CUY CVF EBS ECM ED~ EIF F5P GGK GNL IH9 IHE JG~ LG6 NPM P2P ROL TN5 UI2 UPT VF5 VG9 W1F YZZ 7X8 |
| ID | FETCH-LOGICAL-a303t-96368ac7d9895d5d9552fdf268f81cc3060d5e27c1d7ca08a1df8eec7ae57ef12 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001469189800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-5010 |
| IngestDate | Wed Jul 02 05:00:19 EDT 2025 Wed Jun 25 03:22:00 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 5 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a303t-96368ac7d9895d5d9552fdf268f81cc3060d5e27c1d7ca08a1df8eec7ae57ef12 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ORCID | 0000-0002-7825-3994 |
| PMID | 40241442 |
| PQID | 3191147662 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_3191147662 pubmed_primary_40241442 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-05-19 |
| PublicationDateYYYYMMDD | 2025-05-19 |
| PublicationDate_xml | – month: 05 year: 2025 text: 2025-05-19 day: 19 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Chemical research in toxicology |
| PublicationTitleAlternate | Chem Res Toxicol |
| PublicationYear | 2025 |
| SSID | ssj0011027 |
| Score | 2.4694657 |
| Snippet | Over the past 60 years, drug-induced liver injury (DILI) has played a key role in the withdrawal of marketed drugs due to safety concerns. Early prediction of... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| StartPage | 865 |
| SubjectTerms | Boosting Machine Learning Algorithms Chemical and Drug Induced Liver Injury Humans Machine Learning Pharmaceutical Preparations - chemistry United States Food and Drug Administration |
| Title | BoostDILI: Extreme Gradient Boost-Powered Drug-Induced Liver Injury Prediction and Structural Alerts Generation |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/40241442 https://www.proquest.com/docview/3191147662 |
| Volume | 38 |
| WOSCitedRecordID | wos001469189800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1bS-QwFA7u6sPCsutlXS-rRBCfjE4yTZP6srheB8ah4IV5GzK56C67rbZV9N97TtthnoQFXwqlCZSck-Rcv4-Q7Tj2ojN2go1lFFjEg2TYP8l8N7EBpiheo_Pf9NVgoIfDJG0DbmVbVjk5E-uD2uUWY-T7oCpguqs4Fj_vHxiyRmF2taXQ-EBmu2DKYEmXGk6zCHB51uQqElwkCY7HpENY8H1jQaR3_h8SYOTPe5FFbRRvm5n1dXP69b0_Ok--tIYmPWw0Y4HM-GyR7KQNUvXLLr2aNl6Vu3SHplMM65dF8rkJ59GmS2mJ5L_yvKyOe_3eAT15rjCoSM-KulysovU3liLfmnf0uHi8ZcgIYuGlj2UftJf9AdHRtMCsEGoCNZmjlzV0LcJ-0MO_vqhK2mBg44Bv5Pr05OronLVcDczAJVgx2MexNla5RCfSSZdIKYILItZBc2vBMek46YWy3ClrOtpwF7T3VhkvlQ9cLJOPWZ75FULBhuXSau3GsY-648QEcAB8QMNVuVjyVbI1WfgR7AVMcJjM54_laLr0q-R7I73RfQPaMQI_OQLnUaz9x-x18kkgzS-CtCY_yGyAk8BvkDn7VP0ui81ayeA5SC9eAWFE3ps |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BoostDILI%3A+Extreme+Gradient+Boost-Powered+Drug-Induced+Liver+Injury+Prediction+and+Structural+Alerts+Generation&rft.jtitle=Chemical+research+in+toxicology&rft.au=Chutia%2C+Hillul&rft.au=Borah%2C+Gori+Sankar&rft.au=Mahanta%2C+Hridoy+Jyoti&rft.au=Nagamani%2C+Selvaraman&rft.date=2025-05-19&rft.issn=1520-5010&rft.eissn=1520-5010&rft_id=info:doi/10.1021%2Facs.chemrestox.4c00532&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-5010&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-5010&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-5010&client=summon |