Human-in-the-loop active learning for goal-oriented molecule generation

Uloženo v:
Podrobná bibliografie
Název: Human-in-the-loop active learning for goal-oriented molecule generation
Autoři: Nahal, Yasmine, Menke, Janosch, 1995, Martinelli, Julien, Heinonen, Markus, Kabeshov, Mikhail, Janet, Jon Paul, Nittinger, Eva, Engkvist, Ola, 1967, Kaski, Samuel
Zdroj: Journal of Cheminformatics yasminenahal/hitl-al-gomg: hitl-al-gomg .5. 16(1)
Témata: Active learning, Goal-oriented molecule generation, Interactive algorithms, Human-in-the-loop, Machine learning
Popis: Machine learning (ML) systems have enabled the modelling of quantitative structure-property relationships (QSPR) and structure-activity relationships (QSAR) using existing experimental data to predict target properties for new molecules. These property predictors hold significant potential in accelerating drug discovery by guiding generative artificial intelligence (AI) agents to explore desired chemical spaces. However, they often struggle to generalize due to the limited scope of the training data. When optimized by generative agents, this limitation can result in the generation of molecules with artificially high predicted probabilities of satisfying target properties, which subsequently fail experimental validation. To address this challenge, we propose an adaptive approach that integrates active learning (AL) and iterative feedback to refine property predictors, thereby improving the outcomes of their optimization by generative AI agents. Our method leverages the Expected Predictive Information Gain (EPIG) criterion to select additional molecules for evaluation by an oracle. This process aims to provide the greatest reduction in predictive uncertainty, enabling more accurate model evaluations of subsequently generated molecules. Recognizing the impracticality of immediate wet-lab or physics-based experiments due to time and logistical constraints, we propose leveraging human experts for their cost-effectiveness and domain knowledge to effectively augment property predictors, bridging gaps in the limited training data. Empirical evaluations through both simulated and real human-in-the-loop experiments demonstrate that our approach refines property predictors to better align with oracle assessments. Additionally, we observe improved accuracy of predicted properties as well as improved drug-likeness among the top-ranking generated molecules. Scientific contribution. We present an adaptable framework that integrates AL and human expertise to refine property predictors for goal-oriented molecule generation. This approach is robust to noise in human feedback and ensures that navigating chemical space with human-refined predictors leverages human insights to identify molecules that not only satisfy predicted property profiles but also score highly on oracle models. Additionally, it prioritizes practical characteristics such as drug-likeness, synthetic accessibility, and a favorable balance between exploring diverse chemical space and exploiting similarity to existing training data.
Popis souboru: electronic
Přístupová URL adresa: https://research.chalmers.se/publication/544431
https://research.chalmers.se/publication/544413
https://research.chalmers.se/publication/544431/file/544431_Fulltext.pdf
Databáze: SwePub
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://research.chalmers.se/publication/544431#
    Name: EDS - SwePub (s4221598)
    Category: fullText
    Text: View record in SwePub
  – Url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=pmc&term=1758-2946[TA]+AND+[PG]+AND+2024[PDAT]
    Name: FREE - PubMed Central (ISSN based link)
    Category: fullText
    Text: Full Text
    Icon: https://imageserver.ebscohost.com/NetImages/iconPdf.gif
    MouseOverText: Check this PubMed for the article full text.
  – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edsswe&genre=article&issn=17582946&ISBN=&volume=16&issue=1&date=20240101&spage=&pages=&title=Journal of Cheminformatics yasminenahal/hitl-al-gomg: hitl-al-gomg .5&atitle=Human-in-the-loop%20active%20learning%20for%20goal-oriented%20molecule%20generation&aulast=Nahal%2C%20Yasmine&id=DOI:10.1186/s13321-024-00924-y
    Name: Full Text Finder
    Category: fullText
    Text: Full Text Finder
    Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif
    MouseOverText: Full Text Finder
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Nahal%20Y
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsswe
DbLabel: SwePub
An: edsswe.oai.research.chalmers.se.5490b520.c090.4e36.a8b0.51f47689b066
RelevancyScore: 1014
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 1014.41540527344
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Human-in-the-loop active learning for goal-oriented molecule generation
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Nahal%2C+Yasmine%22">Nahal, Yasmine</searchLink><br /><searchLink fieldCode="AR" term="%22Menke%2C+Janosch%22">Menke, Janosch</searchLink>, 1995<br /><searchLink fieldCode="AR" term="%22Martinelli%2C+Julien%22">Martinelli, Julien</searchLink><br /><searchLink fieldCode="AR" term="%22Heinonen%2C+Markus%22">Heinonen, Markus</searchLink><br /><searchLink fieldCode="AR" term="%22Kabeshov%2C+Mikhail%22">Kabeshov, Mikhail</searchLink><br /><searchLink fieldCode="AR" term="%22Janet%2C+Jon+Paul%22">Janet, Jon Paul</searchLink><br /><searchLink fieldCode="AR" term="%22Nittinger%2C+Eva%22">Nittinger, Eva</searchLink><br /><searchLink fieldCode="AR" term="%22Engkvist%2C+Ola%22">Engkvist, Ola</searchLink>, 1967<br /><searchLink fieldCode="AR" term="%22Kaski%2C+Samuel%22">Kaski, Samuel</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <i>Journal of Cheminformatics yasminenahal/hitl-al-gomg: hitl-al-gomg .5</i>. 16(1)
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Active+learning%22">Active learning</searchLink><br /><searchLink fieldCode="DE" term="%22Goal-oriented+molecule+generation%22">Goal-oriented molecule generation</searchLink><br /><searchLink fieldCode="DE" term="%22Interactive+algorithms%22">Interactive algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22Human-in-the-loop%22">Human-in-the-loop</searchLink><br /><searchLink fieldCode="DE" term="%22Machine+learning%22">Machine learning</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Machine learning (ML) systems have enabled the modelling of quantitative structure-property relationships (QSPR) and structure-activity relationships (QSAR) using existing experimental data to predict target properties for new molecules. These property predictors hold significant potential in accelerating drug discovery by guiding generative artificial intelligence (AI) agents to explore desired chemical spaces. However, they often struggle to generalize due to the limited scope of the training data. When optimized by generative agents, this limitation can result in the generation of molecules with artificially high predicted probabilities of satisfying target properties, which subsequently fail experimental validation. To address this challenge, we propose an adaptive approach that integrates active learning (AL) and iterative feedback to refine property predictors, thereby improving the outcomes of their optimization by generative AI agents. Our method leverages the Expected Predictive Information Gain (EPIG) criterion to select additional molecules for evaluation by an oracle. This process aims to provide the greatest reduction in predictive uncertainty, enabling more accurate model evaluations of subsequently generated molecules. Recognizing the impracticality of immediate wet-lab or physics-based experiments due to time and logistical constraints, we propose leveraging human experts for their cost-effectiveness and domain knowledge to effectively augment property predictors, bridging gaps in the limited training data. Empirical evaluations through both simulated and real human-in-the-loop experiments demonstrate that our approach refines property predictors to better align with oracle assessments. Additionally, we observe improved accuracy of predicted properties as well as improved drug-likeness among the top-ranking generated molecules. Scientific contribution. We present an adaptable framework that integrates AL and human expertise to refine property predictors for goal-oriented molecule generation. This approach is robust to noise in human feedback and ensures that navigating chemical space with human-refined predictors leverages human insights to identify molecules that not only satisfy predicted property profiles but also score highly on oracle models. Additionally, it prioritizes practical characteristics such as drug-likeness, synthetic accessibility, and a favorable balance between exploring diverse chemical space and exploiting similarity to existing training data.
– Name: Format
  Label: File Description
  Group: SrcInfo
  Data: electronic
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/544431" linkWindow="_blank">https://research.chalmers.se/publication/544431</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/544413" linkWindow="_blank">https://research.chalmers.se/publication/544413</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/544431/file/544431_Fulltext.pdf" linkWindow="_blank">https://research.chalmers.se/publication/544431/file/544431_Fulltext.pdf</link>
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.5490b520.c090.4e36.a8b0.51f47689b066
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1186/s13321-024-00924-y
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: Active learning
        Type: general
      – SubjectFull: Goal-oriented molecule generation
        Type: general
      – SubjectFull: Interactive algorithms
        Type: general
      – SubjectFull: Human-in-the-loop
        Type: general
      – SubjectFull: Machine learning
        Type: general
    Titles:
      – TitleFull: Human-in-the-loop active learning for goal-oriented molecule generation
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Nahal, Yasmine
      – PersonEntity:
          Name:
            NameFull: Menke, Janosch
      – PersonEntity:
          Name:
            NameFull: Martinelli, Julien
      – PersonEntity:
          Name:
            NameFull: Heinonen, Markus
      – PersonEntity:
          Name:
            NameFull: Kabeshov, Mikhail
      – PersonEntity:
          Name:
            NameFull: Janet, Jon Paul
      – PersonEntity:
          Name:
            NameFull: Nittinger, Eva
      – PersonEntity:
          Name:
            NameFull: Engkvist, Ola
      – PersonEntity:
          Name:
            NameFull: Kaski, Samuel
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2024
          Identifiers:
            – Type: issn-print
              Value: 17582946
            – Type: issn-print
              Value: 17582946
            – Type: issn-locals
              Value: SWEPUB_FREE
            – Type: issn-locals
              Value: CTH_SWEPUB
          Numbering:
            – Type: volume
              Value: 16
            – Type: issue
              Value: 1
          Titles:
            – TitleFull: Journal of Cheminformatics yasminenahal/hitl-al-gomg: hitl-al-gomg .5
              Type: main
ResultId 1