Diff-SE: A Diffusion-Augmented Contrastive Learning Framework for Super-Enhancer Prediction
Uloženo v:
| Název: | Diff-SE: A Diffusion-Augmented Contrastive Learning Framework for Super-Enhancer Prediction |
|---|---|
| Autoři: | Haolu Zhou, Yu Han, Yude Bai, Yun Zuo, Wenying He, Fei Guo |
| Rok vydání: | 2025 |
| Témata: | Biological Sciences not elsewhere classified, Information Systems not elsewhere classified, severe class imbalance, recent computational approaches, play crucial roles, mouse cell lines, maximizing intraclass similarity, leveraged sequence features, https :// github, enhance feature representation, deep learning framework, based data augmentation, contrastive learning strategy, se consistently outperforms, diffusion module models, enhancer prediction super, contrastive learning, se prediction, data sets, species validation, seq experiments, regulatory elements, often suffer, mcc ), interclass separation, integrates diffusion, gene expression, f <, continuous distribution, baseline model |
| Popis: | Super-enhancers (SEs) are cis-regulatory elements that play crucial roles in gene expression and are implicated in diseases such as cancer and Alzheimer’s. Traditional identification methods rely on ChIP-seq experiments, which are costly and time-consuming. While recent computational approaches have leveraged sequence features for SE prediction, they often suffer from severe class imbalance and poor generalization across species. To address these limitations, we propose Diff-SE, a deep learning framework that integrates diffusion-based data augmentation with contrastive learning. The diffusion module models the continuous distribution of SEs to generate biologically meaningful synthetic positive samples, effectively balancing training data. A contrastive learning strategy is then used to enhance feature representation by maximizing intraclass similarity and interclass separation. Experimental results across eight data sets demonstrate that Diff-SE consistently outperforms the baseline model, achieving 10%–30% improvements in precision (PRE), Matthews correlation coefficient (MCC), and F 1-score. Furthermore, Diff-SE exhibits superior generalization in cross-species validation between human and mouse cell lines. The code and data sets are available at https://github.com/15831959673/Diff-SE, enabling further research and applications in SE prediction. |
| Druh dokumentu: | article in journal/newspaper |
| Jazyk: | unknown |
| DOI: | 10.1021/acs.jcim.5c01005.s001 |
| Dostupnost: | https://doi.org/10.1021/acs.jcim.5c01005.s001 https://figshare.com/articles/journal_contribution/Diff-SE_A_Diffusion-Augmented_Contrastive_Learning_Framework_for_Super-Enhancer_Prediction/29481096 |
| Rights: | CC BY-NC 4.0 |
| Přístupové číslo: | edsbas.7DB86652 |
| Databáze: | BASE |
| Abstrakt: | Super-enhancers (SEs) are cis-regulatory elements that play crucial roles in gene expression and are implicated in diseases such as cancer and Alzheimer’s. Traditional identification methods rely on ChIP-seq experiments, which are costly and time-consuming. While recent computational approaches have leveraged sequence features for SE prediction, they often suffer from severe class imbalance and poor generalization across species. To address these limitations, we propose Diff-SE, a deep learning framework that integrates diffusion-based data augmentation with contrastive learning. The diffusion module models the continuous distribution of SEs to generate biologically meaningful synthetic positive samples, effectively balancing training data. A contrastive learning strategy is then used to enhance feature representation by maximizing intraclass similarity and interclass separation. Experimental results across eight data sets demonstrate that Diff-SE consistently outperforms the baseline model, achieving 10%–30% improvements in precision (PRE), Matthews correlation coefficient (MCC), and F 1-score. Furthermore, Diff-SE exhibits superior generalization in cross-species validation between human and mouse cell lines. The code and data sets are available at https://github.com/15831959673/Diff-SE, enabling further research and applications in SE prediction. |
|---|---|
| DOI: | 10.1021/acs.jcim.5c01005.s001 |
Nájsť tento článok vo Web of Science