Improved epigenetic age prediction models by combining sex chromosome and autosomal markers

Background Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based predicti...

Full description

Saved in:
Bibliographic Details
Published in:Epigenetics & chromatin Vol. 18; no. 1; pp. 45 - 13
Main Authors: Wan, Zhong, Henneman, Peter, Hoefsloot, Huub C. J., Kloosterman, Ate D., Verschure, Pernette J.
Format: Journal Article
Language:English
Published: London BioMed Central 15.07.2025
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects:
ISSN:1756-8935, 1756-8935
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Alterations in epigenetic DNA methylation (DNAm) can be used as an accurate and robust method for biological age prediction. We assessed the feasibility of incorporating sex chromosomal DNAm markers into a six autosomal DNAm CpG marker-based age prediction model, since DNAm-based prediction modeling has predominantly relied on analyzing DNAm patterns on autosomes. Results We employed random forest regression (RFR) to construct age prediction models with publicly available DNAm Infinium 450 K microarray data of sex chromosomes from human whole blood and buffy coat samples and assessed the RFR model performance based on the root-mean squared error (RMSE) and the mean absolute deviation (MAD) of cross-validation. Four types of models were constructed consisting of DNAm probes on sex chromosomes only, on sex chromosomes and autosomes together, on sex chromosomes and/or autosomes with additional stratification by sex and/or age restriction, and reduced models comprising the top best performing sex chromosomal probes combined with six best performing autosomal probes from a previous study. Our data indicated no added predictive value of Y chromosomal DNAm markers in our best-performing prediction model, even though we acknowledged the potential of applying Y chromosomal markers for age prediction. Yet, a significantly improved accuracy of age prediction was observed using a restricted set of X chromosomal combined with the six best predicting autosomal DNAm probes. In this reduced model we noted an RMSE and MAD of 2.54 and 1.89 years, respectively. Particularly, four DNAm markers on the X chromosome exhibited a strong correlation with age, i.e., cg27064949 (DGAT2L6), cg04532200 (PLXNB3), cg01882566 (RPGR) and cg25140188 (annotated to an intergenic region). Conclusions Our findings illustrate that an age prediction model built with a set of sex chromosomal markers combined with autosomal age-informative markers, may serve as a high accuracy model to predict chronological age and may be even competitive with commonly used model built with autosomal DNAm markers only. This study represents a step forward towards the application of epigenetic autosomal and sex chromosomal combined age prediction models for aging and forensic research. Highlights A set of age-prediction models based on DNA methylation (DNAm) markers on sex chromosomes and autosomes was constructed using random forest regression (RFR). From the total dataset containing 1291 whole blood and 547 buffy coat blood samples, 860 whole blood samples were used as training set and 481 as test set, while 365 buffy coat datasets were used as training set and 182 as test set. Cross-validation of the constructed RFR models using more than 10,000 X and 30 Y chromosomal DNAm markers from all collected blood samples, provided a root-mean squared error (RMSE) ranging from 7.70 to 14.29 years, and a mean absolute deviation (MAD) from 6.10 to 11.13 years. Models constructed using sex-stratified and age-restricted data subsets demonstrated comparable RMSE and MAD values to models constructed without stratification or restriction. Models constructed using a selected set of 37 X chromosomal and six autosomal DNAm markers exhibited a significantly improved age prediction performance with a minimum RMSE of 2.54 years and MAD of 1.89 years. A total of four X chromosomal DNAm markers were found to exhibit a significant correlation with age as indicated by a Spearman correlation coefficient of 0.50. In our data sets, Y chromosomal DNAm markers did not enhance predictive performance of our best-performing age prediction model, even though we acknowledge their recognized potential for age prediction accuracy.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1756-8935
1756-8935
DOI:10.1186/s13072-025-00606-5