CBCR: A Curriculum Based Strategy For Chromosome Reconstruction

In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data...

Full description

Saved in:
Bibliographic Details
Published in:International journal of molecular sciences Vol. 22; no. 8; p. 4140
Main Authors: Hovenga, Van, Oluwadare, Oluwatosin
Format: Journal Article
Language:English
Published: Switzerland MDPI AG 16.04.2021
MDPI
Subjects:
ISSN:1422-0067, 1661-6596, 1422-0067
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of xyz coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1422-0067
1661-6596
1422-0067
DOI:10.3390/ijms22084140