Evaluation of methods for modeling transcription factor sequence specificity

The most comprehensive analysis to date of models of transcription-factor binding specificity reveals the best methods for predicting in vivo binding from in vitro data. Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specifi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Nature biotechnology Jg. 31; H. 2; S. 126 - 134
Hauptverfasser:	Weirauch, Matthew T, Cote, Atina, Norel, Raquel, Annala, Matti, Zhao, Yue, Riley, Todd R, Saez-Rodriguez, Julio, Cokelaer, Thomas, Vedenko, Anastasia, Talukder, Shaheynoor, Bussemaker, Harmen J, Morris, Quaid D, Bulyk, Martha L, Stolovitzky, Gustavo, Hughes, Timothy R
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York Nature Publishing Group US 01.02.2013 Nature Publishing Group
Schlagworte:	631/114 631/45/612/822 631/61/191 Agriculture Algorithms analysis Animals Binding sites Bioinformatics Biomedical Engineering/Biotechnology Biomedicine Biotechnology Computational Biology Deoxyribonucleic acid DNA DNA binding proteins DNA sequencing DNA-Binding Proteins - chemistry DNA-Binding Proteins - genetics Genome Genomics Health aspects Life Sciences Mice Nucleotide Motifs - genetics Nucleotide sequencing Physiological aspects Position-Specific Scoring Matrices Protein Array Analysis Proteins Transcription factors Transcription Factors - genetics Transcription Factors - metabolism United States
ISSN:	1087-0156, 1546-1696, 1546-1696
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The most comprehensive analysis to date of models of transcription-factor binding specificity reveals the best methods for predicting in vivo binding from in vitro data. Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a protein's DNA-binding specificity, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For nine TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro –derived motifs performed similarly to motifs derived from the in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices trained by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases (<10% of the TFs examined here). In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 ObjectType-Feature-1
ISSN:	1087-0156 1546-1696 1546-1696
DOI:	10.1038/nbt.2486