Precise physical models of protein-DNA interaction from high-throughput data

A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a pri...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS Vol. 104; no. 2; p. 501
Main Authors: Kinney, Justin B, Tkacik, Gasper, Callan, Jr, Curtis G
Format: Journal Article
Language:English
Published: United States 09.01.2007
Subjects:
ISSN:0027-8424
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.
AbstractList A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.
A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.
Author Kinney, Justin B
Callan, Jr, Curtis G
Tkacik, Gasper
Author_xml – sequence: 1
  givenname: Justin B
  surname: Kinney
  fullname: Kinney, Justin B
  organization: Physics Department and Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
– sequence: 2
  givenname: Gasper
  surname: Tkacik
  fullname: Tkacik, Gasper
– sequence: 3
  givenname: Curtis G
  surname: Callan, Jr
  fullname: Callan, Jr, Curtis G
BackLink https://www.ncbi.nlm.nih.gov/pubmed/17197415$$D View this record in MEDLINE/PubMed
BookMark eNo1zz1PwzAYBGAPRfQDZjbkiS3lteMk9liVTykCBpgj27EboyQOtjP031OJMt3y6HS3RovRjwahGwJbAlV-P40ybqEEIYATYAu0AqBVxhllS7SO8RsARMHhEi1JRUTFSLFC9Ucw2kWDp-4YnZY9Hnxr-oi9xVPwybgxe3jbYTcmE6ROzo_YBj_gzh26LHXBz4dumhNuZZJX6MLKPprrc27Q19Pj5_4lq9-fX_e7OtNFJVImVFuAEooSUFQpWdI8LwRtreZc5aCV5FoorpnhVlYFtbas1Mlb0baMFSXdoLu_3tPCn9nE1AwuatP3cjR-jk3JGckpsBO8PcNZDaZtpuAGGY7N_3_6C3W2Xk0
CitedBy_id crossref_primary_10_1016_j_tig_2009_01_002
crossref_primary_10_1371_journal_pone_0026105
crossref_primary_10_1109_TCBB_2012_106
crossref_primary_10_1038_s44320_025_00086_5
crossref_primary_10_1186_1471_2105_10_345
crossref_primary_10_1186_s12864_016_2533_5
crossref_primary_10_1371_journal_pone_0199771
crossref_primary_10_1093_nar_gkp394
crossref_primary_10_1093_nar_gkn573
crossref_primary_10_1146_annurev_conmatphys_031214_014803
crossref_primary_10_1007_s10955_010_0102_x
crossref_primary_10_1371_journal_pcbi_1004141
crossref_primary_10_1073_pnas_1309933111
crossref_primary_10_1088_1478_3975_11_2_026005
crossref_primary_10_1529_biophysj_107_114074
crossref_primary_10_1186_s13059_022_02661_7
crossref_primary_10_1146_annurev_genom_083118_014845
crossref_primary_10_1186_1472_6807_9_30
crossref_primary_10_1371_journal_pcbi_1012818
crossref_primary_10_1038_nbt_2486
crossref_primary_10_1073_pnas_1004290107
crossref_primary_10_1007_s10955_015_1398_3
crossref_primary_10_1016_j_jtbi_2015_06_010
crossref_primary_10_1073_pnas_0805909105
crossref_primary_10_1089_cmb_2009_0142
crossref_primary_10_1186_1472_6750_8_94
crossref_primary_10_1016_j_tig_2009_08_003
crossref_primary_10_1371_journal_pcbi_1006921
crossref_primary_10_7554_eLife_06397
crossref_primary_10_1016_j_bbagrm_2016_09_002
crossref_primary_10_1371_journal_pcbi_1006226
crossref_primary_10_1103_PhysRevResearch_7_023005
crossref_primary_10_1016_j_gene_2008_07_038
crossref_primary_10_1162_NECO_a_00568
crossref_primary_10_1007_s10955_015_1388_5
crossref_primary_10_1261_rna_079541_122
crossref_primary_10_1073_pnas_1518958112
crossref_primary_10_1371_journal_pcbi_1000590
crossref_primary_10_1073_pnas_1010868108
crossref_primary_10_1162_NECO_a_00463
crossref_primary_10_3109_0954898X_2011_566303
crossref_primary_10_1039_b910888m
crossref_primary_10_1080_09548980902950891
crossref_primary_10_1091_mbc_e07_12_1242
ContentType Journal Article
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1073/pnas.0609908104
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Sciences (General)
ExternalDocumentID 17197415
Genre Research Support, U.S. Gov't, Non-P.H.S
Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: P50 GM071508
– fundername: NIGMS NIH HHS
  grantid: P50GM071508
GroupedDBID ---
-DZ
-~X
.55
0R~
123
29P
2AX
2FS
2WC
4.4
53G
5RE
5VS
85S
AACGO
AAFWJ
AANCE
AAYJJ
ABBHK
ABOCM
ABPLY
ABPPZ
ABTLG
ABXSQ
ABZEH
ACGOD
ACHIC
ACIWK
ACNCT
ACPRK
ADULT
AENEX
AEUPB
AEXZC
AFFNX
AFOSN
AFRAH
ALMA_UNASSIGNED_HOLDINGS
AQVQM
AS~
BKOMP
CGR
CS3
CUY
CVF
D0L
DCCCD
DIK
DOOOF
DU5
E3Z
EBS
ECM
EIF
EJD
F5P
FRP
GX1
H13
HH5
HQ3
HTVGU
HYE
IPSME
JAAYA
JBMMH
JENOY
JHFFW
JKQEH
JLS
JLXEF
JPM
JSG
JSODD
JST
KQ8
L7B
LU7
MVM
N9A
NPM
N~3
O9-
OK1
P-O
PNE
PQQKQ
R.V
RHF
RHI
RNA
RNS
RPM
RXW
SA0
SJN
TAE
TN5
UKR
VQA
VXZ
W8F
WH7
WHG
WOQ
WOW
X7M
XSW
Y6R
YBH
YIF
YIN
YKV
YSK
ZCA
~02
~KM
7X8
ADQXQ
ADXHL
ID FETCH-LOGICAL-c579t-9bd50b9b210b2bba6233592dfc88b30cba8c9b8c4e8fa752ff67bb9bf9dd44562
IEDL.DBID 7X8
ISICitedReferencesCount 52
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000243445400020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0027-8424
IngestDate Fri Sep 05 10:47:58 EDT 2025
Wed Feb 19 01:46:38 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c579t-9bd50b9b210b2bba6233592dfc88b30cba8c9b8c4e8fa752ff67bb9bf9dd44562
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.pnas.org/doi/pdf/10.1073/pnas.0609908104
PMID 17197415
PQID 68413204
PQPubID 23479
ParticipantIDs proquest_miscellaneous_68413204
pubmed_primary_17197415
PublicationCentury 2000
PublicationDate 2007-01-09
PublicationDateYYYYMMDD 2007-01-09
PublicationDate_xml – month: 01
  year: 2007
  text: 2007-01-09
  day: 09
PublicationDecade 2000
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Proceedings of the National Academy of Sciences - PNAS
PublicationTitleAlternate Proc Natl Acad Sci U S A
PublicationYear 2007
References 9581503 - Trends Biochem Sci. 1998 Mar;23(3):109-13
3612791 - J Mol Biol. 1987 Feb 20;193(4):723-50
16873464 - Bioinformatics. 2006 Jul 15;22(14):e141-9
12748633 - Nature. 2003 May 15;423(6937):241-54
16418267 - Proc Natl Acad Sci U S A. 2006 Jan 24;103(4):867-72
11404456 - Proc Natl Acad Sci U S A. 2001 Jun 19;98(13):7158-63
12399584 - Science. 2002 Oct 25;298(5594):799-804
11125145 - Science. 2000 Dec 22;290(5500):2306-9
16207358 - Genome Biol. 2005;6(10):R87
15511291 - BMC Evol Biol. 2004 Oct 28;4:42
16083878 - FEBS Lett. 2005 Aug 15;579(20):4535-40
15543148 - Nat Genet. 2004 Dec;36(12):1331-9
10929718 - Cell. 2000 Jul 7;102(1):109-26
10812473 - Bioinformatics. 2000 Jan;16(1):16-23
15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44
16236723 - Proc Natl Acad Sci U S A. 2005 Nov 1;102(44):15936-41
10385322 - Nat Biotechnol. 1999 Jun;17(6):573-7
12384591 - Nucleic Acids Res. 2002 Oct 15;30(20):4442-51
References_xml – reference: 10812473 - Bioinformatics. 2000 Jan;16(1):16-23
– reference: 3612791 - J Mol Biol. 1987 Feb 20;193(4):723-50
– reference: 16083878 - FEBS Lett. 2005 Aug 15;579(20):4535-40
– reference: 16207358 - Genome Biol. 2005;6(10):R87
– reference: 11125145 - Science. 2000 Dec 22;290(5500):2306-9
– reference: 16873464 - Bioinformatics. 2006 Jul 15;22(14):e141-9
– reference: 10385322 - Nat Biotechnol. 1999 Jun;17(6):573-7
– reference: 12399584 - Science. 2002 Oct 25;298(5594):799-804
– reference: 16418267 - Proc Natl Acad Sci U S A. 2006 Jan 24;103(4):867-72
– reference: 10929718 - Cell. 2000 Jul 7;102(1):109-26
– reference: 9581503 - Trends Biochem Sci. 1998 Mar;23(3):109-13
– reference: 15543148 - Nat Genet. 2004 Dec;36(12):1331-9
– reference: 12748633 - Nature. 2003 May 15;423(6937):241-54
– reference: 11404456 - Proc Natl Acad Sci U S A. 2001 Jun 19;98(13):7158-63
– reference: 15511291 - BMC Evol Biol. 2004 Oct 28;4:42
– reference: 15637633 - Nat Biotechnol. 2005 Jan;23(1):137-44
– reference: 16236723 - Proc Natl Acad Sci U S A. 2005 Nov 1;102(44):15936-41
– reference: 12384591 - Nucleic Acids Res. 2002 Oct 15;30(20):4442-51
SSID ssj0009580
Score 2.1221092
Snippet A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites....
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 501
SubjectTerms Binding Sites
Biophysical Phenomena
Biophysics
DNA - chemistry
DNA - metabolism
DNA, Fungal - chemistry
DNA, Fungal - metabolism
DNA-Binding Proteins - chemistry
DNA-Binding Proteins - metabolism
Likelihood Functions
Models, Chemical
Protein Array Analysis
Protein Binding
Saccharomyces cerevisiae - metabolism
Saccharomyces cerevisiae Proteins - chemistry
Saccharomyces cerevisiae Proteins - metabolism
Thermodynamics
Transcription Factors - chemistry
Transcription Factors - metabolism
Title Precise physical models of protein-DNA interaction from high-throughput data
URI https://www.ncbi.nlm.nih.gov/pubmed/17197415
https://www.proquest.com/docview/68413204
Volume 104
WOSCitedRecordID wos000243445400020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELYKZWAByrM8PTDAYJo4bm1LSKgCKoZSdQDUrfLZidQlCU3L78eXh8SCGFgyJVF0Pl--u_N9HyHXUvVBWCcYeGzNhHNIecsNC5w2VoYGTFlw-xjLyUTNZnraIvfNLAweq2xiYhmoXWaxRt4bKIHTvuIh_2SoGYW91VpAY4O0Iw9k0KflTP2g3FXVAAr3cVhw0RD7yKiXp6a4CwbYFFJhqdH2C7os_zKj3f993x7ZqdElHVbu0CGtON0nnXr_FvSmJpm-PSDjKdJaFDHN64WipSZOQbOEltwNi5Q9TYYU6SSW1fADxVEUivzGrFb3ydcrikdMD8n76Pnt8YXVygrM9qVeMQ2uH4AGn-8BBzAeA0V9zV1ilYIosGCU1aCsiFXiF4snyUCCvz_RzgnMmY7IZpql8QmhPLL4lmgAMRcyAfAJY8iNR6Gxk6Bll1w19pp7z8V2hEnjbF3MG4t1yXFl8nleEWzMQxlqRDqnfz57RrarYmvIAn1O2onfs_EF2bJfq0WxvCwdwl8n09dvhObBHg
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Precise+physical+models+of+protein-DNA+interaction+from+high-throughput+data&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+-+PNAS&rft.au=Kinney%2C+Justin+B&rft.au=Tkacik%2C+Gasper&rft.au=Callan%2C+Curtis+G&rft.date=2007-01-09&rft.issn=0027-8424&rft.volume=104&rft.issue=2&rft.spage=501&rft_id=info:doi/10.1073%2Fpnas.0609908104&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0027-8424&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0027-8424&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0027-8424&client=summon