StarCDP: Dynamic Programming Algorithms for Fast and Accurate Cell Lineage Tree Reconstruction from CRISPR-Based Lineage Tracing Data

CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduct...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Journal of computational biology
Hlavní autori: Dai, Junyan, Molloy, Erin K
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States 23.10.2025
Predmet:
ISSN:1557-8666, 1557-8666
On-line prístup:Zistit podrobnosti o prístupe
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP.
AbstractList CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP.CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP.
CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP.
Author Dai, Junyan
Molloy, Erin K
Author_xml – sequence: 1
  givenname: Junyan
  surname: Dai
  fullname: Dai, Junyan
  organization: University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, USA
– sequence: 2
  givenname: Erin K
  surname: Molloy
  fullname: Molloy, Erin K
  organization: University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/41173524$$D View this record in MEDLINE/PubMed
BookMark eNpNkMtOwzAQRS1URB_wAWyQl2wCfiR2wq6kFCpVomrLupo4Tglq7GI7i34A_00qitTVzFydmas7Q9Qz1miEbil5oFTKR5okMhVCsITyVJCUXaDBUYuOYu-s76Oh91-EUC6IvEL9uFvnCYsH6GcVwOWTxROeHAw0tcILZ7cOmqY2Wzzeba2rw2fjcWUdnoIPGEyJx0q1DoLGud7t8Lw2GrYar53WeKmVNT64VoXaGlw52-B8OVstltEzeF2e0aCOHhMIcI0uK9h5fXOqI_QxfVnnb9H8_XWWj-eRYpyGiBEa01gJrjOWaZnEwEWW8W6sgEBMaCHKAnjBBRM0TitWyJRkZZkUVKouLxuh-7-7e2e_W-3Dpqm96jKA0bb1G86EFDJjadahdye0LRpdbvaubsAdNv-vY7-b5HFf
ContentType Journal Article
DBID NPM
7X8
DOI 10.1177/15578666251386082
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Biology
Mathematics
EISSN 1557-8666
ExternalDocumentID 41173524
Genre Journal Article
GroupedDBID ---
0R~
29K
4.4
53G
5GY
ABBKN
ACGFO
ADBBV
AENEX
ALMA_UNASSIGNED_HOLDINGS
BAWUL
BNQNF
CS3
D-I
DIK
DU5
EBS
F5P
IAO
IHR
IM4
J8X
MV1
NPM
NQHIM
O9-
P2P
RML
RNS
SAUOL
SCNPE
SFC
TN5
TR2
UE5
7X8
ID FETCH-LOGICAL-c231t-201414c63e929e754a369933e9fa0a401b6dba3b3626148f2b7809dd5b17c3522
IEDL.DBID 7X8
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001605194700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1557-8666
IngestDate Sat Nov 01 19:42:12 EDT 2025
Mon Nov 03 02:11:20 EST 2025
IsPeerReviewed true
IsScholarly true
Keywords CRISPR
phylogenetics
Camin-Sokal
lineage tracing
parsimony
star homoplasy
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c231t-201414c63e929e754a369933e9fa0a401b6dba3b3626148f2b7809dd5b17c3522
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 41173524
PQID 3267679289
PQPubID 23479
ParticipantIDs proquest_miscellaneous_3267679289
pubmed_primary_41173524
PublicationCentury 2000
PublicationDate 2025-Oct-23
20251023
PublicationDateYYYYMMDD 2025-10-23
PublicationDate_xml – month: 10
  year: 2025
  text: 2025-Oct-23
  day: 23
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of computational biology
PublicationTitleAlternate J Comput Biol
PublicationYear 2025
SSID ssj0013607
Score 2.4480093
SecondaryResourceType online_first
Snippet CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
Title StarCDP: Dynamic Programming Algorithms for Fast and Accurate Cell Lineage Tree Reconstruction from CRISPR-Based Lineage Tracing Data
URI https://www.ncbi.nlm.nih.gov/pubmed/41173524
https://www.proquest.com/docview/3267679289
WOSCitedRecordID wos001605194700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3dS8MwEA_qFPTBj_k1v4jga3Frumb1RebmUNBRdMrexjVJVdi6uXaCf4D_t3dt5_YiCL4UGlJSksvd73KX3zF2Jl2tPeMZCxRULUeAID0oLaiBDbjgoZ1SCj3fyXa71u16fn7gFudplVOdmCpqPVR0Rn6OMEO60kP_4HL0blHVKIqu5iU0FllBIJShlC7ZnYsiuOl1aTSZqIkRp-dRTSJcojZqQvMuam6ZePh-Q5ippWlt_PcfN9l6jjF5PROKLbZgoiJbyapOfhbZ2v0PVWu8zb4Qbo4bTf-CN7Pi9NzPUrYGaNR4vf-CAySvg5gjuuUtiBMOkeZ1pSZEMsEbpt_n6NAa1Eu8MzaGk0M7o6XldH-FNx5uH_0H6wpNpp7rDYrGaEICO-ypdd1p3Fh5bQZLISJMcHNVnIqjXGEQXxlZdUC4CHXwNYQyoNMWuDoAERDbDXpcoR3IWtnTuhpUpCLQt8uWomFk9hm3hQMVFA1tBy4667ZHnIYIQyR4oRNqU2Kn09nuoexTQAMiM5zEvdl8l9hetmS9UUbS0XNwmXEc5-APXx-yVZvK-qJJssURK4S4880xW1YfyVs8PkmFCp9t__4b1wHUcQ
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=StarCDP%3A+Dynamic+Programming+Algorithms+for+Fast+and+Accurate+Cell+Lineage+Tree+Reconstruction+from+CRISPR-Based+Lineage+Tracing+Data&rft.jtitle=Journal+of+computational+biology&rft.au=Dai%2C+Junyan&rft.au=Molloy%2C+Erin+K&rft.date=2025-10-23&rft.issn=1557-8666&rft.eissn=1557-8666&rft_id=info:doi/10.1177%2F15578666251386082&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon