StarCDP: Dynamic Programming Algorithms for Fast and Accurate Cell Lineage Tree Reconstruction from CRISPR-Based Lineage Tracing Data
CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduct...
Uložené v:
| Vydané v: | Journal of computational biology |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
United States
23.10.2025
|
| Predmet: | |
| ISSN: | 1557-8666, 1557-8666 |
| On-line prístup: | Zistit podrobnosti o prístupe |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP. |
|---|---|
| AbstractList | CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP.CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP. CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at the cellular level. Thus, cell lineage tree (CLT) reconstruction has attracted significant attention in recent years, including the introduction of Star Homoplasy Parsimony (SHP) to model the unique properties of CRISPR-induced mutations, along with the Startle family of methods. However, CLT reconstruction continues to be challenged by technological limitations in producing consistent phylogenetic signals across CLTs. To address these issues, we present Star-CDP, a collection of dynamic programming algorithms that enable researchers to seek, count, sample, and build consensus trees from solutions to SHP within a constrained search space, defined by subsets of cells from which a solution must draw its clades. When using our procedure to construct clade constraints, Star-CDP runs in polynomial time, enabling scalability to larger numbers of cells than Startle-ILP (integer linear programming), the leading method for SHP. In simulations, Star-CDP's strict consensus achieved the same or higher accuracy (f1-score) compared to the leading parsimony methods, with the greatest gains in accuracy occurring when the phylogenetic signal was limited due to the high ratio of cells to mutations. On lineage tracing data from a mouse model of lung adenocarcinoma, Star-CDP's strict consensus achieved the lowest SHP score and comparable numbers of metastatic reseedings compared to PAUP*'s strict consensus and Startle-NNI (nearest neighbor interchange), all benchmarked on a standard data processing pipeline (although our study also revealed that the pipeline can impact relative performance for migrations/reseedings). Star-CDP is available on GitHub: https://github.com/molloy-lab/Star-CDP. |
| Author | Dai, Junyan Molloy, Erin K |
| Author_xml | – sequence: 1 givenname: Junyan surname: Dai fullname: Dai, Junyan organization: University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, USA – sequence: 2 givenname: Erin K surname: Molloy fullname: Molloy, Erin K organization: University of Maryland Institute for Advanced Computer Studies, University of Maryland, College Park, Maryland, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/41173524$$D View this record in MEDLINE/PubMed |
| BookMark | eNpNkMtOwzAQRS1URB_wAWyQl2wCfiR2wq6kFCpVomrLupo4Tglq7GI7i34A_00qitTVzFydmas7Q9Qz1miEbil5oFTKR5okMhVCsITyVJCUXaDBUYuOYu-s76Oh91-EUC6IvEL9uFvnCYsH6GcVwOWTxROeHAw0tcILZ7cOmqY2Wzzeba2rw2fjcWUdnoIPGEyJx0q1DoLGud7t8Lw2GrYar53WeKmVNT64VoXaGlw52-B8OVstltEzeF2e0aCOHhMIcI0uK9h5fXOqI_QxfVnnb9H8_XWWj-eRYpyGiBEa01gJrjOWaZnEwEWW8W6sgEBMaCHKAnjBBRM0TitWyJRkZZkUVKouLxuh-7-7e2e_W-3Dpqm96jKA0bb1G86EFDJjadahdye0LRpdbvaubsAdNv-vY7-b5HFf |
| ContentType | Journal Article |
| DBID | NPM 7X8 |
| DOI | 10.1177/15578666251386082 |
| DatabaseName | PubMed MEDLINE - Academic |
| DatabaseTitle | PubMed MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic PubMed |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Biology Mathematics |
| EISSN | 1557-8666 |
| ExternalDocumentID | 41173524 |
| Genre | Journal Article |
| GroupedDBID | --- 0R~ 29K 4.4 53G 5GY ABBKN ACGFO ADBBV AENEX ALMA_UNASSIGNED_HOLDINGS BAWUL BNQNF CS3 D-I DIK DU5 EBS F5P IAO IHR IM4 J8X MV1 NPM NQHIM O9- P2P RML RNS SAUOL SCNPE SFC TN5 TR2 UE5 7X8 |
| ID | FETCH-LOGICAL-c231t-201414c63e929e754a369933e9fa0a401b6dba3b3626148f2b7809dd5b17c3522 |
| IEDL.DBID | 7X8 |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001605194700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1557-8666 |
| IngestDate | Sat Nov 01 19:42:12 EDT 2025 Mon Nov 03 02:11:20 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | CRISPR phylogenetics Camin-Sokal lineage tracing parsimony star homoplasy |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c231t-201414c63e929e754a369933e9fa0a401b6dba3b3626148f2b7809dd5b17c3522 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 41173524 |
| PQID | 3267679289 |
| PQPubID | 23479 |
| ParticipantIDs | proquest_miscellaneous_3267679289 pubmed_primary_41173524 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-Oct-23 20251023 |
| PublicationDateYYYYMMDD | 2025-10-23 |
| PublicationDate_xml | – month: 10 year: 2025 text: 2025-Oct-23 day: 23 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of computational biology |
| PublicationTitleAlternate | J Comput Biol |
| PublicationYear | 2025 |
| SSID | ssj0013607 |
| Score | 2.4480093 |
| SecondaryResourceType | online_first |
| Snippet | CRISPR-based lineage tracing, coupled with single-cell RNA sequencing, has emerged as a promising approach for studying development and disease progression at... |
| SourceID | proquest pubmed |
| SourceType | Aggregation Database Index Database |
| Title | StarCDP: Dynamic Programming Algorithms for Fast and Accurate Cell Lineage Tree Reconstruction from CRISPR-Based Lineage Tracing Data |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/41173524 https://www.proquest.com/docview/3267679289 |
| WOSCitedRecordID | wos001605194700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3dS8MwEA_qFPTBj_k1v4jga3Frumb1RebmUNBRdMrexjVJVdi6uXaCf4D_t3dt5_YiCL4UGlJSksvd73KX3zF2Jl2tPeMZCxRULUeAID0oLaiBDbjgoZ1SCj3fyXa71u16fn7gFudplVOdmCpqPVR0Rn6OMEO60kP_4HL0blHVKIqu5iU0FllBIJShlC7ZnYsiuOl1aTSZqIkRp-dRTSJcojZqQvMuam6ZePh-Q5ippWlt_PcfN9l6jjF5PROKLbZgoiJbyapOfhbZ2v0PVWu8zb4Qbo4bTf-CN7Pi9NzPUrYGaNR4vf-CAySvg5gjuuUtiBMOkeZ1pSZEMsEbpt_n6NAa1Eu8MzaGk0M7o6XldH-FNx5uH_0H6wpNpp7rDYrGaEICO-ypdd1p3Fh5bQZLISJMcHNVnIqjXGEQXxlZdUC4CHXwNYQyoNMWuDoAERDbDXpcoR3IWtnTuhpUpCLQt8uWomFk9hm3hQMVFA1tBy4667ZHnIYIQyR4oRNqU2Kn09nuoexTQAMiM5zEvdl8l9hetmS9UUbS0XNwmXEc5-APXx-yVZvK-qJJssURK4S4880xW1YfyVs8PkmFCp9t__4b1wHUcQ |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=StarCDP%3A+Dynamic+Programming+Algorithms+for+Fast+and+Accurate+Cell+Lineage+Tree+Reconstruction+from+CRISPR-Based+Lineage+Tracing+Data&rft.jtitle=Journal+of+computational+biology&rft.au=Dai%2C+Junyan&rft.au=Molloy%2C+Erin+K&rft.date=2025-10-23&rft.issn=1557-8666&rft.eissn=1557-8666&rft_id=info:doi/10.1177%2F15578666251386082&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1557-8666&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1557-8666&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1557-8666&client=summon |