A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures

Recent advances in sequencing technologies have enabled the production of massive amounts of data on somatic mutations from cancer genomes. These data have led to the detection of characteristic patterns of somatic mutations or "mutation signatures" at an unprecedented resolution, with the...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	PLoS genetics Ročník 11; číslo 12; s. e1005657
Hlavní autoři:	Shiraishi, Yuichi, Tremmel, Georg, Miyano, Satoru, Stephens, Matthew
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	United States Public Library of Science 01.12.2015 Public Library of Science (PLoS)
Témata:	Algorithms Amino Acid Substitution - genetics Binding sites Cancer Carcinoma - genetics Carcinoma - pathology Cell growth Cluster Analysis Deoxyribonucleic acid DNA DNA Mutational Analysis Epigenomics Experiments Gene mutations Genetic aspects Genome Genomes Humans Laboratories Melanoma Metastasis Methods Models, Statistical Models, Theoretical Mutation Mutation - genetics Neoplasms - genetics Neoplasms - pathology Observations Tobacco smoke Transcription, Genetic Visualization
ISSN:	1553-7404, 1553-7390, 1553-7404
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Recent advances in sequencing technologies have enabled the production of massive amounts of data on somatic mutations from cancer genomes. These data have led to the detection of characteristic patterns of somatic mutations or "mutation signatures" at an unprecedented resolution, with the potential for new insights into the causes and mechanisms of tumorigenesis. Here we present new methods for modelling, identifying and visualizing such mutation signatures. Our methods greatly simplify mutation signature models compared with existing approaches, reducing the number of parameters by orders of magnitude even while increasing the contextual factors (e.g. the number of flanking bases) that are accounted for. This improves both sensitivity and robustness of inferred signatures. We also provide a new intuitive way to visualize the signatures, analogous to the use of sequence logos to visualize transcription factor binding sites. We illustrate our new method on somatic mutation data from urothelial carcinoma of the upper urinary tract, and a larger dataset from 30 diverse cancer types. The results illustrate several important features of our methods, including the ability of our new visualization tool to clearly highlight the key features of each signature, the improved robustness of signature inferences from small sample sizes, and more detailed inference of signature characteristics such as strand biases and sequence context effects at the base two positions 5' to the mutated site. The overall framework of our work is based on probabilistic models that are closely connected with "mixed-membership models" which are widely used in population genetic admixture analysis, and in machine learning for document clustering. We argue that recognizing these relationships should help improve understanding of mutation signature extraction problems, and suggests ways to further improve the statistical methods. Our methods are implemented in an R package pmsignature (https://github.com/friend1ws/pmsignature) and a web application available at https://friend1ws.shinyapps.io/pmsignature_shiny/.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Conceived and designed the experiments: YS MS. Performed the experiments: YS. Analyzed the data: YS MS. Contributed reagents/materials/analysis tools: GT SM. Wrote the paper: YS MS. Designed visualization of mutation signature: GT. The authors have declared that no competing interests exist.
ISSN:	1553-7404 1553-7390 1553-7404
DOI:	10.1371/journal.pgen.1005657