Decoding Lung Cancer Radiogenomics: A Custom Clustering/Classification Methodology to Simultaneously Identify Important Imaging Features and Relevant Genes
Background: This study evaluated a custom algorithm that sought to perform a radiogenomic analysis on lung cancer genetic and imaging data, specifically by using machine learning to see whether a custom clustering/classification method could simultaneously identify features from imaging data that co...
Uloženo v:
| Vydáno v: | Applied sciences Ročník 15; číslo 7; s. 4053 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Basel
MDPI AG
01.04.2025
|
| Témata: | |
| ISSN: | 2076-3417, 2076-3417 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Background: This study evaluated a custom algorithm that sought to perform a radiogenomic analysis on lung cancer genetic and imaging data, specifically by using machine learning to see whether a custom clustering/classification method could simultaneously identify features from imaging data that correspond to genetic markers. Methods: CT imaging data and genetic mutation data for 281 subjects with NSCLC were collected from the CPTAC-LUAD and TCGA-LUSC databases on TCIA. The algorithm was run as follows: (1) genetic clusters were initialized using random clusters, binary matrix factorization, or k-means; (2) image classification was run on CT data for these genetic clusters; (3) misclassified subjects were re-classified based on the image classification algorithm; and (4) the algorithm was run until an accuracy of 90% or no improvement after 10 runs. Input genetic mutations were evaluated for potential medical treatments and severity to provide clinical relevance. Results: The image classification algorithm was able to achieve a >90% accuracy after nine algorithm runs and grouped subjects from a starting five clusters to four final clusters, where final image classification accuracy was better than every initial clustered accuracy. These clusters were stable across all three test runs. A total of thirty-eight genes from the top hundred across each subject were identified with specific severity or treatment data; twelve of these genes are listed. Conclusion: This small pilot study presented a potential way to identify genetic patterns from image data and presented a methodology that could group images with no labels or only partial labels for future problems. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2076-3417 2076-3417 |
| DOI: | 10.3390/app15074053 |