Schmalohr, Corinna Lewis ORCID: 0000-0001-5012-9679
(2025).
Modeling the Tissue-Specific Somatic Mutation Rate Along the Genome Based on Genomic Features.
PhD thesis, Universität zu Köln.
![]() |
PDF
dissertation_publik.pdf - Accepted Version Bereitstellung unter der CC-Lizenz: Creative Commons Attribution Non-commercial Share Alike. Download (18MB) |
Abstract
Somatic mutations, arising from unresolved DNA damage, play a critical role in driving cancer development and progression. Previous studies have demonstrated that mutation rates vary throughout the genome and are affected by large-scale genomic determinants. However, they frequently overlooked important genomic features or lacked the resolution to thoroughly examine changes that are particular to different tissues and mutation types. To address these limitations, we developed a base-pair resolution model to predict somatic mutation rates in the exome. Using cancer mutation datasets and a diverse set of genomic features, we trained and compared several predictive models, including Random Forest, Generalized Linear Models, and LASSO with stability selection. Random Forest performed the best among them and was selected for the majority of analyses. Our findings highlight robust predictive performance, with improved accuracy for specific tissues and mutation types. Key predictors of mutation rate included GC content, replication timing, DNA methylation, histone marks (H3K27ac, H3K4me3, and H3K9ac), RNA expression, transcription factor binding site density, and eQTL annotations. These results underscore the central role of characteristics linked to transcriptional activity in determining local mutation rates. Remarkably, our models showed a high degree of tissue similarity, and tissue-specific models could be transferred between tissues without losing their predictive power. This finding suggested that the same mutational mechanisms are at play across tissues, enabling the use of a single, generalized model to predict mutation rates effectively across tissues. Extending the approach to the whole genome demonstrated that intergenic areas are subject to the same mutational processes as exonic regions. The models were validated on data from healthy tissues, further supporting their broad applicability. This study provides a detailed and comprehensive characterization of somatic mutational patterns, leveraging base-pair resolution and an extensive array of genomic predictors. These insights advance our understanding of mutation processes and have the potential to enhance tumor evolution models and driver mutation discovery.
Item Type: | Thesis (PhD thesis) | ||||||||||
Creators: |
|
||||||||||
URN: | urn:nbn:de:hbz:38-755864 | ||||||||||
Date: | 2025 | ||||||||||
Language: | English | ||||||||||
Faculty: | Faculty of Mathematics and Natural Sciences | ||||||||||
Divisions: | CECAD - Cluster of Excellence Cellular Stress Responses in Aging-Associated Diseases | ||||||||||
Subjects: | Life sciences | ||||||||||
Uncontrolled Keywords: |
|
||||||||||
Date of oral exam: | 24 March 2025 | ||||||||||
Referee: |
|
||||||||||
Refereed: | Yes | ||||||||||
URI: | http://kups.ub.uni-koeln.de/id/eprint/75586 |
Downloads
Downloads per month over past year
Export
Actions (login required)
![]() |
View Item |