Publication
Survival Prediction with Multi-Omics Data
Naimur Islam
Mastersthesis, Rheinland-Pfälzische Technische Universität Kaiserslautern–Landau, 11/2024.
Abstract
As an essential field of Statistics, Survival analysis predicts various aspects of time-to-event data that includes censored data. Technological advancement of the analysis of biological molecules enables prediction of survival times, and risk of a specific disease occurrence by leveraging omics data. Omics data generally contains thousands of features compared to the minimal number of samples and the number of features only increases when additional omics types are integrated to make a multi-omics dataset, which is hard to interpret by humans and can also impact any model to analyze the data successfully. The objective of this thesis is to compare the survival prediction performance of feature-selected subsets with the performance of the full multi-omics dataset. The thesis also explores how does the integration of various omics types within a multi-omics dataset influence survival prediction performance. The findings show that it is uncertain whether models built on feature-selected subsets consistently outperform models built on the full multi-omics dataset. Similarly, it is unclear whether integrating more omics types to construct multi-omics dataset yields better predictive performance than selectively including a less carefully chosen omics types to construct a multi-omics dataset. In general, survival analysis prediction performance heavily depends on the chosen multi-omics dataset, integration technique, selected survival analysis model and its configuration, and the considered prediction performance measure.