Gilteritinib b iframe width height src https
Comparing survival analysis methods for cancer RNA-seq data
Fig. 5 Assessing the accuracy of the eight methods. ROC curves were used to evaluate the ability of each method to identify a set of positive controls that were tumor type-specific and derived from the literature.
results based on the genes that they had identified (Supple-mental Table 1).
As to why the Cox regression method of survival analy-sis performed much better than the other methods it could be primarily due to the genes chosen by this method. While it is certainly possible that this could have been influenced by the “gold standard” or known true positive lists chosen, it seems unlikely that each of the cancer lists favor this method more strongly, with good performance also observed for the k-means, C-index, and D-index methods. Thus, it seems proba-ble that the Cox regression method, and to a lesser extent, the k-means, C-index, and D-index methods simply perform bet-ter than the other methods because of their ability to identify true positives and false positives more robustly in the pres-ence of noisy data.
most robust performance overall with the C-index coming third. The D-index had superior performance that was above the C-index and Cox regression for all noise levels, with AUC > 0.9 for maximum noise level of 1.5. This observation was consistent with the results that assessed accuracy using the ROC curves that were based on positive controls using real RNA-seq data. It is possible this result was due to the ability of the Cox regression method to model continuous data more robustly against noise than the other methods. In Gilteritinib to its fair performance in the previous evaluations, reliability and accuracy, k-means was not found to be very robust and did not perform well in the presence of noise, despite being a non-parametric approach. The method of creating positive controls, generating data, and even adding noise are certainly factors that could have caused or influenced this result.
Testing for robustness using in silico data identified the D-index as the method that was the least sensitive to different levels of noise
Overall, our tests identified D-index as the method that was the most robust to noise using the in silico datasets. As we varied the noise incrementally in our artificially simulated dataset, both the median and 25th–75th percentile-split methods in-creasingly lost their ability to detect signal in the data, and the distribution-specific method also dropped precipitously in per-formance (Fig. 6). The Cox regression method had the second
Comparing the performance of the survival analysis methods using well-known cancer genes
It is useful to examine the results from this study through the lens of several well-known cancer genes for each tumor type. Anoctamin 1 (ANO1) expression has been implicated previ-ously as a marker of poor prognosis and a potential driver of metastasis in HNSC [21,35]. When examining the TCGA HNSC dataset, we find that while almost all methods do de-tect ANO1 as being significantly related to survival (P-value < 0.05); with Cox regression, C-index, and D-index assigned
Fig. 6 Assessing the robustness of the eight methods. A - D. ROC curves demonstrate variable performance of the methods as the noise level in the simulated data increases from 0 to 150%. E. The trend in AUC values across increasing levels of noise indicate that the D-index has the most robust performance of all eight methods, followed by Cox regression.
to ANO1 the most significant P-values (P-value < 0.005). In contrast, the dichotomization methods yielded much higher P-values, both approximately at P-value ∼ 0.03. Since this is close to the significance threshold that is typically chosen in most studies (P-value < 0.05), it stands to reason that if the sample size was smaller it may not have even been detected by these methods.