Abstract
241549
Introduction: Despite thousands of publications regarding radiomics, lack of sufficient external validation of radiomic models, and methodological flaws in assessing biomarker novelty and prognostic power partly explain why sophisticated radiomic features and models have not been adopted in clinic yet. To help overcome some of these limitations, we designed and validated a biomarker discovery tool that selects biomarkers most likely to reflect new prognostic information while minimizing and controlling the number of false positives (FP).
Methods: Candidate biomarkers (CB) are assessed for their predictive potential by ROBI, based on their values in a patient cohort and their association with the outcome (eg, response to treatment). To avoid selecting candidates that replicate known predictive information, already known predictive biomarkers (KPB) must be identified. CB with an absolute Spearman correlation coefficient greater than a tunable cut-off with a KPB are discarded to ensure they capture new information. If multiple KPB are established, multicollinearity is assessed using the Variance Inflation Factor and CB exceeding a certain tunable multicollinearity threshold are discarded. A linear model (Cox for survival, logistic regression for classification) controls for confounders (e.g., age). Each CB prognostic ability is assessed using Harrell's Concordance Index against patient outcome data, or balanced accuracy for classification task. These scores are tested for significance using a 2-sided permutation test of P permutations. A two-stage linear step-up procedure (TST) is used to control the false discovery rate (FDR) through a Q parameter and address multiple testing. To increase the number of selected biomarkers, we introduced a correlation clustering optimization (CCO) before TST, where CB with similar information are clustered by absolute Spearman's correlation and only the biomarker with the best predictive score of each cluster is kept. Because it is selecting the CB with the best p-values, CCO may optimistically bias TST FDR. To improve the estimation of the FP number, ROBI runs randomly permuted outcome data throughout the selection process. The probability of only selecting FP is evaluated by the proportion of permuted datasets with as many as or more selected CB than the non-permuted selection.A total of 500 synthetic datasets (Table 1) and retrospective data of [18F]FDG PET/CT of 378 Diffuse Large B Cell Lymphoma (DLBCL) patients with survival data were analyzed to validate the tool. On the DLBCL data, two KPB, the total tumor volume TTV and a dissemination feature Dmax, were measured, and 10,000 random ones were generated. Selection was performed and verified on each dataset. Statistical significance was evaluated with Wilcoxon signed-rank tests.
Results: A total of 99.3% of synthetic datasets had the number of FP within ROBI’s 95% confidence interval, even with CCO. ROBI selected significantly more true positive (TP) than FP (p<0.001) (Figure 1). For given a TST Q setting, CCO significantly increased the number of TP, FP, and the difference between them (p<0.001). For a given FP number, CCO significantly increased the number of TP (p < 0.001). The estimated probability of selecting only FP, noted Prob, was strongly correlated with the number of TP (ρ=-0.96, p<0.001). For 60% of cases with at least one TP, Prob was <0.05. For the 3.3% cases with only FP selected, 0.6% of them had Prob<0.05.In the 378 DLBCL patients, 96 had progressive disease and 55 died. For PFS prediction, ROBI successfully retrieved TTV and Dmax from the 10,000 random features. One FP was also selected. ROBI predicted a 95% chance of having 0 or 1 FP with an average of 0.1 FP and estimated the probability of having only FP to be 0.0014. For OS prediction, no CB were selected, probably because censoring was too high.
Conclusions: The ROBI pipeline effectively selected relevant biomarkers while controlling FP, demonstrating robust performance on both synthetic and real datasets.