False Discovery Rates in PET and CT Studies with Texture Features: A Systematic Review

PLoS One. 2015 May 4;10(5):e0124165. doi: 10.1371/journal.pone.0124165. eCollection 2015.

Abstract

Purpose: A number of recent publications have proposed that a family of image-derived indices, called texture features, can predict clinical outcome in patients with cancer. However, the investigation of multiple indices on a single data set can lead to significant inflation of type-I errors. We report a systematic review of the type-I error inflation in such studies and review the evidence regarding associations between patient outcome and texture features derived from positron emission tomography (PET) or computed tomography (CT) images.

Methods: For study identification PubMed and Scopus were searched (1/2000-9/2013) using combinations of the keywords texture, prognostic, predictive and cancer. Studies were divided into three categories according to the sources of the type-I error inflation and the use or not of an independent validation dataset. For each study, the true type-I error probability and the adjusted level of significance were estimated using the optimum cut-off approach correction, and the Benjamini-Hochberg method. To demonstrate explicitly the variable selection bias in these studies, we re-analyzed data from one of the published studies, but using 100 random variables substituted for the original image-derived indices. The significance of the random variables as potential predictors of outcome was examined using the analysis methods used in the identified studies.

Results: Fifteen studies were identified. After applying appropriate statistical corrections, an average type-I error probability of 76% (range: 34-99%) was estimated with the majority of published results not reaching statistical significance. Only 3/15 studies used a validation dataset. For the 100 random variables examined, 10% proved to be significant predictors of survival when subjected to ROC and multiple hypothesis testing analysis.

Conclusions: We found insufficient evidence to support a relationship between PET or CT texture features and patient survival. Further fit for purpose validation of these image-derived biomarkers should be supported by appropriate biological and statistical evidence before their association with patient outcome is investigated in prospective studies.

Publication types

  • Meta-Analysis
  • Research Support, Non-U.S. Gov't
  • Review
  • Systematic Review

MeSH terms

  • Area Under Curve
  • False Positive Reactions
  • Humans
  • Image Processing, Computer-Assisted*
  • Kaplan-Meier Estimate
  • Positron-Emission Tomography*
  • Probability
  • ROC Curve
  • Tomography, X-Ray Computed*