False Discovery Rates in PET and CT Studies with Texture Features: A Systematic Review

Anastasia Chalkidou; Michael J O'Doherty; Paul K Marsden

doi:10.1371/journal.pone.0124165

False Discovery Rates in PET and CT Studies with Texture Features: A Systematic Review

PLoS One. 2015 May 4;10(5):e0124165. doi: 10.1371/journal.pone.0124165. eCollection 2015.

Authors

Anastasia Chalkidou¹, Michael J O'Doherty¹, Paul K Marsden¹

Affiliation

¹ Division of Imaging Sciences and Biomedical Engineering, Kings College London 4th Floor, Lambeth Wing, St. Thomas Hospital, SE1 7EH, London, United Kingdom.

Abstract

Purpose: A number of recent publications have proposed that a family of image-derived indices, called texture features, can predict clinical outcome in patients with cancer. However, the investigation of multiple indices on a single data set can lead to significant inflation of type-I errors. We report a systematic review of the type-I error inflation in such studies and review the evidence regarding associations between patient outcome and texture features derived from positron emission tomography (PET) or computed tomography (CT) images.

Methods: For study identification PubMed and Scopus were searched (1/2000-9/2013) using combinations of the keywords texture, prognostic, predictive and cancer. Studies were divided into three categories according to the sources of the type-I error inflation and the use or not of an independent validation dataset. For each study, the true type-I error probability and the adjusted level of significance were estimated using the optimum cut-off approach correction, and the Benjamini-Hochberg method. To demonstrate explicitly the variable selection bias in these studies, we re-analyzed data from one of the published studies, but using 100 random variables substituted for the original image-derived indices. The significance of the random variables as potential predictors of outcome was examined using the analysis methods used in the identified studies.

Results: Fifteen studies were identified. After applying appropriate statistical corrections, an average type-I error probability of 76% (range: 34-99%) was estimated with the majority of published results not reaching statistical significance. Only 3/15 studies used a validation dataset. For the 100 random variables examined, 10% proved to be significant predictors of survival when subjected to ROC and multiple hypothesis testing analysis.

Conclusions: We found insufficient evidence to support a relationship between PET or CT texture features and patient survival. Further fit for purpose validation of these image-derived biomarkers should be supported by appropriate biological and statistical evidence before their association with patient outcome is investigated in prospective studies.

Publication types

Meta-Analysis
Research Support, Non-U.S. Gov't
Review
Systematic Review

MeSH terms

Area Under Curve
False Positive Reactions
Humans
Image Processing, Computer-Assisted*
Kaplan-Meier Estimate
Positron-Emission Tomography*
Probability
ROC Curve
Tomography, X-Ray Computed*

Abstract

Publication types

MeSH terms

Grants and funding