An estimate of the science-wise false discovery rate and application to the top medical literature

Leah R Jager; Jeffrey T Leek

doi:10.1093/biostatistics/kxt007

An estimate of the science-wise false discovery rate and application to the top medical literature

Biostatistics. 2014 Jan;15(1):1-12. doi: 10.1093/biostatistics/kxt007. Epub 2013 Sep 25.

Authors

Leah R Jager¹, Jeffrey T Leek

Affiliation

¹ Department of Mathematics, United States Naval Academy, Annapolis, MD 21402, USA.

PMID: 24068246
DOI: 10.1093/biostatistics/kxt007

Abstract

The accuracy of published medical research is critical for scientists, physicians and patients who rely on these results. However, the fundamental belief in the medical literature was called into serious question by a paper suggesting that most published medical research is false. Here we adapt estimation methods from the genomics community to the problem of estimating the rate of false discoveries in the medical literature using reported $P$-values as the data. We then collect $P$-values from the abstracts of all 77 430 papers published in The Lancet, The Journal of the American Medical Association, The New England Journal of Medicine, The British Medical Journal, and The American Journal of Epidemiology between 2000 and 2010. Among these papers, we found 5322 reported $P$-values. We estimate that the overall rate of false discoveries among reported results is 14% (s.d. 1%), contrary to previous claims. We also found that there is no a significant increase in the estimated rate of reported false discovery results over time (0.5% more false positives (FP) per year, $P = 0.18$) or with respect to journal submissions (0.5% more FP per 100 submissions, $P = 0.12$). Statistical analysis must allow for false discoveries in order to make claims on the basis of noisy data. But our analysis suggests that the medical literature remains a reliable record of scientific progress.

Keywords: False discovery rate; Genomics; Meta-analysis; Multiple testing; Science-wise false discovery rate; Two-group model.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Biomedical Research / standards*
Computer Simulation
Data Interpretation, Statistical*
False Positive Reactions*
Humans
Publications / standards*
Software
United Kingdom
United States