Pages

Sunday, January 04, 2015

GMO Statistics Part 42. False discovery rates and misinterpretation of stats tests

Tree diagram to illustrate the false discovery rate in significance tests. This example considers 1000 tests, in which the prevalence of real effects is 10%. The lower limb shows that with the conventional significance level, p=0.05, there will be 45 false positives. The upper limb shows that there will be 80 true positive tests. The false discovery rate is therefore 45/(45+80)=36%, far bigger than 5%.
An investigation of the false discovery rate and the misinterpretation of p-values
 David Colquhoun

ABSTRACT
 If you use p = 0.05 to suggest that you have made a discovery, you will be wrong at least 30% of the time. If, as is often the case, experiments are underpowered, you will be wrong most of the time. This
conclusion is demonstrated from several points of view. First, tree diagrams which show the close analogy with the screening test problem. Similar conclusions are drawn by repeated simulations of t-tests. These mimic what is done in real life, which makes the results more persuasive. The simulation
method is used also to evaluate the extent to which effect sizes are over-estimated, especially in underpowered experiments. A script is supplied to allow the reader to do simulations themselves, with numbers appropriate for their own work. It is concluded that if you wish to keep your false discovery rate below 5%, you need to use a three-sigma rule, or to insist on p less than or equal to 0.001. And never use the word ‘significant’.
DOI
http://dx.doi.org/10.1098/rsos.140216
Published By The Royal Society
Published online November 19, 2014.
Copyright & Usage
© 2014 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License
http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.





No comments:

Post a Comment