The authors identify 52 interventions published in leading medical journals that compare observed and experimental evidence – in other words, a correlation was observed and then subjected to a randomized experimental design. They find 0 of 52 interventions – again, zero percent – yielded significant results in randomized trials. Zero percent? Five findings were apparently significant in the contrary direction, and not one false positive? Anyway, the article seems like a pretty fundamental indictment of a whole way of doing business, but their prescription is unworkable. Step 1: Cut all data sets in half.The notion that half of all data be placed in a “lock box” and subjected to an elaborate replication process elevates an important principle to the level of absurdity.The principle of replication holds that a study or experiment, repeated by an independent researcher, should generate the same results. Replication goes to the heart of science as a collective, “self-correcting” enterprise. But most scientists acknowledge that this principle is rarely practiced. What about fields of knowledge relying on qualitative and historical data? But, my friend Ben offered a smarter critique:
Here, the authors reject the Bayesian paradigm of statistics wholesale, and claim that the only correct approach to statistics is what the great eugenecists of the 1930s did, calculating p-values and using those for decisions.
Bayesians are fine with a preponderance-of-data approach. One test gives you a bit of evidence that maybe something is true; two tests build your confidence; a hundred tests make you darn certain, but might still leave you confident that one day the whole paradigm will be overturned with a new discovery. There is a voluminous literature that formalizes this.
I don’t know if the authors even know they have frequentist blinders on. But the statement that the test/control experiment format fails for real-world situations—especially those involving humans, who are filled with confounding factors—has been observed time and time again for almost a century now. Major alternatives, including the Bayesian and information-theoretic approaches, have been proposed. This article starts with the presumption that the only correct approach is p-values, and then asks what to do from there; I think they’d be better off acknowledging that maybe it’s time to scrap the whole framework for non-test/control studies.
What he said! Mostly, though, I am appalled at the notion that researchers lock away HALF of their data, which strikes me as a very bad idea.