How do we get away from data mining?

Chris Blattman, citing a recent paper (ungated here) by Rasmussen, Malcow-Møller and Andersen, is wondering why donors and the big RCT research groups aren’t really pushing for a trial registry.

I think the answer is pretty simple: incentives. Trial registries force us to carefully specify our hypotheses up front. The benefits on the quality of the research are obvious: there’s a lot less data mining (atheoretically scouring the data for correlations), and subsequently, less bogus results.

Yet the amount of viable research decreases significantly once we tie our hands ex-ante. From a purely cost standpoint, you can see why donors wouldn’t be happy, given that they fund a lot of this research. If you’re going to spend $50,000 on an RCT, you want the researcher to come up with as many results as possible, even if these results are `discovered’ later, and are a little bit suspect. An RCT that only answers one question (even if it answers it very, very well) doesn’t appeal to the zealots of value-for-money. Furthermore, to the extent that donors are backing rigorous impact analysis for the purpose of choosing the right tools, more results = more taglines. When the head of DFID defends a new project scaling up a newly proven intervention, he/she wants to be able to say “our project has a proven impact on X, Y and Z”, not just on X. In the quality-quantity trade-off, donors will strictly prefer quantity.

What about the big research houses? Both top-down and bottom-incentives are a problem here. At the end of the day, someone allocates money to these places (a donor, a foundation, etc), and that someone is going to have a similar objective function to the donors: more research per dollar is better. What about individual researchers? At the end of the day, we all have to publish…. and, let’s face it, data mining is easy and fun, if pretty dodgy.

How do we move from the current equilibrium to the one seen in the medical sciences? Maybe the journals need to make the first move on this by making pre-registration a prerequisite. I can see this making a lot of people really unhappy.