The elusive primary source

Ben Goldacre, enemy of bad science reporting everywhere, takes a swipe at newspaper articles that don’t link to the primary source:

This week the Telegraph ran the headline “Wind farms blamed for stranding of whales”. “Offshore wind farms are one of the main reasons why whales strand themselves on beaches, according to scientists studying the problem”, it continued. Baroness Warsi even cited it as a fact on BBC Question Time this week, arguing against wind farms.

But anyone who read the open access academic paper in PLoS One, titled “Beaked Whales respond to simulated and actual navy sonar”, would see that the study looked at sonar, and didn’t mention wind farms at all. At our most generous, the Telegraph story was a spectacular and bizarre exaggeration of a brief contextual aside about general levels of manmade sound in the ocean by one author at the end of the press release (titled “Whales scared by sonars”). Now, I have higher expectations of academic institutions than media ones, but this release didn’t mention wind farms, certainly didn’t say they were “one of the main reasons why whales strand themselves on beaches”, and anyone reading the press release could see that the study was about naval sonar.

The Telegraph article was a distortion (now deleted, with a miserly correction), perhaps driven by their odder editorial lines on the environment, but my point is this: if we had a culture of linking to primary sources, if they were a click away, then any sensible journalist would have been be too embarrassed to see this article go online. Distortions like this are only possible, or plausible, or worth risking, in an environment where the reader is actively deprived of information.

This, of course, is a major problem outside of science reporting as well – journalists reporting social science work commonly distort findings to make stories more exciting – more `clickable’. This would be ok if we could check up on them, but often finding the original research involves bunging the author’s name into google with a few key phrases and hoping you have access to that particular journal. This gives journalists a huge, scary advantage in information control.

Academic institutions could play a better role in information control – press releases are often eagerly shoved at the door and are usually over-optimistic about any results in hand. Maybe we could require that academic research which is highlighted in the press cannot be gated, or at least that there should be something more explicit than a press release which allows readers to understand some of the fine detail.

It could also be that a web-based blogging culture might overturn these norms – Goldacre points out that bloggers start from a position of zero credibility (I mean, come on, I’ve got a picture from The Matrix at the top of this post), so we have to link a lot more so people know we aren’t just making it all up. I’m not quite so optimistic about blogging culture – but it is a start.

Research error and the reliability of big reports

Reports produced by NGOs and think tanks are often a ragged amalgamation of other, questionable research. In other news, Dr. Frankenstein's results are still awaiting verification through a randomized trial.

Christ Blattman points to an Atlantic Monthly article on the likelihood that that most published research delivers false results:

Simply put, if you’re attracted to ideas that have a good chance of being wrong, and if you’re motivated to prove them right, and if you have a little wiggle room in how you assemble the evidence, you’ll probably succeed in proving wrong theories right.

Following the same train of thought, Alex Tabarrok, points out that by pure statistical chance, about 5% of all false hypotheses that are tested will give statistically significant results. If you believe that most hypothesis are false, and that we’re only successful in identifying true hypothesis part of the time, then our collection of statistically significant results could be largely contaminated with false positives (Tabarrok gives an arbitrarily alarming figure of 25%).

This makes it terribly important to take these results carefully, and not to treat each individual study as the final word on a subject. It is also a very, very good reason to be distrustful of big, conclusive reports; the sort that are often produced by international NGOs and many of the bilateral donors and think tanks.

Researchers are often called upon to make expansive, sometimes global  statements about tenuous, uncertain relationships (like the relationship between climate change and HIV/AIDS). The tendency is for these researchers to mine results for useful `impacts’, then use those as underpinning assumptions for bigger leaps of logic: a researcher takes person A’s results and person B’s results, proclaims they are true, and uses them to produce results C.

Even if report-writers do this very carefully, they are still bound by the limitations of the original study – and the probability of error goes up. If there is a 25% chance that person A’s results are a fluke, and a 25% chance that person B’s results are a fluke, then there is only a .75 x .75 = 56% chance that result C isn’t constructed from some false results. This is less of a problem if the researcher is considering many results from a single hypothesis, but if the researcher cherry picks different hypothesis (say, for example, an assumption of the impact of X on Y and Z on Q) and strings them together, such flaws will be more and more pronounced.

Tabarrok has a list of things everyone should consider in a world where most research is false. I’ll add a few more, pertinent to the uncertain world of development reports and policy briefs:

  1. Be extremely wary of reports touting specific numbers. A report which says that climate change will cost us exactly $50 billion in the next ten years probably has many, many, many assumptions behind it. For each additional assumption, consider the collective probability that the whole estimate is wrong.
  2. Read the footnotes  and references behind assumptions, and follow-up with the source literature. Be wary if a number is taken from a study that has never been published, or for which there is no clear evidence of inspired debate. Be wary if the author does not mention and appreciate the potential problems with those assumptions, or the reference’s place in the general literature.
  3. Please, please, put on your causality cap before you start touting any numbers.

Off to Dar es Salaam

In a few days I’m leaving for Dar es Salaam, to help run the baseline survey for a “randomised land rights” project in the slums of the city. I’ll be away for about two months.

I’d welcome any general advice on living in Dar or on field work!

My posting frequently will inevitably plummet during this time. Ranil and I are going to try and arrange for some more guest posts to offset this. For more general (read: pretentious and thesaurusrific) writing on my travels, you can check the blog that I kept in Malawi: Stranger in a Strange land.