I’ll take an evaluation please, but hold the scientists

You need an econometrician dear, not a doctor

Following the massive kerfuffle over the Lancet article on the child mortality impact of the Millennium Village Project, both the authors of the paper and the journal itself have finally responded.

The first response, by Paul Pronyk of the Earth Institute, is reassuringly humble: the authors accept all of the mistakes highlighted by Gabriel Demombynes and Espen Prydz, and even claim that subsequent results will be analysed in a more transparent manner:

The project will invite an independent panel of experts, including critics of the project, to participate in scrutinising the vital events and survey data and in assessing their validity.

The second response, by the editors of the Lancet, is more defensive, arguing that even after failing to show that the fall in infant mortality in Millennium Villages was due to the the MVP intervention, the study still had merit – pointing to several other results which were not the focus of the study (and in two instances, were not significantly different than the `control ‘ villages). I was perturbed by the final statement, which suggests that more independent oversight by medical science professionals is the solution to our concerns:

To ensure that all future data from the project are fully and fairly evaluated, Prof Jeffrey Sachs, the Principal Investigator of the Millennium Villages project, is establishing new internal and external oversight procedures, including the creation of an International Scientific Expert Advisory Group, chaired by Prof Robert Black, Chairman of the Department of International Health, Johns Hopkins Bloomberg School of Public Health, which will report to the Principal Investigator and also communicate its findings to The Lancet. The goal is to provide a further independent means of verifying the quality of the project’s design and analysis. It is important that this work, which is of considerable significance for understanding how countries scale up multiple complex interventions across sectors, receives proper scientific evaluation before, during, and after publication.

Emphasis is mine. This suggestion for a solution is missing the point: the problem with the evaluation of the MVP isn’t that it needs more scientists (narrowly defined as researchers from the health community) paying attention. The problem with the evaluation of the MVP is that it has too many scientists paying attention. Let me be clear: while researchers from these fields are amazing at what they do well (especially randomized controlled trials) – they are not as adept at the careful statistical analysis needed for non-random, complex impact interventions. This is why – sadly all too frequently – incredible journals like The Lancet publish research which would be laughed out of a graduate-level applied economics seminar.

Now, to be fair, economists and other social scientists probably do enough injustice to the health literature to give your average¬†epidemiologist¬†an¬†aneurysm, but there’s a difference between wallowing in within-discipline ignorance (economists or health researchers choosing not to know any better) and knowing better and choosing the path of least resistance. If one wanted to be overly cynical, the precise reason why the MVP is publishing in top medical journals has less to do with seeking the most appropriate audience for assessing impact and more to do with choosing a less critical one.

If they want to convince the world that the Millennium Villages are a big deal, they need to at least bring in some social scientists with the statistical know-how to properly evaluate the evidence. Let’s hope that Dr. Pronyk’s independent panel of experts will have an econometrician or two, rather than just relying solely on those who have solid record of publishing in The Lancet.

Sachs the rainmaker

"But kemosabe, this would not stand up to a diff-in-diff"

Many of you will already be familiar with the ongoing debate over the efficacy and evaluation of the Millennium Village Project, the brainchild of the Earth Institute’s Jeffrey Sachs. Due primarily to the work of Michael Clemens at the CGD and Gabriel Demombynes at the World Bank, the MVP’s claims of development impact have finally faced substantial scrutiny, although frequently the debate has felt more like a war of attrition than productive discourse.

Enter the Lancet, a reputable medical journal which has a worrying tendency to publish really¬†disreputable¬†social science research, which just published a study by Sachs et al. showing that, over three years, child mortality (under the age of five) has fallen by roughly 25% across nine¬†Millennium¬†Villages. When compared with `control’ villages (which were chosen later and differ from the MVs in many, substantial ways), the drop was even larger – close to 31%.

Suddenly the bells starting ringing: after all the doubt, the MVP is hailed as being successful in reducing child mortality, with the editor-in-chief of the Lancet rallying behind the paper and the Guardian reporting the results with an astonishing lack of scrutiny. Only in the twitterverse/blogosphere has the response been largely negative (Lee Crawfurd disassembles the results of the Lancet article here).

However undeserved, this might have been a good opportunity for the the Earth Institute to¬†bask in its momentary glory. Yet, the results might have already been undermined by awful timing: the Lancet study arrived just days after another¬†study by the World Bank’s¬†Gabriel Demombynes and Karina Trommlerov√° showing absolutely massive decreases in child mortality across most of sub-Saharan Africa in the past few years.

To understand why this is a problem for the Lancet study, consider the table below, which I’ve assembled from results from that study and some figures from the World Bank one (admittedly swiped from Michael Clemens’s¬†post on it).

From the WB study I’ve taken the same nine countries used in the Lancet article, listed their declines in mortality and (assuming a linear trend) calculated the average decline in under-5 mortality per year. One caveat: the years considered in the World Bank study do not¬†necessarily¬†coincide¬†with the timing of the Millennium Villages in their respective countries, so we may be comparing trends from different periods. Even so – these figures still provide a rough idea of the relative magnitude of the mortality decline.

Per-country figures are not available in the Sachs et al. study (which is it a bit worrying in itself), so I can only compare the average declines in these countries to the average decline in all Millennium Villages. What do the results suggest? While child mortality dropped by 24.6 (less children dying per thousand births) over a 3 year period, average declines for all countries in the study are broadly similar: 22.5.

The first and most important thing to take from these results is that the Millennium Villages aren’t vastly outperforming aggregate gains in the same countries. This makes it very difficult for the MVP to claim it is making an impact – it’s a bit like claiming credit for rain in Oxford, when it has been raining all over the UK.

The second thing worth noting: if you look at the above table, taken from the Lancet study, you’ll see that under-five mortality is actually increasing¬†in the control villages. This strongly suggests that control villages are quite different from the rest of the country at large. The Earth Institute has argued that Millennium Villages (and their control counterparts) were selected because they were different – but even if these odd trends in the control villages don’t disqualify them as a¬†counterfactual¬†(which I still think they do), the differences seen here certainly prevent the MVP from having any sort of claims of external validity.

The argument that the Millennium Villages aren’t outperforming the rest of their host countries is not new: Clemens and¬†Demombynes made it over a year ago, when they found that many other claims of `impact’ by the MVP were reflected in national statistics. ¬†Let’s hope the hype from the this study is similarly deflated.