Some thoughts on More Than Good Intentions

It is more than a little embarrassing that it has taken me this long to write up my thoughts on More Than Good Intentions, given that I received and read the book back in April (note to PR people, please don’t let this stop you sending us review copies of interesting books). I doubt readers an introduction – Karlan and Appel’s book has been reviewed by pretty much every other blogger and development wonk out there, to general acclaim.

Perhaps having a bit more time to reflect and see how the book is received by others is a good thing – while my initial review would have been more comprehensive, it would also have had a fair bit of needless nitpicking. Instead of the review I would have written several months ago, I offer just a few thoughts on what the books gets right, and where it falls short:

Firstly, the acclaim is well-deserved – the book is a nice outlet into some of the more recent experimental evidence from developing countries. While it is obviously meant to be accessible to those with little working knowledge of development interventions and experimental knowledge, it can be still enjoyed by the more experienced researcher. However, those who are already  quite familiar with Dean Karlan’s work, or that of  Innovations for Poverty Action in general might find much of the book redundant.

In terms of what it lends big development debates, More Than Good Intentions is more a marginal than a seminal contribution – it will be most successful at helping its audience update their priors on conventional development interventions, rational behaviour, etc than significantly shifting the terms of the debate. It is refreshing to see a book a little more humble in scope and (mostly) less certain that we’ve got all the answers we need (there is some language at the end about proven interventions, but it isn’t too strong). This is when the randomistas are at their best: concentrating on slowly shifting the body of evidence, rather than petitioning for seismic shifts policy.

Yet, MTGI occasionally feels a little too limited in scope for its own good. By relying almost-exclusively on rigorous experimental work  (most of which they were involved or closely associated with), the authors have really limited the amount of evidence they can draw on. Subsequently, while the book feels in full stride discussing the nuances of microcredit and microsavings (one could almost see a whole new book here), it begins to feel thin as we move on to other themes such as education and agriculture, where we get less of a sense where the research Karlan and Appel highlight sits in the greater body of evidence. Development economists have been researching many of these issues for decades using less rigorous but still worthwhile methods, and it seems odd to discuss a few choice experiments without explaining how it connects to what we already know.

Chances are that you’ve already picked up or read this book. I’d definitely recommend it – but with the same grain of salt needed for all new works – use it to update your priors, not redefine them.

In Today’s Language: “Dudes, WTF? STFU and let me do my job!”

Historically accurate depiction of the Duke of Wellington

Gentlemen,

Whilst marching from Portugal to a position which commands the approach to Madrid and the French forces, my officers have been diligently complying with your requests which have been sent by His Majesty’s ship from London to Lisbon and thence by dispatch to our headquarters. We have enumerated our saddles, bridles, tents and tent poles, and all manner of sundry items for which His Majesty’s Government holds me accountable. I have dispatched reports on the character, wit and spleen of every officer. Each item and every farthing has been accounted for with two regrettable exceptions for which I beg your indulgence:

Unfortunately the sum of one shilling and ninepence remain unaccounted for in one infantry battalion’s petty cash and there has been a hideous confusion as to the number of jars of raspberry jam issued to one cavalry regiment during a sandstorm in western Spain.

This reprehensible carelessness may be related to the pressure of circumstance, since we are at war with France, a fact which may come as a bit of a surprise to you gentlemen in Whitehall.

This brings me to my present purpose, which is to request elucidation of my instructions from His Majesty’s Government so that I may better understand why I am dragging an army over these barren plains. I construe that perforce it must be one of two alternative duties, as given below. I shall pursue either with the best of my ability, but I cannot do both:

1.) To train an army of uniformed British clerks in Spain for the benefit of the accountants and copy-boys in London or, perchance…
2.) To see to it the forces of Napoleon are driven out of Spain.

Your most obedient servant,

Wellington

—Attributed to the Duke of Wellington, during the
Peninsular Campaign, in a message to the British
Foreign Office in London, 11 August 1812.

Reader Duncan gets a virtual high-five for pointing this letter out to me. It’s taken from the introduction of the CGD Essay by Andrew Nastios, the USAID Administrator, entitled The Clash of the Counter-Bureaucracy and Development. The central thrusts of his argument (and I confess here to having given it only a skim) are that bureaucratic regulation of aid agencies is extremely cumbersome and is the focus of far too much ‘development’ work, and that this is counterintuitive, since the most easily measurable development programmes are the least transformational, and the most transformational are the least easily measurable.

Continue reading

Randomized trials are so 1930s

Jim Manzi, the CEO of Applied Predictive Technologies (a randomized trial software firm), reminds us that we’ve been subjecting public policy to experimental methods for quite some time:

In fact, Peirce and others in the social sciences invented the RFT decades before the technique was widely used for therapeutics. By the 1930s, dozens of American universities offered courses in experimental sociology, and the English-speaking world soon saw a flowering of large-scale randomized social experiments and the widely expressed confidence that these experiments would resolve public policy debates. RFTs from the late 1960s through the early 1980s often attempted to evaluate entirely new programs or large-scale changes to existing ones, considering such topics as the negative income tax, employment programs, housing allowances, and health insurance.

So the randomistas aren’t so much as a “new wave” as the “next wave.” More interesting though, are Manzi’s thoughts on external validity:

By about a quarter-century ago, however, it had become obvious to sophisticated experimentalists that the idea that we could settle a given policy debate with a sufficiently robust experiment was naive. The reason had to do with generalization, which is the Achilles’ heel of any experiment, whether randomized or not. In medicine, for example, what we really know from a given clinical trial is that this particular list of patients who received this exact treatment delivered in these specific clinics on these dates by these doctors had these outcomes, as compared with a specific control group. But when we want to use the trial’s results to guide future action, we must generalize them into a reliable predictive rule for as-yet-unseen situations. Even if the experiment was correctly executed, how do we know that our generalization is correct?

One example he discusses the frequent experimentation used in crime-prevention, and how the (very few) subsequent attempts:

Criminologists at the University of Cambridge have done the yeoman’s work of cataloging all 122 known criminology RFTs with at least 100 test subjects executed between 1957 and 2004. By my count, about 20 percent of these demonstrated positive results—that is, a statistically significant reduction in crime for the test group versus the control group. That may sound reasonably encouraging at first. But only four of the programs that showed encouraging results in the initial RFT were then formally replicated by independent research groups. All failed to show consistent positive results.

My biggest fear about the current trend in social science RCT work is not only the failure to confirm positive results, but the failure to confirm negative results. While there is a small, but real incentive to repeat a ‘proven’ randomized study in a new setting, there isn’t much being done to confirm that a negligible treatment effect doesn’t improve elsewhere. While big RCT research groups do care about external validity, it is the initial findings that get seared into the mind of the policymakers. Flashy graphs which generalize without concern don’t help.

Here’s part of the closing to Manzi’s piece, which is a must-read if you’re interested or involved in this type of work:

It is tempting to argue that we are at the beginning of an experimental revolution in social science that will ultimately lead to unimaginable discoveries. But we should be skeptical of that argument. The experimental revolution is like a huge wave that has lost power as it has moved through topics of increasing complexity. Physics was entirely transformed. Therapeutic biology had higher causal density, but it could often rely on the assumption of uniform biological response to generalize findings reliably from randomized trials. The even higher causal densities in social sciences make generalization from even properly randomized experiments hazardous. It would likely require the reduction of social science to biology to accomplish a true revolution in our understanding of human society—and that remains, as yet, beyond the grasp of science.

Be careful who you nudge

A new study by economists from UCLA looked at the effect of a randomised program which informed households of their energy expenditure relative to their peers:

We show that while the electricity conservation “nudge” of providing feedback to households on own and peers’ home electricity usage works with liberals, it can backfire with conservatives. Our regression estimates predict that a Democratic household that pays for electricity from renewable sources, that donates to environmental groups, and that lives in a liberal neighborhood reduces its consumption by 3 percent in response to this nudge. A Republican household that does not pay for electricity from renewable sources and that does not donate to environmental groups increases its consumption by 1 percent.

An overview of the study at Slate. Hat tip to MR.

J-PAL’s Christmas shopping list

The Abdul Latif Jameet Poverty Action Lab has a new list of “best buys” to reach the MDGs out. The suggested interventions are mostly intuitive and reasonable: providing free bednets, deworming, basic education, and empowering women using political quotas.

In the past few years, J-PAL has drastically altered the way we validate our policies, mainly by raising the bar of empirical skepticism. However, I question the specific-to-general recommendations they are making: while each individual study is rigorous and convincing, there is an implicit assumption being made here that what works in a few isolated settings will work in every setting.

Statements like “time-limited offers to purchase fertilizers in the harvesting season, with free delivery in the planting season, can massively increase uptake and usage of fertilizers,” should have qualifiers like “in Western Kenya” added to them.

UPDATE: It looks like these are pretty common criticisms.

If only you knew the power of nagging

Give quiche a chance

I’m naturally a bit skeptical of ground-level interventions that don’t involve cash, needles or textbooks. Anything that involves dubiously-titled training or “empowerment”  sets off my very cynical alarm bells. However, I’m beginning to be persuaded by the evidence that targeted information campaigns work.

First there was Pedro Vicente and Paul Collier’s study on a randomised anti-violence campaign staged prior to the 2007 Nigerian elections, showing significant reductions in the treated districts. Then there was the Heckle and Chide’s study of minibuses in Kenya: a random treatment group were given posters advising passengers to speak up if the minibus drivers drove dangerously (which is pretty much what minibus drivers are born to do). The treatment group saw sizable declines in insurance claims, including those for injury and death.

Now there is a soon-to-be-published paper by Martina Björkman and Jakob Svensson, offering a unique randomised intervention:

  1. Assess local health providers and inform the communities on their relative performance using ‘report cards’,
  2. Encourage these communities to form groups to monitor local health performance.
  3. Sit back and see what happens.

A year after the intervention, a repeat study revealed that the treated communities had: harder working health providers, higher rates of immunization and significantly reduced rates of child mortality and underweight children, all with the same levels of funding.

The best part of the study was the lack of investigation into what the communities were doing to make changes – (there is some rough evidence that the communities were more active in electing and dissolving the local provider management committees). My guess is that a fair amount of nagging was involved.

I’ve come to believe that a crucial part of development is strengthening the accountability link between citizens and their government (not to be confused with enforcing accountability externally), especially when the citizens face a trade-off for enforcement (in this situation, that trade-off is time spent hassling health workers).

A few questions remain:  is it persistent (or would health workers become more resistant to this informal accountability over time?) Is this scalable? Which part of the intervention was key: the information transfer allowing for yardstick comparisons between district, or the “empowerment” workshops? My hunch is the former.

(Bonus points to those that got the Red Dwarf reference).