When blind is not beautiful

"Hello? Is it a placebo effect that you're looking for?

“Hello? Is it a placebo effect that you’re looking for?

Over at Boring Development, Francisco Toro picks up on the recent Bulte et. al. paper which attempts to implement a double-blind protocol in a `standard’ policy RCT. The study’s abstract:

Randomized controlled trials (RCTs) in the social sciences are typically not double-blind, so participants know they are “treated” and will adjust their behavior accordingly. Such effort responses complicate the assessment of impact. To gauge the potential magnitude of effort responses we implement a conventional RCT and double-blind trial in rural Tanzania, and randomly allocate modern and traditional cowpea seed varieties to a sample of farmers. Effort responses can be quantitatively important—for our case they explain the entire “treatment effect on the treated” as measured in a conventional economic RCT. Specifically, harvests are the same for people who know they received the modern seeds and for people who did not know what type of seeds they got; however, people who knew they had received the traditional seeds did much worse. Importantly, we also find that most of the behavioral response is unobserved by the analyst, or at least not readily captured using coarse, standard controls.

So it appears that most of the treatment effects in this study are driven by changing behaviour by the farmers who knowingly-received modern seed varieties. Toro touts this as some sort of massive blow to the randomista movement:

This gap between the results of the open and the double-blind RCTs raises deeply troubling questions for the whole field. If, as Bulte et al. surmise, virtually the entire performance boost arises from knowing you’re participating in a trial, believing you may be using a better input, and working harder as a result, then all kinds of RCT results we’ve taken as valid come to look very shaky indeed.

….

Still, the study is an instant landmark: a gauntlet thrown down in front of the large and growing RCT-Industrial Complex. At the very least, it casts serious doubt on the automatic presumption of internal validity that has long attached to open RCTs. And without that presumption, what’s left, really?

Even if you take the results of this paper at face value (and there are some good reasons we shouldn’t), it’s hard to see here why these results should be that troubling.

The reason that medical researchers use double-blind protocol in clinical trials is to try and pin down the exact physiological impact of a medicine, independent of any conscious or subconscious behavioural response. Placebo effects have been fairly well established, so figuring out that medicine X has an effect above and beyond the health effects created by taking a sugar pill are important. One very important thing to note, however, is that unless there is a no-placebo control group, researchers using double-blind protocol will be unable to identify the total average treatment effect of a medicine: we will know what the impact is compared to someone else given a pill, but if we randomly selected someone in the population (with the same characteristics of the study group) to receive the treatment, we can’t say much about what the overall effect will be. Also, fairly critically, while double-blind studies allow us to make the assumption that placebo effects are similar across treatment and control groups, we cannot say anything about how they would compare to an explicit non-blind clinical trial (i.e. placebo effects might be quite different when the treated know they are treated).

Most development randomistas are answering substantially different questions than medical scientists. It is fairly easy to establish the efficacy of a set of agricultural inputs in a controlled setting: we know fertilizer `works’ in that it improve yields. We know vaccines work in savings lives and that increasing educational inputs, to some extent, improves educational outcomes. This was Jeffrey Sachs’s reasoning when he sold much of the world on the Millennium Village Project: we know what works, we just need to implement it. But most of us running RCTs aren’t interested in the direct impact of an intervention, holding behaviour constant, because it is precisely this behaviour that matters the most. If our question is “do improved seeds work in a controlled setting?” then a double-blind RCT is well and fine, but if our question is, “do improved seeds work when you distribute them openly, as you would do in pretty much any standard intervention,” then you need transparent protocols to get at the average treatment effect you are interested in.

Many economists are interested in mechanisms – in picking apart the behavioural responses to a given treatment. In this respect, the Bulte et. al. paper is very interesting: here we have an intervention which works primarily through behavioural response rather than a change in household resources, etc. This is intriguing and worth picking apart for getting a better sense of why interventions like these work. However, from the perspective of a policy wonk, we might care less: if you give people improved seeds then yields go up. If you de-worm children then schooling goes up. These are answers worth knowing even if that’s all we know.

For those of us interested in behavioural responses, we don’t necessarily need to run around running double-blind RCTs to get a handle on them. Consider this excellent paper by Jishnu Das and others on the effect of anticipated versus unanticipated school grants: when parents knew that their child’s school would be receiving more money, they reduced their own spending on school inputs enough to completely offset the gains from the grants. In a world in which we could have run the grant programme as a blinded RCT, it would have looked like grants were successfully in raising test scores – but it would have told us preciously little about how grants operate in the real world.

There’s another issue here: imposing blinding in many development RCTs creates some substantial ethical issues. Imagine, for instance, that you could fool a Kenyan farmer into not knowing whether or not she received high quality fertilizer or a bag of dirt. The average farmer might behave as if she has received nothing, she might also behave as if she had received a perfectly good bag of fertilizer, or she might hedge and use some of it, realizing that it may not be useful. Some of these decisions may be sub-optimal: if the farmer knew she was in the control group, she might have opted for a different planting method, one which would have resulted in a higher yield. In this particular example, obscuring the treatment from our study group actually runs the risk of doing them harm, especially if they believe they are treated and take complementary actions which are in fact wasteful if they are not actually in the treatment group. 

The thing you should take away from the Bulte et. al. study shouldn’t be “all RCTs are biased because we aren’t measuring placebo effects”  but instead “behavioural response matters for evaluating real-world policies.” The latter statement actually reinforces the need to have transparent RCTs, rather than to try and mimic the double-blind nature of clinical trials.

Who you gonna call? A mob

stoning

“Look, I’d had a lovely supper, and all I said to my wife was: “That piece of halibut was good enough for Jehovah.” “Blasphemy! “”

Large crowds are inherently scary. Not, perhaps at first glance, but those of us that live in large cities do so under the assumption that while crowds are somewhat chaotic  and have the potential for danger, they will never be intentionally malevolent, at least not towards us. Yet there is still an unease there, of the type that underlies the kind of horror frequently employed in post-apocalyptic zombie films or John Carpenter flicks, that very quickly the crowd can turn against you.

Perhaps this is not totally unreasonable source of anxiety – while most violence in London (where I’ve recently started living) tends to be of the individual-on-individual sort, mob violence is more frequently a reality for many living in developing countries. Take, for instance, this new report by the incredibly prolific NGO Twaweza on violence in Tanzania. Drawing upon a nationally-representative phone survey, one of the most striking results is deaths due to mob violence appear to be more common (or are as commonly-perceived by respondents) than ordinary murders: twaweza

In fact, as many people are killed by mobs as by ordinary citizens, police and the national army combined. If we consider mob violence to be a form of extra-legal justice, imagine a government which executes more people than those who commit murder. While these figures are based on perceptions and should be taken with a grain of assault, it’s worth noting that forensic investigations into violent deaths in Dar es Slaam reveal that at least 10% are due to mob justice, still a staggering number.

Yet, outside of the occasional hushed ex-pat dinner party conversation or resigned lamentations by locals, I’ve rarely hear people actually discuss the causes and consequences of mob violence in much detail (although as I write this I suspect that a reader will soon point out that something akin to a Journal of Mobbing Studies which I will have overlooked). Malawi, where I lived for some, seemed particularly afflicted with mob violence centred around automobile accidents, where the drivers of cars found to be at fault were often assaulted, and sometimes killed. This happened with enough frequency to create a culture of “fleeing the scene,” where drivers who were not even directly connected to an accident drove off for fear of being blamed and attacked (this was the basis for a film I shot whilst living there). I began thinking about the issue again when, recently, one of the respondents in a survey I’m helping run in Dar es Salaam was killed by a mob after (purportedly) murdering another resident. 

What leads to mobbing and why does it appear to be more prevalent in societies with dysfunctional institutions? Let’s take the armchair economist position for a moment: it would probably be fairly easy to write down a herding model where people update their beliefs about a person’s guilt based on the behaviour of others. Person 1 decides, for whatever reason, that the accused is guilty, person 2 updates his/her beliefs based on person 1′s belief, and so on, until you have a mob which is convinced that the accused is guilty. If you combine that with a utility function that inversely weights the disutility or ‘guilt’ one might feel  from being personally responsible for a death (I take comfort in knowing I probably didn’t throw the fatal stone), it’s easy to see how mobs might easily form in a context where punishment would otherwise be uncertain.

While this sort of explanation is rather intuitive, I find it a bit unsatisfying for three reasons: i) it ignores the drivers of the probability of punishment in the counterfactual and ii) it assumes that mob behaviour is solely defined over the desire to inflict justice on the guilty. It also tells us nothing about why people are more likely to be stoned to death in Dar es Salaam than, say, Myrtle Beach. A couple of thoughts:

i) What happens if we don’t stone people to death? Most socio-cultural explanations share a similar premise: people rely on mob violence precisely because they do not trust the formal justice system to get the verdict right. If the police and judiciary are capable of finding and punishing the right person, our need to rely on selection-via-herding decreases. If this is true, then strengthening the formal justice system should reduce mob violence. This falls apart if people can selectively engage the formal system – if mob justice isn’t just about guaranteeing some form of punishment, then perpetrators may still choose vigilantism over bringing in the police. This brings me to the next thought:

ii) Is this really about punishment? As with most social/political/economics concepts, Monty Python got there first: in Life of Bryan overzealous women disguise themselves with fake beards so they can throw stones at people for the fun of it. If mobs are primarily made up of young angry men, then we might begin to suspect this has more to do with the tendency for young, angry men to enjoy a bit of the ultra-violence.

Is there a quick fix here, other than waiting for the legal system to become strong enough to both reasonably guarantee punishment of those that commit the initial crime and those who engage in mob justice? Given the snowballing nature of mob violence, moving quickly to both disrupt the initial signal (that the accused must be guilty) and raise the cost of participation (a less extreme version of the Desmond Tutu method of mob justice defusal, perhaps). Do we need a roving band of mob-busters to save the day?

Or perhaps it is reasonable that mob justice is so infrequently subject to policy discussion – it is something which probably declines as countries get richer and their institutions grow more robust, so is it really deserving of too much scrutiny? 

Don’t damn the man, migrate away from him

obi

“Sorry Luke, it would frankly be immoral for me to suggest you leave your aunt and uncle’s farm to fight the Empire. Keep your head down and consider moving somewhere a little less Empiresque.”

In a recent blog post, Bryan Caplan goes after the argument that poor people who wish to migrate away from dysfunctional states should stay there and fix their political system.

When I point out that would-be immigrants are trying to save themselves and their families from hellish Third World conditions, my critics often respond, “They ought to stay home and try to fix their broken political systems!”

For many of the world’s poor, the chances for successful change are slim to none. When compared with the gains from migration, the decision is a bit of a no brainer. Furthermore,  a persons’ decision to migration (flee) should already contain some information about their ability or will to influence their own political system, so these are often the last people who should be sticking around. Given that most of us living in rich countries go out of our way to protect ourselves and our families from unnecessary risks, the suggestion that poor migrants should put themselves on the line is a little unfair.*

Yet Caplan takes what should be a straightforward counter-argument based on the expected returns to political activism and instead tries to moralize it by hatin’ on political activists.

Thus, suppose Jacques the desperate Haitian father has an opportunity to escape to Miami, where he can shine shoes and send money home to feed his kids.  Instead, he chooses to let his kids go hungry so he stays in Port-au-Prince and fights tyranny with political leaflets and soapbox speeches.  Noble?  No more than John.  The righteous man knows that meeting his family responsibilities is more important than playing Don Quixote.

Then he goes after the very notion of activism itself and, in a one-man demonstration of Godwin’s law, manages to link activism with Hitler.

Indeed, triumphant activists routinely give new meaning to the word “tyranny.”  See Lenin, Hitler, and Mao for starters.

Yikes. It’s one thing to point out that staying in Haiti is not always the most cost effective way to improve your life, it’s quite another to condemn those who have what I would describe as “activist preferences.” The decision to stand up to the man isn’t an easy one, nor does it often make economic sense, therefore we should never condemn anyone for failing to stand up against the man when they have everything to lose and nothing to gain.

Yet judging whether or not political activism will be successful is pretty difficult stuff. Actually, I would argue that successful political activism is defined by its unpredictability, which makes it terribly hard to put a normative judgement on. The self-immolation of Mohamed Bouazizi which kicked off the Tunisian revolution and possibly the entire Arab Spring made very little rational sense – Caplan would label Bouazizi as irresponsible for the family he left behind when he killed himself.

I agree that it would have been wrong to condemn Bouazizi if he had instead taken a boat to continental Europe, and I would like to live in a world where other people can easily escape, but shouldn’t we also do our best to support those who have revealed a preference for `fighting tyranny?’ While the world would be immeasurably better off with more open borders, achieving that milestone does not permit us to ignore the injustices that remain around the globe, be they political or economic.

 

 *Although it should be noted that the migration decision, especially if done illegally, can itself be very risky.

Development as freedom from back pain

The Batman solution to back pain: repeated punches to the back, lots of push ups, gruelling climb out of a pit of despair. Sounds a lot like life during my PhD.

As if turning 30 wasn’t enough of an incentive to start feeling anxious about getting older, I recently started having back trouble. The other day, getting out of bed, I threw my back out, and so ended up on the floor with my iPad, as usual contemplating how I could turn this unfortunate turn of events into a blog post.

Lower back pain is particularly frustrating, because as far as the medical establishment is concerned it is an ailment without a clear treatment. Even the most standard type of treatment prescribed by the NHS (rest, painkillers and physio) only shows very moderate success.

This frustration pales in comparison to that of having everyone tell you what they think you should be doing. Physiotherapy,  Yoga, massage, chiropractor, better posture, swimming, acupuncture, eating rare herbs and lying down (this suggestion came from a Tanzanian friend), or the standard GP response of just deal with it.

Many people, often those who have suffered from pain themselves, will swear by their given treatment. I’ve always found this perplexing: surely if there was a obvious method for curing lower back pain, that method would quickly have spread and someone would have become very rich. There are of course reasons why this might not be the case. Let’s consider a few:

1. None of the treatments work, and people just randomly recover from back pain.

This is particularly disconcerting, but given that most of these treatments haven’t been proven with rigorous methods, it’s perfectly possible that people are just recovering at random. If you are trying out treatment X when you happen to get better, it’s likely that you’re going to start seeing a casual relationship where there isn’t one.

2. People have back pain for random reasons and some treatments only work for some types of lower back pain.

This is possibly even more disconcerting. There are a myriad number of potential causes for back pain, and not every treatment will work. So even if you run an RCT examining the impact of a given treatment on pain, if the proportion of people suffering from the exact ailment that the treatment will fix is small enough you might end up failing to reject the null hypothesis anyway. So no particular treatment wins because we don’t have a good sense of what causes back pain, nor which treatment is most appropriate for a given circumstance.

I feel that most of development is (unfortunately) a lot like back pain. There are a lot of people out there who think they know the answer, but if they are living in worlds 1) or 2) where development is random or counties exhibit heterogeneity in the underlying structural prerequisites, then we’re in for a tough time. This isn’t a call to start lamenting – we just need to be aware of the various biases which lead us to over-prescribe certain policies (situation (1)) and under-prescribe others (situation (2)).

Come work for me in Tanzania (short notice)

We are looking for a short term field manager to run a household survey in Dar es Salaam, Tanzania on short notice.

  • The field manager would oversee data collection exploiting a natural experiment in the roll-out of land titling in the city. The aim of the study is to investigate whether the provision of short-term land titles by the government has led to observable differences in household behaviour and welfare. This work is part of a larger portfolio of research of urban property rights in Tanzania.
  • The position would be for approximately 3 months, based  in Dar es Salaam from early January through March 2014
  • Required: we are looking for candidates who have experience with Stata and working with household survey data. A bachelor’s or MA in a quantitative field is preferred. Previous experience working in developing countries, running surveys or managing complex projects is a plus.
  • You would be working with me, Justin Sandefur and Andy Zeitlin on on a fixed contract with the University of Oxford and would oversee a third-party firm which would cover all the practical logistics of data collection. We need someone who understands data, can grasp the research design and can ensure quality.

If you are interested, please write to me at this e-mail address: matt@aidthoughts.org. Please send a CV, an e-mail cover letter explaining why you are interested in the position and your relative strengths, and a description of your experience working with data, field experience, etc. In the subject line, please write: “DSM position: <your name here>”. E-mails which fail to do this will not be considered. We will only be in contact if you make the short list.

UPDATE: the position has now been filled. Thank you everyone who submitted.

I felt a great disturbance in the force

obiwan

” It’s as if millions of economists searching for a natural experiment suddenly cried out in joy.”

From the BBC:

China is to relax its policy of restricting most couples to having only a single child, state media say.

In future, families will be allowed two children if one parent is an only child, the Xinhua news agency said.

The proposal follows this week’s meeting of a key decision-making body of the governing Communist Party.

Math fail

jumpstreet

Until  my early 20s, I never knew that one could become good at math. In high school, I ended up failing 10th-grade math.

That’s Marc Bellemare discussing his struggles with learning mathematics in high school and undergrad. For those of you don’t know, Marc is now an economics professor and is comfortable enough with math to write theory-heavy journal articles. His story about grappling with the subject and eventually learning to master it is well worth a read, especially for those who believe they are inherently bad a math.

I didn’t struggle with mathematics for quite as long as Marc did, but was nearly dissuaded at a much earlier age by the tyranny of early math education: arithmetic.

I’ve never been particularly good at adding, subtracting, multiplying or dividing. How much should we leave for a tip? I’ll let my calculator decide. It’s no surprise then that I found math in elementary school so daunting: we were required to do randomised times tables,where we had to answer as many addition/multiplication questions as possible before an alarm clock went off. I found this immensely stressful and found it very difficult to remember what 7 x 13 was when I knew that any minute now a clock was going to go BZZZZZZZ (I sense there has been a generational improvement though: my father noted that his math classes at a Roman Catholic seminary involved the lecturer smacking students in the back of the head until they got the question right).

When I go back and look through my elementary report cards, I can see how poorly I did: Cs and Ds in basic math, with worried remarks by teachers. Clearly math wasn’t my thing.

Then I was introduced to algebra. You see, arithmetic was usually taught as an exercise lesson: you don’t think about what 7 x 13 means, you remember it. But once math becomes more abstract, it becomes more conceptual and substantially less about memory. I loved algebra. In fact, I loved algebra, trig, and calculus so much that I went on to major in math in university, where I eventually semi-defected to the economics department.

Readers of this blog will probably be past the point where they make significant choices about their math education, but something to keep in mind when you have kids: it’s incredibly easy to be discouraged by math, especially in the early days when it is more about memorization. Others struggle with the more abstract stuff, but as Marc points out, this is a better reason to double down, rather than abandon it for good.*

 

*Obviously this should only be done to a point – everyone has comparative advantages.

Is the land grab debate a proxy war?

unicron

Is the land grab debate about property rights or consolidation?

So much to write, but so little time to do so. Instead I’d just like to end the week with a quick thought on the evolution of the land grabs debate. I’ve been slowly picking my way through Lorenzo Cotula’s fairly comprehensive book on large scale land acquisitions, an I was stuck by the following passage:

“Also, in some cases it is difficult to tell whether a reported deal relates to a new plantation, or to the acquisition of an existing plantation – for example, where a state farm is privatized. The two types of deals would have very different consequences for pressures on land, though even acquiring an existing farm can increase land competition – for instance, if an old state farm has been partly occupied by squatters who are evicted following the privatization, or if the deal involves expanding the existing plantation.”

Cotula is reflecting on the difficulties of discerning land purchases in the Land Matrix which involve some form of consolidation (land owner by multiple smallholder farmers being converted into large-scale farms) and those which do not change the scale of land ownership. This is an important distinction, as it implies entirely different concerns over large scale land acquisitions.

For a large part, the land grab debate has been presented as an issue of property rights: rural communities are having their (possibly customary) rights to land violated when governments lease or sell the land to large national or international firms. This implies direct welfare losses from losing control of a productive asset – imagine if someone showed up and stole your laptop or your main mode of transport (or your house).

But there is a second issue here: even if property rights were perfectly enforced  and all large scale land acquisitions were both fair and voluntary, they would still involve a significant amount of land consolidation, with smallholder plots being converted into much, much large farms. This raises an important question: once we sort out the rights issues, what form of agriculture would we actually like to encourage in these settings?

It is no secret that many NGOs, such as Oxfam, have a bias towards smallholder farming (let’s lead aside whether or not that bias is justified or not, it could very well be). Is the current onslaught on large scale land deals by these NGOs purely about protecting the rights of people, or is this just another front in a much larger war on land consolidation?

Not only the state will bleed

There wasn’t much for me to do when I first joined the Budget Division of Malawi’s Ministry of Finance back in 2006. My particular position had been vacant for almost a year, so it took a bit of time before the acting budget director grew accustomed enough to start diverting work my way. One of the very first things I worked on was an attempt to reconcile the difference between expenditure ceilings set by my department and actual reports of expenditure from the Accountant General’s department.

What complicated this process was the fact that the Accountant General had recently adopted an Integrated Financial Management System (IFMIS), essentially a comprehensive software platform for approving and tracking expenditure. A lot of promises came with IFMIS – the ability to track expenditure in real time and keep a tight leash on expenditure by line ministries. Yet, when I had arrived, the budget department had yet to fully adopt the platform, meaning that our (often fairly specific) budget ceilings had to be manually reconciled with IFMIS-generated expenditure reports.

I doubt that the budget director seriously believed that this greenhorn civil servant was really going to accomplish much with this work and probably saw the task as something to keep me busy while I grew more accustomed to my environment. Even so, I quickly noticed that IFMIS-generated reports seriously deviated from what was being approved by the Budget Division, sometimes even showing expenditure which was above and beyond what had been mandated by our department.

At my director’s prompting, I visited the relevant department at the Account General’s to request more detailed reports from IFMIS. The likely culprit was some of data problem, and I was curious to get to the bottom of it, seeing the whole exercise as a problem with some sort of technical solution. While the civil servants I spoke to at the AG were friendly enough and agreed to send me reports, upon my return to the Ministry of Finance it was later made clear to me that the AG wasn’t too fond of this unknown fresh-faced mzungu making random requests. Not long after, more pressing work diverted my attention, and this particular issue faded into the background.

Later, our own department grappled with the adoption of IFMIS. While technological solutions are frequently touted as solutions to institutional problems (this platform will eliminate corruption!), my experience was that without some basic level of capacity in place, even the most advanced platform was doomed to fail. Hence, if two government ministries can’t keep their budget tallies synchronised in Excel, they are unlikely to be able to get a more complex `black box’ system to work properly.  This is problematic, because when finance systems don’t work properly, it’s very difficult to tell the difference between corruption and incompetence.* My feeling at the time that the discrepancies between the AG’s expenditure records were due to the latter, even though I heard the occasional, unsubstantiated whisper that someone at the AG was stealing money.

This was surprising to me, as there had been a fairly visible crack down on corruption and leakage during the first term of then-president Bingu wa Mutharika. However, it was widely recognized that during his second, more tumultuous term (which began after I had left the country), government systems became more porous and corruption become more common.

One might have expected things to improve upon Mutharika’s sudden death and the ascension of the pragmatic Joyce Banda to the presidency. Yet despite wowing a lot of donors and even some skeptics – including yours truly – her government seems to have inherited many of its predecessors failings: a recent scandal has broken out over implications that there has been substantial theft by employees of the Accountant General’s department, who exploited loopholes in IFMIS to siphon off money.

We tend to lump all dodgy dealings into the broad category of corruption, but there is a clear difference between institutionalised corruption, where political leaders divert resources towards their own benefit, and the kind of rampant theft which goes on when you have a leader who either is unaware of or cannot control corrupt practices. Banda’s situation clearly falls in the latter – given that she has, until very recently, ruled over cabinet  of former members of Mutharika’s party as well as the opposition – she has always been in a precarious position and thus unable to fully keep everyone in her government in line.

The scandal hasn’t been completely bloodless. The recently-appointed director of the Budget Division, Paul Mphwiyo, was nearly shot to death following his attempts to close the loopholes leading to theft of public resources. I knew Paul during my time in Malawi: he was serving as an assistant budget director when I was working for the Ministry of Finance, although we didn’t often work closely together. Let’s hope he recovers quickly and his assailants are eventually apprehended, although I have my doubts about the latter.

For those wanting to keep tabs on the scandal, Kim Yi Dionne remains an excellent source for recent Malawi news and analysis.

 

*This confusion can be easily exploited.

Update: This post got a little more attention than I thought it would, so just wanted to add a little addendum.

I want to be cautious about drawing too many conclusions from my (very brief) interaction with the AG’s system – the Cashgate scandal is another animal entirely. In weighing the corruption or incompetence possibilities, it’s highly likely that my situation fell in the latter. I just felt it was worth noting that these things aren’t always clear, and that there was a bit of an administrative wall between the Account General’s Office and the Budget Division of the Ministry of Finance (they were, at least when I was there, separate `votes’ on the cabinet and in separate buildings.) Also, for the sake of my former department, I want to make it clear that this thing at least seems to be entirely of the AG’s making, and I saw nothing in the Budget Division during my time there that suggested any wrongdoing of this sort.

An apple a day means nothing in a complex system

"But Mulder, what I'm seeing here goes against every single case study and ethnographic paper ever written."

“But Mulder, the evidence I’m seeing here goes against every single case study and ethnographic paper I’ve ever read.”

Recently there has been much fuss made over how researchers and practitioners should be more cognisant of how development policy plays out in environments which are characterised by complexity. While many have used the presence of complex systems to motivate a move towards more experimentation, tracking and empiricism, others have argued that we should instead eschew rigorous empirical methods (such as RCTs) and one-shot policy instruments and opt towards a more dynamic, qualitative approach to development policy.

As of late I have been particularly wary of this second camp, especially when the argument that data-driven methods and randomised controlled trials have little place in a world of complexity. Let me explain why this makes me uneasy.

The human body is itself a complex system, characterised by feedback loops and a lot of unknown parameters. Despite the fact that we know a surprising amount about what makes us tick, thanks to both theory and evidence from biology and medical science, we’re surprising inept at determining long term outcomes. Even so, when my complex system throws up signs that things are not well, I go to see my doctor. After examining me and assessing my symptoms, sometimes through laboratory testing, he makes a diagnosis. Based on that diagnosis, he chooses a treatment, often by selecting a pre-approved medication which has been tested using an RCT.

Let’s think about this for a moment. Most medical research is able to cleanly discern short-term benefits to taking a certain medication. While these medicines are developed using a heavy dose of (biological) theory and iterative testing, trials are rarely long enough to determine what the long term benefits or side-effects will be. While researchers can use previous results and theory to determine that chemical X will result in reaction Y in a human body, they rarely can account for all the possible effects. Randomised controlled trials get us part of the way there, but frequently cannot account for long term effects. So, while we can measure the aggregate effect of a treatment on an incredibly complex system on the short run, we really can’t say that much in the long term, nor can we say much about how these treatments might interact with other treatments.

In fact, it is with predictions about health over the long term where the precision of experimentation often gives way to less robust evidence (such as extended observational studies) or more ad hoc forms of rationalization (is milk good or bad for you?). Similarly, many of the bigger questions in development (how do we improve institutions? What causes economic growth?) are more difficult to address using the most rigorous methods. It is in these areas that, quite naturally, the randomistas have been least successful in their domination of the policy debate.

While we should find all of this disconcerting, the (current) inability of medical RCTs to give us definitive answers on what makes us live longer or be healthier in aggregate is hardly a reason we should rely on them any less. Imagine a world in which your doctor didn’t have access to any randomised medical research. Health professionals would have to resort to casual Bayesian inference to treat people (did John die when I gave him chemical Z?), and would have little sense of which medicines were `proven’ to work. We tend to look down on off-label use of medication, but in a world where rigorous scientific testing isn’t the norm, all prescriptions become off-label. It is a world not a million miles away from the one portrayed in the Mitchell & Web sketch “Homoeopathy A&E.”

The sketch also highlights what the development policy world is like when we toss out rigorous empirical evidence. Yes decisions are made based on qualitative expertise, but they are made without either definitive evidence (did this make a difference?) or appropriate empirical feed back (are things getting better?). A healthy dose of qualitative work is essential in development policy-making, but a world in which all decisions are done qualitatively is  far from ideal: how many of you would wish to be treated by that doctor who had been practising for 40 years, but had never read (or believed) a single medical study?

Just as medicines shown to work using rigorous clinical trials are an essential tool for a doctor navigating the complexities of human health, policies which have been shown to work in some context with an RCT become one of many tools policy-makers can use when operating within a complex policy environment. These types of rigorous trials certainly won’t solve all of our problems, but they are still extremely, extremely useful, even in a complex system. I’m glad that someone is putting out useful albeit marginal medicines which make me feel better when I get sick. It would be even better if someone could figure out more comprehensive interventions which take into account my entire biology, but in the meantime I’ll take what I can get.