The sum of parts

You’re doing it wrong

The IMF has a new paper out on gender budgeting efforts in sub-Saharan African countries:

Gender budgeting is an initiative to use fiscal policy and administration to address gender inequality and women’s advancement. A large number of sub-Saharan African countries have adopted gender budgeting. Two countries that have achieved notable success in their efforts are Uganda and Rwanda, both of which have integrated gender-oriented goals into budget policies, programs, and processes in fundamental ways. Other countries have made more limited progress in introducing gender budgeting into their budget-making. Leadership by the ministry of finance is critical for enduring effects, although nongovernmental organizations and parliamentary bodies in sub-Saharan Africa play an essential role in advocating for gender budgeting.

These sorts of efforts have certainly improved in both scope and sophistication. Back when I worked in the budget division of the Malawian Ministry of Finance, I was only asked once to perform any sort analysis of the gender focus of the budget. The request that landed on my desk had come from the Commonwealth, who wanted to know how many times the word “gender” had been used in any of the previous presentations of the national budget to parliament. After fishing out the transcripts from the Ministry’s library, I eventually discovered the answer was “zero.”

Crowdfunding lives

Crowds sometimes choose inefficient ways to save lives.

Crowds sometimes choose inefficient ways to save lives.

A week ago, two climbers from Utah disappeared in a Pakistan mountain range. The two had already attempted to climb that particular peak last year, but a nearly-fatal accident prevented them from reaching the summit.

Since their disappearance, the internet has successfully crowdsourced over $100,000 to mount a rescue mission.

According to the impact calculator at The Life You Can Save, 100 grand could at the very least save dozens of lives if donated to the right charity. But we haven’t seemed to figure out how to make these causes feel quite as urgent as two blokes stop on top of a mountain.

Economists aren’t really supposed to judge people’s preferences, and it’s unlikely that the plight of the mountaineers is displacing money that otherwise could have been used to de-worm people or give them cash transfers. In fact, there is some (hopefully soon to be released) evidence that even big charity appeals lead to a net increase in people’s propensity to give. But it would be nice if that urgent, empathetic urge to give could be activated in the deeply impersonal world of effective altruism.

Help me test a (very silly) hypothesis by answering a few questions


I’ve held a silly hypothesis in my head ever since I was a grad student, but never had the time/resources to test it. I just recently came across a publication which drastically reduced the costs to testing the idea. It will almost certainly result in a “jokey” paper, but a fun one nonetheless.

But I could use your help. I have constructed an online survey displaying photos of people, and I need respondents to tell me whether these people are smiling, frowning or have neutral expressions. There are over 170 questions, but they are randomized, so even if you only manage to answer a few (and then close the window), it still would help a lot!

I can’t tell you what the idea is just yet, because it might spoil how you answer the questions. More information to follow, once enough people have answered the survey.

Click through here to enter the survey.

On Hirschman, measurement and RCTs


Every intervention is unique, perhaps unintentionally so.

Recently, Dan Honig of Johns Hopkins forwarded Ranil and me some thoughts he had in reaction to an Albert Hirschman on development projects that he felt was pertinent to the discussion on the pros and cons of RCTs. What followed is a discussion (rant) between Dan, Ranil and me. I’ve edited out the e-maily bits for clarity:


Matt, just after I hit send on this I realized I should have included you on this – I generally think you’re right on RCTs and the stale-ness of the conversation (and discussed this with Ranil a few months back some 2 days after you had dinner with him, hence the cc to Ranil) but feel like I’ve never seen this Hirschmann frame and wondering if it struck you as interesting. And yes, basically I’m trying to catalyze you writing something cool on this so I can quote/reference it down the road

Reading Hirschmann’s Development Projects Observed for the first time, and as I read it he’s with [Lant Pritchett and Michael Woolcock] on RCTs and causal density in international development projects. The quote below is from page 186 of the 1967 edition; italics are his, brackets mine; just before this he suggests we may not be able to identify good indicators of effects ex-ante and thus presumably couldn’t be pre-specified in a trial, meaning presumably we would be ill served by an RCT on a particular intervention even if we ignored external validity concerns.

“The indirect effects [of development projects] are so varied as to escape detection by one or even several criteria uniformly applied to all projects. Upon inspection, each project turns out to represent a unique constellation of experiences and consequences, of direct and indirect effects.”



Hey Dan, that’s a really interesting quote by Hirschman. If my interpretation is correct, it seems to be more damning for empirical evaluation in general than for RCTs in particular.

I’m not sure how I feel about this. Even if you move away from a simple, reduced form causal framework, Hirschman’s critique seems like it would apply. Even if development is a messy, complex thing that can’t really be boiled down in an impact evaluation framework, we still rely on measurement when we talk about development, and any given set of measurements is going to leave out things which might matter which are unmeasured. We can point at improving test scores but leave out student stress, etc, and the set of things that we leave out that might be important will change depending on the context. I guess I see this as a problem of measurement rather than as a problem for RCTs.

I also wonder what this means for how an empirical researcher operates. Over the last few years, I have become incredibly suspicious of surprising, counter-intuitive results, where a researcher measures something outside of the standard set of outcomes and finds a result. In a world of multiple hypothesis tests, expanding the set of outcomes to include as much of Hirschman’s unique constellation as possible will open up the door to a lot of false positives which will end up getting written up and published.

So that was a rant. Um, what do you think Ranil?

Continue reading

Come work for me in London or Washington D.C.


Like development? Like statistics and coding? Come join me and Vij Ramachandran at CGD.

The Center for Global Development, an independent, non-partisan research organization in Washington, DC and London, UK seeks a Research Assistant (RA) to support the work of Matt Collin and Vijaya Ramachandran. The successful candidate will have experience with research in economics, public policy, political science or a related field and will be based either in London or in Washington DC.  Applicants must demonstrate strong quantitative, analytical and communication skills. The position is well-suited to those who are considering doctoral study in the future.



Universal Basic Income: The Next Generation

"Wait, you're saying that in the Federation you don't have to worry about money? You can just schlep around the galaxy seducing alien women?"

“Wait, you’re saying that in the Federation you don’t have to worry about money? You can just schlep around the galaxy seducing alien women?”

Over at Five Thirty Eight, there’s a nice piece by Daniel Flowers on the idea of a universal basic income (UBI). Proponents say it will allow people to choose their careers and live their lives without having to worry about ever being poor. A common criticism is that it will create a massive disincentive to work at all. Several experiments have already been run which have found small, but non-negligible effects on the willingness to work. It is one of the outcomes that a host of new experiments of giving people a long term guaranteed basic income will test.

I am a little worried that these new experiments won’t capture the long term, generational impact of a universal basic income. Let’s imagine I really wanted to be a filmmaker (*cough*), but decided to become an economist because filmmaking is more likely to leave me in poverty. If I’m half way through my career as an economist and I start receiving a basic income, it might be too late for me to really break into filmmaking. Even if I pull it off, adjustment costs will be high, and it’s likely I’d end up embarking on a career which would be less successful than if I had started at a much earlier age. It is these decisions that will largely be picked up when the targets of a UBI experiment are largely adult workers.

What is more interesting is the impact on the next generation. Let’s imagine the UBI is introduced and governments can credibility commit to providing it for one’s entire lifetime. Now, all those aspiring filmmakers can select into the job of their choice with lower adjustment costs and a higher likelihood of actually being accomplished at what they’d most like to do. Who knows if this would have a net positive or negative impact on the creation of value, but it certainly would lead to better sorting. But even the most ambitious UBI experiments which are being proposed are unlikely to pick up these effects. Instead what we need to do is find a group of high schoolers and offer some of them (randomly) a credible lifetime UBI, then sit back and see how it affects career decisions and labour market participation in the long run.

As a side note – how much does relative poverty in developing countries lead to sub-optimal career decisions?

The road out of hell

In the FT yesterday, Branko Milanovic suggested that we might be able to increase global migration by reducing the citizenship rights of migrants. This is not a new idea – Lant Pritchett brought it up ten years ago and it is widely practiced by many Gulf states.

Over at Crooked Timber, Chris Bertram made it clear he really doesn’t like this idea, comparing Milanovic’s suggestion to “recreating apartheid” (I suspect the word apartheid will eventually be subject to its own form of Godwin’s law):

Milanovic wants us explicitly to abandon the liberal and democratic principles of legitimacy that those who are subject to the laws of a society should (in time in the case of migrants) get to have the right to make those laws. In doing so, he goes far beyond similar proposals (for example from Martin Ruhs that have been explicitly temporary in nature and have largely focused on labour-market rights. Milanovic’s lack of commitment to the norms of liberal democracy also comes across in the fact that he holds up illegitimate and tyrannical states, such as the Gulf kleptocracies, as models for his proposed policy.

Part of what’s going on here is the economist’s perspective on policy, which just focuses on net improvements in well-being or utility, with income serving as a proxy, and which doesn’t, therefore, see human beings as possessed of basic rights which it is impermissible to violate. Rather, all and any rights can be sacrificed on the altar of income improvement, just in case someone is poor and desperate enough to make a deal (who are we, paternalistically, to stop them?). The road to hell is paved with Pareto improvements.

Let’s be absolutely clear: we are already in hell and we are trying to find a path out of it. International migration restrictions – as they stand – already enforce a global system of apartheid. Most of global inequality in income (and likely in health and happiness) is driven mainly by where  you are born. By preventing someone mired in poverty overseas from moving to a place where they can make a better life for themselves – even temporarily – we are implicitly denying that person the same rights that we enjoy every day (rights that most have us have inherited, not earned). These are also arguments that Pritchett made before.

Human beings have a proximity problem: inequalities in outcomes or rights which are proximate to us (on the right side of an arbitrary national boundary) are weighted much higher that massive, gaping inequalities which are harder to observe because the people bearing the brunt of that inequality happen to live overseas.

We would all agree that a migration system which allows for restricted freedom is a worse solution than a system which allows for the same amount of migration with no restrictions on freedom. But the latter system does not exist, nor has anyone managed to propagate a convincing way to get there. I don’t know if a Milanovic/Pritchett system would work, but I can think of two main reasons why we might not want to consider it:

  1. There is a lower cost path towards a system which does not limit freedom that we can implement sooner.
  2. Adopting a system based on limited citizenship now will somehow make it harder to move to a free system later on.

If Bertram really wants to make a convincing case against the Milanovics of the world, he needs to start by showing us a better road out of hell.

The limitations of the Absolute Palma Index, in two graphs


Last year, the ODI’s Chris Hoy released a really useful and thoughtful paper pointing out that the basic maths of inequality are often not on the side of the poor. Even if economic growth is evenly spread, the absolute difference between the incomes of the poor and the richest must increase. That is, if you are 10 times as rich as I am and our incomes both grow by 10%, you’ll be taking home more money than I will at the end of the day. If we wanted to see a decrease in absolute differences of income around the world, it would require that the income of the poorest grow a great, great deal faster than that of the richest, something we are unlikely to see any time soon.

The unanswered question, and one that Hoy even posits himself end of the paper, is whether or not focusing on absolute differences in income makes more sense than doubling down on the relative differences in income that are captured by traditional inequality measures such as the Gini, Thiel or Palma indices. We know that income is correlated with lots of good outcomes for the beholder – better health, education, happiness and political power. However, if we are being truly honest with ourselves, we would have to admit that we don’t quite fully understand whether relationships are absolute or relative in nature (although we suspect both matter for happiness). Do the richest 1% of Americans have more political power in the US than the richest 1% of Nigerians have in Nigeria? These are the questions we must ask ourselves if we are to make a strong case for caring about absolute income differences.

In the meantime, I woke up this morning to find that Nick Galasso from Oxfam has made a pitch for using the “Absolute Palma Index” as the next big measure of inequality. The Absolute Palma is a variation of the Palma Index of inequality, which itself is the ratio of the share of income earned by the top 10% of the distribution and that of the bottom 40% of the distribution. The Absolute Palma, by contrast, is the absolute difference between the average income of the top 10% and the average income of the bottom 40%.

As the title suggests, I think there are limitations to the Absolute Palma Index, so consider the post a word of caution. I can think of one strong case against absolute measures: while they might be reasonable at describing immediate gains across a country’s income distribution after a year of growth, they aren’t very useful at describing differences between countries across the globe.

I happened to be playing around with data from Christoph Lakner and Branco Milanovic’s paper on the global income distribution, so I decided to see how the Absolute Palma Index varied across countries. Check out the graph below, which looks at how the Absolute Palma Index varies with mean income across countries. I’ve also highlighted countries which are either very unequal, very equal or somewhere in the middle as measured by the traditional Palma Index.



The first thing to note is that there is almost a one-to-one relationship between the log of GDP and the log of the absolute Palma. This is hardly surprising – take any income distribution and raise all incomes by a set percentage and by definition you will see an increase in the Absolute Palma. What this means is that on this index, poor countries do really, really well and rich countries do terribly. And that is most of the story. Log per capita income explains about 93% of the variance in the log of the Absolute Palma. The relative Palma explains most of the remaining unexplained variance, but on the whole has very, very little explanatory power.

The result is that we get some pretty counter-intuitive results. Even though Denmark, Sweden and Norway  are considered by pretty much every person I’ve ever ever spoken to be the most equal places on the planet, they come out as being more unequal than countries that are at the top of the relative Palma Rankings, places like South Africa, Honduras and Brazil.

Which of these countries would you rather be poor in? Presumably the one with the highest average income for the poorest 10%. If we graph the same relationship, instead using the average income of the bottom decile, we find the relationship is less strong, especially so for the poorest countries of the world. But if I had to choose whether I wanted to be born poor in a country with a high or low Absolute Palma index, sign me up for more inequality!



Now for the caveats: the data here is as good as 2008, so the basic cross-sectional relationship may have changed (although it hasn’t appeared to have done so ipapen the years leading up to 2008). There is also a difference between moving between countries of different average/median/poorest decile levels and observing individual countries as they grow richer or poorer. This means that there might be use in keeping track in how growth is `allocated’ across the income distribution, something which is already done (and was done carefully in Chris Hoy’s paper).

Absolute measures might tell us something interesting in the world, and I welcome more work on them. But there is a world of difference between adding a tool to the (now overflowing) box of inequality measures and pushing for headline measure that automatically penalizes rich, developed countries for being rich and developed. In addition, before we begin agonizing about absolute differences within countries, someone needs to make a pretty compelling case that they matter more than both absolute levels or relative differences, because these are things we already go through great pains to measure. If we are worried that the incomes of the poor aren’t growing fast enough, then why isn’t it enough to measure that?

Stata code and underlying data available here.

Update: good comments from Chris Hoy below.

The difficulty of getting good feedback

Most of us have very little clue if what we are doing makes any sense

In a piece for Project Syndicate released today, Ricardo Hausmann makes a grand case against evidence-based policies, specifically the rise of randomized controlled trials:

My main problem with RCTs is that they make us think about interventions, policies, and organizations in the wrong way. As opposed to the two or three designs that get tested slowly by RCTs (like putting tablets or flipcharts in schools), most social interventions have millions of design possibilities and outcomes depend on complex combinations between them. This leads to what the complexity scientist Stuart Kauffman calls a “rugged fitness landscape.”

After presenting a theoretical case of an RCT which tests for and fails to find an impact of tablets on learning in schools, he offers up an alternative approach, one that relies on rapid experimentation and adaptation:

Consider the following thought experiment: We include some mechanism in the tablet to inform the teacher in real time about how well his or her pupils are absorbing the material being taught. We free all teachers to experiment with different software, different strategies, and different ways of using the new tool. The rapid feedback loop will make teachers adjust their strategies to maximize performance.

Over time, we will observe some teachers who have stumbled onto highly effective strategies. We then share what they have done with other teachers.

Notice how radically different this method is. Instead of testing the validity of one design by having 150 out of 300 schools implement the identical program, this method is “crawling” the design space by having each teacher search for results. Instead of having a baseline survey and then a final survey, it is constantly providing feedback about performance. Instead of having an econometrician do the learning in a centralized manner and inform everybody about the results of the experiment, it is the teachers who are doing the learning in a decentralized manner and informing the center of what they found.

Hausmann makes a compelling argument here, but it all hinges on an exceptional premise: that teachers have access to a magical device that gives them *real time* feedback on student learning. Iteration and adaptation makes a lot of sense….. if you are in an environment where you can actually observe the immediate effects of your decisions and be sure that those decisions are having a causal impact.

But most of us are not in those environments. Many teachers might have an idea of how good their particular method is, but in absence of a technology which can provide them with high-quality real-time feedback, it would be very hard to be sure. Most of us are in an environment where we have little idea of what we are doing is effective at all. Even after 32 years of direct observation and some experimentation, I still can’t figure out if spicy food gives me indigestion.

Even when we can successfully parse the noise of life and match an action with a reaction, low-level experimentation still opens up the door to all sorts of internal biases. Human beings are fantastic at creating narratives (I feel good today, it must have been because of that thing I did yesterday) which would whither under larger-scale experimentation.

Of course there are clear examples of low level, rapid experimentation being successful when we have access to technologies that give us good, quick feedback. Bridge Academies, which is now one of the largest private school providers in the world, succeeded largely due to a very high degree of internal experimentation. But to accomplish this, Bridge had to have access to a wealth of real time data on student achievement and attendance as well as enough centralized control to be able to experiment across classrooms and schools.

But in reality these kinds of feedback technologies just don’t exist in many contexts, at least not yet. If I am working in a Ministry of Health in a developing country and I want to discern whether a given health intervention has had an impact, I won’t necessarily have access to real time data on hospital admissions. Instead, I would have to rely on costly household surveys which take time to collect. This slows down the process of iteration and adaptation to a point where a randomized controlled trial combined with some qualitative fieldwork actually looks pretty attractive.

RCTs are far from a perfect solution and Hausmann is correct to point out that they can be slow and blunt tools for figuring out exactly how an intervention should be implemented. But that is a reason to complement them with other methods – not to  chuck them out the door. If a teacher has come up with a new method of using a tablet through rapid experimentation and it is rolled out to the entire school, that method should be rigorously empirically tested. If an RCT of some new intervention finds no effect, we should turn to more rapid experimentation to find a better way.

We’ve been arguing about RCTs for years now – it is disheartening that this debate still feels very black and white.

The problem with nudges is that sometimes they don’t move things very much

Have you ever prescribed azithromycin when you didn't have to? Know what I mean?

Have you ever prescribed azithromycin when you didn’t have to? Know what I mean?

Over-prescribing of antibiotics is a problem because it speeds up the rate at which bacteria develop resistance. In a new study was published in the Lancet yesterday, researchers attempted to use a simple `nudge’ to get doctors in the UK to prescribe less often:

In this randomised, 2 × 2 factorial trial, publicly available databases were used to identify GP practices whose prescribing rate for antibiotics was in the top 20% for their National Health Service (NHS) Local Area Team. Eligible practices were randomly assigned (1:1) into two groups by computer-generated allocation sequence, stratified by NHS Local Area Team. Participants, but not investigators, were blinded to group assignment. On Sept 29, 2014, every GP in the feedback intervention group was sent a letter from England’s Chief Medical Officer and a leaflet on antibiotics for use with patients. The letter stated that the practice was prescribing antibiotics at a higher rate than 80% of practices in its NHS Local Area Team. GPs in the control group received no communication. The sample was re-randomised into two groups, and in December, 2014, GP practices were either sent patient-focused information that promoted reduced use of antibiotics or received no communication. The primary outcome measure was the rate of antibiotic items dispensed per 1000 weighted population, controlling for past prescribing. Analysis was by intention to treat.

This is a fairly standard behavioural intervention – use information (or, less graciously, spam) to nudge people into behaving in a more optimal way. The behavioural insights/economics crowd loves these interventions because they are cheap, so the cost-effectiveness hurdle is easy to overcome. However, that cheapness sometimes overshadows a bigger problem, that frequently these interventions just don’t have very large effects. Here are the results from the Lancet study:

Between Sept 8 and Sept 26, 2014, we recruited and assigned 1581 GP practices to feedback intervention (n=791) or control (n=790) groups. Letters were sent to 3227 GPs in the intervention group. Between October, 2014, and March, 2015, the rate of antibiotic items dispensed per 1000 population was 126·98 (95% CI 125·68–128·27) in the feedback intervention group and 131·25 (130·33–132·16) in the control group, a difference of 4·27 (3·3%; incidence rate ratio [IRR] 0·967 [95% CI 0·957–0·977]; p<0·0001), representing an estimated 73 406 fewer antibiotic items dispensed. In December, 2014, GP practices were re-assigned to patient-focused intervention (n=777) or control (n=804) groups. The patient-focused intervention did not significantly affect the primary outcome measure between December, 2014, and March, 2015 (antibiotic items dispensed per 1000 population: 135·00 [95% CI 133·77–136·22] in the patient-focused intervention group and 133·98 [133·06–134·90] in the control group; IRR for difference between groups 1·01, 95% CI 1·00–1·02; p=0·105).

Let’s focus on the intervention that worked: the peer information treatment. There was a clear decline in antibiotic use for the treatment group, and so the study focuses on the sheer number of prescriptions that were prevented (73,406). However, in terms of relative impact, the study barely changed behaviour. The treatment group’s prescription rate was a mere 3% lower than the control group’s rate. 

So if this is about finding cost effective ways to reduce prescribing, then I’m on board. But clearly these sort of nudges are not going to win the war on antibacterial resistance any time soon.