Angry post about empirical methods and philosophical plumbing

hendel_thewire_post

“We wish to announce we will no longer be reporting annual murder rates, because they represent a world view in which there is only one conceptualisation of “murder.” Murder is actually a fairly complicated, complex process, and to simply “count” these murders only stands to hide the philosophical basis for considering a crime a murder and ignores theories of change as to how or why murders happen. Anyway, the stats are all juked anyway.”

Perhaps this is nitpicking, but there was brief moment while reading Rosalind Eyben and Chris Roche’s rebuttal post on evidence in policymaking (part of a must-read exchange with Chris Whitty and Stefan Dercon), that nearly resulted in an early-morning brain aneurysm:

Let’s start by insisting that a criterion for rigorous research is that it should be explicit about its assumptions or world-view. We suggest that a weakness in many studies is that they usually focus solely on the methodological and procedural and render invisible their ‘philosophical plumbing’. The evidence-based approaches that Stefan and Chris advocate are imposing a certain view of the world, just as our approaches do. Their claims to the contrary foreclose any possible discussion about the different intellectual traditions in interpreting reality.  Theory invites argument and debate.

This argument is made time and time again with those who are both unfamiliar and intimidated by empirical methods. Let me be clear here: a comparison of means does very little to “impose a certain view of the world.” It is just a comparison of means. If I have run a randomised control trial on fertilizer use, I am answering the question “Did this treatment increase fertilizer on use, on average?” To argue that measurement has some sort of inherent, insidious philosophical underpinning is a dangerous and backward way to approach life. A breathalyser test uses various assumptions to measure a person’s blood alcohol level, but I can’t very well go about rejecting its validity because it doesn’t take into account the power relationship between the cop and the driver.

Can the use of rigorous empirical research be used to support theory or ideology? Of course. Are empirics often insufficient to answer really difficult questions. Of course. It is also the case that economists tend to think about problems a certain way, and this might not always be the way a problem needs to be thought about. Are sociological, anthropological and political methods often just as useful for providing evidence? Of course. Should these results often be considered carefully, keeping in mind the context and the various complexities and confounding factors? Of course. 

But measuring poverty, or infant mortality, while rife with methodological assumptions, does not rely on a certain view of the world, unless you classify “I believe some things should be measured” as a world view. So please, stop rejecting simple statistics as a “different intellectual tradition in interpreting reality” – it is really a very silly thing to say and diverts the argument from what really matters: what tools are best for promoting development, and how best can we implement these tools? Rigorous empirical methods are just another tool in the toolbox. Your view of the world will determine which of these tools you rely on the most.

I swear, I think this blog spends half its time trying to put the die-hard randomistas in their place and the other half trying to put the die-hard qualitatives in their place. I need to have a lie down.

6 thoughts on “Angry post about empirical methods and philosophical plumbing

  1. Stephen Jones

    January 24, 2013 at 12:45pm

    I think the problem is that people have been partly talking at cross-purposes in this debate (and I sympathise with Lee’s comment on jargon – I don’t think Eyben and Roche’s language has helped their case). I don’t think E&R are arguing against using RCTs to measure average increased fertiliser use. When they refer to ‘evidence-based’ they are actually arguing against an (alleged) dominance of certain types of evidence in decision-making which (allegedly) lead to over-technocratic programmes rather than interventions which might be more likely to lead to longer-term social change. Whitty and Dercon’s (to me, more convincing and coherent) response was along the lines of ‘actually we do use a variety of methods/evidence appropriate to the problem and think that this in fact opens up the political debate’. But I think both sides, ironically, actually need to put forward more evidence about what the results of current ‘evidence-based’ approaches are in practice.

  2. Rob Levy (@aid_complexity)

    January 24, 2013 at 12:48pm

    Ha ha! Excellent Jon Stewart-style rant here. I like the cut of your demagoguery.

    There is a familiar argument being played out here (see my comment (#4) on the Duncan Green blog in question) between those who are having to make decisions and those who can keep their philosophy pure. Of course taking everything into account is better than using a flawed model of reality, but if you’ve got to decide whether to fund a programme or not, you take the best tools you’ve got.

    Final point: when people say “you can no more prove science is right than you can that magic is wrong”, they misunderstand what science is. It’s a methodology, not a set of presuppositions. Same with evidence; it’s a method, not an outcome in itself.

  3. Kartik Akileswaran

    January 25, 2013 at 3:01am

    I agree wholeheartedly. That said, to pick up on some of Stephen’s points (and to play a bit of devil’s advocate), I think there are some broader issues at hand:

    1) Even though Whitty and Dercon say that they use a variety of methods, it’s not clear that this is what actually happens among the randomista/empirical methods crowd. What percentage of rigorous evaluations contain systematic qualitative analyses done by qualitative researchers? I think it’s pretty low. I’m not saying this is always necessary, but we need to see more of this if we’re to believe that using a variety of methods is commonplace. (I’m not that familiar with DFID’s evaluation work, so this is directed more broadly. I also recognize that there are incentives for publication at work here that may discourage this sort of collaboration).

    2) I’m fairly certain that, in private settings, some in the “rigorous evaluation” crowd would be far more dubious about the value of the “sociological, anthropological, and political methods” that you refer to than Whitty and Dercon suggest. Regardless of what’s said in public, if there is this sort of methodological elitism in private, it likely serves to discourage rather than encourage mixed methods approaches.

    3) When randomistas rack up the accolades (let’s face it, recognition matters) and some donors move heavily toward rigorous evaluation at the expense of other methods (I’m thinking of USAID here), then there is definitely a danger of privileging certain forms of evidence without having a good sense about the relative usefulness of said evidence. This is Stephen’s point, and I think it’s critical. Lant Pritchett puts it best: “Where is the randomized experiment that shows evidence from randomized experiments influences policy more than from other sources?”

    (see http://www.brookings.edu/~/media/Events/2008/5/29%20global%20development/2008_pritchett.PDF)

  4. Patricia Rogers

    January 26, 2013 at 3:32am

    These are important discussions to be having.

    As an advocate for an appropriate mix of method to suit the situation, I also find myself sometime arguing with pure quants and sometimes with pure quals, and often trying to create platforms for more constructive and informed discussions.

    One point I’d like to respond to in this post is the statement that there are no important assumptions built into study focused on the average effects of fertilizer use.

    There are several important and invisible assumptions built into this. The most important is that knowing the average impact is a useful, and in many cases sufficient, bit of information to inform decisions.

    The average effect often masks important subgroup differences, where an intervention is ineffective, damaging, or particularly effective fir certain people or in certain environments.

    And, partly because of this, and partly because of potential challenges in scaling up, the average effect might not be produced if the interventn is widely adopted, which us often the sort of decision these types if studies are seeking to in form.

  5. Brett

    January 26, 2013 at 6:14pm

    Just wanted to say I really liked the category tag…

  6. Brendan Whitty

    February 1, 2013 at 2:11pm

    It seems to me the statement ‘I believe certain things should be measured’ does seem to be quite revealing about a world-view. Methods are linked to disciplines and to epistemologies. What evidence you choose to gather stems from what type of thing you find is important to understand the world. So for me, the philosophical plumbing is not about a comparison of how you measure something, but what you choose to ‘measure’ (or know) and what you do with that knowledge. It’s not about rejecting the validity of a method, but about querying the prioritisation of that knowledge over other evidence in development interventions.

    Take the term ‘measurement’ itself – it implies that extent to which defined outcomes are attained is what’s important. That’s an important thing to know, but certainly not the only and not always the most useful thing. As Patricia Rogers says, it may mean you measure, say, the average fertilizer use, rather than the distribution of that use, (with attendant issues of social inequity); you measure the outcome but not what additional difficult-to-measure consequences were generated by how the intervention happened – say, the bolstering of an incumbent administration through rollout of a fertilizer programme just before an election; you measure the outcome, rather than how it was achieved, and therefore not how to do it better; you don’t consider what “better” might actually look like (unless it’s simply ‘more’), since values are pre-ordained by the chosen outcome and a focus on quantity; you don’t look at the meaning given to the whole process and the value placed on the outcomes (intended and unintended) of those who are on the receiving end, which isn’t an object susceptible to measurement.

    Of course, your example was just a throw-away and doesn’t do to focus too much on it – happily, everyone seems to agree that it’s important to capture different information, and that evaluations should capture multiple aspects of a programme. For me what Eyben and Roche were saying is that different disciplinary backgrounds have a tendency to prioritise certain kinds of knowledge, tend to use tools to do that, tend to use that information in particular way, and that it’s important to keep the spectrum and issues open.

Comments are closed.