Some more thoughts on land grabs and tricky statistics

I suppose you could label this post as my response to Ricardo and Marloes’s response to my post on their recent media brief on the correlation between governance and land grabs. First, I should say that this is all very exciting – it’s nice to have an actual debate about this. NGOs frequently ignore substantive criticism of their analytical work (to be fair, so do a lot of academics), so I must commend Ricardo and Marloes for their enthusiasm and willingness to get in touch and have a reasonable argument about all this.

I think we’ll likely to continue to disagree about when results should be presented (or at least how they should be presented), so I’ll turn my attention to their three main technical points:

 

1) It’s not realistic to assume that investors target poor countries

True, but poor countries themselves might be more likely to put land up for sale. Discerning the difference between targeting and supply-side effects will always be difficult because we only observe actual land deals (in essence, the quantity `consumed’). But this is beside the point – spend fifteen minutes in an economics seminar and you’ll learn that a common way of challenging identifying assumptions is to come up with an equally-credible alternate story. I’ve shown that, at least in this very basic setup, income is a better predictor of a country having a land deal than governance. While my alternate story might be considered implausible (even if it does fit the data better), I really only put it up to point out how equally-flimsy the assumption of investor targeting is.

 

2) My last table is badly specified and then I forget to estimate a hurdle model.

Before delving into the technicalities of this argument, let’s briefly talk about burden of proof. It is Oxfam’s job here to convince us all that investors are targeting countries with poor governance, or at least that there is some consistent correlation between the two. By this very basic metric, I assert that the current analysis falls short, as it doesn’t provide enough evidence to reject the null of no relationship. One doesn’t always need to present and prove an alternate hypothesis, complete with fancy, well-specified econometrics, in order to disprove the one being asserted.

As far as the specification of the first two columns in Table 4 – sure, this is pretty much atheoretic wandering. I’m not going to assert that I’m cleanly identifying any individual channels, but seeing if Oxfam’s relationship stands up (I’m actually trying to help you here guys) once we start controlling for all these things. Multicollinearity doesn’t seem to be preventing some results from shining through. But yes, this is playtime with Stata, although I admit as much up front. See my point about burden of proof here.

Their final point is a technical one – I’m interested in whether, conditional on a country selling any land, governance is correlated with the number of land deals. Technically, this specification is subject to a form of bias due to selection on unobservables: for example, if hotter countries are more likely to sell land, and there is a correlation between temperature and the governance indicators, then estimates in columns (3) and (4) of Table 4 will be biased.  [OK this is not what their point was - see Paul's comment below.] Ricardo and Marloes would be happier if I estimated a model which took this selection into account.

But the problem is: as I point out at the end of my piece, I don’t really buy the selection equation in the first place, and this factors into their third point:

 

3). They take issue with my worry about “bias” in how land deals are reported. 

I’m worried the Land Matrix is a better measure of “number of reports on land deals” than “number of land deals” and that the measure of “have there been any substantial land deals” in the past ten years is really just a measure of “has anyone bothered to submit a news report to the land matrix on your country in the past ten years.” Ricardo and Marloes make a purely theoretical argument that reporting in the UK should be better than reporting in developing countries. If we were talking about general media reporting, I would be inclined to agree, but I’d be surprised if anyone is scanning British newspapers for land deals and submitting the data to the Land matrix.

Furthermore, consider the  final hurdle a land deal must clear to get into the Land Matrix: “entail the conversion of land from local community use or important ecosystem service provision to commercial production.” This seems like it should only be possible in societies where a significant percentage of the population is involved in agriculture and where large scale commercialisation is yet to happen. Sure, the quality of the British government is one of the reasons there aren’t many dodgy deals going on, but we have to remember that Great Britain has already gone through the long process of moving from smallholder farming to relatively large-scale commercial production. Yet Ricardo and Marloes want to code Great Britain  a zero and include it in the selection equation – I’m just not convinced.

How do we move forward? I’m happy that both Marloes and Ricardo want to continue working on this. This is definitely the best outcome – I can think of several ways that one could try and take it a little bit further

  • Let’s start exploiting the time dimension: we have a panel – let’s use it – although I do have a fear that, as several have pointed out, there won’t be enough meaningful variation in the WDI indicators across time to actually identify anything.
  • Number of deals and size - these are, save for my kitchen sink regressions, currently unexploited. As is information on whether or not deals are international or national.
  • Let’s get more data! I’m hesitant to throw a stake into the ground and say it’s time to make a call, especially when the data is as limited as it is. If we could get our hands on district level data (or, in my wildest dreams, GIS data) on land deals, we could start to say so much  more about what’s going on.

Finally, a word to idle academics out there – I implore you to pay more attention to this stuff. We have a hard enough time encouraging replication of our own studies, but I think the world would be a much better place if we sat down from time to time and just tried to recreate “killer facts” that otherwise dominate the discourse. I didn’t start my analysis with any intention to go after Oxfam’s results, but it only took a little while with the data before I realised that the story was much more complex, and worth a second look.

Again, thanks to Ricardo and Marloes for a fun debate (I’ve offered them a second reply if they’d like).

 

3 thoughts on “Some more thoughts on land grabs and tricky statistics

  1. Paul Clist

    February 22, 2013 at 9:17am

    Matt,

    First up, I am grateful that at least one ‘idle’ academic has sat down to replicate the result and allowed the space for reply. So good work on the blog, as always. One thing that hampers academic engagement in this kind of thing is that we can’t get such analysis published, but the blog is a great vehicle for this type of analysis.

    Second, I think you’ve misunderstood R&M’s point about a double hurdle model. They are pointing out that the data is censored and so a (double) hurdle/Cragg/two-part model (all names for the same thing) is appropriate. This consists of a binary 1st stage equation (land deal yes/no) followed by a 2nd stage levels equation (only including countries with at least one land deal). In this light your table 4 cols 3-4 are second stage regressions with no first stage. For example, GDP could be very important in deciding whether any land deals take place, but governance could determine the amount of deals. So it is not selection on unobservables that is missing but a logit showing selection on observables.

    On the WGI, see Langbein & Knack (2010) “The Worldwide Governance Indicators: Six, One, or None?” in JDS 46(2).

    Best, Paul

  2. Matt

    February 22, 2013 at 12:09pm

    Hi Paul,

    Thanks for the clarification on the double-hurdle, so we’re basically talking a more flexible tobit. This makes sense, but I think what we’re really missing here is some theory. Allowing for different determinants of selection (land deals yes/no) and levels should be based on some assumptions as to why these processes should be separate. R&M’s suggestion that the first stage is all about investor targeting and then some sort of deal is worked out might suggest a craggit is the way to go, but then I’d really like to see these things spelled out ahead of time.

    It also doesn’t deal with the issue of what we’re actually measuring – all of this becomes much more complex if we replace “land deal yes/no” with “reported on a land deal in the Land Matrix yes/no”. I think I’d be more comfortable restricting the first stage to a smaller set of countries for which we think these decisions are actually at play, even if that carried with it other issues.

    Thanks also for the reference – I completely agree that the WGIs are measuring roughly the same thing and usually wouldn’t recommend including all at once, but given that the Oxfam piece suggested they represented different pathways, I thought I’d throw them a bone and toss them all in.

    Matt

  3. Paul Clist

    February 22, 2013 at 4:02pm

    Hi Matt,

    Thanks for the reply. Please don’t misunderstand me – I think most of your comments are very valid and useful, I just wanted to clear up one methodological point. While we’re on the topic, I think the 2 part model has, in many cases, more believable assumptions than the tobit. For example the tobit assumes fixed relative importance in both stages for all variables. Cameron and Trivedi have a nice write up, and the Monte Carlo evidence doesn’t imply tobit should be our ‘default’ with censored data – we can test which estimator is appropriate in each circumstance.

    On the more interesting (i.e. less technical) point, I agree the problem is teasing out what is related to income and what is related to governance. I see this problem in the aid allocation context, when people claim to be giving more aid to better governed countries they are normally just giving more to richer countries, for example. Doing this with c.200 data points with c.50 land deals in a dataset is not a challenge I’d like to take on. Sorry for ending on a pessimistic point, but hopefully R&M can get more data!

    Best, Paul

Comments are closed.