Preface: I always say I want to read more papers & summarize them. That can seem like an overwhelmingly massive undertaking. But I am forging ahead! This is the first step of what I hope to be a regular habit of reading and summarizing papers. “Building State Capacity” raised a lot of interesting points – it’s the first paper I’ve read in a while. As I refamiliarize myself with academic writing and various development econ concepts, I hope to become increasingly concise.
Program: Use of biometric identification system to administer benefits from two large welfare programs
Where: Andhra Pradesh, India
When: 2010 (baseline) – 2012 (endline)
Sample: 157 sub-districts, 19 million people
Identification strategy: RCT
- Payment collection became faster and more predictable
- Large reductions in leakage (fraud/corruption)
- Increase in program access: Reduction in gov’t officials claiming benefits in others’ names
- Little heterogeneity of results: No differences based on village or poverty/vulnerability of HH
- Strength of results: “Treatment distributions first-order stochastically dominate control distributions,” which means that “no treatment household was worse off relative to the control household at the same percentile of the outcome distribution”
- Drivers of impact? (non-experimental decomposition)
- For payment process improvement: changed organization responsible for managing fund flow and payments
- For decrease in fraud: biometric authentication
- Cost effective, for state and beneficiaries
Surveys: Baseline and endline household surveys (2 years between)
Randomization: Graduated rollout over 2 years. Treatment subdistricts were first wave, then buffer subdistricts (during survey time), then finally the control subdistricts (note: subdistricts = “mandals” in India)
Stratification: By district and a principal component of socioeconomic characteristics
Analysis: Intent-to-treat (ITT): “estimates the average return to as-is implementation following the ‘intent’ to implement the new system”
“Up-take”: 50% of payments transferred to electronic in 2 years
Main controls: district FEs, “the first principal component of a vector of mandal characteristics used to stratify,” baseline outcome levels where possible
Standard errors: clustered at mandal level (Lowest level of stratification)
- No differential misreporting: not driving results either due to collusion btwn officials and respondents or to inadvertent recall problems
- No spillovers: no evidence of either strategic spillovers (officials diverting funds to control mandals they can more easily steal from) or spatial spillovers (from neighbor gram panchayats – village counsels)
- No effects of survey timing relative to payment time
- No Hawthorne effects
Thoughts & Questions
- “Evaluated at full scale by government”: This minimizes risks around external validity that are often an issue for studies on NGO-operated programs at a smaller scale. Vivalt (2019) found that programs implemented by governments had smaller effect sizes than NGO/academic implemented programs, controlling for sample size; Muralidharan and Niehaus (2017) and others have discussed how results of small pilot RCTs often do not scale to larger populations.
- Love that they remind you of ITT definition in the text – makes it more readable. Also that they justify why ITT is the policy-relevant parameter (“are net of all the logistical and political economy challenges that accompany such a project in practice”)
- Again, authors define “first-order stochastically dominate” in the text, which I was wondering about from the abstract. Generally, well-written and easy to understand after a while not reading academic papers all the time!
- What does “non-experimental decomposition” mean? (This is describing how the authors identified drivers of treatment effects)
- Is it particularly strong evidence that treatment distribution was first-order stochastically dominant over control distribution? How do we interpret this statistically? Logically, if treatment was better for all HHs, relative to the closest comparison HH, that’s a good sign. But what if your results were not stat sig but WERE first-order stochastically dominant? What would that mean for interpreting the results?
- What is the difference between uptake and compliance? Uptake = whether treatment HHs take up the intervention/treatment. Compliance = whether the HH complies with its assigned status in the experimental design (applies to both treat and control households). Is that right?
- What does “first stage” mean? In this paper, it seems to be asking, How did treat and control units comply with the evaluation design, and what is the % uptake? (Basically, did randomization meaningfully work?) Is this always what first stage means for RCTs? How does its meaning differ for other identification strategies?
- Reminder: Hawthorne effects = when awareness of being observed alters study participant behavior
- What does “principal component” mean? Is it like an index?
- Authors note that the political case for investment in capacity depend on a) magnitude and b) immediacy of returns -> Does that mean policy makers are consistently biased toward policies w/ short-term pay-offs? (If yes, would expect there to be a drop-off for policies that have pay-offs on longer timeline than election cycle… or maybe policy just not data driven enough to see that effect?) Also, would this lead to fewer studies on long-term effects of programs/interventions with strong short-term payoffs because little policy appetite for longterm results?
- Challenges of working in policy space: program was almost ended b/c of negative feedback from local leaders (whose rents were being decreased!), but evidence from study, including positive beneficiary feedback, helped state gov’t stay the course! Crazy!
- Reference to “classic political economy problem of how concentrated costs and diffuse benefits may prevent the adoption of social-welfare improving reforms” In future, look up reference: Olson 1965
- Type I and II errors being referenced in a new (to me) way: Type I as exclusion, Type II as inclusion errors … I know these errors in statistical terms as Type I = false positive (reject true null) and Type II = false negative (fail to reject false null). In the line following the initial reference, the authors seem to refer to exclusion errors as exclusion of intended recipients, so not sure if these are different types of errors or I’m not understanding yet. To be explored further in future.
Muralidharan, K., Niehaus, P., & Sukhtankar, S. (2016). Building state capacity: Evidence from biometric smartcards in India. American Economic Review, 106(10), 2895-2929.