Easterly: Academic publishing standards constrain work on important policy questions

220px-William_Easterly_by_Jerry_Bauer“Academic standards are leading us to concentrate on the less important policies. The worst case scenario is development economists risk becoming irrelevant because [they] concentrate on small issues that policy makers don’t think are important.” – Bill Easterly

 

Summary of interview w/ VoxDev

Trends:

  • Growth response to globalization policies (esp. in Africa, e.g.)
  • BUT Intellectual backlash against globalization

Doubts are legitimate – it’s hard to measure causality of these macro dynamics.

But do have persuasive correlations (high rates of inflation are strongly negatively correlated with growth rates). Linked to worse welfare. But rigorous causality determination is difficult – can’t rule out third factor or reverse causality.

Academics are reluctant to study inflation and growth and globalization; we need to present non-causal correlations if that’s the best we can do. It’s a responsibility of the economist to look at these as honestly as possible, even if isn’t most rigorous type of evidence. It’s what we’ve got!

Not enough research on big-picture policies.

Zimbabwe is relapsing into high inflation – will be very destructive and needs to be studied. Venezuela another example of poor policies around inflation. Not many policy makers / academic economists will think of these as good policies, but they used to be very common in S. America and Africa in 1970s-90s.

Incentives in economic publishing – prioritize rigorous resolution through causality. Young economists: by all means, stick with this! Tenured professors: stylized facts are also useful pieces of evidence. Need to work on these big, non-causal issues, too.

Evidence on small scale programs are less relevant for policy-makers.

Good model of what we should do more of: Acemoglu’s work. Easterly’s own research.

Paradox: development economists really want to talk about these big pictures, but very challenging to publish any research on them b/c of huge prioritization of rigor.

Wish the “brightest young minds in our field” have to do RCTs instead of look at the big questions.

IMF / World Bank are more interested in policy practicalities so they’re not as biased as journals.

(Small) RCTs useful for NGOs / specific aid agency programs.

Governments want institutional reforms or macro policy changes.

 

 

 

 

Dev links: Migration & Replication

Migration

No short-term effect of foreign aid on refugee flows

Overview: “We estimate the causal effects of a country’s aid receipts on both total refugee flows to the world and flows to donor countries.”

Data: “Refugee data on 141 origin countries over the 1976–2013 period [combined] with bilateral Official Development Assistance data”

Identification strategy: “The interaction of donor-government fractionalization and a recipient country’s probability of receiving aid provides a powerful and excludable instrumental variable (IV) when we control for country- and time-fixed effects that capture the levels of the interacted variables.”

Findings: “We find no evidence that aid reduces worldwide refugee outflows or flows to donor countries in the short term. However, we observe long-run effects after four three-year periods, which appear to be driven by lagged positive effects of aid on growth.”

Authors: Dreher, Fuchs, & Langlotz

Living abroad doesn’t change individual “commitment to development”

Overview: “Temporary migration to developing countries might play a role in generating individual commitment to development”

Data: “unique survey [of Mormon missionaries] gathered on Facebook”

Identification strategy: “A natural experiment – the assignment of Mormon missionaries to two-year missions in different world regions”

Findings: “Those assigned to the treatment region (Africa, Asia, Latin America) report greater interest in global development and poverty, but no difference in support for government aid or higher immigration, and no difference in personal international donations, volunteering, or other involvement.” (controlling for relevant vars)

Author: Crawfurd

Replication

Lessons from 3ie replications of development impact evaluations

Overview: “focus is internal replication, which uses the original data from a study to address the same question as that study”

Findings: “In all cases the pure replication components of these studies are generally able to reproduce the results published in the original article. Most of the measurement and estimation analyses confirm the robustness of the original articles or call into question just a subset of the original findings.” + some advice info on how to better translate study findings into policy

Authors: Brown & Wood

Practical advice for conducting quality replications 

Overview: The same authors share practical advice address the challenge “to design a replication plan open to both supporting the original findings and uncovering potential problems.”

Contribution:

1. Tips for diagnostic replication exercises in four groups: validity of assumptions, data transformations, estimation methods, and heterogeneous impacts, plus examples and other resources

2. List of don’ts for how to conduct and report replication research

 

 

Building State Capacity: Evidence from Biometric Smartcards in India

Preface: I always say I want to read more papers & summarize them. That can seem like an overwhelmingly massive undertaking. But I am forging ahead! This is the first step of what I hope to be a regular habit of reading and summarizing papers. “Building State Capacity” raised a lot of interesting points – it’s the first paper I’ve read in a while. As I refamiliarize myself with academic writing and various development econ concepts, I hope to become increasingly concise.


Summary

Program: Use of biometric identification system to administer benefits from two large welfare programs

Where: Andhra Pradesh, India

When: 2010 (baseline) – 2012 (endline)

Sample: 157 sub-districts, 19 million people

Identification strategy: RCT

Findings

  1. Payment collection became faster and more predictable
  2. Large reductions in leakage (fraud/corruption)
  3. Increase in program access: Reduction in gov’t officials claiming benefits in others’ names
  4. Little heterogeneity of results: No differences based on village or poverty/vulnerability of HH
  5. Strength of results: “Treatment distributions first-order stochastically dominate control distributions,” which means that “no treatment household was worse off relative to the control household at the same percentile of the outcome distribution”
  6. Drivers of impact? (non-experimental decomposition)
    • For payment process improvement: changed organization responsible for managing fund flow and payments
    • For decrease in fraud: biometric authentication
  7. Cost effective, for state and beneficiaries

Methodology details

Surveys: Baseline and endline household surveys (2 years between)

Randomization: Graduated rollout over 2 years. Treatment subdistricts were first wave, then buffer subdistricts (during survey time), then finally the control subdistricts (note: subdistricts = “mandals” in India)

Stratification: By district and a principal component of socioeconomic characteristics

Analysis: Intent-to-treat (ITT): “estimates the average return to as-is implementation following the ‘intent’ to implement the new system”

“Up-take”: 50% of payments transferred to electronic in 2 years

Main controls: district FEs, “the first principal component of a vector of mandal characteristics used to stratify,” baseline outcome levels where possible

Standard errors: clustered at mandal level (Lowest level of stratification)

Robustness checks:

  1. No differential misreporting: not driving results either due to collusion btwn officials and respondents or to inadvertent recall problems
  2. No spillovers: no evidence of either strategic spillovers (officials diverting funds to control mandals they can more easily steal from) or spatial spillovers (from neighbor gram panchayats – village counsels)
  3. No effects of survey timing relative to payment time
  4. No Hawthorne effects

Thoughts & Questions

  1. “Evaluated at full scale by government”: This minimizes risks around external validity that are often an issue for studies on NGO-operated programs at a smaller scale. Vivalt (2019) found that programs implemented by governments had smaller effect sizes than NGO/academic implemented programs, controlling for sample size; Muralidharan and Niehaus (2017) and others have discussed how results of small pilot RCTs often do not scale to larger populations.
  2. Love that they remind you of ITT definition in the text – makes it more readable. Also that they justify why ITT is the policy-relevant parameter (“are net of all the logistical and political economy challenges that accompany such a project in practice”)
  3. Again, authors define “first-order stochastically dominate” in the text, which I was wondering about from the abstract. Generally, well-written and easy to understand after a while not reading academic papers all the time!
  4. What does “non-experimental decomposition” mean? (This is describing how the authors identified drivers of treatment effects)
  5. Is it particularly strong evidence that treatment distribution was first-order stochastically dominant over control distribution? How do we interpret this statistically? Logically, if treatment was better for all HHs, relative to the closest comparison HH, that’s a good sign. But what if your results were not stat sig but WERE first-order stochastically dominant? What would that mean for interpreting the results?
  6. What is the difference between uptake and compliance? Uptake = whether treatment HHs take up the intervention/treatment. Compliance = whether the HH complies with its assigned status in the experimental design (applies to both treat and control households). Is that right?
  7. What does “first stage” mean? In this paper, it seems to be asking, How did treat and control units comply with the evaluation design, and what is the % uptake? (Basically, did randomization meaningfully work?) Is this always what first stage means for RCTs? How does its meaning differ for other identification strategies?
  8. Reminder: Hawthorne effects = when awareness of being observed alters study participant behavior
  9. What does “principal component” mean? Is it like an index?
  10. Authors note that the political case for investment in capacity depend on a) magnitude and b) immediacy of returns -> Does that mean policy makers are consistently biased toward policies w/ short-term pay-offs? (If yes, would expect there to be a drop-off for policies that have pay-offs on longer timeline than election cycle… or maybe policy just not data driven enough to see that effect?) Also, would this lead to fewer studies on long-term effects of programs/interventions with strong short-term payoffs because little policy appetite for longterm results?
  11. Challenges of working in policy space: program was almost ended b/c of negative feedback from local leaders (whose rents were being decreased!), but evidence from study, including positive beneficiary feedback, helped state gov’t stay the course! Crazy!
  12. Reference to “classic political economy problem of how concentrated costs and diffuse benefits may prevent the adoption of social-welfare improving reforms” In future, look up reference: Olson 1965
  13. Type I and II errors being referenced in a new (to me) way: Type I as exclusion, Type II as inclusion errors … I know these errors in statistical terms as Type I = false positive (reject true null) and Type II = false negative (fail to reject false null). In the line following the initial reference, the authors seem to refer to exclusion errors as exclusion of intended recipients, so not sure if these are different types of errors or I’m not understanding yet. To be explored further in future.

Muralidharan, K., Niehaus, P., & Sukhtankar, S. (2016). Building state capacity: Evidence from biometric smartcards in India. American Economic Review106(10), 2895-2929.

Weekly Development Links #8

Brought to you by #NEUDC2018! Check out mini summaries of the many awesome papers featured at this conference here,  and download papers here. These are three that really struck me.

1. Psychological trainings increase chlorination rates
Haushofer, John, and Orkin 2018: (RCT in Kenya) “One group received a two-session executive function intervention that aimed to improve planning and execution of plans; a second received a two-session time preference intervention aimed at reducing present bias and impatience. A third group receives only information about the benefits of chlorination, and a pure control group received no intervention.” Executive function and time preference trainings led to stat sig increases in chlorination and stat sig decreases in diarrhea rates.

2. Conditional cash transfers reduce suicides!
Christian, Hensel, and Roth 2018: (RCT in Indonesia) This paper is so cool! One mechanism is by mitigating the negative impact of bad agricultural shocks and decreasing depression. “We examine how income shocks affect the suicide rate in Indonesia. We use both a randomized conditional cash transfer experiment, and a difference-in-differences approach exploiting the cash transfer’s nation-wide roll-out. We find that the cash transfer reduced yearly suicides by 0.36 per 100,000 people, corresponding to an 18 percent decrease. Agricultural productivity shocks also causally affect suicide rates. Moreover, the cash transfer program reduces the causal impact of the agricultural productivity shocks, suggesting an important role for policy interventions. Finally, we provide evidence for a psychological mechanism by showing that agricultural productivity shocks affect depression.”

3. Women police stations increased reporting of crimes against women
Amaral, Bhalotra, and Prakash 2018: (in India) “Using an identification strategy that exploits the staggered implementation of women police stations across cities and nationally representative data on various measures of crime and deterrence, we find that the opening of police stations increased reported crime against women by 22 percent. This is due to increases in reports of female kidnappings and domestic violence. In contrast, reports of gender specific mortality, self-reported intimate-partner violence and other non-gender specific crimes remain unchanged.”

BONUS: Amazing 3-D map of world populations
(The Pudding has so many other really interesting and informative graphics, too!)

Weekly Development Links #7

1. 11 years later: Experimental evidence on scaling up education reforms in Kenya (TL;DR gov’t didn’t adopt well)

(This paper was published in Journal of Public Econ 11 years after the project started and 5 years after the first submission!) “New teachers offered a fixed-term contract by an international NGO significantly raised student test scores, while teachers offered identical contracts by the Kenyan government produced zero impact. Observable differences in teacher characteristics explain little of this gap. Instead, data suggests that bureaucratic and political opposition to the contract reform led to implementation delays and a differential interpretation of identical contract terms. Additionally, contract features that produced larger learning gains in both the NGO and government treatment arms were not adopted by the government outside of the experimental sample.”

2. Argument for reporting the “total causal effect”

  • Total causal effect (TCE) = weighted average of the intent to treat effect (ITT) and the spillover effect on the non-treated (SNT)
  • Importance: “RCTs that fail to account for spillovers can produce biased estimates of intention-to-treat effects, while finding meaningful treatment effects but failing to observe deleterious spillovers can lead to misconstrued policy conclusions. Therefore, reporting the TCE is as important as the ITT, if not more important in many cases: if the program caused a bunch of people to escape poverty while others to fall into it, leaving the overall poverty rate unchanged (TCE=0), you’d have to argue much harder to convince your audience that your program is a success because the ITT is large and positive.”
  • Context: Zeitlin and McIntosh recent paper comparing cash and a USAID health + nutrition program in Rwanda. From their blog post: “In our own work the point estimates on village-level impacts are consistent with negative spillovers of the large transfer on some outcomes (they are also consistent with Gikuriro’s village-level health and nutrition trainings having improved health knowledge in the overall population). Cash may look less good as one thinks of welfare impacts on a more broadly defined population. Donors weighing cash-vs-kind decisions will need to decide how much weight to put on non-targeted populations, and to consider the accumulated evidence on external consequences.”

3. Why don’t people work less when you give them cash?

Excellent post by authors of new paper on VoxDev, listing many different mechanisms and also looks at how this changes by type of transfer (e.g. gov’t conditional and unconditional, remittances, etc.)

BONUS: More gender equality = greater differences in preferences on values like altruism, patience or trust (ft. interesting map)

Falk & Hermle 2018

Weekly Development Links #4 – #6

Dev links coming to you weekly from now on!

Week #6: Oct 17

1. Cash transfers increase trust in local gov’t

“How does a locally-managed conditional cash transfer program impact trust in government?”

  • Cash transfers increased trust in leaders and perceptions of leaders’ responsiveness and honesty
  • Beneficiaries reported higher trust in elected leaders but not in appointed bureaucrats
  • Government record-keeping on health and education improved in treatment communities

2. Kinda random: sand dams

Read a WB blogpost on sand dams as a method for increasing water sustainability in arid regions … but that did not explain how the heck you store water in sand, so watched this cool video from Excellent Development, a non-profit that works on sand dam projects.

3. USAID increasingly using “geospatial impact evaluations” ft. MAPS!

Outlines example of a GIE on USAID West Bank/Gaza’s recent $900 million investment in rural infrastructure

Ariel BenYishay, Rachel Trichler, Dan Runfola, and Seth Goodman at Brookings

BONUS: In other geospatial news
LSE blog post on the work of ground-truthing spatial data in Kenya

Week #5: Oct 10

Health Round-Up Edition

1. Dashboards for decisions: Immunization in Nigeria

A new dashboard is being used to improve data on routine immunizations … but doesn’t look like the underlying data quality has been improved. Is this just better access to bad data?

2. Norway vs. Thailand vs. US

A comparative study of health services for undocumented migrants

3. Traditional Midwives in Guatemala

Aljazeera on the complicated relationship between traditional midwives providing missing services and the gov’t trying to provide those services in health centers

BONUS: Visualizing fires + “good”

  • Satellite imagery of crop burning in India in 2017 vs 2018
  • How good is good? 6.92/10. The YouGov visualization on how people rate different descriptors on a 0-10 scale is really interesting if you look at the distributions – lots of agreement on appalling, average (you’d hope there would be clustering around 5!), and perfect. Then, pretty wide variance for quite bad, pretty bad, somewhat bad, great, really good, and very good. Shows how you should cut out generic good/bad descriptions in your writing and use words like appalling or abysmal that are more universally evocative.

Week #4: Oct 3

1. Tanzania outlaws critiques of their data!?

“Consider a simple policy rule: if a government’s statistics cannot be questioned, they shouldn’t be trusted. By that rule, the Bank and Fund would not report Tanzania’s numbers or accept them in determining creditworthiness—and they would immediately withdraw the offer of foreign aid to help Tanzania produce statistics its citizens cannot criticize.”

2. 12 Things We Can Agree On About Global Poverty?

In August, a CGDev post proposed 12 universally agreed-upon truths about global poverty. Do you agree? Are there other truths we should all agree on?

3. Food for thought on two relevant method issues

  • Peter Hull released a two-page brief on controlling for propensity scores instead of using them to match or weight observations
  • Spillover and estimands: “The key issue is that the assumption of no spillovers runs so deep that it is often invoked even prior to the definition of estimands. If you write the “average treatment effect” estimand using potential outcomes notation, as E(Y(1)−Y(0))E(Y(1)−Y(0)), you are already assuming that a unit’s outcomes depend only on its own assignment to treatment and not on how other units are assigned to treatment. The definition of the estimand leaves no space to even describe spillovers.”

BONUS: New head of IMF
Dr. Gita Gopinath takes over.

Weekly Development Links #3

My final week of taking over IDinsight’s internal development links.

1. Development myths: debunked

Rachel Glennerster asked for examples of development myths, resulting in a list development myths along with debunking sources / evidence against. Some of the myths shared, with accompanying evidence:

2. Traditional local governance systems (autocratic) underutilize local human capital

A new paper by Katherine Casey, Rachel Glennerster, Ted Miguel, and Maarten Voors. “We experimentally evaluate two solutions to these problems [autocratic local rule by old, uneducated men] in rural Sierra Leone: an expensive long-term intervention to make local institutions more inclusive; and a low-cost test to rapidly identify skilled technocrats and delegate project management to them. In a real-world competition for local infrastructure grants, we find that technocratic selection dominates both the status quo of chiefly control and the institutional reform intervention, leading to an average gain of one standard deviation unit in competition outcomes. The results uncover a broader failure of traditional autocratic institutions to fully exploit the human capital present in their communities.“

3. Aggressive U.S. recruitment of nurses from Philippines did not result in brain drain / negative health impacts

A new paper by Paolo Abarcar and Caroline Theoharides. “For each new nurse that moved abroad, approximately two more individuals with nursing degrees graduated. The supply of nursing programs increased to accommodate this. New nurses appear to have switched from other degree types. Nurse migration had no impact on either infant or maternal mortality.”

BONUS. Data viz: Poverty persists in Africa, falls in other regions

Justin Sandefur shared that the Economist much improved a World Bank graphic to more clearly visualize how the number of people living in poverty has risen slightly in Africa while other regions have seen sharp decreases in # of people in poverty over time. (Wonder how the graphic would like stacked Africa, South Asia, then East Asia & Pacific? Less dramatic contrast between Africa and the other regions? Number of poor in South Asia hasn’t decreased as dramatically as East Asia, would look more similar to Africa trend than East Asia trend until about 2010 I think.)

Weekly Development Links #2

This is part 2 of me taking over IDinsight’s internal development link round-up.

1. This week in gender & econ

2. Two papers on p-hacking or bad reporting in econ papers

3. Mapping trade routes Tilman Graff shared some really cool visualizations of trade routes, aid, and infrastructure in several Africa countries. They were created as part of his MPhil thesis.

Footbridges for higher wages

Lant Pritchett and other researchers often argue that development economists are too focused on one-off, micro interventions and fail to see the big picture. They are highly critical of the hype that develops around specific interventions following the release of studies using RCTs or other quasi-experimental methods to measure the impact of a specific program – microfinance, for example, had a big moment and, more recently, cash transfers have dominated many discussions of economic development.

Pritchett’s scorecard comparing first generation RCT practice to the approach of the non-RCT crowd is an especially brutal assessment of the micro development literature (second table in the link). He writes, “National Development leads to better well being. National development is ontologically a social process (markets, politics, organizations, institutions). RCTs have focused on topics that account for roughly zero of the observed variation in human development outcomes.”

There’s a lot that’s valid about this line of critique, although I think it’s more a call to be sure to contextualize learnings, ideally with qualitative research to investigate the how and why of a quantitative claim, rather than motivation to throw out the micro development approach altogether.

Besides, there is something so satisfying about how a small intervention can have a big impact.

Small bridges, big deal

Brooks and Donovan’s recent paper (full PDF here) found that building footbridges in Northern Nicaragua protected local workers from the typical wage loss seen during flooding, when travel routes are cut off, and even led to increased profits of local farmers.

Their primary finding is best seen through two graphics from the paper. The first shows the distribution of wage earnings before footbridge construction, and you can clearly see a massive disadvantage to those experiencing flooding. In the second, the gap has disappeared.

Figures 1 & 2: Distribution of wage earnings BEFORE footbridge construction

Figure 2: AFTER

They also find positive spillover effects. First, rural villagers were able to take higher paying jobs in nearby towns, increasing their wages and increasing the wages of those left behind, who faced less competition in the local labor market. (A similar mechanism to that found in the No Lean Season research, which offered select villagers incentives to migrate to cities for work and found positive income effects for those households and neighboring non-study households.)

Second, farmer profits increased. Not because of lower trade costs that allowed farmers to buy cheaper inputs, but because they were able to access new purchasing markets for their goods and diversify their income sources.

This paper is amazing because the data viz communicates clearly, the findings are meaningful and positive, and the idea for the research design had to have come from an intimate knowledge of the challenges facing rural citizens of Northern Nicaragua.

A national and local development tool

Infrastructure studies connect easily to those big questions about national development that anti-randomistas would prefer to focus on.While it won’t be footbridges in every location, there are lots of countries where road and transport infrastructure solutions are needed to promote both local and national development.

Papers like this one show how connectivity and access can be an important determinant of economic welfare via multiple mechanisms. Besides income effects like those measured in the Brooks and Donovan paper, there are possible effects for access to credit, healthcare, or other public services that isolated communities would otherwise miss out on.

Gaining entitlements with infrastructure and cash

There’s a seriously inspiring narrative in there – a simple change that leads to more options, more opportunities, more connectivity. As my colleague Sindy was discussing today, there is a pattern that interventions about increasing options and expanding opportunity, such as infrastructure improvements or cash transfers, seem more powerful to affect broad change than interventions targeting very narrow and specific goals.

Although, there is probably a gain in using both types of interventions at different times, or concurrently.

McIntosh and Zeitlin’s new paper compares a cash transfer program directly with a child nutrition program.The final line of their abstract made me think about paternalism and beneficiary preferences: “The results indicate that programs targeted towards driving specific outcomes can do so at lower cost than cash, but large cash transfers drive substantial benefits across a wide range of impacts, including many of those targeted by the more tailored program.”

People spend their money with different priorities than programs dictate and seem to get more out of it. That suggests to me that cash transfers (or infrastructure improvements) are a way to improve this baseline ability to provide for your household (“entitlements” à la Amartya Sen), while specific health or education interventions are more useful as public service-style campaigns to promote undervalued goods, such as immunizations.

A final thought

I’m generally curious how often Sen’s entitlements approach is explicitly applied to non-famine topics in development research. I’m guessing often. (A two-minute google led me to a PhD thesis called “Poverty as entitlement failures” that sounds interesting.)

Weekly Development Links #1

Each Wednesday at IDinsight, one of our tech team members, Akib Khan, posts a few links (mostly from Twitter!) to what he’s been reading in development that week. For the next three weeks, he’s on leave and I am taking over! Thought I should cross-post my selections (also mostly curated from #EconTwitter):

Cash Transfer Bonanza: The details matter
Blattman et al. just released a paper following up on previous 4-year results from a one-time cash transfer of $400, now reporting 9-year results (see first 3 links). To liven up the internal discussion, I’m adding critiques by Ashu Handa (UNC Transfer Project / UNICEF-Innocenti economist and old family friend), who has cautioned against lack of nuance in interpretation of CT study results, esp. around program implementation details like who is distributing grants, the size of the grants, and how frequently they are given – he studies social protection programs giving repeat cash transfers.

Diff-in-diff treatment timing paper… with GIFs!
Andrew Goodman-Bacon (what a name!) has a new paper that all of #EconTwitter is going crazy over. It deals with some methodological issues using diff-in-diff when treatment turns on at different times for different groups, and other scenarios where timing becomes important. Real paper not for the faint hearted, but the Twitter thread has some great GIFs!

African debt to China: reality doesn’t match the hype

Bonus link: Eritrea & Ethiopia border opening party

Thesis revamp: All hail Ted Miguel, PhD, god of economic writing!

      Ted Miguel, god of economic writing

In order to have a high-quality writing sample for the RA jobs I’m applying to this fall, I am revamping my thesis! Joy of joys!

I thought about doing this earlier in the year and even created a whole plan to do it, but ended up deciding to work on this blog, learning to code, and other, less horrifying professional development activities.

I say horrifying because the thesis I submitted was HORRIBLY WRITTEN. So so so bad. I cringe every time I look back over it. I had tackled a 6-year project (the length of time it took to write the paper I was basing my thesis on, I later found out) in four months time. Too little of the critical thinking I had done on how to handle the piles and piles of data I needed to answer my research question actually ended up in writing.

I thought it would be a drag to fix up the paper. I didn’t expect to still be as intrigued by my research topic (democracy and health in sub-Saharan Africa!) or to be as enthusiastic about practicing my economic writing. I’m taking the unexpected enjoyment as a positive sign that life as a researcher will be awesome.

I’ve been thinking critically about the question of democracy and health and how they’re interrelated and how economic development ties into each. I’ve read (skimmed) a few additional sources that I didn’t even think to look for last time and I already have some good ideas for a new framing of why this research is interesting and important. The first time around, I focused a lot on the cool methodology (spatial regression discontinuity design) because that’s what I spent most of my time working on.

My perspective on the research question has been massively refreshed by time apart from my thesis, new on-the-ground development experience, and the papers I’ve read in the interim.

My first tasks have been to re-read the thesis (yuck), and then gather the resources I need to re-write at least the introduction. I am focusing on the abstract and introduction as the first order of business because some of the writing samples I will need to submit will be or can be shorter and the introduction is as far as most people would get anyways.

To improve my writing and the structure of my introduction, my thesis advisor – who I can now call Erick instead of Professor Gong – recommended reading some of Ted Miguel’s introductions. I printed three and all were well-written and informative in terms of structure; one of them (with Pascaline Dupas) even helped me rethink the context around my research question and link it more solidly to the development economics literature.

The next move is to outline the introduction by writing the topic sentence of each paragraph (a tip taken from my current manager at IDinsight, Ignacio, who is very into policy memo-style writing) using a Miguel-type structure. I’ll edit that structure a bit, then add the text of the paragraphs.

Noble work: Anand Giridharadas on the EKS

There was a recent discussion on the IDinsight #philosophy Slack channel about a recent Ezra Klein Show (EKS on this blog from now on, since I talk about it all the time) podcast with Anand Giridharadas. My contribution built off someone else’s notes that Giridharadas is spot on about how companies (also IDinsight in some ways) sell working for them as an extension of the camaraderie and culture of a college campus, how he doesn’t offer concrete solutions and that’s very annoying, and some reflections on transitioning from private sector consulting to IDinsight’s social sector, non-profit consulting model. I related more to the moral arguments in the podcast, and this is what I shared:

I connected most with his argument about how the overall negative impact of many big for-profit companies on worldwide well-being vastly outweighs any individual good you can do with the money you earn. One of EA’s recommended pathways to change is making a ton of money and giving it to effective charities, but if you do that by working for an exploitative company, then you’re really contributing to the maintenance of inequality and of the status quo racist, sexist, oppressive system.

My dad was always talking about having a “noble” profession when I was growing up (he’s a teacher and my mom’s a geriatric physical therapist) and even though “noble” is a strange way to put it, I think it is really important to (as much as possible) only be party to organizations and companies that are doing good or at least not doing active harm.

That being said, there are more reasons for going into the private sector and aiming to make money than are really dealt with in the podcast. For example, a few people we’ve talked to in South Africa have mentioned that many highly skilled South Africans are responsible for the education costs for all siblings/cousins and that is a strong motivator to take a higher paying salary.

It becomes very related to the debate about how much development or social sector workers should get paid, relative to competitive private sector jobs. I think IDinsight does a pretty good job of being in the middle for US associates anyways – paying enough that you can even save some, which is more than a lot of non-profits provide, but not necessarily trying to compete with private sector jobs because our model relies a lot on hiring people who are in it to serve, not for the money. Something for us to continue thinking about is how this might exclude candidates who have other financial responsibilities and how we should respond to this issue in how we hire and set salaries.

It’s so frustrating when people identify a problem without offering solutions. The closest he comes to offering solutions is to have organizations stop lobbying for massive tax breaks or in other ways deprioritize the bottom line of profitability. Sounded to me like his vision involves a lot more socialist ideas: the full solutions to these issues would involve massive-scale reorganizing of the existing economic system… although maybe we are heading in that direction with more co-op style companies and triple bottom line for-profit social enterprises? (Don’t know a ton about this co-op stuff – mostly from another Ezra Klein show episode probably, but it sounds cool!) …Maybe his next book will try to map out solutions, though?

I’m pretty sure I just solved life

Disclaimer: I was a little drunk on power (calculations) when I wrote this, but it’s me figuring out that econometrics is something I might want to specialize in!

I think I just figured out what I want to do with the rest of my career.

I want to contribute to how people actually practice data analysis in the development sector from the technical side.

I want to write about study design and the technical issues that go into running a really good evaluation, and I want to produce open source resources to help people understand and implement the best technical practices.

This is always something that makes me really excited. I don’t think I have a natural/intuitive understanding of some of the technical work, but I really enjoy figuring it out.

And I love writing about/explaining technical topics when I feel like I really “get” a concept.

This is the part of my current job that I’m most in love with. Right now, for example, I’m working on a technical resource to help IDinsight do power calculations better. And I can’t wait to go to work tomorrow and get back into it.

I’ve also been into meta-analysis papers that bring multiple studies together. In general, the meta-practices, including ethical considerations, of development economics are what I want to spend my time working on.

I’ve had this thought before, but I haven’t really had a concept of making that my actual career until now. But I guess I’ve gotten enough context now that it seems plausible.

I definitely geek out the most about these technical questions, and I really admire people who are putting out resources so that other people can geek out and actually run better studies.

I can explore the topics I’m interested in, talk to people who are doing cool work, create practical tools, and link these things that excite me intellectually to having a positive impact in people’s lives.

My mind is already racing with cool things to do in this field. Ultimately, a website that is essentially an encyclopedia of development economics best practices would be so cool. A way to link all open source tools and datasets and papers, etc.

But top of my list for now is doing a good job with and enjoy this power calculations project at work. If it’s as much fun as it was today, I will be in job heaven.

Continue reading I’m pretty sure I just solved life

New insights on the development vs. humanitarian sectors

When I was at Middlebury, I took classes like Famine & Food Security and Economics of Global Health, learning more and more about humanitarian aid and international development. It didn’t really sink in that these were two different sectors until today.

I had a chance to talk to someone who worked for REACH – an organization that tries to collect the most accurate data possible from war zones/humanitarian emergency areas to inform policy. Seem like pretty important work.

Our conversation solidified to me that the humanitarian sector is different from the development sector. The humanitarian sector has a totally different set of actors (dominated by the UN) and missions, although the ultimate mission of a better world is the same.

Development is about the ongoing improvement of individuals living in a comparatively stable system; humanitarian aid is about maintaining human rights and dignities when all those systems break down.

There’s some overlap, of course – regions experiencing ongoing war and violence may be targeted by development and humanitarian programs alike, for example. I also think the vocabulary blurs a bit when discussing funding for development and humanitarian aid.

Development isn’t quite sure how it feels about human rights, though. Rights are good when they lead to economic development, which is equivalent to most development work.

I’d say that my definition of what I want to do in the development sector bleeds over into the human rights and humanitarian arenas. (I’m sure there’s also an important distinction between human rights sector and humanitarian sector – probably that the humanitarian sector is more about meeting people’s basest needs in crisis, although human rights workers also deal with abuses during crises.)

My interest in humanitarian work has been piqued by this conversation today, though. It was also piqued by my former roommate’s description of her work with Doctors without Borders. The idea of going on an intense mission trip for a period of time, being all-in, then taking a break is kind of appealing. Although REACH itself wasn’t described as a great work experience. Really long hours, but fairly repetitive work.

Maybe I should read more about the economics/humanitarian aid/data overlap.

Is my job moral? [repost]

If I continue on my current career path, I may end up arbitrating who lives and who dies. (And maybe I’ll tell their story in an economics journal and make a living doing so.)

I am planning on pursuing a career in development work, specifically in the evaluation of development programs. The “gold standard” for evaluating programs is a Randomized Control Trial (RCT).

Consider a non-profit distributing books to children with the goal of improving literacy. The non-profit wants to know whether their books really have any impact on children’s literacy. Ideally, they could look at what happens when they give a group of children the books and also what happens when they don’t give the same children books.

However, due to thus far unchangeable time-space continuum properties, this isn’t possible. So, in order to confidently say that their books had an impact, the non-profit needs to compare the literacy scores of children who received the books with other very similar children who didn’t get books. Let’s say they hire me to run an RCT for this very purpose.

To determine which children will get the books (the treatment group) and which children will serve as the comparison group (the control group), I take a list of 100 schools and randomly assign half of them to receive the extra books program. After the books are distributed and some time has passed, I go back to the schools and I have all the children take literacy tests. I compare the test scores of children in each group, and find that, on average, children who received books did much better on the literacy tests.

The non-profit is very happy and uses the results to convince more people to donate to their program. Now they can give books to many more children, and presumably those children’s literacy scores will also increase.

This is all good and well. Even if some children in the study were chosen not to receive books, there are several commonly accepted justifications for why we studied them without providing a service:

  • The non-profit did not have enough money to give books to all the schools anyway. Randomly determining which schools received the books makes it as fair as possible.
  • While the books program was unlikely to have negative effects on children, we didn’t know if it would have no effect or a positive effect at the start. So we didn’t know if we were really depriving children of a chance to improve their literacy.
  • Being able to conduct the evaluation could inform policy and global knowledge on effective ways to improve literacy, and could improve decision-making at the non-profit.
  • In this case, maybe the control group children were the first to receive books when the non-profit’s funding increased.

These are common justifications for development evaluations. They seem quite reasonable — randomly giving out benefits might be the fairest option, we don’t know what the effect really is, and the study will contribute to our shared knowledge and lead to better decisions and even better outcomes in the future.

What if, instead of working on literacy, the non-profit wanted to reduce deaths from childbirth by improving access to and use of health facilities by pregnant women?

Suddenly, so much more is at stake.

If I randomly assign half a county to have access to a special taxi service that drives pregnant women to hospitals for safer deliveries, and one of the women who was assigned NOT to receive the taxi service dies because she gave birth at home, is the evaluation immoral? Am I morally culpable for her death?

Because I work with numbers and data, it is easy to separate myself from the potential negative consequences of the work. I didn’t choose her to die — the random number generator made me do it. 

Photo by Markus Spiske on Unsplash

So what if we’re in a situation where a randomized control trial seems immoral? How can we still learn about what works and what doesn’t?

There are other evaluation methods that can give us an idea of what programs work and which don’t. For example, quasi-experimental methods look at situations where comparable control and treatment groups are incidentally defined by the implementation of a policy. Then we can compare two groups without having to be responsible for directly assigning some people to receive a program while others go without.

Qualitative or other non-experimental methods involve gathering data by talking to people, doing research, and meeting with different groups to get various opinions on what’s happening. These methods can also help paint a picture of whether a program is having a positive effect.

But the RCT is the gold standard for a reason. A well-designed RCT can tell us what the effect of a program is with much higher confidence and precision than other methods.

UNICEF Social Policy Specialist Tia Palermo recently wrote a post titled “Are Randomized Control Trials Bad for Children?” for UNICEF’s Evidence for Action blog. She makes a powerful point to consider: What are the alternatives to running RCTs? Are they better or worse?

Palermo sees the alternative as worse: “Is it ethical to pour donor money into projects when we don’t know if they work? Is it ethical not to learn from the experience of beneficiaries about the impacts of a program?” she asks.

Her most convincing argument is that there are ethical implications every research method we might choose:

“A non-credible or non-rigorous evaluation is a problem because underestimating program impacts might mean that we conclude a program or policy doesn’t work when it really does (with ethical implications). Funding might be withdrawn and an effective program is cut off. Or we might overestimate program impacts and conclude that a program is more successful than it really is (also with ethical implications). Resources might be allocated to this program over another program that actually works, or works better.”

And there are ethical implications to not evaluating programs at all. If non-profits aren’t held to any standard and don’t measure the effect of their program at all, there’s no way to tell which interventions and which non-profits are helping, having no effect on, or even harming the program recipients.

In the case of the woman who died because she didn’t get to a health facility, if the study had never taken place, would she have gotten to a health facility or not? It is impossible to know what would have happened, but it’s not impossible to minimize the risk of harm and maximize the benefits to all study participants. 

Photo by Anes Sabitovic on Unsplash

Ultimately, RCTs generate important evidence when they are well executed. The findings from such studies can be used to make better decisions at non-profits, at big donor foundations like the Gates Foundation or GiveWell, and at government agencies. All of which can lead to more lives saved, which is the ultimate goal.

So what to do about the ethical implications of randomly determining who gets access to a potentially life-saving program? Or any program that could have a positive impact on people’s lives?

There are a variety of measures in place to ensure ethical conduct in research and many more ~official~ economists are thinking about these ideas.

The 1979 Belmont Report in helped establish criteria for ethics in human research, focusing on respect for people’s right to make decisions freely, maximizing benefits and doing no harm, and fairness in who bears any risks or benefits. Institutional Review Boards (IRBs) are governing bodies that ensure these principles are being upheld for all research.

Economists Rachel Glennerster Shawn Powers wrote a highly recommended piece on these ethical considerations, “Balancing Risk and Benefit: Ethical Tradeoffs in Running Randomized Evaluations,” which I’m currently reading.

Yet persistent concerns about how to run ethical evaluations suggest that there is more work to do.

Taking the time to consider the ethical implications of each project is key. And I think there is more room for evaluators to read deeply on the subject and really dig into how to make evaluations more just and more beneficial to even those in the control group who don’t receive the program.

A driving principle, especially for researchers running RCTs in the development field, could be that an evaluation must have a direct positive impact on all study participants, either during the study or immediately following its completion. There are a variety of ways, some more commonly used than others, that researchers can apply this principle:

  • If we truly don’t know whether the effect of the program is positive or negative, we can make plans to provide the program to control households if it is found to have a positive effect.
  • If we suspect the program has a positive effect, the control group can be offered the program immediately after the study period has ended.
  • We can offer everyone in the study a base service, while the study tests the effectiveness of an additional service provided only to the treatment group. This way, everyone who is contributing time and information to the study receives some benefit in return.
  • Extensive piloting (testing different ideas and aspects of the evaluation before the start of the study) can also reveal potential moral dilemmas to evaluating any particular program.
  • Community interest meetings can be held before the study is implemented to gain community-level consent to participate in the study. These meetings could also be held quite early on to inform research designs and improve the quality of the study results. For example, in some cultures, it is not appropriate for a man to be alone with a woman he is not related to. If this is the case in a study area, then hiring male staff to conduct surveys would lead to a less successful study.
  • Local staff can be hired to conduct any surveys or data collection to ensure that the surveys are culturally appropriate.
  • We always obtain full and knowledgable consent from participants, which may require translating surveys into participants’ native language.
  • If study participation requires much time or effort from control group individuals, they can be appropriately compensated.
  • All reports on evaluations (RCTs and other designs) can be fully transparent about research decisions and how ethical concerns were addressed. This will contribute to the international research community’s combined knowledge of how to ensure the rights of participants are provided for in RCTs and other research.
  • The learnings from the study can also be shared with the participating community and should add to their knowledge about their own lives; contributing to the abstract “international research community” is not enough.

Enacting these measures requires more of researchers: some have the potential to affect the legitimacy of the evaluation results if they are not properly accounted for in analysis. But a strong sense of ethics and a dedication to the population being served (often low-income individuals from the Global South, contrasted with well-off researchers from the West) demand that we take the extra time in our research to consider all ethical implications.

Originally published on my Unofficial Economist Medium publication, November 4, 2017.