Noble work: Anand Giridharadas on the EKS

There was a recent discussion on the IDinsight #philosophy Slack channel about a recent Ezra Klein Show (EKS on this blog from now on, since I talk about it all the time) podcast with Anand Giridharadas. My contribution built off someone else’s notes that Giridharadas is spot on about how companies (also IDinsight in some ways) sell working for them as an extension of the camaraderie and culture of a college campus, how he doesn’t offer concrete solutions and that’s very annoying, and some reflections on transitioning from private sector consulting to IDinsight’s social sector, non-profit consulting model. I related more to the moral arguments in the podcast, and this is what I shared:

I connected most with his argument about how the overall negative impact of many big for-profit companies on worldwide well-being vastly outweighs any individual good you can do with the money you earn. One of EA’s recommended pathways to change is making a ton of money and giving it to effective charities, but if you do that by working for an exploitative company, then you’re really contributing to the maintenance of inequality and of the status quo racist, sexist, oppressive system.

My dad was always talking about having a “noble” profession when I was growing up (he’s a teacher and my mom’s a geriatric physical therapist) and even though “noble” is a strange way to put it, I think it is really important to (as much as possible) only be party to organizations and companies that are doing good or at least not doing active harm.

That being said, there are more reasons for going into the private sector and aiming to make money than are really dealt with in the podcast. For example, a few people we’ve talked to in South Africa have mentioned that many highly skilled South Africans are responsible for the education costs for all siblings/cousins and that is a strong motivator to take a higher paying salary.

It becomes very related to the debate about how much development or social sector workers should get paid, relative to competitive private sector jobs. I think IDinsight does a pretty good job of being in the middle for US associates anyways – paying enough that you can even save some, which is more than a lot of non-profits provide, but not necessarily trying to compete with private sector jobs because our model relies a lot on hiring people who are in it to serve, not for the money. Something for us to continue thinking about is how this might exclude candidates who have other financial responsibilities and how we should respond to this issue in how we hire and set salaries.

It’s so frustrating when people identify a problem without offering solutions. The closest he comes to offering solutions is to have organizations stop lobbying for massive tax breaks or in other ways deprioritize the bottom line of profitability. Sounded to me like his vision involves a lot more socialist ideas: the full solutions to these issues would involve massive-scale reorganizing of the existing economic system… although maybe we are heading in that direction with more co-op style companies and triple bottom line for-profit social enterprises? (Don’t know a ton about this co-op stuff – mostly from another Ezra Klein show episode probably, but it sounds cool!) …Maybe his next book will try to map out solutions, though?

I’m pretty sure I just solved life

Disclaimer: I was a little drunk on power (calculations) when I wrote this, but it’s me figuring out that econometrics is something I might want to specialize in!

I think I just figured out what I want to do with the rest of my career.

I want to contribute to how people actually practice data analysis in the development sector from the technical side.

I want to write about study design and the technical issues that go into running a really good evaluation, and I want to produce open source resources to help people understand and implement the best technical practices.

This is always something that makes me really excited. I don’t think I have a natural/intuitive understanding of some of the technical work, but I really enjoy figuring it out.

And I love writing about/explaining technical topics when I feel like I really “get” a concept.

This is the part of my current job that I’m most in love with. Right now, for example, I’m working on a technical resource to help IDinsight do power calculations better. And I can’t wait to go to work tomorrow and get back into it.

I’ve also been into meta-analysis papers that bring multiple studies together. In general, the meta-practices, including ethical considerations, of development economics are what I want to spend my time working on.

I’ve had this thought before, but I haven’t really had a concept of making that my actual career until now. But I guess I’ve gotten enough context now that it seems plausible.

I definitely geek out the most about these technical questions, and I really admire people who are putting out resources so that other people can geek out and actually run better studies.

I can explore the topics I’m interested in, talk to people who are doing cool work, create practical tools, and link these things that excite me intellectually to having a positive impact in people’s lives.

My mind is already racing with cool things to do in this field. Ultimately, a website that is essentially an encyclopedia of development economics best practices would be so cool. A way to link all open source tools and datasets and papers, etc.

But top of my list for now is doing a good job with and enjoy this power calculations project at work. If it’s as much fun as it was today, I will be in job heaven.

New insights on the development vs. humanitarian sectors

When I was at Middlebury, I took classes like Famine & Food Security and Economics of Global Health, learning more and more about humanitarian aid and international development. It didn’t really sink in that these were two different sectors until today.

I had a chance to talk to someone who worked for REACH – an organization that tries to collect the most accurate data possible from war zones/humanitarian emergency areas to inform policy. Seem like pretty important work.

Our conversation solidified to me that the humanitarian sector is different from the development sector. The humanitarian sector has a totally different set of actors (dominated by the UN) and missions, although the ultimate mission of a better world is the same.

Development is about the ongoing improvement of individuals living in a comparatively stable system; humanitarian aid is about maintaining human rights and dignities when all those systems break down.

There’s some overlap, of course – regions experiencing ongoing war and violence may be targeted by development and humanitarian programs alike, for example. I also think the vocabulary blurs a bit when discussing funding for development and humanitarian aid.

Development isn’t quite sure how it feels about human rights, though. Rights are good when they lead to economic development, which is equivalent to most development work.

I’d say that my definition of what I want to do in the development sector bleeds over into the human rights and humanitarian arenas. (I’m sure there’s also an important distinction between human rights sector and humanitarian sector – probably that the humanitarian sector is more about meeting people’s basest needs in crisis, although human rights workers also deal with abuses during crises.)

My interest in humanitarian work has been piqued by this conversation today, though. It was also piqued by my former roommate’s description of her work with Doctors without Borders. The idea of going on an intense mission trip for a period of time, being all-in, then taking a break is kind of appealing. Although REACH itself wasn’t described as a great work experience. Really long hours, but fairly repetitive work.

Maybe I should read more about the economics/humanitarian aid/data overlap.

3ie: Improve power calculations with a pilot

3ie wrote on June 11 about why you may need a pilot study to improve power calculations:

  1. Low uptake: “Pilot studies help to validate the expected uptake of interventions, and thus enable correct calculation of sample size while demonstrating the viability of the proposed intervention.”
  2. Overly optimistic MDEs: “By groundtruthing the expected effectiveness of an intervention, researchers can both recalculate their sample size requirements and confirm with policymakers the intervention’s potential impact.” It’s also important to know if the MDE is practically meaningful in context.
  3. Underestimated ICCs: “Underestimating one’s ICC may lead to underpowered research, as high ICCs require larger sample sizes to account for the similarity of the research sample clusters.”

The piece has many strengths, including that 3ie calls out one of their own failures on each point. They also share the practical and cost implications of these mistakes.

At work, I might be helping develop an ICC database, so I got a kick out of the authors’ own call for such a tool…

“Of all of the evaluation design problems, an incomplete understanding of ICCs may be the most frustrating. This is a problem that does not have to persist. Instead of relying on assumed ICCs or ICCs for effects that are only tangentially related to the outcomes of interest for the proposed study, current impact evaluation researchers could simply report the ICCs from their research. The more documented ICCs in the literature, the less researchers would need to rely on assumptions or mismatched estimates, and the less likelihood of discovering a study is underpowered because of insufficient sample size.”

…although, if ICCs are rarely reported, I may have my work cut out for me!

Is my job moral? [repost]

If I continue on my current career path, I may end up arbitrating who lives and who dies. (And maybe I’ll tell their story in an economics journal and make a living doing so.)

I am planning on pursuing a career in development work, specifically in the evaluation of development programs. The “gold standard” for evaluating programs is a Randomized Control Trial (RCT).

Consider a non-profit distributing books to children with the goal of improving literacy. The non-profit wants to know whether their books really have any impact on children’s literacy. Ideally, they could look at what happens when they give a group of children the books and also what happens when they don’t give the same children books.

However, due to thus far unchangeable time-space continuum properties, this isn’t possible. So, in order to confidently say that their books had an impact, the non-profit needs to compare the literacy scores of children who received the books with other very similar children who didn’t get books. Let’s say they hire me to run an RCT for this very purpose.

To determine which children will get the books (the treatment group) and which children will serve as the comparison group (the control group), I take a list of 100 schools and randomly assign half of them to receive the extra books program. After the books are distributed and some time has passed, I go back to the schools and I have all the children take literacy tests. I compare the test scores of children in each group, and find that, on average, children who received books did much better on the literacy tests.

The non-profit is very happy and uses the results to convince more people to donate to their program. Now they can give books to many more children, and presumably those children’s literacy scores will also increase.

This is all good and well. Even if some children in the study were chosen not to receive books, there are several commonly accepted justifications for why we studied them without providing a service:

  • The non-profit did not have enough money to give books to all the schools anyway. Randomly determining which schools received the books makes it as fair as possible.
  • While the books program was unlikely to have negative effects on children, we didn’t know if it would have no effect or a positive effect at the start. So we didn’t know if we were really depriving children of a chance to improve their literacy.
  • Being able to conduct the evaluation could inform policy and global knowledge on effective ways to improve literacy, and could improve decision-making at the non-profit.
  • In this case, maybe the control group children were the first to receive books when the non-profit’s funding increased.

These are common justifications for development evaluations. They seem quite reasonable — randomly giving out benefits might be the fairest option, we don’t know what the effect really is, and the study will contribute to our shared knowledge and lead to better decisions and even better outcomes in the future.

What if, instead of working on literacy, the non-profit wanted to reduce deaths from childbirth by improving access to and use of health facilities by pregnant women?

Suddenly, so much more is at stake.

If I randomly assign half a county to have access to a special taxi service that drives pregnant women to hospitals for safer deliveries, and one of the women who was assigned NOT to receive the taxi service dies because she gave birth at home, is the evaluation immoral? Am I morally culpable for her death?

Because I work with numbers and data, it is easy to separate myself from the potential negative consequences of the work. I didn’t choose her to die — the random number generator made me do it. 

Photo by Markus Spiske on Unsplash

So what if we’re in a situation where a randomized control trial seems immoral? How can we still learn about what works and what doesn’t?

There are other evaluation methods that can give us an idea of what programs work and which don’t. For example, quasi-experimental methods look at situations where comparable control and treatment groups are incidentally defined by the implementation of a policy. Then we can compare two groups without having to be responsible for directly assigning some people to receive a program while others go without.

Qualitative or other non-experimental methods involve gathering data by talking to people, doing research, and meeting with different groups to get various opinions on what’s happening. These methods can also help paint a picture of whether a program is having a positive effect.

But the RCT is the gold standard for a reason. A well-designed RCT can tell us what the effect of a program is with much higher confidence and precision than other methods.

UNICEF Social Policy Specialist Tia Palermo recently wrote a post titled “Are Randomized Control Trials Bad for Children?” for UNICEF’s Evidence for Action blog. She makes a powerful point to consider: What are the alternatives to running RCTs? Are they better or worse?

Palermo sees the alternative as worse: “Is it ethical to pour donor money into projects when we don’t know if they work? Is it ethical not to learn from the experience of beneficiaries about the impacts of a program?” she asks.

Her most convincing argument is that there are ethical implications every research method we might choose:

“A non-credible or non-rigorous evaluation is a problem because underestimating program impacts might mean that we conclude a program or policy doesn’t work when it really does (with ethical implications). Funding might be withdrawn and an effective program is cut off. Or we might overestimate program impacts and conclude that a program is more successful than it really is (also with ethical implications). Resources might be allocated to this program over another program that actually works, or works better.”

And there are ethical implications to not evaluating programs at all. If non-profits aren’t held to any standard and don’t measure the effect of their program at all, there’s no way to tell which interventions and which non-profits are helping, having no effect on, or even harming the program recipients.

In the case of the woman who died because she didn’t get to a health facility, if the study had never taken place, would she have gotten to a health facility or not? It is impossible to know what would have happened, but it’s not impossible to minimize the risk of harm and maximize the benefits to all study participants. 

Photo by Anes Sabitovic on Unsplash

Ultimately, RCTs generate important evidence when they are well executed. The findings from such studies can be used to make better decisions at non-profits, at big donor foundations like the Gates Foundation or GiveWell, and at government agencies. All of which can lead to more lives saved, which is the ultimate goal.

So what to do about the ethical implications of randomly determining who gets access to a potentially life-saving program? Or any program that could have a positive impact on people’s lives?

There are a variety of measures in place to ensure ethical conduct in research and many more ~official~ economists are thinking about these ideas.

The 1979 Belmont Report in helped establish criteria for ethics in human research, focusing on respect for people’s right to make decisions freely, maximizing benefits and doing no harm, and fairness in who bears any risks or benefits. Institutional Review Boards (IRBs) are governing bodies that ensure these principles are being upheld for all research.

Economists Rachel Glennerster Shawn Powers wrote a highly recommended piece on these ethical considerations, “Balancing Risk and Benefit: Ethical Tradeoffs in Running Randomized Evaluations,” which I’m currently reading.

Yet persistent concerns about how to run ethical evaluations suggest that there is more work to do.

Taking the time to consider the ethical implications of each project is key. And I think there is more room for evaluators to read deeply on the subject and really dig into how to make evaluations more just and more beneficial to even those in the control group who don’t receive the program.

A driving principle, especially for researchers running RCTs in the development field, could be that an evaluation must have a direct positive impact on all study participants, either during the study or immediately following its completion. There are a variety of ways, some more commonly used than others, that researchers can apply this principle:

  • If we truly don’t know whether the effect of the program is positive or negative, we can make plans to provide the program to control households if it is found to have a positive effect.
  • If we suspect the program has a positive effect, the control group can be offered the program immediately after the study period has ended.
  • We can offer everyone in the study a base service, while the study tests the effectiveness of an additional service provided only to the treatment group. This way, everyone who is contributing time and information to the study receives some benefit in return.
  • Extensive piloting (testing different ideas and aspects of the evaluation before the start of the study) can also reveal potential moral dilemmas to evaluating any particular program.
  • Community interest meetings can be held before the study is implemented to gain community-level consent to participate in the study. These meetings could also be held quite early on to inform research designs and improve the quality of the study results. For example, in some cultures, it is not appropriate for a man to be alone with a woman he is not related to. If this is the case in a study area, then hiring male staff to conduct surveys would lead to a less successful study.
  • Local staff can be hired to conduct any surveys or data collection to ensure that the surveys are culturally appropriate.
  • We always obtain full and knowledgable consent from participants, which may require translating surveys into participants’ native language.
  • If study participation requires much time or effort from control group individuals, they can be appropriately compensated.
  • All reports on evaluations (RCTs and other designs) can be fully transparent about research decisions and how ethical concerns were addressed. This will contribute to the international research community’s combined knowledge of how to ensure the rights of participants are provided for in RCTs and other research.
  • The learnings from the study can also be shared with the participating community and should add to their knowledge about their own lives; contributing to the abstract “international research community” is not enough.

Enacting these measures requires more of researchers: some have the potential to affect the legitimacy of the evaluation results if they are not properly accounted for in analysis. But a strong sense of ethics and a dedication to the population being served (often low-income individuals from the Global South, contrasted with well-off researchers from the West) demand that we take the extra time in our research to consider all ethical implications.

Originally published on my Unofficial Economist Medium publication, November 4, 2017.

How should I use my professional development time?

So much learning I want to do

  • Coding classes: Advanced R, Intro to Python, Machine Learning
  • Reading academic articles
    • In economics
    • In global health sector
  • Reading development-related articles
  • Summarizing/critiquing work-related articles
    • And post online
    • And share on internal knowledge management channels
  • Read Poor Economics for real (embarrassed I’ve only read half of it though)
  • Read Field Experiments book
  • Stata challenges from work
  • Plan a brown bag lunch presentation or chai & chat on a topic that interests me
    • An opportunity to practice presenting a slide deck
  • Read/plan for Tech Team Bookclub meetings on Machine Learning
  • Create a mapping portfolio by doing GIS challenges (possible??)

So little time…

I have blocked out three hours a week for my own PD. What do I want to prioritize? How scheduled/organized should I be about this?

I want to use the time for a mix of projects. This week, I want to read and write about one academic article related to health care and economics. It’s something I’ve been meaning to do. That should take 2 hours – I’ll polish and post my thoughts on my own time. Then, I can use the rest of the time to investigate what kind of mapping questions I can start looking into. I’m very excited about maps.

Long-term, I can plan to split it up into 2-3 chunks so I can make some progress on each of my projects across time.

  • I’ll come back to the coding and the books later
  • I’ll do Tech Team Bookclub and Stata challenges as they arise at work
  • I’ll plan for a mix of reading & writing about articles and GIS for now
    • Maybe once I have some cool maps made, I’ll do a brown-bag and an internal blog post about spatial data in Africa and how it’s relevant for IDinsight

This week

2 hours: I want to break down what heterodox vs. pluralistic vs. mainstream economics are. The idea of alternative economic models really appeals to me, but I don’t know what the big distinctions or points of conflict are. I’ll find some sources on my own time this week, and on Friday, read them and write up a summary for here.

1 hour: Investigate spatial data available for Kenya, maybe read an article on general spatial data quality in Africa.