I’m pretty sure I just solved life

Disclaimer: I was a little drunk on power (calculations) when I wrote this, but it’s me figuring out that econometrics is something I might want to specialize in!

I think I just figured out what I want to do with the rest of my career.

I want to contribute to how people actually practice data analysis in the development sector from the technical side.

I want to write about study design and the technical issues that go into running a really good evaluation, and I want to produce open source resources to help people understand and implement the best technical practices.

This is always something that makes me really excited. I don’t think I have a natural/intuitive understanding of some of the technical work, but I really enjoy figuring it out.

And I love writing about/explaining technical topics when I feel like I really “get” a concept.

This is the part of my current job that I’m most in love with. Right now, for example, I’m working on a technical resource to help IDinsight do power calculations better. And I can’t wait to go to work tomorrow and get back into it.

I’ve also been into meta-analysis papers that bring multiple studies together. In general, the meta-practices, including ethical considerations, of development economics are what I want to spend my time working on.

I’ve had this thought before, but I haven’t really had a concept of making that my actual career until now. But I guess I’ve gotten enough context now that it seems plausible.

I definitely geek out the most about these technical questions, and I really admire people who are putting out resources so that other people can geek out and actually run better studies.

I can explore the topics I’m interested in, talk to people who are doing cool work, create practical tools, and link these things that excite me intellectually to having a positive impact in people’s lives.

My mind is already racing with cool things to do in this field. Ultimately, a website that is essentially an encyclopedia of development economics best practices would be so cool. A way to link all open source tools and datasets and papers, etc.

But top of my list for now is doing a good job with and enjoy this power calculations project at work. If it’s as much fun as it was today, I will be in job heaven.

Why you should convert categorical variables into multiple binary variables

Take the example of a variable reporting if someone is judged to be very poor, poor, moderately rich, or rich. This could be the outcome of a participatory wealth ranking (PWR) exercise like that used by Village Enterprise.

In a PWR exercise, local community leaders can identify households that are most vulnerable. These rankings can then be used to target a development program (like VE’s graduation-out-of-poverty program that combines cash transfers with business training) to the community members that are most in need.

Let’s say that you want to include the PWR results in a regression analysis as a covariate. You have a dataset of all the relevant variables for each household, including a variable that records whether the household was ranked in the PWR exercise as very poor, poor, moderately rich, or rich.

You need to convert this string variable (text) into a numeric value. You could assign each option a value from 1 to 4, with 1 being “very poor” and 4 meaning “rich” … but you shouldn’t use this directly in your regression.

If you have a variable that moves from 1 to 2 to 3 to 4, you’re implying that there is a linear pattern between each of those values. You’re saying that the effect on your outcome variable of going from being very poor (1) to poor (2) is the same as the effect of going from poor (2) to moderately rich (3). But you don’t know what the real relationship is between the different PWR levels, since the data isn’t that granular. You can’t make the linear assumption.

So instead, you should use four different binary variables in your regression: Ranked “very poor” or not? “Poor” or not? “Moderately rich” or not? “Rich” or not?

This Stata support page does a great job of summarizing how to apply this in your regression code or create binary variables from categorical using easy shortcuts. I like:

reg y x i.pwr

But how do you interpret the results?

When you create dummies (binary variables) out of a categorical variable, you use one of the group dummies as the reference group and don’t actually include it in the regression.

By default, the reference group is usually the smallest/lowest group. In this case, that means “very poor.” So in the regression, you’ll have three dummies, not four. Being “very poor” is the base condition against which to compare the other rankings.

Let’s say there is a statistically significant, positive coefficient on the “moderately rich” dummy in your regression results. That means that, compared to the base condition of being very poor, being moderately rich has a positive effect on your outcome variable.

Ӧzler: Decrease power to detect only a meaningful effect

Photo by Val Vesa on Unsplash

Reading about power, I found an old World Bank Impact Evaluations blog post by Berk Ӧzler on the perils of basing your power calcs in standard deviations without relating those SDs back to the real life context.

Ӧzler summarizes his main points quite succinctly himself:


  • Think about the meaningful effect size in your context and given program costs and aims.
  • Power your study for large effects, which are less likely to disappear in the longer run.
  • Try to use all the tricks in the book to improve power and squeeze more out of every dollar you’re spending.”

He gives a nice, clear example to demonstrate: a 0.3 SD detectable effect size sounds impressive, but for some datasets, this would really only mean a 5% improvement which might not be meaningful in context:

“If, in the absence of the program, you would have made $1,000 per month, now you’re making $1,050. Is that a large increase? I guess, we could debate this, but I don’t think so: many safety net cash transfer programs in developing countries are much more generous than that. So, we could have just given that money away in a palliative program – but I’d want much more from my productive inclusion program with all its bells and whistles.”

Usually (in an academic setting), your goal is to have the power to detect a really small effect size so you can get a significant result. But Ӧzler makes the opposite point: that it can be advantageous to only power yourself to detect what is a meaningful effect size, decreasing both power and cost.

He also advises, like the article I posted about yesterday, that piloting could help improve power calculations via better ICC estimates: “Furthermore, try to get a good estimate of the ICC – perhaps during the pilot phase by using a few clusters rather than just one: it may cost a little more at that time, but could save a lot more during the regular survey phase.”

My only issue with Ӧzler’s post is his chart, which shows the tradeoffs between effect size and the number of clusters. His horizontal axis is labeled “Total number of clusters” – per arm or in total, Bert?!? It’s per arm, not total across all arms. There should be more standardized and intuitive language for describing sample size in power calcs.