Simpson’s Paradox

I’m not sure about you, but I’m a big fan of quick and dirty analysis. If you can take a quick look at something and make a judgment that can be the most efficient way to approach things.

Caution is required though, as there are many instances where it is not the best approach and can lead to poor decisions, which could end up costing a lot in the long run. For example, if you are recruiting thousands of Face-to-Face monthly donors, and comparing suppliers, then a quick look at attrition and seeing it is similar is probably not the best approach. The cost structure, value and phasing of the attrition will play a huge role. What is required in this case is more information than purely attrition.

An example I have recently come across is a state-based Australian charity, which tested recruiting interstate. Mailing a combination of data sources such as those from part of a swap (donor exchange) program, a data cooperative, and cold lists, they took a sensible approach to testing interstate acquisition by not just rolling out across everything but focusing on the best performing lists.

With offshore acquisition packs, the time between planning, execution and lodgment can be quite long, so the details can be overlooked or forgotten.

When it came to look at the results, the question was asked, how did the interstate prospects perform? The topline results were encouraging. Interstate 1 got a slightly lower response rate, but it was within the acceptable range, and Interstate 2 even out performed the home state.

Table1

Based on the quick and dirty analysis, it looked great. Expanding fundraising interstate would open up new revenue streams. However, looking at the results in more detail gave a different picture. The interstate test was only undertaken on some of the best-performing Swaps and Co-ops. So isolating these, and ignoring the poorer lists that were only mailed to the home state is the only way the true comparison can be made.

Table2

So what does this mean?

Interstate 1 has a response rate over 4 percentage points lower than the Home state for Swaps and has nearly half the response rate for Co-ops. This difference gets hidden because of the relative weighting of the data towards swaps compared to the Home state.

For Interstate 2, which overall ‘beat’ the home state, at the list level, it underperformed. This is an interesting variant of Simpson’s paradox.

A couple of links are below for a fuller and better explanation than I can give!

https://en.wikipedia.org/wiki/Simpson%27s_paradox

https://www.statslife.org.uk/the-statistics-dictionary/2012-simpson-s-paradox-a-cautionary-tale-in-advanced-analytics

Andy