Q: If drug B has a higher success rate (%age of cures) than drug A
when given to women, and also when given to men, does it have a higher
success rate when given to people in general?

[2 Apr 1997]

A: Not necessarily, e.g.

Women | Men | |||
---|---|---|---|---|

Success | Failure | Success | Failure | |

A | 85 | 31 | 4 | 5 |

B | 3 | 1 | 1 | 1 |

Then for women B (75%) is better than A (73%) and for men B (50%) is again better than A (44%), but for people in general A (71%) is better than B (67%).

Q: What's the smallest possible "paradoxical" situation (i.e.
smallest total number of people)? There are two versions of the problem
depending on whether we allow entries to be 0.

[2 Apr 1997]

A: When I first considered this puzzle I found the following two examples by hand, and wondered whether they were minimal:

Zeros not allowed (20 people):

Women | Men | |||
---|---|---|---|---|

Success | Failure | Success | Failure | |

A | 3 | 4 | 1 | 5 |

B | 1 | 1 | 1 | 4 |

Zeros allowed (9 people):

Women | Men | |||
---|---|---|---|---|

Success | Failure | Success | Failure | |

A | 2 | 1 | 0 | 1 |

B | 1 | 0 | 1 | 3 |

In 2011 I found that the first of these was quoted in *Impossible?
Surprising solutions to Counterintuitive Conundrums* by Julian Havil
(published 2008, ISBN 978-0-691-13131-3, available from Amazon for example). This inspired me
to settle the question with a quick exhaustive computer search.

It turns out that for the zeros-not-allowed version the 20-person
example above is not *quite* the best possible. The minimum
possible total is 19, and there are two essentially different
solutions:

Women | Men | |||
---|---|---|---|---|

Success | Failure | Success | Failure | |

A | 2 | 1 | 2 | 5 |

B | 3 | 2 | 1 | 3 |

and

Women | Men | |||
---|---|---|---|---|

Success | Failure | Success | Failure | |

A | 2 | 1 | 3 | 5 |

B | 3 | 2 | 1 | 2 |

For the zeros-allowed version the 9-person example above
*is* minimal, and is essentially the only solution.

[20 Mar 2011]

Q: Here's another striking version. A greengrocer sells apples at
a fixed price per fruit, and oranges similarly. Each day an apple costs
more than an orange. I buy fruit on several days. On average, did my
apples cost me more per fruit than my oranges?

[20 Mar 2011]

A: Not necessarily. For example:

- On Monday apples cost 9p each and oranges cost 8p each, and I buy an apple and two oranges.
- On Tuesday apples cost 3p each and oranges cost 2p each, and I buy two apples and an orange.
- Overall I bought three apples which cost a total of 15p, and three oranges which cost a total of 18p. So on average my apples cost 5p each but my oranges cost 6p each.

Simpson's paradox can be interpreted as a sign that we're asking
the wrong question. In the drugs trial we shouldn't be asking whether
drug A is better than drug B, but rather why both drugs are more
effective on women than on men. At the greengrocer's we shouldn't be
asking whether apples cost more than oranges, but rather why the prices
of both fruit changed so much on Tuesday.

[20 Mar 2011]

Q:
Call the above situation a 2-level paradox, because we're
measuring the drugs' effectiveness at two levels: the gender level and
the overall population level. Is it possible to have a 3-level
paradox? For example, is it possible that drug A has a higher success
rate on people, but drug B has a higher success rate on women and on
men, but drug A has a higher success rate on each of young women, old
women, young men and old men?

[Haidar Al-Dhalimy, 2 Apr 2021]

A: Here's a 3-level paradox:

Women | Men | |||||||
---|---|---|---|---|---|---|---|---|

Young | Old | Young | Old | |||||

Success | Failure | Success | Failure | Success | Failure | Success | Failure | |

A | 2 | 3 | 3 | 1 | 1 | 3 | 1 | 1 |

B | 1 | 2 | 5 | 2 | 1 | 4 | 4 | 5 |

The better drug for a given subpopulation is shown in red text:

- For young women drug A cures 2 out of 5 = 40% and drug B cures 1 out of 3 = 33%.
- For old women drug A cures 3 out of 4 = 75% and drug B cures 5 out of 7 = 71%.
- For women overall drug A cures 5 out of 9 = 56% and drug B cures 6 out of 10 = 60%.
- For young men drug A cures 1 out of 4 = 25% and drug B cures 1 out of 5 = 20%.
- For old men drug A cures 1 out of 2 = 50% and drug B cures 4 out of 9 = 44%.
- For men overall drug A cures 2 out of 6 = 33% and drug B cures 5 out of 14 = 36%.
- Overall drug A cures 7 out of 15 = 47% and drug B cures 11 out of 24 = 46%.

By exhaustive search this example is minimal for the zeros-not-allowed version, although there are other minimal examples.

Inspired by this example I got a bit carried away and found a 6-level paradox. This diagram is transposed relative to the tables above, so each column aggregates pairs of subpopulations from the previous column. This diagram is also available as a PDF document.

By exhaustive search this example is minimal for the zeros-not-allowed version, although there are probably other minimal examples.

I produced a 6-level example because the table still fits on one page, but I stopped there because I'm not sure we'd learn very much by going further. I can't really see any structure in these examples, and I'm just using a brute-force search so I don't have a method for constructing them. I don't even have a bound on the size of minimal examples at a given level.

In this 6-level example the effect simply reverses at each level. However
I think that in fact we could specify *any* pattern of red text
in the diagram, and then find a population where colouring the better
drug in each subpopulation yields that pattern, although I haven't
proved it.

This page is maintained by Thomas Bending,
and was last modified on 27 June 2021.

Comments, criticisms and suggestions are welcome.
Copyright © Thomas Bending 2021