Simpson's Paradox
A treatment improves outcomes for men. It also improves outcomes for women. Yet when you combine the data, it seems to hurt outcomes overall. This is Simpson’s paradox: a trend that appears in several groups can disappear or reverse when the groups are combined. The aggregate tells a different story than the parts.
The classic example: UC Berkeley was accused of discriminating against women in graduate admissions. Overall, men were admitted at higher rates. But examined department by department, women were admitted at equal or higher rates. The paradox: women applied to more competitive departments. Aggregation hid the real pattern.
The paradox isn’t a quirk of statistics — it’s a warning about causation. Which is the “real” effect: the combined data or the disaggregated? It depends on what you’re asking. If departments are the decision-makers, department-level data matters. If university-wide policy is the question, aggregate data might be relevant. The numbers don’t tell you which question to ask.
The mechanism is confounding. Some hidden variable (department competitiveness, patient severity, treatment allocation) correlates with both the grouping and the outcome. Aggregating erases the confound’s structure, creating the illusion of a different relationship. The same data supports opposite conclusions depending on how you slice it.
The practical lesson: be suspicious of aggregate statistics. Ask what happens when you break them down. And be suspicious of disaggregated statistics too — ask what happens when you combine them. If the answers conflict, you’ve found a confound worth understanding.
More fundamentally: data doesn’t speak for itself. Every summary statistic involves choices about grouping and aggregation. Those choices shape what the data seems to say. Simpson’s paradox is just the extreme case of a universal problem: the frame determines the picture.
Related: survivorship bias, signal and noise, epistemology, selection, map and territory