In probability and statistics, Simpson's paradox (or the Yule-Simpson effect) is an apparent paradox in which a correlation (trend) present in different groups is reversed when the groups are combined. This result is often encountered in social-science and medical-science statistics, and it occurs when frequency data are hastily given causal interpretations. Simpson's Paradox disappears when any causal relations are derived systematically – i.e. through formal analysis.
예를 보면 쉽다.
1)
데이비드는 1995, 1996년 모두 데릭보다 성적이 좋지만 2년의 결과를 합치면 좋지 않다.
1995 | 1996 | Combined | |
Derek Jeter | 12/48 .250 | 183/582 .314 | 195/630 .310 |
David Justice | 104/411 .253 | 45/140 .321 | 149/551 .270 |
2) 신장결석에 대한 두 가지 치료법에 대한 비교.
Treatment A | Treatment B | |
성공률 | 78% (273/350) | 83% (289/350) |
전체 성공률만 보면, 치료법 B가 좋아보인다.
하지만 나누어 보면, 결과가 너무 달라진다.
Treatment A | Treatment B | |
---|---|---|
Small Stones | 93% (81/87) | 87% (234/270) |
Large Stones | 73% (192/263) | 69% (55/80) |
Both | 78% (273/350) | 83% (289/350) |
At best, Simpson's Paradox is used to argue that association is not causation.
At worst, Simpson's Paradox is used to argue that induction is impossible in observational studies.
참고.
http://en.wikipedia.org/wiki/Simpson's_paradox
web.augsburg.edu/~schield/MiloPapers/99ASA.pdf
티스토리를 오랜만에 사용하는데..
글쓰기가 너무 불편.. ;;
댓글 없음:
댓글 쓰기