Saturday, August 18, 2012

How a statistically inept jury led to a wrongful conviction

In 1999, Sally Clark of Britain was wrongly convicted of murdering her two infant sons that actually died of sudden infant death syndrome (SIDS). "SIDS is the unexpected, sudden death of a child under age 1 in which an autopsy does not show an explainable cause of death. [1]" The case was overturned a little more than three years later and Sally was released. In 2007, Sally was tragically found dead in her home due to alcohol over-intoxication.

An "expert" witness pediatrician Roy Meadow served for the case, and he is thought to have played a major role in convincing the jury of the Sally Clark's guilty verdict by making two major statistical errors in accessing the probability of Sally Clark's innocence. [In my opinion, using arguments of probability instead of hard evidence for convicting criminals is not so just, but let us proceed with this premise regardless.]

The assumption of independence.
Roy Meadow's Claim 1: Data indicate that 1/8,543 infants born from mothers in class A* die of SIDS. Sally belongs to class A. Thus, the probability of Sally's first and second child dying of SIDS is (1/8,543)(1/8,543) = 1 in 73 million.

The data say that, given a randomly selected infant born from a mother in class A, the probability that this newborn will die of SIDS is 1/8,543. The probability that Sally's first child would die of SIDS is then 1/8,543. However, the probability that her second infant would die of SIDS is not 1/8,543 (which is what Roy Meadow assumed to obtain the 1 in 73 million) because we now know something more about Sally: her first child had died of SIDS. In fact, since SIDS is likely due to genetic factors, the probability of Sally's second child dying of SIDS is almost certainly much higher than 1/8,543 because we now know Sally might carry a genetic element that predisposes her children to SIDS. Instead, Roy Meadow made the assumption that Sally's second infant dying of SIDS is completely independent of the event of her first infant's death by SIDS and declared the probability of her second son dying as 1/8,543 in calculating the 1 in 73 million probability.

The prosecutor's fallacy.
Roy Meadow's Claim 2: The probability that two infants of the same mother both naturally die of SIDS (not by murder) is very small. Thus, the probability that Sally Clark is innocent is comparably small.

For simplicity, let us neglect all other possible explanations for the death of Sally Clark's two infants and consider that either one of the two happened: (1) Sally Clark murdered both of her infants. (2) Both infants died of SIDS. Let us denote two events as:
$I$: Sally Clark is innocent.
$E$: the evidence that two of Sally's infants died is observed.

$P(E | I)$ is the probability that, given Sally did not murder her two infants, the evidence would be observed. Since we are neglecting any other explanations of the death of Sally's two infants, this corresponds to hypothesis (2). If even Ray Meadow's underestimate of $P(E|I)$ in Claim 1 were correct, we can still negate Ray Meadow's Claim 2.

$P(I | E)$ is the probability that, given the evidence, Sally Clark is innocent. $P(I | E)$ is the most important quantity that the jury would like to know for a basis in its decision and the quantity that Ray Meadow wrongly assumed to be equal to $P(E | I)$. This is precisely the prosecutor's fallacy: assuming that $P(I | E) = P(E | I)$. In words, the fallacy is that, because the likelihood of Sally's two children both dying without her murdering them is very small, the probability of Sally's innocence given the observed evidence is comparably small.

We show that they are not equal using Bayes' Theorem, derived in Why cocaine users should learn Bayes' Theorem. "Bayes Theorem allows you to separate how likely alternative explanations of an event are, from how likely it was that the event should have happened in the first place. [2]" Relating the conditional probabilities $P(E | I)$ and $P(I | E)$, we immediately see that they are not equal, but carry on to investigate how exactly they differ:

$P(I | E) = \dfrac{P(E | I) P(I) }{ P(E)}$.

Next, rewriting the probability that the evidence is observed by considering the only two ways one may observe the evidence, namely by Sally being innocent or not innocent, $P(E) = P(E | I) P(I) + P(E | \mbox{~}  I)P(\mbox{~} I)$ where $\mbox{~}$ denotes a negation so that $P(\mbox{~} I) = 1 - P(I)$ is the probability Sally is not innocent. Substitute this expression for $P(E)$ into Bayes' Theorem above to arrive at:

$P(I | E)=\dfrac{P(E | I)P(I)}{P(E | I) P(I) + P(E | \mbox{~} I) P(\mbox{~} I)}$.

Well, $P(E | \mbox{~}I)$ is the probability that the evidence would be observed given that Sally murdered her two children-- this is one. Dividing the numerator and denominator of the right hand side by $P(\mbox{~}I)$ brings in a quantity which is the ratio of the probability of an event happening to that of it not happening-- the odds.

$P(I | E) = \dfrac{P(E | I) odds(I) }{ P(E | I) odds(I) + 1}$,

and, according to the prosecutor's claim, $P(E | I)$ is very small so we can say that:

$P(I | E) \approx P(E | I) odds(I)$

$odds(I)$ is not conditioned on any evidence. Although we don't know its value, the odds that a random mother from class A (Sally is essentially this random mother chosen since we don't have any evidence on her) will not murder her two children is quite larger than one. The large $odds(I)$ quantity in this case makes $P(I | E) >> P(E | I)$-- invalidating Roy Meadow's claim 2. Just because the probability of observing the evidence if the defendant were innocent is very small, the probability of the defendant being innocent given the evidence, which may be very valuable to a jury, is not necessarily very small. An estimate in [2] estimates $P(I | E)$ to be 2/3 in Sally's case-- a far distance from the 1 in 73 million figure obtained from making two serious statistical errors that ruined someone's life.

[2] A much better (but longer) account than mine here.
*Class A: infants born from non-smoking, affluent families with a mother over 26 [2].


  1. Awesome! A very good example of why people should know basic probability.

    I may be wrong, but i think there's a slight mistake in the paragraph where you start talking about the odds, it should say P(E|~I) instead of P(E|I) and P(~I) instead of P(I), since you're saying it's the probability of observing the evidence given that Sally is not innocent, which is indeed 1.

    1. Thanks! You are right. Oops, the "~" was in my html code, but the LaTeX interpreter needs "\mbox{~}".