Tuesday, June 19, 2012

Why cocaine users should learn Bayes' Theorem

Diagnostic tests for diseases and drugs are not perfect. Two common measures of test efficacy are sensitivity and specificity. Precisely, sensitivity is the probability that, given a drug user, the test will correctly identify the person as positive. Specificity is the probability that a drug-free patient will indeed test negative. Even if the sensitivity and specificity of a drug test are remarkably high, the false positives can be more abundant than the true positives when drug use in the population is low.

As an illustrative example, consider a test for cocaine that has a 99% specificity and 99% sensitivity. Given a population of 0.5% cocaine users, what is the probability that a person who tested positive for cocaine is actually a cocaine user? The answer: 33%. In this scenario with reasonably high sensitivity and specificity, two thirds of the people that test positive for cocaine are not cocaine users.

To calculate this counter-intuitive result, we need Bayes' Theorem. A geometric derivation uses a Venn Diagram representing the event that a person is a drug user and the event that a person tests positive as two circles, each of area equal to the probability of the particular event occurring when one person is tested: $P(\mbox{user})$ and $P(+)$, respectively. Since these events can both happen when a person is tested, the circles overlap, and the area of the overlapping region is the probability that the events both occur [$P(\mbox{user and }+)$].

We write a formula for the quantity that we are interested in, the probability that a person who tests positive is indeed a drug user, $P(\mbox{user} | +)$, (Read the bar as "given that". This is a 'conditional probability'.) by acknowledging that we are now only in the world of the positive test circle. The +'s that are actually drug users can be written as the fraction of the '+  test' circle that is overlapped by the 'drug user' circle:
$P(\mbox{user} | +) = \dfrac{P(\mbox{user and } +)}{ P(+)}$.

We bring the sensitivity into the picture by considering the fraction of the drug users circle that is occupied by positive test results:
$P(+ | \mbox{user}) = \dfrac{P(\mbox{user and }+)}{P(\mbox{user})}$.

Equating the two different ways of writing the joint probability $P(\mbox{user and }+)$, we derive Bayes' Theorem:
$P(\mbox{user} | +) = \dfrac{P(+ | \mbox{user}) P(\mbox{user})}{P(+)}$.

We already see that, in a population with low drug use, the sensitivity first gets multiplied by a small number. Since we do not directly know $P(+)$, we write it differently by considering two exhaustive ways people can test positive, namely by being a drug user and by not being a drug user. We weigh the two conditional events by the probability of these two different ways:
$P(+) = P(+ | \mbox{user}) P(\mbox{user}) + P(+ | \mbox{non-user}) P(\mbox{non-user})$
        $= P(+ | \mbox{user}) P(\mbox{user}) + [1 - P(- | \mbox{non-user})] [1-P(\mbox{user})]$
The specificity comes into the picture and $P(+)$ can be computed by the known values as $P(+)=0.0149$. Finally, using Bayes' Theorem, we calculate the probability that a person that tests positive is actually a drug user:
$P(\mbox{user} | +) = \dfrac{(99\%) (0.5\%) }{ (1.49\%) }= 33\%$.

The reason for this surprising result is that most (99.5%) people that are tested are not actually drug users, so the small probability that the test will incorrectly identify a non-user as positive results in a reasonable number of false positives. While the test is good at correctly identifying the cocaine users, this group is so small in the population that the total number of positives from cocaine users ends up being smaller than the number of positives from non-drug users. There are important implications of this result when zero tolerance drug policies based on drug tests are implemented in the workforce.

The same idea holds for diagnostic tests for rare diseases: the number of false positives could be greater than the number of positives for people that actually have the disease.

[1] http://en.wikipedia.org/wiki/Bayes'_theorem See 'drug testing'. This is where I obtained the example.

Tuesday, June 12, 2012

Simpson's Paradox

The Simpson's Paradox is a non-intuitive phenomena where a correlation that is present in several groups is the opposite of what is found when the groups are amalgamated together. The Simpson's Paradox elucidates the need to be skeptical of reported statistics that may be drastically dependent upon how the data are aggregated [1] and to be aware of lurking variables that may negate a conclusion about what causes the correlation in the data.

The most interesting example comes from a case in 1973 where UC Berkeley was sued for discrimination against women in graduate school admissions. The data of percent acceptance indisputably show that, if a male applies, it is more likely for him to be admitted than if a female applies (44% vs. 35%). At first glace, one may propose the causal conclusion that Berkeley is biased against females.

However, if we partition the data by department to investigate the most discriminatory department, we reveal that, in 4/6 of the departments, a female applicant is more likely to be accepted than a male applicant. In the remaining two departments, the disparity between men and women is not nearly as drastic as the amalgamated data above. This data refute the causal conclusion that Berkeley has a significant bias against women.

The reason for this reversal of correlation in the aggregated data set by partitioning it [Simpson's paradox] is because of a lurking variable that had not been considered when the law suit was filed, namely the department to which one applies. Let us look at the number of males and females that apply to each particular department. We see that the least competitive departments A and B are heavily dominated by male applicants, while the most competitive departments E and F are dominated by female applicants.

The reason that, in the amalgamated data, a significantly higher percentage of male applicants are accepted than women, is that females applied to more competitive departments than the males did. Thus, as a whole, it was more likely that a male applicant would be accepted to Berkeley. But, this is because, according to the data, a woman was more likely to apply to a department that has a lower average acceptance rate.

Several other examples, such as batting averages, kidney stone treatments, and birth weights, of a real-life Simpson's paradox can be found on the Wikipedia page [2] where this data were taken from.

[1] P. J. Bickel, E. A. Hammel, J. W. O'Connell. Sex Bias in Graduate Admissions: Data from Berkeley. Science 187, (4175). 1975. pp. 398-404.
[2] http://en.wikipedia.org/wiki/Simpson's_paradox