Predictive Policing, Gender Bias, and Median Survival
“Predictive policing” driven by bias
A recent New York Times story describing the increasing practice of “predictive policing” does a good job of describing what this is all about, at least in broad strokes:
“The strategy… combines elements of traditional policing, like increased attention to crime ‘hot spots’ and close monitoring of recent parolees. But it often also uses other data, including information about friendships, social media activity and drug use, to identify ‘hot people’ and aid the authorities in forecasting crime.”
The devil, however, is in the details. Most statistical models do reasonably well when predicting an average; but predicting an individual’s behavior is a different story and the uncertainty around the prediction can be large enough to render it useless. In this story, we are told nothing about the potential errors associated with prediction of individual behavior. Models are most useful when the data on which they are estimated are a randomly selected subset from a relevant population. Here, it seems that models are based on information collected from those who have already been in trouble with the law.
First, that is not a random sample from any population and second, it is well known that minorities, poor people and others disadvantaged in some way are already more likely to be jailed and/or convicted of a crime than a white person who commits the same crime. Therefore, these “predictive models” will promote existing biases and increase unfairness in the administration of justice. A little statistical knowledge could have completely transformed this story. —Alicia Carriquiry, Iowa State University
Gender bias in academic grants or Simpson’s Paradox?
Sometimes, numbers tell a different story when aggregated than if they are divided into subgroups. Such is the case with Simpson’s paradox: a strange situation in which looking at categories can tell the opposite story of what happens in the aggregate. Imagine for a moment that you are comparing women to men on their rates of flipping heads on each of two coins. The first coin is a fair coin, so that (on average) a flip results in heads 50 percent of the time. The second coin is weighted, so that (on average) a flip turns up heads 10 percent of the time. Unfortunately, we have far fewer female participants than expected, so as a result, there are very few women—only ten— flipping the fair coin. But we have 100 women flipping the weighted coin, and 100 men flipping both the fair and the weighted coins. Nonetheless, the data just happen to turn out perfectly, and in each case, the expected number of heads actually occurs:
Note that men and women get exactly half heads for the fair coin. For the weighted coin, both men and women get exactly 10 percent heads.
Yet when you add up what happens to women versus men, we see a different story: women have made 110 flips in all, and only 15 were successful. That is a success rate of 13.6 percent. In contrast, men flipped a total of 200 times, and they had 60 successes, resulting in a success rate of 30 percent. Put together, the data suggest that the coins favor the men. But when you look at the individual groups, it’s clear that, for each coin, men and women faired similarly.
The same phenomenon recently occurred with data collected on grant awards in the Netherlands. In the aggregate, women only received 14.9 percent of the grants for which they applied, while men received 17.7 percent. Overall, 42.1 percent of grant applications were by women, but only 37.9 percent were awarded to women. Scientists reported this result in the Proceedings of the National Academy of Science, noting that the difference was statistically significant, and the results were widely reported across the Netherlands. Yet, if one considers the subgroups of scientific fields, the gender bias does not exist, according to a number of statisticians critical of the study. How can this be?
It comes down to looking at the data in specific scientific fields, rather than in the aggregate. Among those scientific fields, women are more likely to apply to the more challenging funding areas, with low success rates for both men and women. This is like having half the women and half the men flipping the biased coin, with a small chance for both of having “success.” These fields include biomedical sciences and social sciences.
In contrast, fewer women applied to fields like physics, which have comparably higher funding rates. This is like having many fewer women flipping the fair coin with a 50/50 chance of “success”. In the aggregate, the women seem to have gotten a short deal, but the reality is that their applications fell disproportionately to the fields with the lowest success rates.
The critique of the PNAS article doesn’t demonstrate that gender bias doesn’t exist within the granting agencies. In fact, the data beg the question of why there is such a discrepancy in application rates, or in funding rates among fields where women have a larger role. But the story certainly points to the importance of identifying lurking variables (in this case, the scientific field) that could impact the results so they seem scientifically conclusive when the data suggest no bias in awarding bias within individual fields. —Rebecca Goldin, George Mason University*
When the median is a matter of life or death
A recent story on CNN argues that “a lung cancer drug called necitumumab, which is not yet approved by the FDA, should only cost between about $500 and $1,300 per cycle based on the fact that it only gives patients a few more weeks of quality life.” Putting aside the issue of cost, it’s not that everyone lasts a few more weeks, though. Based on studies of this drug, the median survival time is 1.6 months longer using this drug with other therapies, than using the other therapies alone. In other words, half of the patients receive fewer than six weeks of prolonged life compared to the current standard treatment, and half live more than six additional weeks using necitumumab. But the small median survival rate does not by itself indicate that the drug is only a small improvement on a dire situation.
To illustrate, suppose theoretically that a medicine cures 40 percent of all patients, and does absolutely nothing for the other 60 percent. The average (median) benefit would be zero additional weeks, since more than half of the patients derived no benefit at all. Yet for 40 percent of the patients, the medicine saved their lives. The medicine would be considered a game-changer and would be an exciting development indeed, even with its high price tag! —Rebecca Goldin, George Mason University*
 To be specific, on necitumamamb combined with other therapies, the median lifespan was 11·5 months [95% CI 10·4–12·6]) vs 9·9 months [95% CI 8·9–11·1] for those not using necitumumab; stratified hazard ratio 0·84 [95% CI 0·74–0·96; p=0·01])
*Goldin is supported in part by NSF grant #1201458.