The Challenge of Selection Bias
In a recent opinion piece—Generous Republican Benefits—in the New York Times, Jennifer Senior makes the point that, although Republicans tend to be against the government requiring employers to offer paid parental leave, Republican senators are not against paid parental leave per se. She provides evidence of this with a survey of Senate offices. As statisticians we take no position on her eventual argument that government should require employers to offer paid parental leave. Instead, we want to discuss interpreting the findings of her survey.
Only 26 of the 100 Senate offices responded to her written request for information about their parental-leave policy: the offices of 15 (of 44) Democratic, nine (of 54) Republican, and both independent Senators. Ms. Senior reports that “virtually all” 26 provided paid parental leave of some kind, regardless of party. She chalks up the 26 percent response rate to the touchy subject of private employment practices. A valid conclusion from her survey is that at least some Senate Republican and Democrat offices provide paid parental leave, which might be enough evidence to make her argument.
Despite the title of her piece, Ms. Senior’s intention was not to defend the paternal-leave policies of Senate Republican offices; moreover, nowhere does she claim that virtually all Senate Republican offices provide paid parental leave. Some inattentive readers are, nevertheless, likely to draw that (lazy) inference thanks to the suggestive title.
Other readers might think that the survey suggests that virtually all Senate offices provide paid parental leave, because virtually all survey respondents did. Such readers are assuming that the missing data (the Senate offices that did not respond) occurred completely at random. The nonresponse has occurred completely at random if the 76 non-responding Senate offices form a random subset of all the Senate offices. In particular, the decision of each office not to respond has no relationship to whether or not the Senate office provides the benefit.
In any event, despite what may appear to be scandalously low response rates, pollsters have a historical record to support – and if necessary modify – their adjustments. There is no such historical record for Ms. Senior’s survey. What if many of the 74 missing Senate offices didn’t respond to her survey because they did not want to appear churlish in not providing paid parental leave?
The great statistician Abraham Wald did a study of returning airplanes in World War II to determine where more armor plating might be needed. Bullet holes were mostly to be found in all but a few areas. Wald’s great insight was that the areas without holes were the areas where more shielding was needed. Planes having bullet holes in those areas were more likely to have been shot down and thus were “nonrespondents” to this study.
Sadly, we have no way of knowing whether an office’s decision to respond to Ms. Senior survey was correlated with its paid leave policy. Statisticians call such a correlation “a selection bias.” Due to our ignorance, reliable inferences cannot be drawn from her survey—except about the 26 offices that did respond. But readers drawing improper inference from Ms. Senior’s survey shouldn’t feel alone. Selection bias remains an underappreciated issue in many fields, ranging from medical research to public policy.
Two examples illustrate the challenges of addressing selection bias in important studies. One source used by analysts to identify predictors of serious injuries in automobile crashes is a database of people with serious injuries in automobile crashes. In this database, however, the relevant comparison group—individuals without serious accidents—is omitted, thereby causing an instance of selection bias.
For example, among severely injured people in the Crash Injury Research Engineering Network (CIREN), seat belts do not seem to protect against certain injuries. However if one includes data about drivers not in serious accidents, all types of seatbelt use is associated with reduced injury. By omitting non-injured people, we ignore the many people who were saved from injury by seatbelts. As a result, we may underestimate the protective effect associated with using a seatbelt.
Controlled clinical trials have long been the standard for assessing the purported benefit of new drugs and therapies. Volunteers for clinical trials, however, are arguably different in important ways from some of the broader population that eventually will utilize treatments studied in the trials. The enrollment of non-representative samples can produce an overestimation of the medicine’s effectiveness in a more general setting. This exaggerated effect occurs when samples have less diversity than the general population.
Statisticians have an important role to play in helping researchers, policy makers, and the public at large to be more aware of the pitfalls of selection bias. Sometimes stating what is not known, such as the policies of the 74 Senate Offices that did not respond to Ms. Senior, can assist the reader in appreciating the limitations of claims based on a data analysis.
Please note that this is a forum for statisticians and mathematicians to critically evaluate the design and statistical methods used in studies. The subjects (products, procedures, treatments, etc.) of the studies being evaluated are neither endorsed nor rejected by Sense About Science USA. We encourage readers to use these articles as a starting point to discuss better study design and statistical analysis. While we strive for factual accuracy in these posts, they should not be considered journalistic works, but rather pieces of academic writing.