A Principled Approach to Benchmarking in Studies of Racial Discrimination

“… the instability of benchmarking results stems from the absence of a causal foundation”
causal inference
fairness
policing
measurement
proxies

Kai R. D. Cooper, Gregory Lanzalotto, Haosen Ge, Jacob Kaplan, Scott Desposato, Dean Knox, and Jonathan Mummolo. In preparation. “A Principled Approach to Benchmarking in Studies of Racial Discrimination in Traffic Enforcement.”

Authors
Affiliations

Kai Cooper

The Wharton School, University of Pennsylvania

Gregory Lanzalotto

The Wharton School, University of Pennsylvania

Haosen Ge

The Wharton School, University of Pennsylvania

Jacob Kaplan

School of Public and Internaional Affairs, Princeton University

Scott Desposato

Department of Political Science, UC San Diego

Dean Knox

The Wharton School, University of Pennsylvania

Jonathan Mummolo

School of Public and Internaional Affairs, Princeton University

Abstract

Racial bias in policing is well documented. Traffic stops represent the most common type of encounter in which civilians interact with police, which makes this an important setting for investigation of police behavior. Evidence-based debates in this area frequently rely on case-specific benchmarks to evaluate the racial distribution of police stops. Often, benchmarks such as per-capita demographics or not-at-fault drivers in vehicle collisions are used. However, these measures only approximate the intended comparison because they fail to (i) precisely define the causal contrast, (ii) accurately describe the nature of drivers visible to officers, (iii) explain disparities due to potentially biased selection decisions by officers. In this work we formally define racial bias as a violation of an individual fairness criterion: officer decision-making should be ignorant of race conditional on the behavior of the civilian, i.e. whether or not they are driving dangerously. This latter, however, is unmeasurable. To address this issue, we build on the growing proximal causal inference literature to make use of cameras, crashes and checkpoints as negative outcome controls for identification, under certain assumptions which may be probed via a sensitivity analysis. We will extend the method’s viability to a host of imperfect data settings which plague policing, e.g. when race is mismeasured or unavailable. We finish with applications of the approach to a variety of U.S. police jurisdictions.