« Soliciting ideas (and seeking new voices) for Criminal Justice panels at the AALS Annual Meeting | Main | Connecting realities of incarceration to the outbreak of coronavirus »
March 3, 2020
Making the case for algorithms to help with criminal justice decision-making
This new Washington Post piece by a group of California professors and data scientists, headlined "In the U.S. criminal justice system, algorithms help officials make better decisions, our research finds," makes a notable case for using algorithms in criminal justice decision-making. Here are excerpts:
Should an algorithm help make decisions about whom to release before trial, whom to release from prison on parole or who receives rehabilitative services? They’re already informing criminal justice decisions around the United States and the world and have become the subject of heated public debate. Many such algorithms rely on patterns from historical data to assess each person’s risk of missing their next court hearing or being convicted of a new offense.
More than 60 years of research suggests that statistical algorithms are better than unaided human judgment at predicting such outcomes. In 2018, that body of research was questioned by a high-profile study published in the journal Science Advances, which found that humans and algorithms were about equally as good at assessing who will reoffend. But when we attempted to replicate and extend that recent study, we found something different: Algorithms were substantially better than humans when used in conditions that approximate real-world criminal justice proceedings....
Surprised by the finding, we redid and extended the Dartmouth study with about 600 participants similarly recruited online. This past month, we published our results. The Dartmouth findings do not hold in settings that are closer to real criminal justice situations
The problem isn’t that the Dartmouth study’s specific results are wrong. We got very similar results when we reran the study by asking our own participants to read and rate the same defendant descriptions that their researchers used. It’s that their results are limited to a narrow context. We repeated the experiment by asking our participants to read descriptions of several new sets of defendants and found that algorithms outperformed people in every case. For example, in one instance, algorithms correctly predicted which people would reoffend 71 percent of the time, while untrained recruits predicted correctly only 59 percent of the time — a 12 percentage point gap in accuracy.
This gap increased even further when we made the experiment closer to real-world conditions. After each question, the Dartmouth researchers told participants whether their prediction was correct — so we did that, too, in our initial experiments. As a result, those participants were able to immediately learn from their mistakes. But in real life, it can take months or years before criminal justice professionals discover which people have reoffended. So we redid our experiment several more times without this feedback. We found that the gap in accuracy between humans and algorithms doubled, from 12 to 24 percentage points. In other words, the gap increased when the experiment was more like what happens in the real world. In fact, in this case, where immediate feedback was no longer provided, our participants correctly rated only 47 percent of the vignettes they read — worse than simply flipping a coin.
Why was human performance so poor? Our participants significantly overestimated risk, believing that people would reoffend much more often than they actually did. In one iteration of our experiment, we explicitly and repeatedly told participants that only 29 percent of the people they were assessing ultimately reoffended, but our recruits still predicted that 48 percent would do so. In a courtroom, these “judges” might have incorrectly flagged many people as high risk who statistically posed little danger to public safety.
Humans were also worse than algorithms at exploiting additional information — something that criminal justice officials have in abundance. In yet another version of our experiment, we gave humans and algorithms detailed vignettes that included more than the five pieces of information provided about a defendant in the original Dartmouth study. The algorithms that had this additional information performed better than those that did not, but human performance did not improve.
Our results indicate that statistical algorithms can indeed outperform human predictions of whether people will commit new crimes. These findings are consistent with the findings of an extensive literature, including field studies, that show that algorithmic predictions are more accurate than those of unaided judges and correctional officers who make life-changing decisions every day.
I blogged about the prior study in this post, and here are some (of many, many) prior related posts on risk assessment tools:
- ProPublica takes deep dive to idenitfy statistical biases in risk assessment software
- "Assessing Risk Assessment in Action"
- Thoughtful account of what to think about risk assessment tools
- "The Use of Risk Assessment at Sentencing: Implications for Research and Policy"
- Wisconsin Supreme Court rejects due process challenge to use of risk-assessment instrument at sentencing
- "In Defense of Risk-Assessment Tools"
- Parole precogs: computerized risk assessments impacting state parole decision-making
- Thoughtful look into fairness/bias concerns with risk-assessment instruments like COMPAS
- "Gender, Risk Assessment, and Sanctioning: The Cost of Treating Women Like Men"
- Expressing concerns about how risk assessment algorithms learn
- "Under the Cloak of Brain Science: Risk Assessments, Parole, and the Powerful Guise of Objectivity"
- New research findings by computer scientists "cast significant doubt on the entire effort of algorithmic recidivism prediction"
- "The Accuracy, Equity, and Jurisprudence of Criminal Risk Assessment"
- "Report on Algorithmic Risk Assessment Tools in the U.S. Criminal Justice System"
- "Algorithmic Risk Assessment in the Hands of Humans"
- "Beyond the Algorithm Pretrial Reform, Risk Assessment, and Racial Fairness"
- "Federal Criminal Risk Assessment"
March 3, 2020 at 01:54 PM | Permalink