Trend AnalysisSociology & Political Science
Algorithmic Bias in Criminal Justice: When Code Becomes Judge
Recidivism prediction algorithms now influence bail, sentencing, and parole decisions affecting millions. But the mathematical impossibility of satisfying multiple fairness criteria simultaneously means that every algorithm embeds a value judgment about which groups bear the cost of prediction errors.
By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.
In courtrooms across the United States and increasingly worldwide, judges consult algorithmic risk assessment tools before making decisions about bail, sentencing, and parole. These tools—COMPAS, PSA, ORAS, and their successors—promise objectivity: replace the subjective gut feelings of individual judges with data-driven predictions. The sociological reality is more complex. Algorithms trained on historical criminal justice data inherit the biases embedded in that data—biases reflecting decades of racially disparate policing, prosecution, and sentencing.
The ProPublica investigation of COMPAS in 2016 revealed that the tool was twice as likely to falsely flag Black defendants as future criminals compared to white defendants. This finding ignited a debate that has only intensified: can algorithmic tools be made fair, or does the very act of predicting criminal behavior from historical data reproduce structural inequality?
Why It Matters
Gao (2025) examines AI's dual role in criminal sentencing—its capacity to both mitigate and exacerbate existing biases. The analysis reveals a fundamental asymmetry: while algorithms can identify and partially correct for some forms of judicial bias (inconsistency between judges, anchoring effects), they can also systematize and scale biases that were previously idiosyncratic. A biased judge affects one courtroom; a biased algorithm affects every courtroom that uses it. The study documents how algorithmic tools create an illusion of neutrality that makes bias harder to identify and challenge—defendants and their lawyers cannot cross-examine an algorithm the way they can challenge a human expert's testimony.
Song (2024) tackles the mathematical foundations of algorithmic fairness and reveals a result that has profound implications for policy: multiple statistical measures of fairness are mathematically incompatible. An algorithm cannot simultaneously satisfy calibration (risk scores mean the same thing across groups), false positive rate parity (equal rates of wrongful high-risk classification), and false negative rate parity (equal rates of wrongful low-risk classification) unless base rates are identical across groups—which they are not, given differential arrest and conviction rates.
The Science
The Impossibility of Simultaneous Fairness
Song (2024) formalizes the fairness constraints that courts, legislators, and algorithm designers implicitly navigate. The impossibility theorem means that every recidivism prediction tool necessarily chooses which groups bear the cost of prediction errors. If the tool is calibrated (a "7 out of 10" risk score means the same probability regardless of race), then it will produce unequal false positive rates across groups with different base rates. If it equalizes false positive rates, it sacrifices calibration. This is not a technical problem awaiting a clever solution—it is a mathematical constraint that requires an explicit value judgment about the acceptable distribution of error.
Bias Detection and Mitigation Attempts
Oikonomou, Bailis, and Bentos (2025) apply fairness-aware machine learning to the Greek prison system—extending the conversation beyond the heavily studied American context. Their analysis of recidivism prediction models reveals that standard ML algorithms (random forests, gradient boosting) produce significant disparities across demographic groups, and that post-hoc fairness interventions (threshold adjustments, re-weighting) can reduce but not eliminate these disparities. Notably, interventions that improve fairness along one dimension often degrade it along another, confirming the impossibility results at the empirical level.
Transparency and Explainability
Cavus, Benli, and Altuntas (2025) propose a framework combining deep learning with clustering techniques to create more transparent and bias-resilient recidivism prediction. Their approach clusters defendants into more homogeneous subgroups before applying prediction models, reducing the impact of between-group base rate differences. The framework also incorporates explainability features that allow stakeholders to understand which factors drive individual predictions—a critical requirement for due process.
Accuracy-Fairness Trade-offs
Farayola, Tal, and Saber (2025) directly examine the trade-off between prediction accuracy and fairness, and its implications for trust and equity. Their analysis demonstrates that fairness-constrained models consistently sacrifice some predictive accuracy—the question is how much accuracy loss is acceptable to achieve meaningful fairness improvements. The study finds that moderate fairness constraints produce only small accuracy reductions, suggesting that the accuracy-fairness trade-off is less severe than algorithm developers often claim.
Fairness Criteria in Recidivism Prediction
<
| Fairness Criterion | Definition | Implication | Limitation |
|---|
| Calibration | Same risk score = same probability across groups | Risk scores are "honest" | Permits unequal error rates |
| False Positive Rate Parity | Equal rates of wrongful high-risk classification | Equal protection from false incrimination | Requires sacrificing calibration |
| False Negative Rate Parity | Equal rates of wrongful low-risk classification | Equal protection from undetected risk | Conflicts with FPR parity |
| Predictive Parity | Same positive predictive value across groups | Detained groups equally likely to reoffend | Compatible with disparate impact |
| Individual Fairness | Similar individuals receive similar scores | Treats each case on its merits | Defining "similarity" is subjective |
What To Watch
The field is moving from "can we debias algorithms?" toward the harder question: "should we use prediction algorithms in criminal justice at all?" Several jurisdictions (Illinois, New Jersey) have moved to eliminate cash bail using risk assessment tools, while others (California) have retreated from algorithmic approaches after discovering persistent bias. The critical development to watch is whether the EU AI Act's classification of criminal justice AI as "high risk"—requiring conformity assessments, human oversight, and bias audits—creates a viable regulatory model, or whether the impossibility of simultaneous fairness means that no amount of oversight can make these tools genuinely fair.
In courtrooms across the United States and increasingly worldwide, judges consult algorithmic risk assessment tools before making decisions about bail, sentencing, and parole. These tools—COMPAS, PSA, ORAS, and their successors—promise objectivity: replace the subjective gut feelings of individual judges with data-driven predictions. The sociological reality is more complex. Algorithms trained on historical criminal justice data inherit the biases embedded in that data—biases reflecting decades of racially disparate policing, prosecution, and sentencing.
The ProPublica investigation of COMPAS in 2016 revealed that the tool was twice as likely to falsely flag Black defendants as future criminals compared to white defendants. This finding ignited a debate that has only intensified: can algorithmic tools be made fair, or does the very act of predicting criminal behavior from historical data reproduce structural inequality?
Why It Matters
Gao (2025) examines AI's dual role in criminal sentencing—its capacity to both mitigate and exacerbate existing biases. The analysis reveals a fundamental asymmetry: while algorithms can identify and partially correct for some forms of judicial bias (inconsistency between judges, anchoring effects), they can also systematize and scale biases that were previously idiosyncratic. A biased judge affects one courtroom; a biased algorithm affects every courtroom that uses it. The study documents how algorithmic tools create an illusion of neutrality that makes bias harder to identify and challenge—defendants and their lawyers cannot cross-examine an algorithm the way they can challenge a human expert's testimony.
Song (2024) tackles the mathematical foundations of algorithmic fairness and reveals a result that has profound implications for policy: multiple statistical measures of fairness are mathematically incompatible. An algorithm cannot simultaneously satisfy calibration (risk scores mean the same thing across groups), false positive rate parity (equal rates of wrongful high-risk classification), and false negative rate parity (equal rates of wrongful low-risk classification) unless base rates are identical across groups—which they are not, given differential arrest and conviction rates.
The Science
The Impossibility of Simultaneous Fairness
Song (2024) formalizes the fairness constraints that courts, legislators, and algorithm designers implicitly navigate. The impossibility theorem means that every recidivism prediction tool necessarily chooses which groups bear the cost of prediction errors. If the tool is calibrated (a "7 out of 10" risk score means the same probability regardless of race), then it will produce unequal false positive rates across groups with different base rates. If it equalizes false positive rates, it sacrifices calibration. This is not a technical problem awaiting a clever solution—it is a mathematical constraint that requires an explicit value judgment about the acceptable distribution of error.
Bias Detection and Mitigation Attempts
Oikonomou, Bailis, and Bentos (2025) apply fairness-aware machine learning to the Greek prison system—extending the conversation beyond the heavily studied American context. Their analysis of recidivism prediction models reveals that standard ML algorithms (random forests, gradient boosting) produce significant disparities across demographic groups, and that post-hoc fairness interventions (threshold adjustments, re-weighting) can reduce but not eliminate these disparities. Notably, interventions that improve fairness along one dimension often degrade it along another, confirming the impossibility results at the empirical level.
Transparency and Explainability
Cavus, Benli, and Altuntas (2025) propose a framework combining deep learning with clustering techniques to create more transparent and bias-resilient recidivism prediction. Their approach clusters defendants into more homogeneous subgroups before applying prediction models, reducing the impact of between-group base rate differences. The framework also incorporates explainability features that allow stakeholders to understand which factors drive individual predictions—a critical requirement for due process.
Accuracy-Fairness Trade-offs
Farayola, Tal, and Saber (2025) directly examine the trade-off between prediction accuracy and fairness, and its implications for trust and equity. Their analysis demonstrates that fairness-constrained models consistently sacrifice some predictive accuracy—the question is how much accuracy loss is acceptable to achieve meaningful fairness improvements. The study finds that moderate fairness constraints produce only small accuracy reductions, suggesting that the accuracy-fairness trade-off is less severe than algorithm developers often claim.
Fairness Criteria in Recidivism Prediction
<
| Fairness Criterion | Definition | Implication | Limitation |
|---|
| Calibration | Same risk score = same probability across groups | Risk scores are "honest" | Permits unequal error rates |
| False Positive Rate Parity | Equal rates of wrongful high-risk classification | Equal protection from false incrimination | Requires sacrificing calibration |
| False Negative Rate Parity | Equal rates of wrongful low-risk classification | Equal protection from undetected risk | Conflicts with FPR parity |
| Predictive Parity | Same positive predictive value across groups | Detained groups equally likely to reoffend | Compatible with disparate impact |
| Individual Fairness | Similar individuals receive similar scores | Treats each case on its merits | Defining "similarity" is subjective |
What To Watch
The field is moving from "can we debias algorithms?" toward the harder question: "should we use prediction algorithms in criminal justice at all?" Several jurisdictions (Illinois, New Jersey) have moved to eliminate cash bail using risk assessment tools, while others (California) have retreated from algorithmic approaches after discovering persistent bias. The critical development to watch is whether the EU AI Act's classification of criminal justice AI as "high risk"—requiring conformity assessments, human oversight, and bias audits—creates a viable regulatory model, or whether the impossibility of simultaneous fairness means that no amount of oversight can make these tools genuinely fair.
References (6)
[1] Gao, Y. (2025). Algorithmic Justice: Can AI Mitigate or Exacerbate Bias in Criminal Sentencing? Academic Journal of Sociology and Management.
[2] Oikonomou, F., Bailis, E., & Bentos, S. (2025). Towards Fair Recidivism Prediction: Addressing Bias in Machine Learning for the Greek Prison System. IEEE IRASET.
[3] Song, J. (2024). Formalizing Fairness: Statistical Measures of Parity for Recidivism Prediction Instruments. Michigan Technology Law Review, 30(2).
[4] Cavus, M., Benli, M.N., & Altuntas, U. (2025). Transparent and Bias-Resilient AI Framework for Recidivism Prediction. Applied Soft Computing, 113160.
[5] Farayola, M.M., Tal, I., & Saber, T. (2025). A Fairness-Focused Approach to Recidivism Prediction. AI and Ethics.
Farayola, M. M., Tal, I., Saber, T., Connolly, R., & Bendechache, M. (2025). A fairness-focused approach to recidivism prediction: implications for accuracy, trust, and equity. AI & SOCIETY.