Algorithmic Due Process: Mistaken Accountability and Attribution in State v. Loomis
By Ellora Israni - Edited by Evelyn Chang
Algorithms are eating the world.[1] Every industry, from education to health to finance to media, is embracing algorithmic trends,[2] and the justice system is no exception. Police departments are engaging algorithms to make communities safer.[3] States are using algorithms to replace outdated, discriminatory bail systems.[4] And parole boards are relying on algorithms to reduce recidivism risk.[5] On the whole, algorithms in the law are promising; they allow actors to most effectively deploy constrained resources to modernize practices, reduce bias, and deliver justice.
But, there is one application of algorithms to the law that is constitutionally, technically, and morally troubling: algorithmic-based sentencing. States are increasingly allowing, or even requiring, judges to consider actuarial risk assessment scores in sentencing decisions.[6] Recently, in State v. Loomis, the Supreme Court of Wisconsin held that the use of a proprietary risk assessment tool—called COMPAS, the algorithm was originally developed to help parole boards assess recidivism risk—at sentencing, did not violate the defendant’s due process rights to be sentenced (a) individually and (b) using accurate information.[7] COMPAS’s author, Northpointe, Inc., refused to disclose its methodology to the court and the defendant;[8] however, COMPAS’s output—a risk assessment score—was referenced by both the State and the trial court during sentencing. Because the algorithm deemed the defendant to be at high risk of recidivism, the sentencing court denied him the possibility of parole and handed down a six year sentence.[9]
Despite upholding COMPAS’s constitutionality, the Court placed numerous restrictions on its use. The algorithm could not be used to determine whether an offender would be incarcerated, or to calculate the length of his or her sentence.[10] Its use had to be accompanied with an independent rationale for the sentence, and any Presentence Investigation Reports containing the score had to contain an elaborate, five-part warning about the algorithm’s limited utility.[11] The defendant appealed to the Supreme Court, which declined to hear the case.[12]
That the court allowed an algorithm, into which actors in the justice system have limited visibility, to play even the slightest role in depriving an individual of his liberty is arguably unconstitutional and certainly troubling, from both a technical and a moral standpoint. The Wisconsin Supreme Court’s opinion, as well as the briefs on appeal, reflect fundamental misunderstandings about how an algorithm like COMPAS might work, and what safeguards would allow it to be useful in sentencing. However, these misunderstandings are also a window into a more promising framework, which would allow algorithms to lend their power to the justice system without raising constitutional, technical, or moral concerns.
I. That the court considered examining COMPAS’s source code, instead of its inputs and outputs, is what allowed it to conclude COMPAS was constitutional.
In the Wisconsin Supreme Court and on appeal, all parties focused on whether Northpointe’s refusal to disclose its algorithm rendered COMPAS’s use unconstitutional because neither the defendant nor the court could see the source code. That, the defendant argued, violated his right to be sentenced based on information whose correctness he had an opportunity to verify.[13] He knew what information COMPAS was using-his criminal history and a pre-sentencing questionnaire he had filled out, but had no idea how it was using said information to predict his recidivism.[14] Thus, he demanded to see the source code to determine the algorithm’s constitutionality.[15] The State’s brief similarly focused on whether the source code had to be disclosed.[16]
The fixation on examining source code reflects a fundamental misunderstanding about how exactly one might determine whether an algorithm is biased. It is highly dubitable that any programmer would write “unconstitutional code” by, for example, explicitly using the defendant’s race to predict recidivism anywhere in the code. Examining the code is unlikely to reveal any explicit discrimination.
However, detailed analysis of COMPAS by ProPublica revealed that the algorithm is not, in fact, race-neutral. Black defendants are often predicted to have a higher risk of recidivism than they actually do, and white defendants are often predicted to have a lower risk than they actually do.[17] Thus, even if there are already race-based discrepancies in recidivism today,[18] the algorithm does not simply reflect the discrepancies that already exist. It exacerbates these discrepancies along racial lines to the point of racism against defendants of color. But, if the source code is facially neutral, what explains the bias?
The answer probably lies in the inputs given to the algorithm. Thus, the defendant should be asking to see the data used to train the algorithm and the weights assigned to each input factor, not the source code.[19] The problem is implicit, not intentional bias. For example, although the code is unlikely to branch on race, it could conceivably consider ZIP code. Using the latter is not explicitly unconstitutional, but it has been shown to be a proxy for the former, which is.[20] If one’s ZIP code receives a higher weight compared to other factors, it could explain the racial biases found in the ProPublica analysis.[21] There is a strong possibility that, even if race or another constitutionally impermissible factor is ever mentioned in the source code, the algorithm is achieving the same, racist or otherwise inappropriately biased result by placing a high weight on a variable that is a proxy for race.[22] Thus, inputs, rather than source code, should be evaluated to reveal possible sources of unconstitutionality.
II. An accurate examination of COMPAS could feasibly reveal that it is more biased, not less, than a judge.
All the parties in Loomis—the court, the State, and, even the defendant—allude to the algorithm being less biased than a judge.[23] Computers, after all, are not humans; machines are not racist. However, that perspective fails to recognize that computers, especially machine learning algorithms, are merely reflections of the input data that trains them.[24] And if that input data is truly representative of our biased world, the computer will be biased too.[25] Computers are intelligent, but they are not wise.[26] That distinction is especially important in the context of the law, where, as Oliver Wendell Holmes famously stated, experience is often more important than logic.[27]
Furthermore, machine learning algorithms often work on a feedback loop; if they are not constantly retrained, and informed of the flaws in their initial determinations, their biases are continuously affirmed.[28] Algorithms “lean in” to their own biases, incorrectly assuming their correctness, and drift further and further away from fairness. Thus the need to affirmatively and continuously re-train algorithms against these biases becomes even more urgent over time. This sort of “algorithmic affirmative action”[29] is exactly how Google and Facebook build some of their algorithms.[30] It recognizes that equal opportunity may not always align with societal tendencies, so algorithms must incorporate it by design.[31]
Finally, in extolling the virtue of algorithms over judges, the parties fail to account for the uniquely human ability to individualize. This is precisely the argument against mandatory minimum sentences—they steal away judges’ discretion to deliver justice by considering any mitigating factors, rendering them merely cogs in the system[32]—and it is equally cogent against machine sentencing. For example, part of the brutality of using predicted recidivism to guide sentencing is that it does not sufficiently incorporate causation. Is it truly that defendants with higher rates of recidivism warrant longer sentences; or, is it that defendants who are given longer sentences are kept out of their communities, unemployed, and away from their families longer, which naturally increases their risk of recidivism?[33] Whereas a machine is programmed to assume unidirectional causation, a judge could investigate causal relationships with more nuance.
Thus, the parties should not be asking whether the algorithm is completely, facially neutral; they should be asking whether it has affirmatively been trained against the racism of the world and told: these are the biases in the inputs you are receiving, so actively combat them.
Algorithms should only be deployed where society has accepted the high error rate of an imprecise tool—only to automate rules, not standards.[34] Risk recidivism algorithms have a 70 percent accuracy rate; that is, there is a 70 percent chance any randomly selected higher-risk individual is classified as higher risk than a randomly selected lower-risk individual.[35] Parole officers, for whom such algorithms were originally developed, might accept this rate of sending prisoners to the wrong treatment programs. But sentencing commissions—who pride themselves on minimizing Type I[36] errors and operate in a justice system that advertises as “innocent until proven guilty”—should not tolerate such a gross error rate.[37] By questioning only the biases of one particular algorithm, the actors in Loomis failed to identify the meta-issue of whether any algorithm is ever appropriate in such a high-stakes context.[38]
III. If algorithms such as COMPAS are to continue being deployed in the justice system, their creators need to be held to the same standards as are other actors in the system.
The Wisconsin Supreme Court allowed COMPAS to survive due process review, but its allowance was noticeably tepid. The Court’s restrictions on COMPAS’s use render it little more than an algorithmic “pat on the back,” a corroborative affirmation for actors in the criminal justice system that their determinations are correct. Because of the human tendency to allow “words [to] yield to numbers,”[39] the Loomis decision is therefore unlikely to promote the intended skepticism of algorithms.[40] Instead, it will have the opposite effect of reassuring lawyers and judges that their actions are justified by “anchoring” them in algorithmic affirmation.[41] COMPAS will let them sleep at night, knowing that even if they spent all day sending people to prison—or worse, sending them to the deaths—a computer reassured them that was the “correct” thing to do.
The use of COMPAS is morally troubling precisely because sentencing should not be easy. Actors in the criminal justice system should lose sleep over the fact that they are systemically deriving people of their life, liberty, and property. That should be hard. It is a serious, unimaginable thing. Anyone who has a hand in this system should have to grapple with the consequences of their work; as algorithms become a part of the criminal justice system, that “anyone” should include technologists.
Programmers and their work need to be held to the same high ethical and constitutional standards to which we hold other actors in the criminal justice system. We need algorithmic due process.[42] To write and deploy code that is used to justify incarcerating and killing people is grave. Programmers should not be able to hide behind a veil of ignorance; Frankenstein’s creator is responsible for his actions.[43] Just as judges are compelled to explain not only their decisions but also their reasoning via published opinions, programmers must shed some light on their contributions.
Ellora Israni is a J.D. candidate at Harvard Law School. She earned her B.S. in Computer Science from Stanford University and was previously a software engineer at Facebook.
[1] This is a reference to venture capitalist Marc Andreessen’s oft-cited prophecy that “software is eating the world.” See, e.g., Marc Andreessen, Why Software is Eating the World, Wall St. J. (Aug. 20, 2011), https://www.wsj.com/articles/S... [https://perma.cc/DMB9-F3S6].
[2] For example: big data, artificial intelligence, and machine learning.
[3] See Justin Jouvenal, The new way police are surveilling you: Calculating your threat ‘score’, Wash. Post (Jan. 10, 2016), https://www.washingtonpost.com... [https://perma.cc/VS8K-CJKN].
[4] See Episode 783: New Jersey Bails Out, Planet Money (Jul. 12, 2017), http://www.npr.org/sections/money/2017/07/12/536905881/episode-783-new- jersey-bails-out [https://perma.cc/5WMR-U6F2].
[5] See Eric Holder, National Association of Criminal Defense Lawyers 57th Annual Meeting and 13th State Criminal Justice Network Conference, U.S. Dep’t of Justice (Aug. 1, 2014). https://www.justice.gov/opa/sp... [https://perma.cc/L3NL-FUXL].
[6] See John Lightbourne, Damned Lies & Criminal Sentencing Using Evidence-Based Tools, 15 Duke L. & Tech. Rev. 327, 343 (2017).
[7] State v. Loomis, 881 N.W.2d 749, 767 (Wis. 2016).
[8] Northpointe, Inc. said that the algorithm is proprietary and disclosing it would leave the company vulnerable to competitors.
[9] See Excerpts from Circuit Court Sentencing Transcript, La Crosse County Circuit Court (Aug. 12, 2013).
[10] See State v. Loomis, 881 N.W.2d at 769.
[11] Id.
[12] See Loomis v. Wisconsin, 137 S. Ct. 2290 (2017). Both the state of Wisconsin and the United States Solicitor General filed briefs defending COMPAS. See Brief for the United States as Amicus Curiae, Loomis v. Wisconsin, 137 S. Ct. 2290 (2017) (No. 16-6387), 2017 WL 2333897; see also Brief in Opposition, Loomis v. Wisconsin, 137 S. Ct. 2290 (2017) (No. 16-6387). Because the defendant had a right to access and verify for correctness the inputs to the algorithm — his criminal record and a pre-sentencing questionnaire — both briefs said COMPAS did not violate his due process right to be sentenced using accurate information. Because of the numerous proscribed warnings and the fact that the judge has the last word, both briefs said COMPAS did not violate his due process right to be sentenced individually – despite the fact that COMPAS was developed to predict group, not individual, recidivism.
[13] Brief of Defendant-Appellant at 22, State v. Loomis, 881 N.W.2d 749 (2016) (No. 2015AP157-CR), 2015 WL 9412098, at *22; see also Gardner v. Florida, 430 U.S. 349, 351 (1977).
[14] See Rebecca Wexler, When a Computer Program Keeps You In Jail, N. Y. Times (Jun. 13, 2017), https://www.nytimes.com/2017/06/13/opinion/how-computers-are-harming-criminal-justice.html [https://perma.cc/2WZG-LP77].
[15] Brief of Defendant-Appellant, supra note 14 at 22–25.
[16] Brief of Plaintiff-Respondent at 17, State v. Loomis, 881 N.W.2d 749 (2016), 2016 WL 485419 at *17.
[17] See Jeff Larson et al., How We Analyzed the COMPAS Recidivism Algorithm, ProPublica (May 23, 2016),https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm [https://perma.cc/3V5S-W874].
[18] See U.S. Sentencing Comm’n, Recidivism Among Federal Offensers: A Comprehensive Overview 24 (2016).
[19] See Anupam Chander, The Racist Algorithm?, 115 Mich. L. Rev. 1023, 1024 (2017).
[20] See Katherine Noyes, Will big data help end discrimination—or make it worse?, Fortune (Jan. 15, 2015), https://fortune.com/2015/01/15/will-big-data-help-end-discrimination-or-make-it-worse/ [https://perma.cc/97QR-NJ65].
[21] In contrast, when New Jersey replaced its cash bail system with a risk assessment formula earlier this year, it published both the formula and inputs that the formula considers: a combination of age, criminal history, and prior failures to appear in court. Neither race nor any proxy variable is considered. See Laura and John Arnold Found., Public Safety Assessment: Risk Factors and Formula, http://www.arnoldfoundation.org/wp-content/uploads/PSA-Risk-Factors-and-Formula.pdf [https://perma.cc/SM4T-68UH].
[22] Richard Berk et al., Fairness in Criminal Justice Risk Assessments: The State of the Art 31 n.23 (U. Pa. Dep’t. of Criminology, Working Paper No. 2017-1.0, 2017)(“Because of racial residential patterns, zip code can be a strong proxy for race.”).
[23] See State v. Loomis, 881 N.W.2d 749, 765 (Wis. 2016); see also Brief of Plaintiff-Respondent, supra note 17, at 8; see also Brief of Defendant-Appellant, supra note 14, at 28–29.
[24] See generally Andrea Roth, Machine Testimony, 126 Yale L.J. 1972 (2017).
[25] Cf. James Vincent, Twitter Taught Microsoft’s AI Chatbot to be a racist asshole in less than a day, The Verge (Mar. 24, 2016, 6:43 AM), https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist [https://perma.cc/L2GZ-95S2].
[26] What is the difference between intelligence and wisdom? “Knowledge is knowing that a tomato is a fruit; wisdom is not putting it in a fruit salad.” – Miles Klington
[27] See Kiel Brennan-Marquez, “Plausible Cause”: Explanatory Standards in the Age of Powerful Machines, 70 Vand L. Rev. 1249, 1300 n.172 (2017).
[28] See Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy 27 (2016).
[29] See Chander, supra note 20, at 1025.
[30] For example, the query "n***** house" was for a while returning “White House” in Google Search Results, a result that was learned from user behaviors; the company explicitly altered their otherwise machine-learned search algorithm to fix that. Id. at 1043.
[31] See The White House, Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights (May 2016), https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/2016_0504_data_discrimination.pdf [https://perma.cc/W2VZ-RY7C] (advocating for designing data systems that promote fairness and safeguard against discrimination from the first step of the engineering process). Another option would be to use COMPAS as one of an explicit set of sentencing factors, similar to how race is considered one of a set of factors in affirmative action policies, and specify how precisely it should be weighed. The Loomis Court did allude to this, but did not offer any concrete guidance as to how the algorithm should weigh against other factors. See Adam Liptak, Sent to Prison by a Software Program’s Secret Algorithms, N. Y. Times (May 1, 2017), https://www.nytimes.com/2017/05/01/us/politics/sent-to-prison-by-a-software-programs-secret-algorithms.html [https://perma.cc/U3WX-TAR2] (“Justice Bradley made Compas’s role in sentencing sound like the consideration of race in a selective university’s holistic admissions program. It could be one factor among many, she wrote, but not the determinative one.”).
[32] See Matthew Van Meter, One Judge Makes the Case for Judgement, The Atlantic (Feb. 25, 2016), https://www.theatlantic.com/politics/archive/2016/02/one-judge-makes-the-case-for-judgment/463380/ [https://perma.cc/F7QE-AX29].
[33] See Sonja B. Starr, Evidence–Based Sentencing and the Scientific Rationalization of Discrimination, 66 Stan. L. Rev. 803, 813 (2014).
[34] See Danielle Keats Citron, Technological Due Process, 85 Wash. Univ. L. Rev. 1249, 1301 (2008).
[35] See Lightbourne, supra note 7, at 336.
[36] A Type I error, also known as a false positive, is the incorrect rejection of a true null hypothesis-for example, finding a defendant who is actually innocent guilty.
[37] See Lightbourne, supra note 7, at 336.
[38] See Note, State v. Loomis: Wisconsin Supreme Court Requires Warning Before Use of Algorithmic Risk Assessments in Sentencing, 130 Harv. L. Rev. 1530, 1534 (2017). For a proposed alternative, see Tal Zarsky, Automatic - For the People?, Cyberlaw JOTWELL (Nov. 8, 2016), http://cyber.jotwell.com/automatic-for-the-people/ [https://perma.cc/58QB-44BS] (citing the European mandate that humans be allowed to challenge machine-made decisions to preserve dignity in the process).
[39] See Note, supra note 39, at 1536 (explaining the human tendency to place more faith in numerical data, regardless of its actual validity, than other forms of evidence).
[40] See Brennan-Marquez, supra note 28, at 1257. (“When officials know they will have to account for their decisions...it causes officials to monitor themselves and—ideally—to internalize the constitutional limits and legality principles just explored.”).
[41] See Note, supra note 39, at 1536 (“Behavioral economists use the term “anchoring” to describe the common phenomenon in which individuals draw upon an available piece of evidence — no matter its weakness — when making subsequent decisions. A judge presented with an assessment that shows a higher recidivism risk than predicted may increase the sentence without realizing that “anchoring” has played a role in the judgment.”).
[42] See Katherine Freeman, Algorithmic Injustice: How the Wisconsin Supreme Court Failed to Protect Due Process Rights in State v. Loomis, 18 N.C.J.L. & Tech. On. 75, 99 (2016) (“[A]utomation bias effectively turns a computer program's suggested answer into a trusted final decision.”)(internal quotations omitted).
[43] See Frank Pasquale, Secret Algorithms Threaten the Rule of Law, MIT Tech. Rev. (Jun. 1, 2017), https://www.technologyreview.com/s/608011/secret-algorithms-threaten-the-rule-of-law/ [ https://perma.cc/RHD9-2A8P] (“Sending someone to jail thanks to the inexplicable, unchallengeable judgments of a secret computer program is too Black Mirror for even hardened defenders of corporate privileges.”).