Keep Your Jurors (and Mock Jurors) Human

By Dr. Ken Broda-Bahm:

The courtroom had all the usual elements: oak paneling, parties, a judge, a jury box, a flag. Only in the jury box, instead of live human jurors, there were three large digital monitors on which appeared the “jurors” for this case: ChatGPT (OpenAI), Grok (xAI), and Claude (Anthropic). This event, held last month at the University of North Carolina School of Law, aimed to test the possibilities for using Artificial Intelligence to decide a live criminal case. The three different AI engines were fed realtime transcripts of the proceedings, and legal instructions, before deliberating with each other in front of a live audience to reach a verdict on a robbery case. The fact that observers found a lot to be desired in the exercise will not necessarily undercut the techno-optimistic drive to include quickly evolving AI into broader swaths of the legal process. In fact, some companies are currently offering AI-generated virtual juries as a way to run mock trials without the need for human recruits. Given the economics, I expect that trend to grow as more traditional providers consider incorporating bots into their service offerings.

It is easy to see the appeal in an AI-supported mock trial. When supplied with information, modern AI systems based on Large-Language-Models (LLMs) are able to respond with analytical answers that — often, not always — sound reasonable, useful, and even plausibly human. Cutting out the costs associated with recruiting and finding space for human respondents could end up being very attractive. The problem is that, when it comes to measuring human judgment, an AI substitution is putting simulation over the real thing. After all, providing a seemingly realistic version of a human response is the number one thing that AI currently does well. But does that mean it is a good way to test your legal case? Knowing that it may place me among the luddites, I think the answer — for now at least — is going to be “no.” In this post, I’ll share a few problems I see with AI juries or AI mock juries.

The Basic Problem: A Lack of Validity

I’ve made a reasonable effort to understand what is going on under the hood with AI, and my concern is that we may be getting the A without the I. That is, I’m clear on the “Artificial” part, but not so sure about the “Intelligence” component. The processes that lead to natural-sounding responses from AI are not based on anything that matches what we consider to be “knowledge,” “judgement,” or “thought.” Instead, they are based on probability algorithms to determine what might be the most likely next words in any sentence. This algorithm is based on the full corpus of human communication (i.e., the internet) on which AI trains itself. It’s a souped up version of auto-complete. So in asking an AI “juror” what they think, you are not getting perception, experience, judgment, attitude, or opinion (though you may be getting something that feels like all of the above). What you are getting is an algorithm scraping a plausible sequence of words from somewhere else in the communication universe. You need to keep that in perspective: The communication may read as something plausible when drawn from the corpus of human online communication, but that does not make it a valid measure of a juror’s likely reaction to your case.

The Practical Problem: A Lack of Representativeness

To the extent that the virtual mock juror services are based on off-the-shelf AI, as those I’ve looked at seem to be, they are drawing from the entire internet, i.e., not from your venue, not from the people likely to show up for jury service, and not those who are oriented to the jury context (and likely to adopt a somewhat different persona in that context). There are a large number of situational constraints attaching to jury duty, and while you can load a lot of that into a complex prompt, that does not mean you would have similar results when testing a case with AI versus testing it with a representative sample of jury-eligible humans from your venue. One benefit of testing a case in front of thirty or forty of those individuals is that you get a concrete sense of what reactions are common across the group, and which reactions are outliers. When you have recruited carefully, you can trust the former as being largely representative of the venue. With an AI test, the machine’s response might be cogent, but it will be a guess at how representative it is.

The Legal Problem: Open Source Discoverability

Chatting with a computer might feel confidential, but legally, it is not. The prompt you provide as part of the test, including any case summaries and presentations will be shared with servers that are likely scattered across the globe. While legislation could change this in the future, currently there seems to be no reasonable expectation of privacy or confidentiality surrounding that information. The platform used by the main virtual juror service I looked at is ChatGPT, and the CEO of that company, Sam Altman, has admitted that its prompts are likely discoverable in a lawsuit. “Right now, if you talk to a therapist or a lawyer or a doctor… there’s legal privilege for it. There’s doctor-patient confidentiality, there’s legal confidentiality,” Altman said. “And we haven’t figured that out yet for when you talk to ChatGPT.” More pointedly, he warned, “If you go talk to ChatGPT about your most sensitive stuff and then there’s a lawsuit, we could be required to produce that.” Even if there were no issues with the validity or the reliability of the AI feedback on your case, the discoverability problem alone would be a pretty good reason to stick to human mock jurors who can be bound to a confidentiality agreement as part of an attorney work product protected mock trial.

While it is easy to find examples of AI giving bad, humorous, or even dangerous advice, its main selling point is that it often gives a pretty reasonable-sounding reaction. But being a good simulation does not necessarily make it a good research tool. For now, when you want to see what humans will think of your case, I recommend you stick to asking humans.

____________________
Other Posts on AI:

____________________

Image credit: Shutterstock, used under license

Your Trial Message

(formerly the Persuasive Litigator blog)

Keep Your Jurors (and Mock Jurors) Human

Need to Talk about Race...