Your Trial Message

Your Trial Message

(formerly the Persuasive Litigator blog)

Don’t Be Entranced By Statistical Claims From Mock Trial Research

By Dr. Ken Broda Bahm:

Alpha hypnotize
At a conference, I once met another consultant who actually claimed that his mock trial findings would line up with actual trial results with a definite confidence interval of, he said, plus or minus five percent.  It was one of those conference moments when you realize that you urgently need to talk to someone else, because to anyone with even a rudimentary understanding of research methods, the claim was absurd on its face.  A mock trial will not predict your actual trial results for a number of reasons.  But beyond that, there are important limitations on the ability to apply statistical tools to the results of mock trial research. 

This is a critical point for litigators, as mock trial consumers, to understand:  While good design and research practices matter, and mock trials are able to yield many conclusions that are heuristically valuable and of great practical use, only in limited circumstances are mock trial researchers able to communicate a finding and follow it up with, “…and that is true at a point-o-five level of statistical significance.”  In this post, I provide my own thoughts on the modest role of statistical methods in the small group research context of most mock trials and focus groups, helping the litigator/consumer find the best middle ground between two extremes:  promoting an indifferent “methods don’t matter,” perspective on the one hand, or promoting a narrow “what is statistical is what is true” conclusion on the other.  Both positions are equally damaging to the intelligent use of mock trial research.

Statistical Approaches Don’t Generally Fit Mock Trials:

Approaches to running and analyzing mock trials differ, and that isn’t a bad thing.  With an interest in emphasizing high standards (a goal I emphatically agree with), some researchers will stress a need to statistically justify the findings of any mock trial project (an application I emphatically disagree with).  High standards are vital, but standards need to be appropriate to the project’s design and expected utility, and levels of significance cannot be the sine qua non used to justify recommendations.

“Statistical Significance” is often interpreted in the public mind as a shorthand for that which is true, reliable, and important (von Roten, 2006).  In reality, it is simply a measure of the chances that differences observed in a sample are due to sampling error — an accident of who is picked — instead of being caused by the factor under investigation.  Even that measure rests on a number of assumptions that are unlikely to bear out in most mock trial research.

1.  Random Sampling:  Does every member of the population have an equal chance of being chosen for the sample?  In the case of mock trials using database recruits, those who answer ads, or other volunteers, the answer is definitely “no.”  Yet even for mock trials that rely on randomized selection (our practice), there is still the bias of who is home to answer the phone, who chooses to answer, and who agrees to participate.

2.  Ecological Validity:  Do the methods employed in research approximate the real-life situation under investigation?  Clearly, there are many differences between an actual trial and a mock trial simulation. “The uncontrolled and uncontrollable variables,” notes Doug Keene in a comment to a recent post in Deliberations, include “representativeness of presentation, environmental factors such as evidentiary rulings, judge’s whimsy about hardship issues and strikes for cause, and the talent or charisma of opposing counsel, to name a few.”  These factors prevent a clear claim of ecological validity in most attorney work-product mock trials.  Recent commentary (e.g., Wiener, Krauss & Lieberman, 2011) notes a number of questions even adhering to the more systematic “jury simulations” that are used as the basis for academic articles in peer reviewed publications like Law and Human Behavior.  The fact that mock trials inevitably differ from actual trials in duration, detail, and structure, is an important limit on their statistical generalizability.

3.  Adequate Sample Size:  Are there enough study participants to allow one to generalize to the population?  With the smaller groups typically used in mock trials, relationships between variables that are tested will often be insignificant, not due to an absence of a relationship necessarily, but due to the absence of an adequate sample.  The smaller the effect you’re trying to measure, the larger the sample it takes to reliably measure it.  For very large differences, you can see results in samples as small as 30, or even less.  But for the generally more nuanced differences associated with communication approaches, you would need a project with far more mock jurors in order to measure it well.

4.  Control Group:  When results are looked at experimentally — testing the effect of a specific variable — then there needs to be a control group that is not exposed to that variable.  When testing potential trial strategies in a mock trial, however, you are generally looking at one approach.  Applying statistical analysis to the conclusions of that would be like scoring a pharmaceutical drug trial when everyone had received the drug and no one had received the placebo.  Deciding that a specific part of the message worked or didn’t work is a judgment call made after evaluating the mock jurors’ feedback and deliberations, and not a statistically-governed finding.  There are exceptions, where for example a mock trial will build in scenarios to test (e.g., three juries see the case with the possibly precluded smoking gun memo, and three juries see the same case without it), yet in those cases, any significant conclusions would be limited to the single variable manipulated and not to all of the remaining parts of the case story.

These limitations are well known and rather obvious to those who work within the litigation consulting field.  When, despite that practitioners frame the mock trial results in the language of statistics, the danger is that they are doing so for the purpose of mystification:  They are trying to convey a special or magical importance to the mock trial results.  That is the opposite of the realistic and practical advice that litigators actually need.

Still, There’s a Limited Role for Statistical Analysis of Mock Trial Results:

Even if a mock trial researcher were able to address each of these limitations, any statistical conclusions would, at most, be a statement about the population from which the jury will be drawn, and not a statement about the selected jury.  That is because juries are never selected randomly, but are instead shepherded though the intensely nonrandom process known as voir dire.  Your selected jury will, if both sides are doing their job, look quite different from your raw venue population, preventing any statistical conclusions from being applied to the seated jury itself.

But that does suggest one specific area where it can, in some circumstances, be meaningful to look at statistical results stemming from a mock trial:  jury selection.  When you have a sufficient number of mock jurors, it can be meaningful to correlate various mock juror attitudes with the leanings and verdicts observed in the mock trial, and to generate a selection strategy based on those relationships.  That approach has its limits when applied to just the results of a single mock trial with twenty-five to thirty participants. That can be fairly expensive, but does generate statistically meaningful patterns that can be applied to jury selection.  But this approach differs quite a bit from the conventional mock trial.

And an Unlimited Opportunity for Mock Trials to Provide Benefits At the Qualitative Level:  

Those who equate “research” with “statistics” might wonder at this point, “If there are so many limits to the statistical generalizability of mock trials, then why do them?”  The answer is simple:  Because they are useful.  Not all research-based learning stems from the relationship of numbers that we call “statistics.”  Most mock trials fall under the category of “qualitative research,” or methods that aim for a deeper understanding of human behavior, in particular, the reasons for behavior.  A well-designed mock trial allows researchers and clients to focus on the reasons that mock jurors find persuasive, and use to influence each other in deliberations.  By looking at the content and the patterns of mock jurors’ own attitudes and reason-giving behavior, the project generates findings and recommendations that don’t depend on any statistical stamp of approval.  Sophisticated mock trial users understand that a pattern observed in a mock trial won’t necessarily be a pattern that is repeated in the actual trial.  But it is still useful heuristically, in generating ideas, assessing approaches, and seeing the range of possibilities. 

Still it is important to remember that the label “qualitative” isn’t an invitation to purely subjective philosophizing.  Methods should still be systematic.  Some of the systematic yet qualitative methods employed in mock trials include:

  • Submitting open-ended questionnaire responses to content analysis to identify frequency and pattern.
  • Looking for meaningful correspondence between attitudes and actions.
  • Creating a taxonomy of the arguments mock jurors offer for and against a party or position.
  • Noting the difference between claims that are understood and used by mock jurors, and those that are forgotten or unclear.
  • Identifying the reasons behind mock jurors’ shift of opinion.

There are many others – all are useful, but none are foolproof.  The bottom line is that it is always a good idea to ask us researchers the grand epistemological question:  How we know what we claim to know? 

Reasonable consultants, researchers, and clients can and do differ over exact methods, but when in the role of consumers in this context, litigators should understand that qualitative research, like mock trials, should be valued for their common ability to generate useful and meaningful results, and not for their limited ability to generate statistically significant conclusions.  

______ 

Wiener, R.L., Kraus, D.A., & Lieberman, J.D. (2011, June 27).  Mock Jury Research: Where Do We Go from Here?  Behavioral Sciences and the Law.  DOI: 10.1002/bsl.989, Link: http://onlinelibrary.wiley.com/doi/10.1002/bsl.989/full

Photo Credit:  Melomane, Flickr Creative Commons (with alpha symbol added by the author).  For a cool effect, move your head closer to or farther away from the image