Be Skeptical of Dial Testing

By Dr. Ken Broda Bahm:

The Superbowl is now over, the Seahawks are victorious, and now it is time for the traditional post-game attention to turn toward the commercials (especially for Broncos fans). This year, the winner in the most-discussed category seems to be the Budweiser commercial featuring the heartwarming bond between dog and horse. And we actually knew that before the ad’s ‘debut’ during the game because of the publicity in advance. There were also sites allowing viewers to not only see and rate each of the ads, but also to use dial and slider technology to provide a continual response, so we know exactly when our collective “awww” is elicited during the horse and puppy commercial.

These kinds of response systems, variously known as ‘continual response,’ ‘real-time response,’ or simply ‘dial testing,’ are applied in other settings. We’ve seen them scrolling across the bottom of the screen during some political debates, for example, and they’re also used in many higher-end mock trials. The fit, at least on face, would seem to be clear: If you want exact feedback based on specific pieces of evidence, individual legal arguments, and distinct rhetorical moments, why not put a dialer in your mock jurors’ hands and see what works? Because Persuasion Strategies counts itself among the more comprehensive and cutting edge mock trial service providers, we are often asked, “Do you use those dial thingies during your mock trial?” Our answer has always been, “No, we don’t.” The reasons for that come down to the fact, based on the best information coming from sources other than companies trying to sell the technology, that the devices offer more gimmick than gain, and pose serious questions about whether they’re measuring a real reaction or whether they’re creating a reaction to measure. In this post, I aim to put together my best explanation for our choice to avoid it, and the best reasons for others to be at least skeptical of the dials.

The Dial Testing Hype

If you rely on the marketing pitches provided by companies that sell instant response as part of their services, there would be little doubt about the value of the technology. Statements like this one from a company called Logos Communications are typical: “Dial testing allows you to target the exact moment your message becomes successful. And perhaps more importantly, it shows you when your message isn’t working.” Similarly, Luntz Global, a political consulting group that often runs instant response sessions on the major news networks, enthuses: “Our dial technology allows you to reach into the very hearts and minds of the people you are trying to reach in a measurable, scientific fashion.”

Often, these systems are offered by individuals who know their stuff from a social science research perspective. But one telling fact is how often they emphasize the word “reliable” more than the word “valid.” “Reliable” means that the measure can be repeated over and over again in different contexts while yielding similar results. But “valid” means that it is actually measuring what it claims to be measuring. Your bathroom scale might return the same number day after day (so it’s reliable — as long as you’re maintaining the same weight), but because the adjustment is off, the number it provides isn’t reporting your actual weight (it’s not valid).

The Problems With Dial Testing

While proponents of dial testing point to reliability, the real problems are in validity. And that is where the academic support tends to fall down as well. When strong validity is claimed, a closer look often fails to support that conclusion and, in addition, reveals practical problems that outweigh the novelty and the “wow factor” of a continual response system. For an innovative Prezi presentation on the subject, see this one by University of Pittsburgh professor Gordon Mitchell (and if you don’t know what a ‘Prezi presentation’ is, then this is also a good example of the use of that medium).

Drawing from that resource and a few others, here are some of the validity and related questions that pique my own skepticism about dial testing.

What Are You Measuring?

Any quantitative system of measurement can create the problem of false precision. The numbers make you feel that you’ve captured something real, but even a Cosmopolitan quiz will give you a score. Whether it relates to anything actual is another question.

In the case of dial testing, the question is whether it is measuring a reaction that would occur if the participants did not have that dial in their hand. That is something that we can’t really say. Instead, it seems more likely that the instrumentation itself is creating pressure to react and to vary the reactions from moment to moment. That pressure simply wouldn’t be the same in the dial-free environment of a jury box. “The data provided by continual response technologies can be both graphic and beguiling,” psychologist Jon Baggaley (1987) writes. “Peaks and troughs in the moment by moment response profile invite instant interpretations of, for example, ‘high visual interest,’ ‘medium programme appeal,’ ‘low presenter credibility’ – and so on, depending on the response measure used. Such interpretations may be quite invalid.” In addition, as Mitchell (2014) notes, the reduced time for decision making means that participants are more likely to rely on what they would take to be ‘expected’ answers. For example, the viewer rating the Budweiser horse and puppy commercial could well give a ratings spike at the moment those animals are reunited, but we don’t know whether they genuinely feel it, or whether they simply recognize that as the intended high point of the spot.

How Can You Separate the Signal From the Noise?

A common research barrier is separating the signal from the noise, or being able to distinguish what you want to be evaluating from a background of random variation. This is a particular concern in pretrial research settings which feature rich and complex messages combined with a sample size that is generally smaller than the statistical ideal. Adding dials to that setting adds a great deal of data, but how much of it is worth paying attention to?

The Superbowl ad rating site provides a very useful example of the problem. If you take the time to rate one of the messages, try keeping the rating constant for a few moments and you will see a message appear: “Please continue to rate this video,” it will ask. What it is really suggesting, however, is that you should not leave the dial in one place, but instead need to move it around. In that way, the instrument is directly creating a reaction and not simply measuring one.

Even without such a message, the problem remains: How can you tell who was at a resting point because they’re simply paying attention to the ad and not to the slider (that is what I found myself prone to do during the ads), and who is consciously leaving their dial at the same point because their evaluation hasn’t changed? And among those who are reliably moving their dials or sliders, who is doing so meaningfully and who is doing so randomly? In any stimulus as long as a mock trial summary argument, random movement is a distinct possibility, perhaps even a probability. As Baggaley (1987) notes, “Little is known about the criteria by which a meaningful shift in response can be distinguished from a random one” when using dials.

Is It Worth the Distraction?

We know from a great deal of research that an indivdual’s cognitive load has a substantial effect on their decision making. The amount of cognitive capacity and working memory being used up can directly determine whether individuals take the time to work through a decision, or whether they use quick mental shortcuts or heuristics in order to reach a quick decision. This is important in this context because the most obvious effect of handing an audience member a dial and asking for continual feedback is that you are dramatically increasing that audience member’s cognitive load. And you’re doing so in a way that doesn’t have ‘ecological validity’ in the sense that there will be no dialers used in court.

This creates the possibility that participants might be so focused on providing continuous responses that they are paying less attention — or at least, different attention — to the message they’re hearing. Or it creates the possibility of the individual ignoring or downplaying the continual feedback so they can focus more of their finite cognitive resources to the task of simply listening. The latter might be more likely. Research showing that events earlier in a sequence tend to be evaluated more positively than events that occur later in the sequence (reported in Baggeley, 1987), suggesting that at a certain point, respondents simply get tired of moving the dial.

If the instrument were simply distracting to participants, that would be bad enough. But the greater danger may be that it is distracting to the researchers and those who interpret the research as well. Doug Garnett is an advertising expert specializing in infomercials. In a piece entitled, “Why I don’t Use Dial Testing,” he writes, “Dials reveal minute by minute detail that’s unrelated to sales success. And preoccupation with those details leads production teams to avoid dealing with things that are far more important.”

I strongly believe that anyone considering the use of dials in research should experience it themselves first. Try it and see how meaningful it feels from a participant’s perspective.

The Superbowl itself is a day for attention-getting gimmicks, especially on the advertising front. In pretrial research, however, there is a profound need to focus on what matters most, and to not get lost in the noise, uncertainty, and distraction of even very ‘cool’ research technologies. As Doug Garnett continues, “The true catastrophe of dials groups is that I’ve always ended up at the same place: detail without insight. Those few times when the report reveals something useful, it always comes from the discussion afterward and not from the dials. So why risk misinterpretation, producer distraction, and outright miscommunication with dials?”

______

Other Posts on Mock Trial Research Methods:

______

Baggaley, J. (1987). Continual response measurement: Design and validation.Canadian Journal of Educational Communication, 16(3), 217-38.

Maier, J., Maier, M., Maurer, M., Reinemann, C., & Meyer, V. (2009). Real-Time Response Measurement in the Social Sciences. Methodological Perspectives and Applications, Frankfurt a. M.: Peter Lang.

Mitchell, G. R. (2014). Little Annie’s Hidden Curriculum: Pedagogy and Politics of Dial-Metering Public Debates. Keynote Address, 4th International Conference on Argumentation, Rhetoric, Debate, and the Pedagogy of Empowerment (11-13, January 2013, Doha, Qatar): Prezi URL: http://prezi.com/v9avigf3b4oy/little-annies-hidden-curriculum/

Photo Credit: Picture of the author, taken and edited by Jason Bullinger, Persuasion Strategies

Your Trial Message

(formerly the Persuasive Litigator blog)

Need to Talk about Race...