Multiple Response Questions on the Web
Separating the Good from the Bad

Still More on Shirking

It seems that everyone is getting concerned about Web panel members satisficing.  Someone recently sent me a presentation that Burke has been giving that has gotten at least one of our clients concerned.  Here is my response:

I think there is an element of alarmism that may not be justified.

I'll begin by pointing out that Jon Krosnick first started talking about satisficing way back in 1991, before anyone ever even thought about Web surveys or online access panels. His work suggests that there is some level of satisficing in all survey modes and the right question to ask is whether it is any worse with Web using access panels than with, say, CATI using RDD (what many think of as the gold standard).

Jon has actually done some work on this question and I saw him present some of it at this year's AAPOR conference. His work looked at comparison of seven panels and an RDD study. He designed a set of six experiments that varied response order and minor word changes designed to measure satisficing. For example, he asked this question:

Which of the following do you think is the most important problem facing the United States today?

  1. The government budget deficit
  2. Drugs
  3. Unemployment
  4. Crime and violence

Half of each panel got this version and the other half got a version that reversed Deficit and Crime. According to satsifcing theory, position should impact the likelihood of an answer being chosen, in this case, Deficit should be chosen more often when listed first than when it is listed last.  So his outcome measure was the difference between the percent choosing Deficit when it is at the top vs. when it is at the bottom. The RDD study produce a 1% differential and the panels produced a range from 1.1% (SSI) to 12.9% (Greenfield). Across all six experiments he found an average difference of .8% for RDD, three panels at 1% or less (SSI, Harris, and GoZing), Survey Direct at 1.9%, SPSS at 3.2% and Greenfield at 5.5%. His conclusion: "Overall, Web survey responses have about the same robustness to changes in question order as the telephone survey. . . " although "Greenfield panelists seem more sensitive than any of the others."

His colleagues at the Stanford Institute for the Quantitative Study of Society also reported on comparisons of a broad set of measures including demographics, life style issues, general attitudes and beliefs, and technology use. In several instances they compared these same panels and RDD against some established benchmarks such as Census data. Their succinctly stated conclusion: "Remarkable comparability of results."

My point is that while there may be some satisifcing going on with Web surveys there is little evidence that it is seriously affecting results. Clearly, some panels are not as good as others and we need to sort them out, but based on research I saw presented in April at ESOMAR's Worldwide Panel Research Conference the main thing we need to worry most about is the number of surveys the panelist is completing. It seems clear people who do lots and lots of surveys respond differently from those who do few, but we can typically select these people out in advance by working with our panel providers.

I also would tend to disagree with some of the Burke suggestions for dealing with satisficing. The classic measure (advocated by Krosnick among others) is to compute measures of non-differentiation of responses within gridded items where studies have shown the strongest enticement to satisficing (straightlining). Levels of non-substantive reponse (missing data) is another. And some people like survey length, but that is an inherently dirty measure given varying connection speeds and ISP traffic volumes.