Does it really matter how we present scales in Web surveys?

In an earlier post I noted the special issue of SMR focused on Web surveys.  In one of the articles in that issue Don Dillman and two of his former students look at the effect of various ways of displaying scales in online questionnaires.  They start by citing a point made some years ago by Norbert Schwartz to the effect that in the traditional interviewer modes scale labels whether words or numbers are “vague quantifiers.”  In a visual mode like Web, how the categories are displayed provides additional clues on how to interpret them.  Order, orientation (vertical vs. horizontal), and the distance between categories are all variations in presentation that might influence how a respondent interprets the question.  We could stop right there and note that this may be a major reason why we seen often dramatic differences between telephone and Web.  But there is a good deal more.

They did several experiments testing the impact on response of various scale formats.  The scales were all simple five point scales.   The key findings:

  • They confirmed Roger Tourangeau’s “good is up” hypothesis by showing that consistently presenting the most positive option first, regardless of orientation, results in respondents answering more quickly than if the most negative option is placed first. There is no difference in the resulting response distributions; it’s just easier for people to process the scale that way.
  •  They did not find uneven spacing of response options to be problematic as long as the midpoint and endpoints were visually aligned.   In one manipulation there was more space around the midpoint than around other categories and in another there was more space between the  endpoints and the penultimate points at either end than between other categories.  These displays did not affect response.
  • Similarly, separating a DK response from the rest of the categories with more space does not seem to have affected response.
  • Finally, they found that labels and numbers on all scale points (as opposed to endpoint labels only) seem to slow people down, although with no impact on the response distributions.

All in all, this is a nice little piece and I recommend that anyone interested in the issue get hold of it and dig into more detail.  My only problem with it is that it tests relatively short, five-point scales.  I can’t help but wonder if these findings would replicate on longer scales like those with seven or eleven categories.