Previous month:
March 2009
Next month:
May 2009

Posts from April 2009

Some light shed on Web survey breakoffs

It's been hard to go to an MR conference over the last couple of years and not get at least one presentation pitching the use of Flash as a cure all for much of what ills online—falling response rates, high abandonment, high levels of satisficing, etc. I have never found these presentations to be terribly convincing. It's hard to deny that many or perhaps even most online surveys are boring. I just have never been convinced that adding some color and some cool answering gadgets is the solution.

A more encouraging effort has come out of MarketTools and their SurveyScore product, an attempt to quantify the impact of survey design on respondent behavior. Their early research seemed to say that it's all about length but with a hint that screens with lots of grids also are a problem.

But I see some real ground being broken in an article by Andy Peytchev in the current issue of POQ. (Here I disclose that Andy is a former member of the UM team we have done considerable work with, that we collected the data he analyzed, and that he thanks me among others in his first footnote.) He starts by providing us with a framework for thinking about the problem of survey breakoff, itself a service given the state of the debate. And he tries to link that framework to the larger literature on nonresponse. Perhaps I will go into that in a later post, but here I want to focus on his results. His key finding is that respondents are most likely to breakoff when they encounter a page they perceive as burdensome. No surprise that one of the design elements that seems to drive breakoffs is grids. Others include:

  • Section introductions where the shift to a new topic is a commitment some respondents are not willing to make
  • Sensitive questions or those where judgment is required
  • Open ends or, especially, screens with multiple open ends
  • Multiple questions on a page
  • Pages with novel respondent tasks, in his case, slider bars and constant sums/tallies

One of his more interesting findings is that respondents who breakoff generally are not satisificers. Rather they are people who are taking their time, seem to be reading and considering questions carefully, and in general putting forth good effort. There simply comes a time when they can't go on because the survey becomes too demanding. But he also notes that there is a relationship between likelihood of breakoff and formal education, suggesting that more challenging survey designs are especially difficult for those with less education.

This is an interesting line of research and I would hope to see more of this in the years ahead.


I don’t know

I have this assignment of sorts to read an often-cited article by Jon Krosnick and some colleagues titled, "The Impact of 'No Opinion' Response Options on Data Quality," Public Opinion Quarterly, 66:371-403. This is quite timely as I have just finished a bit of empirical research with some colleagues that cites this article, although I confess I have not read it in several years. The research was a Web-based experiment in which we tried to assess the impact of offering a DK option in a Web survey. This is of special interest because so much of Web is about transitioning phone studies to Web and understanding some of the differences one sees. Interviewers seldom read a DK option to respondents. Rather, they hold them in their pocket and use them as a sort of last resort. Online you need to decide whether to display the DK option or not, but when you do it's not unusual to get a significant increase in its use. Hardly surprising. The key research question is whether you get an overall different distribution of substantive responses depending on whether you offer a DK option. In our study (presented by Mick Couper at the General Online Research Conference in Vienna), we got significantly higher rates of nonresponse (i.e., more frequent selection of DK rather than simply skipping the question) when we presented it on the screen. Differences in the distribution of response items question by question were difficult to detect. So the takeaway here is that not presenting the DK reduces nonresponse and does not seem to lead to a lot of guessing and nonsensical answers that destabilize the distributions.

Krosnick and his colleagues come to essentially the same conclusion. There is a body of research (see, for example, Philip Converse, "Attitudes and Non-Attitudes: Continuation of a Dialogue, " in The Quantitative Analysis of Social Problems, ed. Edward Tufte, 1970) that argues sometimes people really don't have a "preconsolidated opinion" and you can't force them to come up with one. So to keep people from guessing or answering randomly, it's best to offer a DK. Krosnick and his colleagues argue that selecting the DK is a form of satisficing that is most prevalent among respondents with less formal education, in self-administered situations, and toward the end of the questionnaire. In other words, it's a sign that respondents are not willing to put forth the cognitive energy to give a thoughtful answer, so they jump at the chance the DK offers. Take away the option and these people mostly will give valid answers and you will reduce your nonresponse.

Of course, taking away the DK option will not drive nonresponse to zero. Survey research, like life, is full of compromise. Unfortunately, due to a design flaw introduced by yours truly our research could shed no light on what happens if you don't offer a DK and don't let respondents just skip the question. On many commercial Web surveys there is no DK and an answer is required, which may produce a different result. That's a variation we need to test, but at least in this experiment our results were not terribly different from what Krosnick describes.


More comparisons of Web to other methods

I am finally getting around to wading through the mother lode of academic research noted in an earlier post way back at the beginning of March.  The special POQ issue has two articles, one looking at Web versus face-to-face and the other comparing CATI, Web and IVR.  The results are not particularly surprising, but it's nice to see one's suspicions confirmed with well-designed and executed research.

Dirk Heerwegh and Geert Loosveldt report on results from a survey in Belgium designed to assess attitudes toward immigrants and asylum seekers. They put considerable effort into designing both the Web and face-to-face survey based in Dillman's unimode construction principles.  In other words, they worked hard at making the two surveys as comparable as possible rather than optimizing each to its own mode.  Their results are pretty convincing.  The Web survey produced a higher rate of "don't know" responses, more missing data, and less differentiation in scales.

Frauke Kreuter, Stanley Presser, and Roger Tourangeau looked at social desirability bias across three methods--one with an interviewer (CATI), one without an interviewer (Web), and one with sort of an interviewer (IVR).  They drew a sample of University of Maryland alumni and asked a variety of questions about academic performance and post graduation giving.  They were able to verify the respondent's answers against university records.  In essence, they were able to tell who was telling the truth and who was not.  As with Heerwegh and Loosveldt, the results are pretty much what we would expect.  Web reporting was the most accurate and CATI the least with IVR generally somewhere in the middle.

So there you have it.  We used to like to say that "the Internet changes everything."  Well, it does not appear to have changed some basic principles of survey research.



Gravity. It's not just a good idea. It's the law.

Over the last few weeks I have been in a few settings where people took to comparing survey results from a Web panel (or two) with results from phone or in-person studies that used a probability sample.  The results were not all that comparable, and the general sense of the people looking at these results more or less came down to, "How can this be?"  There then followed a search for reasons that mostly came down to either bad panel or bad panelists. 

Should we be surprised when nonprobability samples yield different results than probability samples?  Or surprised when they are the same?

When I do presentations about panels I often like to quote Humphrey Taylor who has argued that we should accept online results as valid because in a number of comparisons with traditional methods (especially in political polling) they have produced similar results.  Or, to quote him directly, "Newton had no theory that explained gravity or that justified his 'laws' of gravity, dynamics, or optics.  But they became widely accepted because they worked." 

Now it's always seemed to me that this gives short shrift to Sir Isaac's Philosophiæ Naturalis Principia Mathematica, but putting that aside, the thing about gravity is that it always works and in very predictable and mathematically precise ways.   Alas, online does not and there is evidence aplenty to demonstrate it.

This may seem to be another of those traditionalist rants against online, but it's not.  It's more like a plea to recognize that online is different, that there is no inherent reason why it should produce the same results as probability-based methods.  When it doesn't the search for an explanation ought to start with the fact that the sample frame (i.e., the panel or all panels, for that matter) is biased and that no amount of demographic balancing or weighting is sufficient to fix that.  We have come to simplify the problem by assuming that if we can get the demos right we have a good sample, but the heart of the problem is not the demos, it's the cluster of attitudinal and behavioral characteristics that cause some people to go online while others do not and cause some people to join panels while the vast majority of others do not.  At this point we don't understand those differences well enough to measure them and therefore dealing with that bias is a hit and miss game.

At least until another Isaac Newton comes along.