Previous month:
February 2009
Next month:
April 2009

Posts from March 2009

It's free!

Michelle Rawling from Oxford Journals has sent me an email clarifying my earlier characterization of the special issue of POQ as "free to AAPOR members.  She writes:

"Public Opinion Quarterly has recently published its annual special issue, this time covering the topic of Web Survey Methods. As it has been in the past, the POQ special issue is free to all readers! For more information, including a complete list of articles and access to them, visit"

Thanks, Michelle, for the clarification.

Yes, but . . .

Catching up on some things that have been sitting around my inbox I came across this little tidbit from the Q&A part of a Peanut Labs Webinar in which the company pitched its social network sample product. Someone asked, "How do you ensure "Representative" sample from these social media outlets. I was more than a little surprised at their answer:

"[The sample] Peanut Labs delivers is "representative" of Internet users that belong to social networks. We cannot make claims beyond that – just as no other online sample provider can claim that their samples are representative of anyone other than people who want to join panels."

This sort of truth in advertising has been in short supply in our industry for some while, and not just in all the buzz about social networking and social media. To no small extent the disappointment that many clients have come to feel about online has its roots in the extravagant claims often made by panel companies in particular, but also online evangelists in research agencies whose business models were built around online methods.

But these moments of truth speaking can be fleeting and the PL spokesperson went on to say:

"However, research-on-research studies to be released early on in the new year show that Peanut Labs sample differs very little from offline and online sample in terms of behavior or attitude. The one exception is in their attitude to online advertising, with which they are more intensely involved."

OK, so nobody is perfect.

From the frying pan to the fire

The last issue of the International Journal of Market Research has an article by Mike Cooke and some colleagues at GfK describing their attempt to migrate Great Britain's 20 year running Financial Research Survey from face-to-face to online. Despite hard work by some of the smartest people I know in this business and after spending around £500,000 they ultimately concluded that full migration to online was not possible unless they were prepared to risk the credibility of the data itself. There simply were too many differences online that would disrupt their 20 year timeline. They ultimately settled on a mixed mode approach that has an online component but with the majority of interviews still done face-to-face.

Seeing the piece in print (I had heard the research presented previously at a conference a year or so ago) reminded me that much of the research one sees on this general topic of conversion of tracking studies from offline to online doesn't have a happy ending. Earlier this year in Toronto at an MRIA conference I heard a paper by Ann Crassweller from the Newspaper Audience Databank in Canada describing her test of the feasibility of converting a tracking study on newspaper readership from telephone to online. She compared results from her telephone survey to those from four different online panels. She was unable to get an online sample from any of the four panels that matched up with her telephone sample on key behavioral items, and the variation among the four panels was substantial. She concluded that at least for now, she needed to stay with telephone.

Fredrik Nauckhoff and his colleagues from Cint had a better story to tell at the ESOMAR Panel Research Conference in 2007. They compared telephone and online results for a Swedish tracker focused on automobile brand and ad awareness. Results mostly were comparable and where they were not the authors felt they were manageable. They did, however, sound a note of caution about the applicability of their results to countries with lower internet penetration than Sweden (81 percent).

I've personally been involved in a number of similar studies over the last few years, most of which I can't describe in any detail because they are proprietary to clients. One exception is work we did back in 2001 on the American Customer Satisfaction Index. We administered the same interview about satisfaction with a leading online book seller by telephone and to an online panel. To qualify for the survey the respondent had to have purchased from the online merchant in the last six months. We found few significant differences in our results. Additional experiments on this study with offline merchants have been less encouraging.

In 2003 we conducted a series of experiments aimed at the possible transition of at least some of Medstat's PULSE survey from telephone to Web. Despite using two different panels and a variety of weighting techniques we were unable to produce online data that did not seriously disrupt the PULSE time series on key measures. This study continues to be done by telephone.

In some cases, despite significant differences between telephone and online, clients still have elected to transition to online, either in whole or in part. In at least one instance of a customer satisfaction study a client felt that the online results were a more accurate reflection of their experience with their customer base. In another, the cost savings were so significant that the client elected to accept the disruption in the time series and to establish new baseline measurement with online.

What all of this suggests to me is that it is impossible to know in advance whether a given study is a good candidate or a poor candidate for transition to online. There is little question that online respondents are different in a whole host of ways—demographic, attitudinal, and behavioral—from the rest of the population and from the survey respondents we typically interview in offline modes. The key is to understand whether those differences matter in the context of whatever we are trying to measure in our survey. We can only learn this through empirical investigation, and even then, explaining the differences in our results can be frustratingly difficult.

The truth spoken here

Two interesting articles in the current issue of the trade rag, Research Business Report, both saying some things you don't hear said often enough.  First comes Bernie Malinof, founder of the Canadian consultancy Element54.  He sums up the current respondent engagement frenzy like this, "A 50-foot rogue wave is rapidly approaching with an explosion of interactive, rich-media questionnaires using formats that may or may not produce better data.  But I guarantee you they will produce different data. "

Readers of this blog will know immediately that I agree.  The experiments with this stuff that are routinely presented at industry conferences almost always focus on the fact that respondents like shinier online questionnaires, but few focus on whether or not the data we get is any better.  In my cynical moments I have been heard to say, "A lot of this Flash-enabled stuff is about people advancing their business models and not about producing better data."

Then here comes Kees de Jong, new CEO of SSI talking about, among other things, how we got ourselves into the panel data quality crisis. He contrasts the generally cautious approach of Europeans to online research with the full-steam ahead approach here in the U.S.  To wit, "But while that went on in Europe, Harris Interactive's George Terhanian and his peers convinced the U.S. market of the feasibility of online research.  Everyone began doing it, some without asking enough questions, and things overall got a little out of hand . . .U.S. culture is about business building and everyone wanting their piece of it."

And there you have the quandary.   How do you balance the need to innovate with the need to do good research?  How do you know when the legitimate desire to gain a competitive advantage in the marketplace crosses the line into snake oil? 

Does it really matter how we present scales in Web surveys?

In an earlier post I noted the special issue of SMR focused on Web surveys.  In one of the articles in that issue Don Dillman and two of his former students look at the effect of various ways of displaying scales in online questionnaires.  They start by citing a point made some years ago by Norbert Schwartz to the effect that in the traditional interviewer modes scale labels whether words or numbers are “vague quantifiers.”  In a visual mode like Web, how the categories are displayed provides additional clues on how to interpret them.  Order, orientation (vertical vs. horizontal), and the distance between categories are all variations in presentation that might influence how a respondent interprets the question.  We could stop right there and note that this may be a major reason why we seen often dramatic differences between telephone and Web.  But there is a good deal more.

They did several experiments testing the impact on response of various scale formats.  The scales were all simple five point scales.   The key findings:

  • They confirmed Roger Tourangeau’s “good is up” hypothesis by showing that consistently presenting the most positive option first, regardless of orientation, results in respondents answering more quickly than if the most negative option is placed first. There is no difference in the resulting response distributions; it’s just easier for people to process the scale that way.
  •  They did not find uneven spacing of response options to be problematic as long as the midpoint and endpoints were visually aligned.   In one manipulation there was more space around the midpoint than around other categories and in another there was more space between the  endpoints and the penultimate points at either end than between other categories.  These displays did not affect response.
  • Similarly, separating a DK response from the rest of the categories with more space does not seem to have affected response.
  • Finally, they found that labels and numbers on all scale points (as opposed to endpoint labels only) seem to slow people down, although with no impact on the response distributions.

All in all, this is a nice little piece and I recommend that anyone interested in the issue get hold of it and dig into more detail.  My only problem with it is that it tests relatively short, five-point scales.  I can’t help but wonder if these findings would replicate on longer scales like those with seven or eleven categories.