For those mobile users sneaking into Web surveys

AAPOR's online journal Survey Practice continues to be one of my favorite places to go for quick updates on the latest methodological research. To wit, the December issue has a nice little piece by Mario Callegaro on mobile users who perhaps unexpectedly show up to take an online survey designed for administration via a laptop or desktop computer. Frequent conference goers have heard lots of papers about the problems of designing for mobile—simple screen, straightforward questions, less than five minutes, etc.). Mario's piece reminds us that the more immediate problem may be dealing with respondents who are ahead of the curve and already expect to do surveys on their smartphone or iPad. He describes how to detect these folks, presents a series of screen captures that show just how badly different mobile devices can scramble a Web survey, and reports on an experiment that shows significantly higher termination rates for mobile versus computer respondents. He concludes with a range of options from ignoring them (as long as they remain a very small percentage of respondents--say, less than 2 percent) to modifying the survey design to accommodate the different screen sizes. It's good advice worth checking out first hand.

We do a lot of very long and complex Web surveys sometimes including multiple DCM exercises, all of which make designing for mobile impossible. Our preference has been to detect the mobile users at the outset and turn them away with a message that says, in essence, you need to do this on a real computer. We've never tracked how many of those people actually come back later but we probably should.

To tell you the truth . . .

Those of us who are old enough to remember the early days of online may also recall that one troubling finding was the disparity between customer sat ratings across modes, that is, online vs. phone. Online always seemed to be lower. There were multiple hypotheses advanced but once the empirical work got going it seemed to come down to one of two mode effects: good old fashioned social desirability bias or simply differences in the way people use scales in a visual mode like Web as compared to an aural mode like telephone.

I admit to having always been skeptical of social desirability as an explanation when it comes to something like satisfaction with someone's product or service. The literature on social desirability generally has focused on really heavy issues like drug use and abortion. Then I saw some stuff from Harris Interactive that claimed to find effects on behaviors as mundane as how often one brushes one's teeth or goes to church. And now an article (with too many authors to cite) in the summer issue of POQ goes even further and seems to show that subjective items such as attitudes and opinions also are affected by social desirability bias. The experiment was not with Web, but rather using a method they call TCASI (for Telephone Computer-Assisted Self-Interviewing). In this approach a person is called and the interview started with conventional telephone. At one point the respondent is switched over to IVR and the interview moves from interviewer administration to self-administration. In the experiment respondents were randomly assigned either to full telephone or to TCASI. Respondents in the TCASI treatment were more supportive of traditional gender roles and corporal punishment, less supportive of integrated neighborhoods and same –gender sex, and more likely to agree that occasional marijuana use is harmless and to describe themselves as attractive. In other words, the telephone interviews were more likely to deliver socially tolerant and politically correct answers than were the self-administered interviews.

The thing I like about this study is that it doesn't confound the issue with other factors in the way a straight Web-telephone study might. There is no visual dimension; it's all aural. There also is no issue of comparing panel sample (non probability) with probability.

Of course, the key issue in these discussions always comes down to the simple question: what is the truth and therefore the better method for getting at it?

More comparisons of Web to other methods

I am finally getting around to wading through the mother lode of academic research noted in an earlier post way back at the beginning of March.  The special POQ issue has two articles, one looking at Web versus face-to-face and the other comparing CATI, Web and IVR.  The results are not particularly surprising, but it's nice to see one's suspicions confirmed with well-designed and executed research.

Dirk Heerwegh and Geert Loosveldt report on results from a survey in Belgium designed to assess attitudes toward immigrants and asylum seekers. They put considerable effort into designing both the Web and face-to-face survey based in Dillman's unimode construction principles.  In other words, they worked hard at making the two surveys as comparable as possible rather than optimizing each to its own mode.  Their results are pretty convincing.  The Web survey produced a higher rate of "don't know" responses, more missing data, and less differentiation in scales.

Frauke Kreuter, Stanley Presser, and Roger Tourangeau looked at social desirability bias across three methods--one with an interviewer (CATI), one without an interviewer (Web), and one with sort of an interviewer (IVR).  They drew a sample of University of Maryland alumni and asked a variety of questions about academic performance and post graduation giving.  They were able to verify the respondent's answers against university records.  In essence, they were able to tell who was telling the truth and who was not.  As with Heerwegh and Loosveldt, the results are pretty much what we would expect.  Web reporting was the most accurate and CATI the least with IVR generally somewhere in the middle.

So there you have it.  We used to like to say that "the Internet changes everything."  Well, it does not appear to have changed some basic principles of survey research.

Toronto in January?

I've just come back from Toronto where I gave a talk at NetGain 3.0, a one-day conference put on by MRIA. As the title suggests, the focus was online research and the presentations covered all of the usual ground that conferences like this cover. Now I don't mean that as a knock. I think it's good news that the issues are being widely discussed in all sorts of venues. There may not be a whole lot of new solutions being proposed but at least people are increasingly aware of the problems the industry and clients are wrestling with.

The conference was opened by Pete Cape from the UK and SSI. Pete has been a major voice in the ongoing debate. He took the group through an exercise that quickly exposed that we are in industry of amateurs with little background or formal training in market research. Most people seem to have just stumbled into the business, and that's not just true in Canada. It goes a long way toward explaining why we struggle with many of these methodological issues. Bottom line: as an industry we too often don't really understand what we are selling or the validity of the claims we make for it.

Next up was John Wright, a political pollster from Ipsos-Reid. His talk was equal parts bragging about how accurate their telephone polling has been, presenting lots of data "proving" that online can be just as good as telephone polLing if it's done right, and railing at organizations like MRIA and AAPOR for their intransigence around the reporting of margin of error statistics for online studies. The truth is that political polling is one arena where online has been shown to work pretty well, although the art of political polling is arcane enough that we should probably not infer much about other kinds of research. The railing against MRIA and AAPOR was Exhibit A in Pete Cape's argument that research training is desperately needed in our industry. I happened to be sitting next to the Standards Chair for MRIA and we agreed that John's quarrel was not with MRIA or AAPOR but with the guy who invented the margin of error calculation with its problematic assumption that you had a probability sample.

Next up was a paper by Anne Crassweller that she had also presented in Dublin at the ESOMAR Panels Conference. It's one of those studies chronicling attempts to move a long-term study online and failing to do so because the topic—newspaper readership—is to some degree correlated with online behavior. This would seem to be a classic example of where online does not fit the purpose of the research.

Then came what I thought was the best presentation of the conference y Barry Watson from Environics. These guys build population segmentation models based on attitudes and values. Barry presented some data comparing three online panels to the general US population. A key segment way overrepresented in the panels is what they call "liberal/progressives." The underrepresented segments included groups they call "disenfranchise and" and "modern middle America." To really understand the implications one would have to dig deeper into the segment composition but this approach of trying to understand the attitudinal and behavioral differences of online panelists versus the general population strikes me as very important and generally missing when people make claims of "representativeness." Mostly the industry as expressed these things in demographic terms which really are somewhat meaningless in this context.

Barry also gave us the best quote of the conference: "Bias is only a problem when you don't know what it is."

The afternoon was less interesting, even with me kicking it off. My main message: let's stop talking about representativeness and instead focus on understanding bias and how it relates to the business problem we are studying.

Next we had the obligatory argument for "eye candy" to increase respondent engagement and lots of data to show just how widespread social desirability bias can be. And there was a pitch from the RFL people about their "pillars of quality."

When it was all said and done I found it not a bad way to spend a day. I got some fresh perspective and a chance to rant a bit which is always welcome.

Lying about satisfaction?

Back in September I described a WSJ piece that reported on a set of findings from Harris Interactive suggesting that  social desirability operates more widely than perhaps I had thought.  Nonetheless, I was not convinced that it was an especially significant concern for customer satisfaction surveys.  Turns out, I might be wrong about that

We are working on a proposal in which we are looking at the possible impacts of transitioning a customer sat study from telephone to IVR.  While doing my due diligence on this I found a 2002 POQ article (Roger Tourangeau, Darby Miller Steiger, and David Wilson (2002), "Evaluating IVR," POQ, 66, 265-278.)  In a set of well designed experiments they found that telephone interviewing consistently produced higher sat scores than IVR.  While the differences were not major (less than a point on mean scores for a 10 point scale)and not always significant they were very stable across questions and different length scales.

The obvious question (at least for me) is how this might translate to Web where for years we have seen major differences in sat scores when compared to phone.  Of course, with Web we have a second variable, namely seeing the scale displayed rather than having it read.  There is some interesting research there as well, but I'll save that for another post.