This month's Research Business Report has two very interesting and related pieces. In one the editor takes the ARF ORQC to task for their somewhat languid reporting on the results of their Foundations of Quality initiative as well as the usefulness of their findings, or at least those findings that have been released to date. (In case you've not been paying attention, the ARF has spent a ton of money doing basic research on research aimed at improving our understanding of some of the long-talked about quality concerns with online panels.) At the risk of oversimplifying, I would characterize the ARF results to date as reassurance that the principal panel quality issues the industry has been most worried about over the last few years—fraudulent respondents, multipanel membership, people taking the same survey more than once, high levels of satisficing, etc.—are real but controllable, either by the panel supplier or the researcher.
Unfortunately, in the process of answering what we thought were the big questions they have raised an even bigger one. Here is one bit of the ARF data that has made it into the public domain. When comparing the incidence of smoking in survey results from 17 panels to benchmark data using traditional probability-based methods they found that (a) the three probability-based methods produced similar results while (b) the variation across panels varied by as much as 14 points. This should not surprise us. Panels are all very different in how they recruit their members and the way in which they manage those members over time. We can control the demographics, at least at the sampling stage, but the mix of attitudes and behaviors of panelists will vary substantially from panel to panel based on a wide range of things (e.g., where they were sourced, the appeal that was used, the incentive scheme, the survey topics they offer, etc.) that we still don't understand very well. In the end, the results we get for any given study are highly dependent (and mostly unpredictable) on the panel we use. This is not good news.
Which brings us to the second piece, the RBR lead story. For about the last 18 months Steve Gittleman has been doing something he calls, "The Grand Mean Project." In a truly prodigious effort, Steve has amassed survey data covering something like 90 panels worldwide and has been using those data to study variability among panels. He argues that with the move away from probability-based methods and the consistency that comes with a rigorously applied scientific approach we have lost our anchor to reality and with it any hope of using panels to get at "true values" for survey variables. Rather, the best we can hope for is that the same survey administered to the same panel multiple times will produce consistent results. Administer that survey to another panel, and you likely will get different results. Steve believes that he has developed a method that measures panel consistency over time and that you can use that method to identify the most consistent panels. In essence he is making an argument others have made to the effect that bias is not necessarily a bad thing as long as we know it's there, it's consistent over time, and we have other sources of information on the same topic that can help us correct it.
A little over ten years ago, Gordon Black, founder of Harris Interactive, asked "whether findings from huge samples of Internet respondents, coupled with sophisticated weighting processes, are as accurate as anything done on the telephone or door-to-door." More and more it seems like the answer is no.