Previous month:
May 2015
Next month:
October 2015

Posts from August 2015

Failure to replicate

The not-so-big new last week was the NYT article with the intriguing title, “Many Psychology Findings Not as Strong as Claimed, Study Says,” a rehash of this article in Science. In case you missed it, the bottom line is that findings from roughly two-thirds of studies in peer-reviewed journals could not be replicated.

This should notPuzzled surprise us. Way back in 1987 a group of British researchers compared results published on various cancer-related topics and “found 56 topics in which the results of a case control study were in conflict with the results from other studies of the same relationship.” And, I expect all of us can think of a few things we do that used to be healthy but now are not. And vice versa.

What might all of this mean for MR? 

On more than one occasion Ray Poynter has warned us about publication bias and the fact that results from just one study do not always point to truth. The former is cited as one potential culprit in the Science article, and the latter ought to be obvious to anyone calling him or herself a researcher. But is there something even more troubling at the heart of the problem, something that MR has in spades: questionable sampling practices.

Psychologists are notorious for their use of relatively small convenience samples and the belief that randomizing sample members across treatments cures all. I did not look at all 100 studies in the Science article, but the handful I looked at all collected new samples ranging in size from 60 to 220. Students and people in the street were popular choices. If this is indicative of all 100, I am shocked that only 60 didn’t replicate. Then again, I drew a very small convenience sample.

For the most part, MR avoids small samples, except in qualitative research where we generally are smart enough to characterize the results as “directional” at best but seldom representative. Good for us. But we are knee-deep in convenience samples these days. A little less certainty in what we claim they represent wouldn’t hurt.

Online samples: Paying attention to the important stuff

Those of you who routinely prowl the MRX blogosphere may have noticed a recent uptick in worries about speeders, fraudulent respondents, and other undesirables in online surveys. None of this is new. These concerns first surfaced over a decade ago, and I admit to being among those working the worry beads. An awful lot has changed over the last 10 years, but it seems that not everyone has been paying attention. Muchqu

Yesterday, my buddy Melanie Courtright at Research Now reached her I’m-not-going-to-take-it-any-more moment and posted an overview of what are now widely accepted practices for building and maintaining online sample quality. Most this is not new, nor is it unique to Research Now. If you are really worried about this stuff, choose your online sample supplier carefully and sleep at night. ESOMAR has for many years provided advice on how to do this (select a supplier, not sleep at night).

Of course, none of this guarantees that you are not going to have some speeders sneak into your survey who will skip questions, answer randomly, choose non-substantive answers (DK or NA), etc. Your questionnaire could be encouraging that behavior, but let’s assume you have a great, respondent friendly questionnaire. Then the question is, “Does speeding with its attendant data problems matter?”  The answer is pretty much, “No.” It may offend our sensibilities but the likely impact on findings is negligible. Partly that’s because we seldom get a large enough proportion of these “bad respondents” to significantly impact our results, but also because their response patterns generally are random rather than biased. See Robert Greszki’s recent article in Public Opinion Quarterly for a good discussion and example.

The second iteration of the ARF's Foundations of Quality initiative also looked at this issue in considerable detail and offered these three conclusions:

  • For all the energy expended on identifying those with low quality responses, they may make less of a difference in results than focusing more clearly on what makes for a good sample provider.
  • Further, when sub-optimal behaviors occur at higher rates, they generally indicate a poorly designed survey – some combination of too long, too boring, or too difficult for the intended respondents. Most respondents do not enter a survey with the intention of not paying attention or answering questions in sub-optimal ways, but start to act that way as a result of the situation they find themselves in.
  • Deselecting more respondents who exhibit sub-optimal behaviors may increase bias in our samples by reducing diversity, making the sample less like the intended population.

The irony in all of this is that the potential harm caused by a few poor performing respondents pales in comparison to the risk of using samples of people who have volunteered to do surveys online, especially in countries with low Internet penetration. There is the widely-accepted belief in the magical properties of demographic quotas to create representative samples of almost any target population. No doubt that works sometimes, but we also know that, depending on the survey topic, other characteristics are needed to select a proper sample. What characteristics and when to use them remain open questions. Few online sample suppliers have proven solutions and outside of academia little effort is being put to developing one.