The end of “don’t ask, don’t tell” in online survey research?

My colleague, Gregg Peterson, attended last week's CASRO Online Research Conference and has sent me this post.

The era of "don't ask, don't tell" in the world of commercial market research may well have ended last week in Las Vegas at the annual CASRO Online Research Conference.    The purveyors of online surveyors came face-to-face with the survey taking monsters they created.  Back in our rooms, if we looked up at the big mirrored walls of our swanky, conference hotel suites, we noticed the real culprits staring straight back at us.

Hands down, the talk of last week's conference was "the panel of panelists".  These were eight people recruited at the back of a large quantitative study of web survey takers who were there to help us understand their world and to tell us what they liked and disliked about the surveys we served up to them.  They were selected from among a few hundred qualified Las Vegas respondents (the Las Vegas respondents being a sub-quota of the large national study) based initially on the quality and thoughtfulness of their open ended responses in the quantitative survey, and, at a second stage, in a follow-up interview with the panel's eventual moderator, to see how well they could articulate their respective panel taking experiences.  (Let's hope that the recruiter was unconcerned about demographic diversity, because 7 of the 8 were not employed full time, none were below the age of 30, and all were Caucasians.)

And articulate they did - all of the scary sins of our industry.  Here's the least surprising of what we learned:  Each of them were members of multiple panels – the least ambitious of them was a member of a mere four panels while the majority of them seemed to be a members of 8-10 panels. When asked who had received multiple invitations to the same survey, all hands shot up immediately. A few happily admitted to taking the same survey multiple times.  Some seemed to be aware that it might not be completely kosher to do so, while one claimed proudly to have responded to all 10 invitations.  ("It's not up to me to police you guys.")  One very articulate elderly woman suggested that panel companies should allow them to indicate when they will be on vacation, because it was "so hard to come home to 600 or 800 survey invitations."  All were very clear about the importance of incentives and one panelist admitted to taking surveys and participating in focus groups simply to pay off the mortgage on his vacation home.   And he was less than happy about the new speeder detection tools sometimes imbedded in our surveys.    "It's easy to go fast when you've seen the same thing over and over again." 

As always, there are caveats. This was, after all, a qualitative study – not even pretending to be representative like most of the surveys we report on. That being said, there was some good news here.  These folks take their survey taking responsibilities seriously.  They did surveys at least in part because they really like giving their opinions.  They seemed sincere in their pledge of honesty and good intentions about providing truthful answers to all questions including screeners.  They only wish we were better at our jobs.  The spoke of broken links, screeners that take forever, highly repetitive surveys, progress bars that clearly lie, 50 minute surveys with too small an incentive, incentives that never show up, questions without a "Don't Know" category, missing response categories in our employment questions (retired people do not want to mark themselves as unemployed) and the dread of being confronted with dense text or a "a page full of tiny little boxes". They don't think their detergent has a personality and they were certain that there aren't 60 different ways to describe a soft drink.  A few really preferred surveys that gave them a chance to provide detailed open-end responses and many articulated the pleasure of taking a visually interesting and well-constructed survey.

What was perhaps most surprising was the reaction of researchers following this session. I heard a senior sales executive of a very large panel company express shock and amazement about the volume of surveys being taken and the professionalism of the respondents on the panel.  Everyone was abuzz.  Perhaps it was just the relief of getting it out in the open once and for all.  We could suddenly talk more freely about what we feared – the life blood of our industry may be a small army of highly determined professional survey takers. 

And guess what other topic was prominently featured on this year's agenda? Routers and maximizing panel "efficiency".  In other words, let's figure out if we can get these poor souls to do a few more poorly constructed surveys. We have work to do.


The bottom line on online sample routers

So what's the bottom line? I don't think we need to fear routing. Routing is essentially about automating and systematizing what panel companies have been doing for a decade. While some vendors are working on ways to claim greater representivity by the questions they ask in their screeners that's really a separate issue from routing. Likewise, sample quality solutions such as Imperium and TrueSample are distinct from routing.

So routing does not solve, nor does it pretend to solve, the underlying problems of doing research with volunteers, whether on panels or through other online sourcing methods. It probably does not make things worse and may create some significant improvements by bringing more discipline to the online sampling process and making it possible to broaden the sources we can use. What is certain is that the future is one with more routing, not less. My sense is that currently around 40% (give or take 10%) of US online sample is being routed. But I also expect that to continue to increase and eventually dominate.

Of course, the real bottom line is what effects if any we see in survey results using routed sample. It's not clear to me that that research has been done and I'm glad to see the ARF taking this on as part of their larger online sample quality initiative. Clients have made it very clear that the one thing they absolutely must have is reliability, replicability, consistency—whatever you choose to call it. They can't get one result this week and a different result with the same survey two weeks later. They can live without representivity or classic validity, but they must have reliability. I hope the ARF initiative helps us understand the role routing might play in meeting that need.


The problem of router bias

Everyone worries about router bias and it's not clear that anyone has figured out how to deal with it. At its core it seems to come down to the priority given to one survey over others and how that impacts the samples delivered to all of those other surveys. Before panels we drew samples that were independent of one another. But online samples create a dependency because sample drawing for a group of surveys is centralized, drawing on a fixed pool of respondents, in the case at hand, those entering the router. But, of course, the needs of one survey may conflict with those of another. I think of routers are chaos systems in action and each time a new survey is introduced you get a classic butterfly effect.

Here is a classic example. Let's say we have a general survey of car buyers as well as a survey of Hummer buyers both active in the router at the same time. Because of the low incidence of Hummer buyers all of them might go to that survey and the general survey of car buyers will have no Hummer buyers in it. (And if you happen to believe, as I do, that Hummer buyers have a unique perspective on the world that perspective, too, may be under represented in other surveys that have nothing to do with cars.) Hummerbuyer In other words, meeting the needs of low incidence surveys in the router could well bias the samples for the other surveys active at the same time. It gets really complex when you realize that there are hundreds if not thousands of surveys, all being serviced simultaneously by the same router and from the same pool of respondents.

Some argue that random assignment makes this problem go away but it's not clear if that's the case or how widely it's practiced. The fallback seems to be that this problem is inherent to panels, has always existed and has been managed inconsistently even within a single panel company. But everyone also agrees that it's a problem requiring a good deal more thought and empirical research before we understand it well enough to deal with it effectively.


Respondents in routers

The heart of routing is screening respondents. Conceptually, routing involves amassing all of the screening questions of waiting surveys (of which there typically are hundreds) and packaging them in a way that they can be efficiently administered while minimizing the chances that a respondent gets routed to a survey and then fails to qualify.  Some companies also use whatever information they already have on an individual to shorten the screening time. This might include profiling information on their panel members or, for non-panel members, information stored from a previous visit to the router by that individual.

One obMazevious concern is the fate of potential respondents in this process. Some people may take a long time to qualify and sometimes a respondent can pass the screener but still don't qualify in the survey proper. It's not uncommon to offer screen-outs, and occasionally completes, the opportunity go back into the router to try to qualify for another survey. Some companies place limits on how long they will let a person be in the router. Others limit the number of survey opportunities offered while some take the attitude that it's the Internet and people will do what people will do. Unlike real black holes, people can and do escape.

There are some major differences in the way all of this is managed. Going into the ARF meeting I thought I understood that there are three main types of router designs:

  1. Serial routers in which screening questions for waiting surveys are asked one after the other until a respondent qualifies. These routers generally are smart enough to remember questions in common across the waiting surveys so that a specific question only gets asked once of each respondent.
  2. Parallel routers in which a randomized set of screening questions is asked and then a respondent is routed to a survey for which he or she qualifies. This assignment might be random or it might be based on an algorithm that considers the needs of all waiting surveys.
  3. Hierarchical routers which start with a main router where a few basic questions are asked and then people are passed to one of several mini-routers that feed surveys with similar qualifiers, on similar topics, etc.

In practice it's nowhere near that clean. Most companies seem to use a mix of hierarchical and serial routing. And everyone has arguments about why their approach is best.

In my next post I'll describe the problem that everyone worries about: router bias.


Let’s have a look at online sample routers

A couple of weeks back I moderated about a four hour discussion on online sample routers sponsored by the ARF's Foundations of Quality research initiative. The current focus of the initiative is development of a research agenda that touches on the key dimensions of online research. The role of sample routers and how they may affect data quality is one of their issues. The discussion I moderated had about 15 people in it with reps from the major US panels as well as a handful of full-service research companies with large panels. As is often the case when you get a bunch of research types in a room talking about what they do it was a very open discussion with lots of sharing of practices, although generally not down to the detail level of secret sauces. This is the first of a whopping four blog posts summarizing what I learned.

I think it's fair to say that routers make a lot of researchers nervous. We have become comfortable with the standard online panel paradigm, in part because it's not unlike working with traditional list samples or even buying RDD . Lifes-maze But routing is different if not downright mysterious to most of us. We imagine these streams of potential respondents from all of these disparate sources flowing into this black hole on the Internet and then pre-screened samples delivered to waiting surveys at the other end. Nobody, except the people who build routers, seems sure exactly what goes on in there, including me.

Like many people I have confused blending with routing. But the objectives of routing are different from blending. Blending is basically sourcing and how to be smart about leveraging your panel, river, social networks and other people's panels to create more diversity in a sample or to fill low incidence quotas. Routing doesn't necessarily involve blending; some companies are only routing their own panels.

Most routing tries to do four things (in no particular order):

  1. Maximize the likelihood that anyone who wants to do a survey can. Everyone sees the problems in continuing to send willing respondents to surveys for which they don't qualify.
  2. Increase the chances of filling all the quotas for every survey and delivering on time to clients.
  3. Automate sample allocation and deployment based on a set of well thought out rules so that the process is less ad hoc than in the past when it was done manually, often by project managers and in inconsistent ways.
  4. Centralize decision making about how to optimize use of the available pool of respondents and provide metrics so that these decision makers know what's going on inside the router.

Some would argue that there are other objectives like increasing representivity or improving sample quality but these are extraneous to the primary things that routers are designed to do.

In my next post I'll describe a little about how routers try to do what they are meant to do.