Those pesky robo-polls

A new issue of Survey Practice is out and among the short articles is one by Jan van Lohuizen and Robert Wayne Samohyl titled "Method Effects and Robo-calls." (Some colleagues and I also have a short piece on placement of navigation buttons in Web surveys.) Like most people I know I have little regard for the accuracy of robo-calling as a competitor to dual frame RDD/cell phone using live interviewers and this article provides some grist for that mill. The paper looks at 624 national polls and the specific issue of Presidential approval. I'll just quote their conclusion:

" . . . while live operator surveys and internet surveys produce quite similar results, robo-polls produce a significantly higher estimate of the disapproval rate of the President and a significantly lower estimate for 'no opinion', attributing the difference in results to non-response bias resulting from low participation rates in robo-polls."

So far so good. But it reminded me of a report I'd recently seen (via Mark Blumenthal) about the latest NCPP report on pollster accuracy. In this study of 295 statewide polls in the 2010 cycle the average error Robocalls in the final outcome was 2.4 percentage points for polls with live interviewers versus 2.6 for robo-polls and 1.7 for Internet. Of course, accuracy on Election Day is not the same as accuracy during the course of the campaign. As even casual observers have noticed there is a tendency for all polls to converge as the election draws near. As this excellent post by Mark Mellman spells out, robo-polls may do well on Election Day but not so well in the weeks prior. I won't speculate as to the reasons.

But I take comfort in all of this. It's always nice to have one's prejudices confirmed.


Let’s get on with it

I spent some time over the weekend putting the finishing touches on a presentation for later this week in Washington at a workshop being put on by the Committee on National Statistics of the National Research Council. The workshop is part of a larger effort to develop a new agenda for research into social science data collections. My topic is "Nonresponse in Online Panel Surveys." Others will talk about nonresponse in telephone surveys and in self-administered surveys generally (presumably mail). The workshop is part of an overall effort driven by the increasing realization in the scientific side of the industry that as response rates continue to fall a key requirement of the probability sampling paradigm is violated. And so the question becomes: what are we going to do about it?

My message for this group is that online panels as we have used them in MR so far are not the answer. As I've noted in previous posts, response rates for online panels typically are an order of magnitude worse than telephone. At least with the telephone you start out with a good sample. (Wireless substitution is a bit of a red herring and completely manageable in the US.) With online panels you start out with something best described as a dog's breakfast. Dogs-breakfast-feature While it's become standard practice to do simple purposive sampling to create a demographically balanced sample that's generally not enough. To their credit, Gordon Black and George Terhanian recognized that fact over a decade ago when they argued for "sophisticated weighting processes" that essentially came down to attitudinal weighting to supplement demographic weighting to correct for biases in online samples. But understanding those biases and which ones are important given a specific topic and target population is not easy, and it doesn't always work. So a dozen years and $14 billion of online research later the industry seems to be just weighting online samples by the demographics and stamping them "REPRESENTATIVE."

The facts would seem to be these. On the one hand, you can still draw a terrific probability sample but the vast majority of people in that sample will not cooperate unless you make the extraordinary and expensive efforts that only governments have the resources to make. On the other hand, online panels have demonstrated that there are millions of people who not only are willing but sometimes eager to do surveys, yet we've not developed a good science-based approach for taking advantage. I take hope in the fact that some people are at least working on the problem. Doug Rivers regularly shares his research on sample matching which is interesting, although I've not seen applications outside of electoral polling. GMI's new Pinnacle product also is interesting, but so far I've only seen a brochure. And statisticians tell me that there is work in nonprobability sampling in other fields that might be adapted to the panel problem.

My message to the workshop group this week is simple: "Let's get on with it."


The relentless march of cell-only households

NHIS has released the latest estimates of wireless-only households in the US and, to almost no one's surprise, the steady increase continues. As of June 30, 2010 26.6 percent of US homes had only a telephone, a sharp increase of 2.1 percentage points since December of 2009. Another 15.9 percent of homes report that they take all or mostly all of their calls on a cell phone even though they also have landline service. The demographics of the wireless only population remain pretty much unchanged—young people, less affluent, less likely to have health insurance, more likely to be Hispanic, etc. WireSub

Some us, and I confess that I had one foot in this camp, once believed that as this group aged and became more established they might gravitate back to landlines but there is little evidence to support that. As you can see in the graph at the right (blue line is for adults), if anything, the slope is steepening.

When you look at these numbers it's hard to fathom that anyone doing telephone research might still be calling landlines only, but I see evidence of it every day. Fortunately, there is an emerging body of high quality research that is extremely helpful with the design, costing, and execution of studies that include cell phones. That literature is nicely summarized in AAPOR's 2010 Cell Phone Task Force Report.


So much for robopolls

For about the last week or so I have been getting regular calls on my home answering machine from Governor Mike Huckabee whom I gather is once again running for President.  While it seems to be the Governor's voice it also is a recording inviting me to do a survey by IVR.  Somewhere back in my blog archive there are a couple of posts about what the politicos like to call "robo polls," that is, RDD samples autodialed with a survey in which the questions are recorded and played back while respondents answer using the telephone keypad. I wrote those posts because a colleague had asked me about the validity of the methodology and specifically with regard to representivity.  I did my best to discredit it.

My posts hardly landed a blow compared to the job that Nate Silver of fivethirtyeight.com has done in a recent post exposing the methodology of the most well known of the robo pollsters, Scott Rasmussen.  There are the usual problems with Rasmussen's methodology--limited calling windows (5:00PM-9:00PM weeknights), no call backs, very low response rates, taking the first person who answers the phone, etc.--but Nate has taken things a step further by demonstrating that at best these calls can only reach about 50 percent of the population.  In fact, it's probably much lower.  Putting together the most recent data on cell only households with time use data from the American Time Use Survey Nate shows that the likelihood people are home or if they are home the likelihood they will answer the telephone is somewhere around 50/50.

I would like to think that this is the last nail in the coffin of robo polls but I know better.  It's just one more example of how little the consumers of survey research, whether in the media, the general public, or even our MR clients, understand about the underlying scientific principles of what we do. For this we probably only have ourselves to blame.


Cell only households continue to rise

CDC has just updated its estimate of cell only households from the National Health Interview Survey.  For those of you not paying attention, this is pretty much the gold standard for tracking the growth of cell only households.  CDC now reports that as of June of 2009, 22.7 percent of US homes have at least one cell phone but no landline.  Another 14.7 percent report having a landline but mostly using their cell phone(s) to place and receive calls.  Putting the two together, this means that a standard RDD telephone survey is likely to miss about 37 percent of households unless it is augmented to include cell phone sample.  While it adds time and expense, calling cell phones is rapidly becoming a standard feature of US telephone surveys.

I'd like to find a silver lining in this but can't.