Previous month:
October 2012
Next month:
December 2012

Posts from November 2012

The real lessons from US electoral polling

There has been a lot of buzz in MR circles about US electoral polling and especially the summary of accuracy of the polls produced by Nate Silver. As I wrote in this blog at the time Nate's piece appeared it's only natural for researchers to zero in on data collection methodology and that's pretty much happened in spades, most recently in today's research-live bulletin. But I can't help but wonder whether all of these arguments about telephone, online, cell phones, robopolls, etc. miss the real point.

The political pros who are the consumers of these data (we might call them clients) see a different story, much of it traceable to consistent misjudgments about who would actually show up at the polls and vote. This interview with Obama pollster Joel Berenson makes two really important points. The first is that the likely voter models used by many pollsters no longer work in the world of sophisticated GOTV efforts like the one the Obama campaign engineered. The second point, and perhaps more important one, is the folly of relying on a singled data source. The Obama campaign built its prediction models from multiple sources, some good and some not so good. And they did it in a way that obviously was very effective.

The really interesting story here would be for someone to do a deep dive on the sampling methodologies and likely voter models of the major polling firms, although I don't expect to see that anytime soon.

So let's not make too much of a fuss over this or draw lessons that may not be worth learning. Putting it in MR terms we might say that where there was failure it was not one of technique but the more fundamental error of not understanding the dynamics of that particular marketplace. And let's not overlook the fact that the most accurate poll of all (missed by just .1%) was by the Columbus Post Dispatch, and it was a mail survey.

Accuracy of US election polls

Nate Silver does a nice job this morning of summarizing the accuracy of and bias in the 2012 results of the 23 most prolific polling firms.   I’ve copied his table below. Before we look at it we need to remember that there is more involved in these numbers than different sampling methods.  The target population for most of these polls is likely voters and polling firms all have a secret sauce for filtering those folks into their surveys.  Some of the error probably can be sourced to that NateSilver step.


But to get back to the table, the first thing that struck me was the consistent Republican bias.  The second was the especially poor performance by two of the most respected electoral polling brands, Mason-Dixon and Gallup.  But my guess is that readers of this blog are going to look first at how the polls did by methodology.  In that regard there is some good news for Internet methodologies, although we probably should not make too much of it.

  As far back as the US elections of 2000 Harris Interactive showed that with the right adjustments online panels could perform as well as RDD.  When the AAPOR Task Force on Online Panels (which I chaired) reviewed the broader literature on online panels we concluded this about their performance in electoral polling:

A number of publications have compared the accuracy of final pre-election polls forecasting election outcomes (Abate, 1998; Snell et al, 1999; Harris Interactive, 2004, 2008; Stirton and Robertson, 2005; Taylor, Bremer, Overmeyer, Sigeel, and Terhanian, 2001; Twyman, 2008; Vavreck and Rivers, 2008).  In general, these publications document excellent accuracy of online nonprobability sample polls (with some notable exceptions), some instances of better accuracy in probability sample polls, and some instances of lower accuracy than probability sample polls. “ POQ 74:4, p.743

So there is an old news aspect to Nate’s analysis and one would hope that by 2012 the debate has moved on from the research parlor trick of predicting election outcomes to addressing the broader and more complicated problem of accurately measuring a larger set of attributes than the relatively straightforward question of whether people are going to vote for Candidate A or Candidate B.  In Nate’s table there are nine firms with an average error of 2 points or less and four of the nine use an Internet methodology of some sort.  I say “of some sort” because as best I can determine there are three methodologies at play.  Two of the four (Google and Angus Reid) draw their samples to match population demographics (primarily age and gender).  IPSOS, on the other hand, tries to calibrate its samples to using a combination of demographic, behavioral and attitudinal measures drawn from a variety of what it believes to be “high quality sources.”  (YouGov, which is further down the list, does something similar.)  RAND uses a probability-based method to recruit its panel.  So there are a variety of methodologies at play in these numbers.

Back in 2007, Humphrey Taylor argued that the key to generating accurate estimates from online panels is understanding their biases and how to correct them.  I tried to echo that point in a post about #twittersurvey a few weeks back.  Ray Poynter commented on that post.

My feeling is that the breakthrough we need is more insight into when the reactions to a message or question are broadly homogeneous, and when it is heterogeneous . . . When most people think the same thing, the sample structure tends not to matter very much. . .However, when views, attitudes, beliefs differ we need to balance the sample, which means knowing something about the population. This is where Twitter and even online access panels create dangers.

 I think Ray has said it pretty well.

Representativiteit is dood, lang leve representativiteit!

I'm in Amsterdam where for the last two days I've attended an ESOMAR conference that began as a panels conference in 2005, morphed into an online conference in 2009 and became 3D (a reference to a broader set of themes for digital data collection) in 2011. This conference has a longstanding reputation for exploring the leading edge of research methods and this one has been no different. There have been some really interesting papers and I will try to comment on a few of them in the days ahead.

But as an overriding theme it seemed to me that mobile has elbowed its way to the front of the pack and, in the process, has become as much art as science. People are doing some very clever things with mobile, so clever that sometimes it takes on the character of technology for technology's sake. Occasionally it even becomes a solution in search of a problem. This is not necessarily a bad thing; experimentation is what moves us forward. But at some point we need to connect all of this back to the principles of sampling and the basic requirement of all research that it deliver some insight about a target population, however defined. Much of the so-called NewMR has come unmoored from that basic principle and the businesses that are our clients are arguing about whether they should be data driven at all or simply rely on "gut."

At that same time we've just seen this fascinating story unfold in the US elections that has been as much about data versus gut as Obama versus Romney. The polls have told a consistent story for months but there has been a steady chorus of "experts" who have dismissed them as biased or simply missing the real story of the election. An especially focused if downright silly framing of the argument by Washington Post columnist (and former George W. Bush advisor) Michael Gerson dismissed the application of science to predict electoral behavior of the US population as "trivial."

So today, regardless of their political preferences, researchers should take both pleasure and perhaps two lessons from the election results. The first is that we are at our best when we put the science we know to work for our clients and do them a major disservice when we let them believe that representivity is not important or magically achieved. Shiny new methods attract business but solid research is what retains it. The second is that while the election results were forecast by the application of scientific sampling, it was won with big data. The vaunted Obama ground game was as much about identifying who to get to the polls as it was about actually getting them there.