Previous month:
January 2010
Next month:
March 2010

Posts from February 2010

So this is what it's come to

Like a lot of us in MR I belong to a panel or two.  Just professional curiosity. My day job pays me enough so I can get by.   I have found two invitations in my inbox on a Saturday afternoon and since they are B2B I have clicked through to the first one.  I am welcomed with an expression of appreciation, accented with an exclamation point at the end. I don't get the the point of the punctuation, but at least it's not all in caps.  I make a note to suggest to the person who runs our telephone centers that maybe if she trained interviewers to shout their appreciation at people when they answer the phone that cooperation would improve.  But alas, the appreciation is short lived because before I so much as click the Next button I am sternly warned: "Should our quality checks determine that you have not provided thoughtful attention to this survey, you may be disqualified and forfeit the associated rewards."  Whoa! (Yes, I raised my voice.) They don't tell me what those quality checks are or how they will determine whether I deserve the $9 they've offered me but I am chastened and click a button pledging to be "thoughtful and honest."  I am asked my gender and my age in a series of categories.  But apparently that's not enough because I am then asked my "exact age."  It seems odd to me that they would use two screens to get to my age but, hey, it's their money and I'm being compensated for my time.  I worry about use of the word "exact" but thankfully they only want it in years. My birthday is in March but I round down.  Apparently this is not one of their quality checks because they let me go forward.  Next they want to know if I work in the marketing research industry.  Now the panel company knows I work in MR because I'm sure I answered their profile questionnaire thoughtfully and honestly.  So this might be one of the "quality checks" I've been warned about.  I decide to test them and click "No."  I get to the next question.  It may take the full 30 minutes to answer.  But I'm feeling guilty and want a second chance to answer thoughtfully and honestly on that MR question.  They're not interested in letting me make amends.  There is no button to take me back.  In desperation I use the Back button my browser and change the answer to "Yes."  I breathe a sigh of relief and click the Next button.  Nothing happens.  I am found out. I promise to do better next time.  As far as I can tell, no one is listening.

CASRO Panels Finale

Back again at CASRO Panels. The afternoon is dedicated to survey routers. As this has unfolded it's clear to me that it is the perfect way to end to the conference. It's the logical endpoint. So let's go back to the beginning and note that there seem to be four overriding themes to this conference.

The first was implicit in Kim Dedeker's opening remarks about the focus on reliability rather than validity. It's hard any more to find anyone whose head has been in the game to argue that panel research as we practice it today (with a few notable exceptions) produces representative samples or results that are projectable to target populations with any specific accuracy. That is at long last a settled issue. As Kim suggested, clients can deal with results that they know have some bias as long as those results are not jumping all over the place from survey to survey. For many of their purposes directional results are just fine.

The second is that the panel data quality crisis (which Kim sort of launched) is no longer the focus. Panel companies and research suppliers have developed a set of solutions to deal with the biggest issues and these are being implemented all over the industry. It may be too soon to pronounce the problem solved, but I think it's clear that we are out of the woods on this one. There still is good and important research on this issue, some of it presented at this conference, but we seem to have figured this one out.

The third is the clear realization that there is wide variability in panels and it's unwise to expect to get consistent results from panel to panel. One of the themes of the ARF research is to protect against variability by making sure that the panel you choose to work with as the depth to support the full run of your research. If the panel can't sustain it and you are forced to change, you could be in for a rocky time.

Finally, the era of the panel as we have known it over roughly the last 15 years is rapidly closing. The old model of sending a bunch of invitations to the panel and directing panelists to a single survey is increasingly untenable. As people have been pointing out for years the panel model is not sustainable. The pool of willing respondents is limited and we need strategies that tap multiple sources to draw in the number of respondents we need to meet the demand. And so we need to create a nearly constant stream of willing respondents from panels, from river, from social networks, from IM, from sms messaging ,etc. .

Which brings us to routers. We need effective ways to allocate those respondents. These things have been around for over a decade and mostly used with river sampling, but they have been black boxes that most of us know very little about. Now getting routers right seems like a critical issue. We heard nice presentations from Western Wats and OTX on various approaches to routing. The main takeaway seemed to be that random assignment of Rs to waiting surveys is the most efficient in terms of sample utilization as well as the best way to moderate the diversity/bias across these multiple sample sources. But it's not that simple. There are other issues as well. For example, we need routers that minimize respondent burden. Some routers keep sending Rs through multiple surveys until they find one that the R qualifies for. People can get trapped in these and are unlikely to come back a second time. Equally problematic is the difficulty of computing some standard metrics we are used to like response rate or contact rate. Or should we have willing Rs to more than one survey out of the same recruitment? What happens when the input stream varies by source? Are the types of screening questions that Jackie Lorch described in her paper a good idea? The more people talk about it, the more difficult the problem becomes.

So my main takeway for the last two days is that a dramatic change is upon us and it's not clear just how ready we are for that change. Most of the buzz in the MR world over the last year or so has been about social media, MROCs, Twitter, etc. Panels were passé.

For better or for worse, panels are becoming a lot more interesting.

CASRO Panels Conference – Day 2

Back at CASRO Panels for another day. First speaker was to be Joel Rubinson from ARF but he has sent his number two, Ray Pettit. He began by reviewing the major findings from the Foundations of Quality project:

  • There are lots of variations in results among panels. In other words, results on a study will change if you change panels. Another consequence is that lending can be dangerous.
  • Purchase intent measures are impacted by a panelist's tenure on the panel.
  • Doing lots of surveys is not necessarily bad.

He also put up a graphic showing all of the things that impact "Panel Data Quality." When I saw it my thought was that they are rebuilding the Total Survey Error Model. Seems like that would have been a good starting point rather than trying to slowly reinvent it more or less piece by piece in a somewhat unsystematic way. It's an existing framework with a rich literature to back it up. Drawing on and participating in that might have been a better approach than reinventing it.

The rest of the presentation was about the QeP process that involves a formalized set of forms and procedures to document things at the panel stage, the individual survey stage, and the research agency editing stage. It's been tested with some big suppliers and met with enthusiasm. They are going to run training programs for it as they roll it out more broadly.

In the Q&A one of the program chairs (Jeff Miller) pressed him on when we will see detailed results. So far we've only seen high level stuff and there has been considerable disappointment around the industry in terms of what we have seen so far. The summary stuff apparently was published in the Journal of Advertising Research December issue. He promised the detailed stuff in March.

Also in the Q&A someone pointed out that just as this is rolling out for panels the whole panel landscape is changing. How quickly they can evolve to deal with that probably is a critical success factor for the initiative.

Next we heard from Nallan Suresh and Michael Conklin from MarketTools. They have been doing some interesting building regression models to understand what drives respondent engagement. The essence of their model is a combination of behavior and outcomes on debrief questions. Key findings:

  • Shorter surveys are better than longer surveys and the max seems to be around 17-20 minutes.
  • Matrix questions are inherently problematic, although they do shorten surveys so one has to seek a balance.
  • Easily understood questions with intuitive answering devices are better than complex and difficult questions with unfamiliar answering conventions that cause respondent's to struggle on some pages of the survey.
  • Key indicators of bad survey design are high rates of abandonment and satisficing.


To their credit, they do not advocate dealing with the problem by color and flash gadgets as many others have done.

This is good common sense stuff, and it's nice to see it backed up with data. As an example they have brought a client to testify to his ability to move a fairly complex survey task from a CLT setting to online and simplifying it along the way. Lots of data to show that mostly, it worked. A nice story, but it seems like a bit of a nonsequitur.

It's not their point, but the cynic in me wonders if maybe the kinds of people who show up at CLT testing are the same kind of people who sign up for panels.

Last presenter is this segment was Adam Porter from e-Rewards/Research Now. He reported on some research designed to get a handle on what Rs view as a good survey versus a bad survey.

  • They found a positive relationship between survey satisfaction and incentive size.
  • They found a negative relationship between survey satisfaction and length.

Not much of a surprise there but the better findings focused on the characteristics of a bad survey:

  • Unclear questions
  • Repetitive questions
  • Not relevant (to the R) questions
  • No way to express an opinion (no DK, not able to skip, etc.)
  • Too detailed
  • Too many clicks

The positives were essentially the opposite of the negatives. But one key point: the single most problematic feature was restricted answer options. No DK, no way to refuse, could not skip, no open end to express an opinion.


CASRO Panels Conference -- Day 1

I’m blogging from the CASRO Panels Conference in New Orleans.  (#caspan on Twitter)This is the third year for this event and based on the program it could be the strongest year yet.   And make no mistake about it; this conference takes its title seriously.  The sessions are overwhelming focused on the quality challenges that the panel paradigm faces while also giving some space to some new developments.  The MR blogosphere’s current obsession with social media is getting scant attention.

An aside: Jeffrey Henning is sitting behind me and also blogging the conference.  So the bar is being set very high.  He has told me his secret about how he manages to be so prolific and so smart.  Unfortunately, I can’t act on either.  Worse yet, I am really rushing and so expect an order of magnitude increase in typos.

Diane Bowers opened with encouraging words about the MR industry showing signs of recovery from 2009.  Part of that is their survey data and some of it optimism she’s picking up among CASRO members.  Let’s hope so.

True to its theme the conference’s first speaker was the lady who was among the first to wave the warning flag: Kim Dedeker.    She promised that her talk would be about ‘reliability’ rather than ‘validity.’  By that she means a system that produces ‘identical outcomes’ given the same inputs.  It’s about consistency from survey to survey and not necessarily about accuracy.   As a former client-side person she speaks with authority when she says that clients engage with us as part of managing risk on business decisions.  If we can’t help them do that then we lose our credibility.  I didn’t count them, but she used the term ‘science’ at least 10 times.  That’s something clients look to us to provide and the application of science is what delivers on that consistency thing.  She was asked a question about accuracy which sort of gets to validity.  Sometimes it’s important, but often it’s less important than reliability because many client companies have other sources to benchmark against.  As long as they are seeing reasonable consistency over time, they feel reassured.  But she also pointed out to all of us that it is absolutely essential that we keep evolving the science.  Surveys may or may not be dead, but it’s hard to argue that how we do surveys must change and change rather dramatically.  That is the question we have yet to answer.

Next up was Jamie Baker- Prewitt (no relation) from Burke whose topic was the variation in buying patterns that may exist across different sample sources.  She did a nice job of summarizing the research on research issues that we have all watched go by over the last five years or so.  (I was a bit surprised to hear that we don’t have to worry about coverage error anymore because of high Internet penetration but will soldier on.)  Her study looked at six samples—two classic panels, two river samples, and two social networking samples.  The demographics of the samples were surprisingly uniform, although Facebook seemed to have delivered a much different group (more male, more lower income, older) than one might expect. The two social networks also delivered samples with people who spend more time online that the other samples.  Time forced her to race through product awareness and use measures. It was impossible to keep up but there were lots of instances where there were not a whole lot of differences among these sources, some surprising and some not.   In general social networking sample tends to be an outlier more than others.  The Facebook sample often stands out, especially in terms of brand awareness where the FB respondents just are not as aware as others.  Sample from FB took a real beating.   Very different from the other sources on a variety of measures, but also very expensive and inefficient.  The bottom line seems to be that there is some consistency across the standard panels and river but it was less so for social networking sample, especially FB.  Someone asked about accuracy but she punted on that one.  Remember, it’s about consistency.

The final presentation in this leg was Jackie Lorch from SSI.  Very interesting.  She started with the claim that panels as we have known them are dying and we need to be much more diverse in how we recruit.  So she imagines a combination of panel invitations, river, sms messages, etc. with everyone coming into a routing hub where they answer some questions and then get put to one of many online surveys.  I expect she’s right  about the need for this kind of multiple sourcing going forward.  It’s the obvious answer to the-panels-are-not-sustainable argument that one hears over and over.  But the interesting part is what happens in the hub where the sample sources get blended together.  She starts from the premise that balancing people on demographics is not enough.  We need more.  We need to take into account the attitudinal and behavioral differences that are at the heart of why panels and online in general fails the representative test.  So they have been doing a lot of work with various kinds of psychometric and segmentation ideas to try to create more representative samples than you can get just with demographic balancing.  My first reaction was that it was like propensity weighting only on the front end.  But the more she talked the clearer it seemed that it was model-based sampling, although she never used the term.  Now I am not a sampler and if you are I suggest you stop now because I am about to send you into terminal eye rolling.  Once again, I soldier on.  You build a model of the distribution of key variables you need to create a representative sample of your target population and then make sure your sample is drawn to conform to it.  This is respectable stuff, but also very difficult.  As a sampler I know once said, “There is nothing wrong with model-based sampling; it’s just that there are a lot of bad models.”  In other words, your sample is only as good as your model and getting the model right is hard.  Modeling to a specific outcome is one thing, but modeling to a whole range of possible and unknown outcomes is really really difficult.  Some of the people doing online political polling are using this approach.  They know the right proportions of characteristics and behaviors to get in the sample.  They have been able to do it because the study the same problem over and over and its one with a known outcome.  But building a general model to cover all of the possible topics in an MR consumer study sounds like a really tough job.   I wish them luck.

Unfortunately, I had a phone meeting and missed the rest of the afternoon.  There was a discussion on the legal aspects of digital fingerprinting and a panel discussion about communities.  The paper I wish I had heard was Pete Cape’s.  He’s always interesting.  His topic was “Conditioning effects in online communities.”  His abstract says he will try to answer the question of whether surveys of online communities are reliable.  I will have to ask him the answer when I see him today.  But by far the two best papers I have ever heard on this topic sum things up pretty well.   Kristoff de Wulf did one at ESOMAR in Dublin a couple of years back and showed how community members tend to either be in love with the brand to start, fall in love once they join, or become disenchanted and fall away.  Last year in Montreux Ray Poynter put it this way (I am paraphrasing): “If you test a concept in your community and they hate it, go back to the drawing board.  If they love it, go do some real research.”

More later.


Last week a colleague sent me this ad and suggested that it was the worst MR ad ever. 


At least the guy is not in one of those big African cooking pots.

Then today I opened up the latest issue of Research and found this:


So there is a contender.  But it makes you wonder what people are thinking.