Thoughts on Privacy Day

Thursday, January 28 is Data Privacy Day, and I am reminded that roughly 25 years ago I wrote a chapter for a book in which I tried to imagine how what was then clumsily described as computer-assisted information collection (CASIC) might evolve in the decades ahead. I had just read Nicholas Negroponte’s fascinating little book, being digital, in which he described a future in which all information, including all of our personDataprivacyday-300x200al information, would be digital and the many ways in which that might make our lives better,  especially with regard to personalization of products and services.  At a time when the practice of market, opinion, and social research was dominated by surveys, the implications seemed enormous: the day was coming when we no longer would need to do surveys because all the data would have already been collected. We would just need to analyze it.

The book’s editors, all long-time friends and colleagues in the journey from paper and pencil data collection to automated systems, were not kind in their reaction to my first draft. The key criticism was that the future I imagined was a massive invasion of personal privacy. The public would not stand for it, and strong legal privacy protections that would prevent us from accessing all that data were inevitable.

Their criticism caused me to have a look at what people paying attention to technology and  privacy back then were thinking about the future. I learned that there were discussions about technical solutions involving encryption algorithms and personal keys that would give individuals complete control over who can and cannot access their personal data, something that now sounds like blockchain. In his book, Slaves of the Machine: The Quickening of Computer Technology, Gregory Rawlins wrote, “Today’s encryption technology could, if use widely enough, make us the last generation ever to have to fear for our privacy.”

Fast forward to 2015 when I was part of a team charged with updating the ICC/ESOMAR Code on Market, Opinion, and Social Research and Data Analytics. The OECD had just finished updating their global privacy principles and as part of that effort had solicited input from academics and the business community.  I was especially struck by a paper written by Peter Cullen, who was then Microsoft’s Chief Privacy Strategist, and two academics, Fred Cate and Viktor Mayer-Schoenberger. In it they argued that the burden on individuals to manage their own privacy was unsustainable. Privacy policies had become undecipherable, often not read, and in too many cases data was being collected without any formal notice at all. Their bottom line was that we could no longer rely on the notice and consent model when collecting personal data. They proposed that we “shift responsibility away from individuals and toward data collectors and data users, who should be held accountable for how they manage data rather than whether they obtain individual consent.”

That made great sense to me then and still does, but we are a long way from getting there. The GDPR was a good start, especially in terms of laying out a set of data protection principles and processes that place a greater burden on data collectors and data users. Many of these same principles are gradually making their way into privacy regulations worldwide. But progress has been slow, data collectors have often resisted, and enforcement has been weak. The call for giving people more control over how their data is used has been long on rhetoric, but short on person-friendly ways of doing so. Mostly they just add more burden.

And so, we need a Data Privacy Day to raise awareness and encourage businesses to develop and maintain data protection practices that shield individuals from the privacy infringements that have become all too commonplace. For its part, ESOMAR has launched its #BeDataSmart initiative with the simple call to recognize that behind every piece of data there is a person, and that person deserves our respect. It really can be just that simple.

In theory at least, those of us who work in the insights sector should not require these sorts of reminders about our responsibilities to those whose data we rely on. A central, long-standing principle at the core of our claim to be a self-regulating industry is that those individuals who participate in research must not suffer adverse consequences as a direct result of that participation. Adapting that principle and inventing the practices to enforce it  in a world not unlike what Negroponte imagined is the central challenge of our time.


Polling Again!

Once again the second biggest story coming out of a US presidential election is what one commentator described as "the absolutely massive failure of political polling." Really? 

Consider this. Nate Silver, relying mostly on surveys, predicted that Joe Biden would win 30 states and that the tipping point state would be Pennsylvania. He forecasted the winner in all but two of those 30: Florida and North Carolina. That’s a 93% hit rate.  His projection of the final vote for Florida was off by 2.8 points and just 1.3 points in North Carolina. Taken as a whole and by  pretty much any objective measure, that’s not bad. And whether we and they like it or not, the public along with  those of us who work in the insights field are going to have to accept it as the best we can do.

People-dont-believe-in-pollsSurveys have become a blunt instrument from which we unreasonably, in my view, expect to get precise measurements. Perhaps that expectation was more reasonable when we were able to generate probability samples from frames with near complete coverage and achieve very high response rates. Even then, our reliance on sampling meant there would be some error in our estimates, and that simultaneously drawing repeated samples from that same frame and achieving similar response rates would still not deliver the exact same estimates every time.

But those days are gone. The key pillars  of scientific survey research –a probability sample from a frame with high coverage of the target population and a high response rate – have all but disappeared and with them went our capacity to deliver results that at least come close to meeting the expectations of the media and the public at large. That’s not the fault of political pollsters, but of decades of social and technology-driven change that have dramatically altered what is possible. Although to be fair, there are those who oversell what they do and why it works better than the rest.

So instead we have hodgepodge of methods.Those methods run the gamut from science to snake oil. Fortunately, most still work reasonably well. They work in part because they have an advantage that political pollsters enjoy and the rest of us in research don’t; they eventually know the right answer and therefore have a leg up on figuring out what they got wrong and how to fix it for the next round.

Most of what I see written about these massive failures are by the same people who, lacking any grasp of the basics of electoral polling, nonetheless obsess on them prior to the election, and then have a need to find blame when they see the results. Or perhaps they long for the good old days when old white guys acting as savants would sit around predicting outcomes based on more reliable methods, such as counting yard signs. That aside, the big question may well be not how can surveys improve but how the polling industry can thrive in a world where simply forecasting winners and losers is not enough, that we also must precisely say by how much. This is a game of expectations, and right now political pollsters are losing it. And just maybe, that's their fault.


Charting a course through the COVID-19 Pandemic

Two interesting pieces of research out this morning about the impact of the COVID-19 pandemic on research results.

The first comes from Zappi and an online test of 26 concepts and advertisements across six consumer categories (personal care, food and beverage, home hygiene, Telco, QSR, OTC) in five markets (US, UK, China, Italy and Mexico). All 26 were previously tested prior to March 1 and then retested March 17-18.  Zappi’s analysis found few meaningful differences across 78 comparisons.  My gut reaction: “How can this be?”

There also is a report just released from the Pew Research Center here in the US. It was conducted for a week stretching from March 10 to March 16 and asked about the perceived threat of COVID-19 to the US economy, the overall health of the US population, daily life in their communities, their personal financial situation, and their personal health. While 70% reported seeing the pandemic as a major threat to the US economy, only about a third saw it as a major threat to their personal financial situation and 27% as a major threat to their health. Another 23% saw COVID-19 as presenting no threat at all to their personal health. The obvious message would seem to be that only a small minority of the US population saw COVID-19 as a major threat to them personally. How can this be?

It might be partially explained by the changes in response over the survey period where four of the five measures rose significantly. For example, 29% of respondents who completed the Pew survey in the first two days (March 10-11) felt that the virus posed a major threat to their personal finances. By the last two days (March 16-17) that measure rose 11 points to 40%. On the other hand, respondent concerns about their personal health remained fairly consistent, although the alarm bells didn’t start ringing loudly here in the US until the week of March 16.

These are two different studies with different samples covering different time periods and there may be multiple ways to reconcile them. I’m not going to try to do that. However, I would note that it's of critical importance to have multiple measures and perspectives if we are to really know what is happening. I also want to reinforce the message that it's more important than ever to continue to do research even as the pandemic continues to unfold and hopefully subsides. The more multiple points of measurement you can bring to bear the better. Business, governments, non-profits—any organization that relies on data to make informed decisions--needs to understand how the attitudes and behaviors of the people they serve are evolving as we work our way through this crisis. If ever knowledge were power, this is that time.

For details on the Zappi work look here. For the Pew study look here.