Data is showing

Good day and here’s to a stable and prosperous calendar year.
Many groups will be remaining at home in spite of loosened travel and commute restrictions through the rest of this year and possibly beyond. Will that development lead to increased accounts payable solutions as a result?

Thoughts on Privacy Day

Thursday, January 28 is Data Privacy Day, and I am reminded that roughly 25 years ago I wrote a chapter for a book in which I tried to imagine how what was then clumsily described as computer-assisted information collection (CASIC) might evolve in the decades ahead. I had just read Nicholas Negroponte’s fascinating little book, being digital, in which he described a future in which all information, including all of our personDataprivacyday-300x200al information, would be digital and the many ways in which that might make our lives better,  especially with regard to personalization of products and services.  At a time when the practice of market, opinion, and social research was dominated by surveys, the implications seemed enormous: the day was coming when we no longer would need to do surveys because all the data would have already been collected. We would just need to analyze it.

The book’s editors, all long-time friends and colleagues in the journey from paper and pencil data collection to automated systems, were not kind in their reaction to my first draft. The key criticism was that the future I imagined was a massive invasion of personal privacy. The public would not stand for it, and strong legal privacy protections that would prevent us from accessing all that data were inevitable.

Their criticism caused me to have a look at what people paying attention to technology and  privacy back then were thinking about the future. I learned that there were discussions about technical solutions involving encryption algorithms and personal keys that would give individuals complete control over who can and cannot access their personal data, something that now sounds like blockchain. In his book, Slaves of the Machine: The Quickening of Computer Technology, Gregory Rawlins wrote, “Today’s encryption technology could, if use widely enough, make us the last generation ever to have to fear for our privacy.”

Fast forward to 2015 when I was part of a team charged with updating the ICC/ESOMAR Code on Market, Opinion, and Social Research and Data Analytics. The OECD had just finished updating their global privacy principles and as part of that effort had solicited input from academics and the business community.  I was especially struck by a paper written by Peter Cullen, who was then Microsoft’s Chief Privacy Strategist, and two academics, Fred Cate and Viktor Mayer-Schoenberger. In it they argued that the burden on individuals to manage their own privacy was unsustainable. Privacy policies had become undecipherable, often not read, and in too many cases data was being collected without any formal notice at all. Their bottom line was that we could no longer rely on the notice and consent model when collecting personal data. They proposed that we “shift responsibility away from individuals and toward data collectors and data users, who should be held accountable for how they manage data rather than whether they obtain individual consent.”

That made great sense to me then and still does, but we are a long way from getting there. The GDPR was a good start, especially in terms of laying out a set of data protection principles and processes that place a greater burden on data collectors and data users. Many of these same principles are gradually making their way into privacy regulations worldwide. But progress has been slow, data collectors have often resisted, and enforcement has been weak. The call for giving people more control over how their data is used has been long on rhetoric, but short on person-friendly ways of doing so. Mostly they just add more burden.

And so, we need a Data Privacy Day to raise awareness and encourage businesses to develop and maintain data protection practices that shield individuals from the privacy infringements that have become all too commonplace. For its part, ESOMAR has launched its #BeDataSmart initiative with the simple call to recognize that behind every piece of data there is a person, and that person deserves our respect. It really can be just that simple.

In theory at least, those of us who work in the insights sector should not require these sorts of reminders about our responsibilities to those whose data we rely on. A central, long-standing principle at the core of our claim to be a self-regulating industry is that those individuals who participate in research must not suffer adverse consequences as a direct result of that participation. Adapting that principle and inventing the practices to enforce it  in a world not unlike what Negroponte imagined is the central challenge of our time.

Polling Again!

Once again the second biggest story coming out of a US presidential election is what one commentator described as "the absolutely massive failure of political polling." Really? 

Consider this. Nate Silver, relying mostly on surveys, predicted that Joe Biden would win 30 states and that the tipping point state would be Pennsylvania. He forecasted the winner in all but two of those 30: Florida and North Carolina. That’s a 93% hit rate.  His projection of the final vote for Florida was off by 2.8 points and just 1.3 points in North Carolina. Taken as a whole and by  pretty much any objective measure, that’s not bad. And whether we and they like it or not, the public along with  those of us who work in the insights field are going to have to accept it as the best we can do.

People-dont-believe-in-pollsSurveys have become a blunt instrument from which we unreasonably, in my view, expect to get precise measurements. Perhaps that expectation was more reasonable when we were able to generate probability samples from frames with near complete coverage and achieve very high response rates. Even then, our reliance on sampling meant there would be some error in our estimates, and that simultaneously drawing repeated samples from that same frame and achieving similar response rates would still not deliver the exact same estimates every time.

But those days are gone. The key pillars  of scientific survey research –a probability sample from a frame with high coverage of the target population and a high response rate – have all but disappeared and with them went our capacity to deliver results that at least come close to meeting the expectations of the media and the public at large. That’s not the fault of political pollsters, but of decades of social and technology-driven change that have dramatically altered what is possible. Although to be fair, there are those who oversell what they do and why it works better than the rest.

So instead we have hodgepodge of methods.Those methods run the gamut from science to snake oil. Fortunately, most still work reasonably well. They work in part because they have an advantage that political pollsters enjoy and the rest of us in research don’t; they eventually know the right answer and therefore have a leg up on figuring out what they got wrong and how to fix it for the next round.

Most of what I see written about these massive failures are by the same people who, lacking any grasp of the basics of electoral polling, nonetheless obsess on them prior to the election, and then have a need to find blame when they see the results. Or perhaps they long for the good old days when old white guys acting as savants would sit around predicting outcomes based on more reliable methods, such as counting yard signs. That aside, the big question may well be not how can surveys improve but how the polling industry can thrive in a world where simply forecasting winners and losers is not enough, that we also must precisely say by how much. This is a game of expectations, and right now political pollsters are losing it. And just maybe, that's their fault.