A few weeks back I saw a post by online usability specialist Jakob Nielsen titled, “User Satisfaction vs. Performance Metrics.” His finding is pretty simple: Users generally prefer designs that are fast and easy to use, but satisfaction isn't 100% correlated with objective usability metrics. Nielsen looked at results from about 300 usability tests in which he asked participants how satisfied they were with a design and compared that to some standard usability metrics measuring how well they performed a basic set of tasks using that design. The correlation was around .5. Not bad, but not great. Digging deeper he finds that in about 30% of the studies participants either liked the design but performed poorly or did not like the design but performed well.
I immediately thought of the studies we’ve all seen promoting the use of flash objects and other gadgets in surveys by pointing to the high marks they get on satisfaction and enjoyment as evidence that these devices generate better data. The premise here is that these measures are proxies for engagement and that engaged respondents give us better data. Well, maybe and maybe not. Nielsen has offered us one data point. There is another in the experiment we reported on here where we found that while the version of the survey with flash objects scored higher on enjoyment, respondents in that treatment showed evidence of lack of engagement at the same rate as those tortured with classic HTML. They failed some classic traps at the same rate.
A cynic might say that at least some of the validation studies we see are more about marketing than survey science. A more generous view might be that we are still finding our way when it comes to evaluating new methods. Many of the early online evangelists argued that we could not trust telephone surveys any more because of problems with coverage (wireless substitution) and depressingly low response rates. To prove that online was better they often conducted tests showing that online results were as good as what we were getting from telephone. A few researchers figured out that to be convincing you needed a different point of comparison. Election results were good for electoral polling and others compared their online results to data collected by non-survey means, such as censuses or administrative records. But most didn’t. Studies promoting mobile often argue for their validity by showing that their results match up well with online. There seems to be a spiral here and not in a good direction.
The bottom line is that we need to think a lot harder about how to validate new data collection methods. We need to measure the right things.