Previous month:
March 2011
Next month:
May 2011

Posts from April 2011


This morning's topic is "Optimizing Health Survey Strategies." Let's be clear about one thing: rumors of the death of probability sampling are greatly exaggerated. The NHIS uses a high quality sampling frame and routinely get 95%. The NSFG gets 79%. There is a lot of money and time spent to get these kinds of outcomes and one of the key questions for this session is whether there are less expensive ways to do health research, although outside of these "gold standard" surveys.

Overall there seemed to be three basic themes: (1) use of online panels; (2) mixed mode designs; and (3) better tools for dealing with nonresponse. The online panel argument was mostly about getting access to low, make that very low, incidence populations, in this case sufferers of primary immune deficiency disease. Even very large probability samples uncover small numbers of these individuals. The study worked pretty well and it generated considerable discussion around fit-for-purpose. No one is sure how to make those decisions on a survey by survey basis but there was some agreement that the credibility of findings from nonprobability samples can be strengthened by validating against other high quality studies. Good quality probability samples with high response rates are needed for generating highly accurate estimates and for calibration, but some relaxing of traditional quality critera seems possible in other kinds of studies.

The other topic that generated a lot of discussion is imputation. One cynic suggested we just interview two people and impute the rest. As someone who occasionally moves between conferences for the two sectors of the industry—MR and government—I am always impressed with the energy and brainpower that the government side puts into dealing with the nonresponse problem. MR has sort of thrown up its hands and moved one, but the government folks are nothing if not dogged in trying to solve the nonresponse problem whether in data collection methods, imputation, or weighting. We could learn something from our government colleagues on this score.

HSRM Day 2 Postscript

The discussion after the PM session at HSRM took an interesting turn when a number of people questioned whether we were increasing respondent burden to unsustainable levels. In the MR side of the industry we clearly have gone over the top with many questionnaires, but mostly those questionnaires offend with their repetition and downright tediousness. They test respondents' tolerance for long surveys with detailed questions about things they really don't care about. But some government surveys take respondent burden to a whole new level:

  • Performance tests to assess physical capacity, for example, by standing on one leg with your eyes closed or lifting a weight above our head.
  • Collection of biomarkers (blood samples, urine samples, vaginal swabs, etc.).
  • Questions about difficult-to-recall facts such as how often prescriptions for drugs were filled.
  • Questions about events that are so routine or obscure that the respondent may never have cognitively processed them (such as how much insured paid versus the co-pay for a hospital visit).

There is no clear answer. It's a tough problem that the industry as a whole continues to grapple with.

HSRM – Day 2 PM

The afternoon session is about monitoring Healthcare Reform. But first I must say how impressive the discussion was at the end of the morning session. At this conference the discussion part of a session is taken very seriously. It is a genuine dialogue among people who know their stuff. None of the usual eye rolling that you sometimes get in the discussion time at other conferences.

But back to evaluating the Affordable Care Act (ACA). The essential problem is this: as the act is implemented policymakers will be looking for data to understand its impacts, but we currently don't have good, consistent and reliable measures to produce such data. An alternative title for this session might have been: Measurement Error in Major Health Surveys. We heard about a number of different federal surveys, including some of the flagships like the American Community Survey (ACS) and the Medical Expenditures Panel Survey (MEPS). They all seem to have their measurement problems whether in question wording, sampling, respondent understanding of the benefits maze of both public and federal health insurance programs, or good old fashioned recall. I was especially struck by the recall problem. For example, there is significant disparity between what MEPS get from self-reports of emergency room visits and prescription drug use and what they get from administrative data and reports by respondents' healthcare providers. The differences are explainable but difficult to fix. I don't want to make too big a deal of this. In the data comparisons that people showed us even the statistically significant differences are small in percentage terms, but at the scale these folks are working a few percentage points can translate to millions of people.

I was reminded of a quote from the statistician William Kruskal who once said, "If you have one GPS you will always know exactly where you are. If you have two you will never be completely sure."

I suppose the bottom line here is that most of these surveys were originally designed for some other purpose—modeling, monitoring trends—and repurposing them is difficult. But I could not help but be struck by the degree to which people are so open, so completely willing to lay out the inconsistencies in their survey estimates. It's another example of how seriously this part of the industry takes what it does. It's hard to imagine a similar session at any MR conference. Maybe in MR we always get it exactly right.

HSRM – Day 2 AM

There is a bit of a cloud over the conference this morning. A number of the papers were written by and to be presented by government people who could not travel yesterday because of the potential government shutdown. So people from the contractor side are taking up the slack and delivering the government papers.

The morning session's topic is "Advances in Measuring Health Status and Behaviors." There are five papers and they focus mostly in two areas: physical and cognitive capacity in the elderly and various ways of measuring mental impairment. I'm not going to go paper by paper. In a couple of months you'll be able to download them all from the NCHS site. Instead I'll make a couple of observations.

The first is the systematic and thorough way in which these folks go about developing questions, questionnaires, and the procedures to administer them. I was especially amused by testing of methods to measure mental and physical capacity among the elderly! Cognitive interviewing, field tests, and sophisticated measurement models are all part of the toolbox. Those of us in commercial MR seldom get the time and the money to do this kind of serious questionnaire development. Our questionnaires are often downright bad. Some of that is lack of training but a lot of it is putting together questionnaires under incredibly tight deadlines, and sometimes questionnaires that will only be used once. But if we are serious about increasing respondent engagement questionnaire design is going to need a lot more focus than it's currently getting.

The second observation is that these folks may start with very long interviews (an hour is not unusual) but they still struggle with the same issues we do in MR: controlling questionnaire length in the face of the constantly expanding need for more data. So we heard about "planned missing data designs" that administer only part of a long battery of questions to any one respondent and then the imputation models to support a full analysis.

Health Survey Research Methods Conference

I am at the 10th Health Survey Research Methods Conference in Peachtree, GA, which is just outside Atlanta. This is a very unique conference that has been held every two or three years since 1975. It is a gathering of government health survey researchers and people from the companies that do most of their data collection. It is invitation-only, typically has 80-90 attendees, and is organized around a series of plenary sessions. A condition of attendance is that you agree to show up for all of the sessions. HSRM We take all of our meals together and pretty much talk about surveys for three days. It is health survey camp! This is my third time doing this and I enjoy it immensely, although I'm not sure why they keep inviting me.

The conference kicked off Friday night with a double barreled keynote. Jack Fowler (UMass-Boston) gave us a nice summary of the origins of the basic conference concept and its evolution over three decades. (Proceedings from some of the previous conferences can be found here.) It was an often humorous first-person report, Jack having missed just one conference in the set of ten. He was followed by Ed Sondik from the National Center for Health Statistics who presented an interesting talk on anticipating the needs for data on the health of the US population. He reminded us that the basic mission of NCHS is to collect and publish the most comprehensive and accurate data possible on health and within that mission he believes that this is "the best of times." He's very high on the quality the data, on the government leadership to use it, and on the way in which technology can make everything easier. Despite the enthusiasm, he expressed a couple of concerns. One is that while we have been good at recognizing the importance of and measuring the ABCS of good health (aspirin, blood pressure control, cholesterol control, and smoking cessation) we have been a lot less successful influencing people's behavior. He noted studies showing the strong link between health-related behaviors and the social networks within which people live, but also noted that we may not be studying and understanding those networks as well as we should. Hence the second concern, that we may not be leveraging all that we could because we are sometimes a bit too pure about the methods we are willing to use. Although he never used the words "fit-for-purpose" there was a fair amount of informal discussion about it once the session moved to the bar.