Wednesday, August 5, 2015

SGCC Consumer Pulse and Segmentation Study: Are 1,000 Consumers Enough?

In 2011, Smart Grid Consumer Collaborative (SGCC) inaugurated our Consumer Pulse research program. We’ve completed five waves so far, most recently in late 2014. Each wave of the Consumer Pulse is a national telephone survey of 1,000 adult (18+) heads of household, interviewed via telephone by the highly regarded research firm Market Strategies International. This is the most reliable and scientifically valid research available on smart energy technology and consumers.
Occasionally, when I talk about the Consumer Pulse research at a public forum, audience members question how we can make statements about the perceptions and preferences of the entire U.S. population when we have interviewed only 1,000 people. This happened just two weeks ago while I was at a conference in Washington, D.C., with an audience member expressing doubts that 1,000 was large enough. While a blog post cannot substitute for a course in probability and statistics, I will try to address that concern here, very briefly.

There are really two ways to answer, one based on everyday experience, the other on theory and (sorry) some slightly complicated mathematics. First, let’s talk about your own experience with polling data. Remember the last presidential election? A day or two before election day, we were all reading polls that predicted the percentage of the national vote each candidate would receive, within a small “margin of error.” Most national political polls are based on one-two thousand interviews with “randomly selected” voters, and nearly all of the polls that were conducted competent, independent observers turned out to be accurate, within their margins of error.

Even now, the first GOP debate hosted by Fox News will be determined based on polling data with approximately 1200 survey respondents to determine which are the top 10 candidates based on voter preferences.

To say that survey respondents are “randomly selected” means that each person in the population we want to know about has an equal chance of being selected and asked to respond.  This is key because if, for example, people over 50 had a higher likelihood to be included in the survey than people under 50, the survey results would be “skewed” in the direction of older people’s opinions – the survey respondents would not be “representative” of the total population.

In our Consumer Pulse surveys, great care is taken to make sure that the respondents are selected at random – every US household has an equal chance to be included in the surveys. When data collection is complete, the data are weighted slightly by age, ethnicity, gender and region to align even more closely with national population parameters. This allows us to state that the margin of error for the total sample size of 1,000 is +/–3.1 percentage points at a confidence level of 95%. This means that there is a 95% probability that a Consumer Pulse finding we report is within +/–3.1 percentage points of the finding we would arrive at by interviewing every 18+ head of household in the U.S., if that were possible.

Proving this claim is where the math comes in. Statistical theory demonstrates that, assuming the people included in a survey are selected randomly, the required sample size n and margin of error E are given by the following formulas:
N x/((N-1)E2 + x)
Sqrt[(N - n)x/n(N-1)]
Egads!  I will not try to explain these equations (or pretend that I would be competent to do so), but they are governed by the laws of probability, one of the best understood and most practically useful areas of mathematics. If you would like to delve more deeply into this topic, there are many good statistics textbooks and academic websites to help you do so.
If you have stuck with me this far, I hope you are persuaded that the Consumer Pulse research is well-designed, rigorous and accurate. That is why we feel confident calling it “the most reliable and scientifically valid research available on smart energy technology.”