In 2011, Smart Grid Consumer Collaborative (SGCC) inaugurated our
Consumer Pulse research program. We’ve completed five waves so far, most
recently in late 2014. Each wave of the Consumer Pulse is a national telephone
survey of 1,000 adult (18+) heads of household, interviewed via telephone by
the highly regarded research firm Market Strategies International. This is the most
reliable and scientifically valid research available on smart energy technology
and consumers.
Occasionally, when I talk about the Consumer Pulse research
at a public forum, audience members question how we can make statements about
the perceptions and preferences of the entire U.S. population when we have
interviewed only 1,000 people. This
happened just two weeks ago while I was at a conference in Washington, D.C., with an audience
member expressing doubts that 1,000 was large enough. While a blog post cannot
substitute for a course in probability and statistics, I will try to address
that concern here, very briefly.
There are really two ways to answer, one based on everyday experience,
the other on theory and (sorry) some slightly complicated mathematics. First, let’s talk about your own experience
with polling data. Remember the last
presidential election? A day or two
before election day, we were all reading polls that predicted the percentage of
the national vote each candidate would receive, within a small “margin of
error.” Most national political polls
are based on onetwo thousand interviews with “randomly selected”
voters, and nearly all of the polls that were conducted competent, independent
observers turned out to be accurate, within their margins of error.
Even now, the first GOP debate hosted by Fox News will be
determined based on polling data with approximately 1200 survey respondents to
determine which are the top 10 candidates based on voter preferences.
To say that survey respondents are “randomly selected” means
that each person in the population we want to know about has an equal chance of
being selected and asked to respond.
This is key because if, for example, people over 50 had a higher
likelihood to be included in the survey than people under 50, the survey
results would be “skewed” in the direction of older people’s opinions – the
survey respondents would not be “representative” of the total population.
In our Consumer Pulse surveys, great care is taken to make
sure that the respondents are selected at random – every US household has an
equal chance to be included in the surveys. When data collection is complete,
the data are weighted slightly by age, ethnicity, gender and region to
align even more closely with national population parameters. This allows us to
state that the margin of error for the total sample size of 1,000 is +/–3.1
percentage points at a confidence level of 95%. This means that there is a 95%
probability that a Consumer Pulse finding we report is within +/–3.1 percentage
points of the finding we would arrive at by interviewing every 18+ head of household
in the U.S., if that were possible.
Proving this claim is where the math comes in. Statistical theory demonstrates that,
assuming the people included in a survey are selected randomly,
the required sample size n and
margin of error E are given by
the following formulas:
x

=

Z(^{c}/_{100})^{2}r(100r)

n

=

^{N x}/_{((N1)E}^{2}_{
+ x)}

E

=

Sqrt[^{(N  n)x}/_{n(N1)}]

Egads!
I will not try to explain these equations (or pretend that I would be
competent to do so), but they are governed by the laws of probability, one of
the best understood and most practically useful areas of mathematics. If you would like to delve more deeply into
this topic, there are many good statistics textbooks and academic websites to
help you do so.
If you have stuck with me this
far, I hope you are persuaded that the Consumer Pulse research is
welldesigned, rigorous and accurate. That is why we feel confident calling it “the most reliable and
scientifically valid research available on smart energy technology.”