Lies, Damned Lies, and Statistics (9): Too Small Sample Sizes in Surveys

So many things can go wrong in the design and execution of opinion surveys. And opinion surveys are a common tool in data gathering in the field of human rights.

As it’s often impossible (and undesirable) to question a whole population, statisticians usually select a sample from the population and ask their questions only to the people in this sample. They assume that the answers given by the people in the sample are representative of the opinions of the entire population. But that’s only the case if the sample is a fully random subset of the population – that means that every person in the population should have an equal chance of being chosen – and if the sample hasn’t been distorted by other factors such as self-selection by respondents (a common thing in internet polls) or personal bias by the statistician who selects the sample.

A sample that is too small is also not representative for the entire population. For example, if we ask 100 people if they approve or disapprove of discrimination of homosexuals, and 55 of them say they approve, we might assume that about 55% of the entire population approves. Now it could possible be that only 45% of the total population approve, but that we just happened, by chance, to interview an unusually large percentage of people who approve. For example, this may have happened because, by chance and without being aware of it, we selected the people in our sample in such a way that there are more religious conservatives in our sample than there are in society, relatively speaking.

This is the problem of sample size: the smaller the sample, the greater the influence of luck on the results we get. Asking the opinion of 100 people, and taking this as representative of millions of citizens, is like throwing a coin 10 times and assuming – after having 3 heads and 7 tails – that the probability of throwing heads is 30%. We all know that it’s not 30 but 50%. And we know this because we know that when we increase the “sample size” – i.e. when we throw more than 10 times, say a thousand times – we will have heads and tails approximately half of the time. Likewise, if we take our example of the survey on homosexuality: increasing the sample size reduces the chance that religious conservatives (or other groups) are disproportionately represented in the sample.

When analyzing survey results, the first thing to look at is the sample size, as well as the level of confidence (usually 95%) that the results are within a certain margin of error (usually + or – 5%). High levels of confidence that the results are correct within a small margin of error indicate that the sample was sufficiently large and random.

9 thoughts on “Lies, Damned Lies, and Statistics (9): Too Small Sample Sizes in Surveys”

Leave a comment