Surveys can provide valuable data about the prevalence of various conditions, behaviours and traits in a population, but there needs to be careful consideration of the methods used to sample the population of interest. It is not merely a matter of more is better, although the number of respondents to a survey is clearly important: a large sample will increase the precision of the estimate of prevalence (as reflected by the width of the 95% confidence interval (CI)). For example, if the aim of a survey is to estimate the prevalence of urinary incontinence in a population of people living with spinal cord injury (SCI), then 2000 respondents will provide a more precise estimate than 200. However, the number of respondents is less important than whether those who respond are representative of the population. If they are not, then the estimate might be precise (as shown by a narrow CI) but it will not be accurate (as demonstrated by a big difference between the sample prevalence and the population prevalence).

Of course, the best way to determine the prevalence of urinary incontinence in a population clearly is to survey everyone in the population. Then there is no need to use inferential statistics. Needless to say, it is nearly always impossible to survey everyone in the population. Even if a questionnaire is sent to everyone on a national registry of people with SCI, it is very unlikely that everyone will receive and complete the survey. Even if they do, the data registry may be incomplete. A truly random sample would provide very valuable data because it would provide unbiased estimates of characteristics (parameters) of the population. However, most samples are non-random, which makes analysis and interpretation of the data problematic. This is because it is very difficult to determine whether non-random samples are representative of the target population. For example, those with urinary incontinence may be more likely to respond to a survey on urinary incontinence than those without. If this is the case then the survey will overestimate the prevalence of urinary incontinence in the target population.

There are some simple statistical tests that can be used to identify differences in characteristics of the sample and the target population. Sometimes researchers compare the distributions of variables such as age, race, gender, time since injury and type of injury in the sample and the target population. When differences are identified, it is possible to adjust estimates for known imbalances between the sample and the target population [1]. The adjusted estimates may be less biased than naive estimates. However, adjustment for characteristics such as age, race, gender, time since injury and type of injury may not be enough to obtain unbiased estimates because there may be other, more important, characteristics affecting incontinence that are not known or difficult to measure. So any statistical adjustment of prevalence estimates based on appropriate weighting is going to be potentially problematic [2].

These are quandaries without simple solutions. They illustrate the need to be careful about how data derived from surveys are reported and interpreted. Importantly, any publication based on survey data needs to clearly articulate the population of interest, the size of that population and the methods used to sample from it. Information on selective inclusion and attrition, if available, and on statistical procedures used to adjust for non-representativeness, also need to be described. Importantly, there needs to be clear acknowledgement of the potential for error and bias in any estimates of prevalence.

Spinal Cord values surveys designed to estimate the prevalence of various conditions, behaviours and traits, but it encourages authors to explicitly address sampling issues and discuss whether estimates derived from surveys are likely to accurately reflect the characteristics of the target population.