Main

Examination of the association between outcome of interest and exposure or intervention requires accurate measurements that are representative of the target population. Accurate measurements are also required if the purpose of the study is merely to monitor a population's health so that prevalence or incidence rates can be determined. Spurious associations and inaccurate estimates mainly arise due to chance, bias, confounding and/ or contamination. We must endeavour to minimise these at the design phase, as often they cannot be adjusted for later when the data are being analysed.

Findings of a study sample will often be extrapolated to a larger population. Deviations from the true population measure may be due to chance — also called variation — which is measured as random error. Chance can never be eliminated entirely, but it can be minimised through replication of measurements, and by increasing the size of the study.

Bias occurs when there is a systematic difference between study measurements and the true population values. Types of bias include:

Selection bias: systematic difference between those selected into the sample and those not selected. The sample is therefore not representative of the population.

Observer or measurement bias: systematic difference in measurement of health status or risk factor between observers.

Recall bias: differences in reporting experiences between those who have and those who do not the outcome of interest; occurs particularly in retrospective studies.

Publication bias: tendency to report studies that make strong statements about outcomes of interest.

Bias results from poor study design and cannot be corrected for at the analysis stage. Confounding occurs when a spurious association is made at the analysis stage between outcome and exposure which, in reality, results from a secondary exposure that was not included in the analysis. For example, in a study it may be found that people of one town have higher rates of oral cancer than another. Interpreting this to mean that oral cancer is dependent on area of residence is incorrect if it is known that people of the first town are, in general, heavier smokers than the populace of the second. Confounding can be minimised at the design stage by:

Randomisation: In an experimental study, subjects are assigned to control and intervention groups at random. This ensures that members of the same group are less likely to have higher than usual rates of other potentially confounding characteristics in common.

Matching: By matching pairs of subjects according to potential confounding variables, for example, sex and age, the impact of confounding is kept to a minimum (to be covered in more detail in Article V in this series).

Crossover design: Where two or more interventions are being compared, the same subject is assigned to both interventions, and relative effectiveness of the two interventions is assessed within subjects. The idea is similar to that of matching, but within-subject variation is smaller than between-subject variation. By using the same subject, potentially confounding characteristics are standardised (to be covered in the final article of this series).

Restriction (or blocking): Subjects are grouped together according to characteristics that are potential confounders and a specified, identical proportion of each group is randomly assigned to an intervention or control group. This maintains the balance of subjects with potential confounding characteristics assigned to each arm of the study.

Stratification: Like restriction, this ensures that characteristics possibly influencing the health outcome measurement are optimally balanced between intervention and control groups. Stratification can be used at the analysis stage, but the procedure has implications for sample size and, if it is not considered at the design stage, the power of an experimental study test will be greatly reduced.

Confounding can be controlled for at the analysis stage by using statistical models that adjust for more than one variable at once. This can only be done, however, if the confounder is known and the appropriate data have been collected.

Contamination occurs when an intervention administered to an intervention group of an experimental study filters into the control group. An example might be where oral health education is given to the intervention arm and this is repeated informally by a subject in the experimental group to a subject in the control group. This may either dull or entirely mask an existing association between intervention and outcome. Contamination can be avoided by carrying out a clustered study design, where clusters defined by geography or dental practice, for example, are the experimental unit assigned to one or more interventions.