Commentary

Two major areas of uncertainty cloud the issue of whether implementing systematic population screening for oral cancer and precancer would be effective and economically viable. These are: (i) uncertainty over rates of malignant transformation of precancerous lesions; and (ii) whether therapeutic benefits would follow from early detection of potentially malignant lesions or, indeed, individuals at high risk. By synthesising findings from available studies on leukoplakia prevalence, this review provides the most dependable information to date on the first question.

Although entitled a systematic review, the method described for identifying research, selecting studies and assessing study quality did not conform with stringent guidelines for such reviews promulgated by, for example, the UK National Health Service's Centre for Reviews and Dissemination. There is no mention of an advisory group having been formed to oversee the conduct of the study, the only databases searched were Medline and Embase and the selection of studies for inclusion in the review was, as far as can be ascertained, carried out solely by the author with no independent second reviewer. Nevertheless, it is likely that most of the relevant ecological studies were revealed in what, in effect, amounted to a scoping search.

At the same time, the search terms failed to identify a number of screening studies that might also have provided relevant, verified prevalence data on precancer. The yield of studies from the search process was very limited, as seen in many reviews attempting to derive pooled estimates of quantitative findings by combining suitable data from available published reports. In order to obtain a reasonable-sized pool of data for meta-analysis a less-strict, perhaps, rather than desirable, choice of inclusion criteria was adopted. This resulted in the selection of 23 studies. Of these, only six were deemed to be well-designed.

The random-effects method for combining different treatment-effect sizes from heterogeneous clinical trials was the method preferred for gaining a robust pooled estimate of leukoplakia prevalence. This was calculated to be 2.60% (see above). Consideration was given to the advantages of this method compared with the alternative inverse-variance method of calculating a weighted average. Sensitivity analysis showed that, using the random-effects method, the impact of non-valuable studies was not so strong as to significantly change the value of the pooled estimate.

Finally, a weighted average of the annual malignant transformation rate using data from six studies was calculated by the inverse-variance method. The resulting pooled estimate was 1.36%. Using these data together with the global leukoplakia prevalence estimate, the author judged the number of oral cancer cases occurring in the world per annum resulting from malignant transformation of leukoplakia. The incidence was calculated as 6.2–29.1 cases for every 100 000 people. Interestingly, when this figure was transposed to the world population, the lower limit exceeded the officially reported crude overall annual world-wide oral cancer incidence rate of 5.56. The discrepancy was attributed to gross under-reporting of oral cancer cases in less developed countries.

Professor Petti responds

Robustness of the pooled world leukoplakia prevalence estimate

Estimating world leukoplakia prevalence could be helpful in determining the necessity of implementing oral cancer preventive programmes and in deciding the most effective strategy to adopt.

The main problem in estimating global oral leukoplakia prevalence is reducing the various sources of error to the lowest possible level. Errors can be divided into errors in the primary studies and errors in the systematic review.

Primary study errors may be systematic casual errors. Casual error, because of sample variability, is a function of the sample size, and can be easily investigated by means of assessment of the precision level, which is the inverse of the standard error. Systematic errors are ‘between-study’ discrepancies because of differences in the study design and are partly responsible for the between-study heterogeneity. A high degree of systematic error is not necessarily symptomatic of low-study quality: it can also occur between high-quality studies, for example, when different threshold levels of certainty are chosen in leukoplakia diagnosis. High-quality studies adopting the C-1 certainty level (ie, provisional diagnosis made with one clinical examination), according to the certainty factor scale1 are more likely to report higher prevalence estimates than high-quality studies that adopt the C-4 level (ie, histopathological confirmation of persisting lesions).

All the primary studies bring with them, to different extents, both casual and systematic errors. The crucial point in conducting a systematic review on leukoplakia prevalence is determining the highest acceptable error level for the inclusion of the primary studies in the reviewing process.

Review errors result in insufficient primary studies to allow a robust pooled-prevalence estimate to be made, and can be divided into errors in primary study inclusion and errors in primary study publication.

Examples of inclusion errors are the non-inclusion of undetected high-quality primary studies, or the inclusion of low-quality studies. Errors in publication occur because of the tendency of journal editors not to publish, and of investigators not to submit, studies made on samples of small sizes that consequently have a low precision level, reporting low-prevalence estimates.

In writing the paper, I decided to control for these various forms of error using a conservative method to estimate the pooled leukoplakia prevalence (ie, the random-effect method), and to explore the various sources of error. I calculated between-study heterogeneity and made a sensitivity analysis of both the inclusion criteria and the statistical approach adopted.

There was an alternative more intuitive, but less accurately measurable, method to explore all these forms of error which I decided not to include in the paper to avoid overloading it with statistics. This method is the funnel plot. Plotting the logarithms of the primary study point-prevalence estimates on the x-axis (note that normalisation by logarithm transformation is required because of the skewed distribution of point-prevalence estimates) against precision on the y-axis results in a normal distribution of primary studies within a certain degree of heterogeneity. The presence of a high level of the four aforementioned forms of error is easily visible, although not exactly measurable, by means of the funnel plot. More specifically, primary studies with a high degree of casual error are located at the bottom of the graph because of their low precision level, and are widely spread around the logarithm of the true point-prevalence estimate (although the presence of many studies with high casual error level does not lead to biased pooled-prevalence estimates, but to wider confidence intervals).

Studies that should not be used for the systematic review, because of a high degree of systematic error, do not follow the normal distribution and their position in the graph is outside the “funnel” of the remaining primary studies. Lack of detection or inclusion of high-quality studies, which compromises the robustness of the pooled-prevalence estimate, would result in a non-normal study distribution and the “cloud” of primary studies in the graph is flatter than expected. Finally, lack of publication, detection or inclusion of less precise studies that report low prevalence estimates makes the plot asymmetric, with the right-hand side containing more studies for low-precision values than its counterpart.

If there was a high level of publication error, an adjustment would be required because it would invariably lead to higher pooled-prevalence estimates. Such adjustment can be made by means of the so-called trim and fill method.2 First, the asymmetric studies on the right-hand side of the plot are removed (trimmed), leaving a symmetric remainder from which the pooled-point leukoplakia prevalence is re-estimated using the random-effect method. The logarithm of the new pooled-prevalence estimate is the true centre of the funnel. The removed studies are then replaced and their “unpublished” counterparts imputed (filled), like mirror images of the trimmed studies. Specifically, each of these unpublished studies has the same precision value and the same distance from the newly estimated centre as its published counterpart, but it is located on the left-hand side of the plot. After the inclusion of these unpublished studies, a new pooled-prevalence estimate is calculated using the method suggested by the overall heterogeneity level.

The funnel plot made using the primary studies included in my systematic review, displayed in Figure 1, clearly shows that publication error was the only truly worrying problem. Using the trim and fill method and including the unpublished studies (located on the left-hand side of the graph and displayed between brackets), the adjusted pooled world leukoplakia prevalence estimate calculated using the random-effect method was 1.93% (95% CI, 1.55–2.31) not statistically different at the 95% level from the published estimate (2.60%; 95% CI, 1.72–2.74). (See original paper for raw data.)

Figure 1.
figure 1

Funnel plot filled with the asymmetric “trimmed” studies (identified by the study number, see original article) on the right-hand side of the graphic, the logarithm of the pooled point leukoplakia prevalence estimated using the remaining studies (identified by the vertical line at −4.15 on the x-axis, corresponding to 0.0158) and the “unpublished” filled studies (between brackets) on the left-hand side of the graphic.

Starting from this new value, the crude global annual oral cancer incidence rate would fall, with 95% probability, between 5.60 and 24.54 for every 100 000 people, statistically different at the 95% level from the officially reported incidence for the year 2000 of 5.56 per 100 000 people. This result confirms the robustness of the previously reported pooled estimate of world leukoplakia prevalence. This is despite the fact that study selection and quality assessment did not conform with stringent guidelines so that ultimately too many low-quality studies were included and some important high-quality studies were not detected.