Commentary

In most orthodontic treatment plans some form of anchorage is necessary to control the reciprocal forces of tooth movement.1 It is usually obtained by applying a force to a group of teeth or through extra-oral sources, eg the neck or cranium. However, these techniques have a limited area of application, may still cause unintended loss of anchorage, or depend on the collaboration of the patient.1 Orthodontic mini implants are not conditioned by most of these limitations, are indicated for a wide variety of treatment mechanics and can be used in both jaws over a wide range of time periods.2

The failure rate of these devices during the application of orthodontic forces is an important-patient outcome. This issue is addressed in the systematic review that is critically appraised in this commentary. The completion of this broad-spectrum review is a major undertaking and the authors have to be complimented for their persistence. However, I want to discuss some limitations of this article that should have been addressed during the preparation of the protocol and the reviewing and editorial processes. The AMSTAR3 and CASP (www.casp-uk.net accessed April 25th 2013) tools were consulted for this critical appraisal.

Most of the limitations of this systematic review are the result of an inadequate framing of the review question. This paper addressed one broad-spectrum question on the failure rates of miniscrews and over 40 ‘narrow-spectrum questions’ on individual variables that could influence these outcomes. This magnitude of questions alone creates manageability issues for both the review team and the editors.4 Broad-spectrum questions are indicated when it is plausible that outcomes are more or less the same for different subpopulations of patients and interventions, but earlier systematic reviews have shown that this is not the case.5,6,7 This paper could still have been framed around a broad-spectrum review question that shows the heterogeneity in failure rates between subpopulations, but should then be complemented with only a few a priori well defined narrow-spectrum questions.

The materials and methods section of a systematic review should include a specific chapter ‘Criteria for including studies for this review’, in which the types of participants, interventions, outcomes and studies are defined.4 These definitions were either incomplete or missing in this paper and should have also been defined for each narrow-spectrum question. For example the primary outcome ‘failure rates of miniscrews’ was not defined. This creates confusion, because authors of the selected studies have defined this outcome differently or not at all.6,8 Further, patient-important outcomes such as adverse effects of interventions, which are obligatory outcomes in systematic reviews according to the Cochrane collaboration, were not assessed.4 The time point for measuring outcomes was also not defined. This is an important factor, because varying durations could introduce heterogeneity, and studies too short in duration could have little relevance. In addition, study selection was based on study design labels and not on explicit design features. To avoid ambiguity, the Cochrane collaboration recommends against using design labels for selecting studies.9

Because the narrow-spectrum research questions and their eligibility criteria were not specifically defined, it is impossible to understand why certain studies were selected and how to assess the validity of their outcomes. It is for example not clear why only two studies were selected for assessing associations between insertion torque and failure rates when at least seven of the included articles addressed such an association.10 A table with the names of the excluded studies and the reasons for exclusion would have been helpful. This lack of transparency is problematic for readers, guideline developers and for researchers who want to update or replicate all or some parts of this paper.

In addition, outcomes of both randomised-controlled trials (RCTs) and non-randomised studies were pooled in the same forest plot. However, when evidence is identified in different study designs, it is favoured to synthesise their effect estimates separately.9,11 The presentation of the diamond in the forest plot that summarises the outcomes of all study designs should have been avoided, because it will further mislead the reader. Separate forest plots for each type of study design should have been shown.

To be useful for clinicians, systematic reviews must not only provide an outcome, but also give the necessary information to judge whether this effect estimate is likely to be correct.12 These judgments are not based exclusively on the assessment of risk of bias. The GRADE Working Group has identified five categories; risk of bias, imprecision, inconsistency, indirectness and publication bias, which are used to make quality ratings of outcomes.12This rating procedure has been adopted by the Cochrane Collaboration,13 but was not conducted in this systematic review. These assessments should have been done separately for RCTs and observational studies.14 GRADE provides a starting quality rating, ‘high quality’, for outcomes from the former and ‘low quality’ for those from the latter type of studies.12These ratings can be subsequently up- or down-graded based on the application of the five quality criteria.

The quality of the outcome ‘implant failure rates’ for the broad-spectrum review question in the five RCTs could be down-rated to ‘low’ quality, because of bias, imprecision and ‘substantial inconsistency’. Because no information on adverse effects was provided, these quality judgments cannot be placed into perspective. The same three domains could also down-rate the quality of the observational studies to ‘very low’ quality. Publication bias could probably further negatively affect these ratings. Some of the selected studies were also identified as ‘suspect of multiple publication bias’ in another systematic review.10

The quality ratings of the outcomes of the narrow-spectrum questions are not possible because specific PICOT (Participants, Interventions, Comparisons, Outcomes and Time periods) questions were not defined, and most information to make such judgments was not provided by the authors.

In the conclusions the authors state that ‘the small mean failure rates of miniscrews indicate their usefulness in clinical practice’. However, systematic reviewers have been discouraged to make such practice recommendations, because these judgments do not depend on just one patient-important outcome, but also on, eg, the quality of the evidence, the balance between desirable and undesirable outcomes, costs, the setting and input from pertinent stakeholders.12

Many of the limitations outlined in the previous sections could have been prevented in the protocol phase of this systematic review. Splitting the article up in several different publications with narrower-spectrum PICOT questions could have made this broad-spectrum review more manageable and accessible.4 A methodologist with expertise in systematic reviewing and conducting meta-analyses should have been included in the review team to assist with methodological issues.

However, also the editors and peer-reviewers of the American Journal of Orthodontics and Dentofacial Orthopedics (AJODO) have an important impact on the quality of this paper. This article for example is not structured according to a validated format for reporting systematic reviews.15,16,17 Although the AJODO has adopted the PRISMA statement and promotes the protocols of the Cochrane Collaboration for reporting systematic reviews, full endorsement of these reporting formats in this journal has been deficient.18,19,20 The quality of systematic reviews will only improve when editors implement these guidelines and assess whether authors have complied with these protocols as well.21 Peer-reviewers and editors owe such rigour to the hard work of current and future review teams. Some minor suggestions by these stakeholders could have significantly improved the quality of this paper.

Failure rates of orthodontic mini implants in combination with other patient-important outcomes should be evaluated when implementing these devices for orthodontic anchorage. True failure rates in different subpopulations of patients and interventions will probably vary substantially from those summarised in the forest plot of this systematic review.

Practice points

  • The critical appraisal of this systematic review suggests that the quality of the body of evidence on failure rates of orthodontic mini-implants is low

  • Editors of journals should not only adopt validated reporting guidelines of systematic reviews, but should also implement them.