Introduction

The Index of Orthodontic Treatment Need (IOTN) was initially developed in the aftermath of the Schanschieff Report1 to avoid unnecessary treatment of mildly misaligned teeth. Since 2006 the IOTN has been used as a sieve in allocating treatment services, where NHS resources are limited, in a fair and transparent way.

The IOTN has allowed orthodontists to standardise their approach to evaluating treatment need and has been considered a useful tool for planning orthodontic provision.2

The IOTN assesses the need for orthodontic treatment according to the highest potential risk to the integrity of the teeth or supporting structures from the malocclusion. It attempts to identify those most likely to benefit from orthodontic treatment. The index is based on the patient's individual need and is an objective and reliable way to select those patients who will benefit most from treatment and prioritise limited NHS resources.2

In summary the IOTN has two components – dental health component (DHC) and aesthetic component (AC).

The IOTN was developed at the University of Manchester by Brook and Shaw3 and was based on the index of treatment priority used by the Swedish Dental Board.4 The index needed to be sufficiently flexible to allow for adjustments of the cut-points in view of the uncertainty of the relative contribution that each occlusal trait makes to the longevity and satisfactory functioning of the dentition.3

The DHC aims to categorise the detrimental effects of various deviant occlusal traits in order of severity. The severity is categorised into five grades (1–5) based on the relative effect of various deviant occlusal traits on the longevity of the dentition. Along with a number grade, a letter is assigned to identify and record specific deviant occlusal anomalies.

The acronym MOCDO (missing teeth; overjets; crossbites; displacement of contact points; overbites) guides the observer to the single worst deviant occlusal trait of the malocclusion.

The AC was developed in Cardiff by Evans and Shaw in 1987 and was adapted from the Standardised Continuum of Aesthetic Need (SCAN) index.5 One thousand orthodontic photographs of varying attractiveness were shown to six non-dental personnel who rated them on a linear scale of attractiveness. The AC consists of 10 photographs showing different levels of dental attractiveness on a scale of 1-10 with 1 being the most attractive and 10 being the least attractive arrangement of teeth.

Since 1989 the IOTN has remained unchanged and its use is a contractual requirement of orthodontic providers in the National Health Service in England and Wales in an attempt to provide uniform objective prescribing of orthodontic treatment. The use of the IOTN allows NHS providers to decide which cases are severe enough to warrant treatment in children less than 18 years of age currently. Patients who are eligible for NHS orthodontic treatment must meet the requirements of IOTN DHC 5, 4 and 3 with and AC of 6 or above. Patients should be less than 18 years of age on the date of referral. NHS orthodontic care may be approved for adults on a case-by-case basis if there is a severe dental health issue or complex multidisciplinary needs.

Initial development of the IOTN showed it to have both intra and inter observer agreement to an acceptable level with almost perfect agreement being obtained for the DHC and substantial agreement for the AC.6 Studies have demonstrated that dentists of varying training and practical involvement can be easily trained to record the DHC and AC of the IOTN to a satisfactory level.7 It has been shown that dentists who received IOTN training referred patients more appropriately with a greater proportion of patients having definite treatment need.8

A recent calibration study of third year dental students showed substantial ability of the students to apply both DHC and AC of IOTN. This study also showed that they applied the DHC of the IOTN better than the AC.9 Different ways of teaching IOTN to undergraduates have been examined and the use of a computer aided learning tool has been shown to be a more effective way of teaching dental students IOTN compared to when they were taught with lectures.10 Up until the introduction of direct access in May 2013 it was only dentists that were permitted by the GDC to use IOTN to screen patients. However, the introduction of the direct access guidance permits orthodontic therapists to carry out IOTN screening without the patient needing to see a dentist first. The GDC guidance states that orthodontic therapists who wish to carry out IOTN must be sure that they are trained, competent and indemnified to do so.

Collectively among all registrants who are legally able and responsible for assessing who receives orthodontic treatment, not all have been calibrated for accuracy in making these judgements. Incorrect use of IOTN can have an impact on patient treatment and NHS resources. Inappropriate referrals to NHS orthodontic specialists due to incorrect use of the IOTN can contribute to wasted cost to both patient and commissioners. Incorrect use may also mean that patients are denied the orthodontic treatment they need.

Aim

To determine whether dental registrants, those dental care professionals registered with the General Dental Council (GDC), can use the DHC and AC of the IOTN 'accurately' to an acceptable level of agreement and diagnostic validity.

Objectives

  • To determine frequency of use of IOTN among dental registrants, working in either primary or secondary care.

  • To establish when registrants last had training in the IOTN and if the training was considered verifiable or non-verifiable.

  • To ascertain which factors influence 'accuracy' of use of IOTN.

In this part of the article, 'accuracy' in use of the IOTN among different registrant groups will be discussed, looking at agreement and diagnostic validity. The remaining objectives will be discussed in part 2.

Method

Ethical approval for the study was granted by the Leeds University Dental Research Ethics committee and from the National Research Ethics Service.

Different dental registrants (Table 1) were assessed on their accuracy of use of IOTN by requesting each participant to score both DHC and AC of the IOTN for 14 pre-selected cases. Each case was presented using a set of study models and an intra-oral frontal view photograph. The 14 selected cases presented a range of malocclusions. The IOTN scores were compared to expert scores, which were determined by members of the orthodontic team based at Leeds University, who had been recently calibrated in the use of IOTN.

Table 1 Inclusion criteria for the registrant groups

All participants had access to a clear flexible ruler as well as the IOTN ruler and an Ortho-Care IOTN aide-memoire. There were no time restrictions imposed to ensure the IOTN scores produced by participants represented what would be produced in a clinical setting that has access to similar conditions and facilities.

The participants were also asked to complete a short questionnaire comprising of the following questions:

  • Frequency of use of IOTN?

  • Last episode of training/teaching in the use of IOTN?

  • Type of training: verifiable or non-verifiable CPD?

  • Place of work: primary care, secondary care or in both?

  • Year of qualification/specialisation?

Each of the individual questions was presented with various tick box options. Data from the questionnaire was analysed to provide information on training and use of the IOTN. The results from the questionnaire will be discussed in part 2 of this article.

Any registrant that did not fit the groups listed in Table 1 was not permitted to take part in the study.

Participants were recruited at two UK annual national conferences including the British Dental Association (BDA) Conference in April 2014 and British Orthodontic Conference (BOC) in September 2014. Both organisations sponsored a stand at the main exhibition hall to allow recruitment of delegates. During both conferences only delegates that approached the stand were informed about the study and were provided with both verbal and written information.

Participants for the student groups were recruited at the following study days held at Leeds University:

  • Northern Universities Consortium (NUC) MOrth revision course for postgraduate orthodontic students in their third year (April 2014)

  • Yorkshire Orthodontic Therapy Course (July 2014).

At both study days a short powerpoint presentation was provided to delegates inviting them to take part in the study. All participants were provided with verbal and written information about the study. Consent in a written format was obtained from each participant. In order to reduce bias the data record sheets were all numbered and the IOTN scores were separated from the questionnaire sheet ensuring blinding of the results. As an incentive to take part in the study all participants were invited to be included in a prize draw to win an iPad mini.

Statistical analysis

Assessing agreement

To assess the agreement between the participant's scores and the expert scores, Cohen's kappa (k) statistics were calculated for the DHC and AC for each participant. Kappa statistics were used to ensure that chance was corrected for when measuring agreement. This was achieved by subtracting chance expected agreement from the observed agreement and then rescaling.11 In the context of this paper kappa was used as a statistical measure of agreement of categorical variables.12

Landis & Koch12 have defined kappa values of zero to 0.20 to represent poor agreement; 0.21 to 0.40 to represent fair agreement; 0.41 to 0.60 to represent moderate agreement; 0.61 to 0.80 to represent substantial agreement; and values over 0.81 to represent excellent agreement.

The criteria for acceptable agreement ('accurate use' of the IOTN) with the expert scores was set at kappa >0.60. To establish agreement for each of the individual registrant groups mean kappa results were calculated for both the DHC and AC independently.

Assessment of diagnostic validity

Validity ensures the index measures what it is supposed to measure. In the case of IOTN, DHC scores 1–3 indicate 'non or slight need' and scores 4–5 indicate a 'definite treatment need'.13 For AC scores, 1–7 indicate 'non or slight need' and 8–10 indicate a ' definite treatment need'.13 The diagnostic validity of each participant in making binary yes/no decision that treatment was indicated or not, was assessed. The specificity (those cases correctly identified as not requiring treatment) and sensitivity (those cases correctly identified as requiring treatment) for each participant was calculated and used to determine the diagnostic validity in making the binary decision of needing treatment based on the DHC and AC as separate entities. Agreement was considered acceptable if specificity and sensitivity were >70% in both the DHC and AC of the IOTN. Mean sensitivity and specificity were calculated for the different registrant groups.

Assessment of bias

In this study the Wilcoxon signed rank test was used for both the DHC and AC looking at the different registrant groups to assess for any bias in the participant scores. Bias is considered present if participant scores are consistently too high (over scoring) or to low (underscoring) when compared to the expert scores.

Results

Demographics

A total of 229 participants took part in the study with the number of participants within each registrant group shown in Table 2.

Table 2 Number of participants per registrant group

There was a range of number of participation within the different registrant groups (Table 2). The groups with the highest number of participants were the GDP followed by the DFT group. The groups with the least participation included the SOT group followed by the QOT group.

Assessment of agreement

The mean kappa for DHC for the individual registrant groups is shown in Table 3 and Figure 1. The mean DHC kappa scores for the different registrant groups ranged from 0.25(GDP) to 0.74(SO). The mean AC kappa scores for the different participant groups ranged from 0.13 (SO group) to 0.21 (QOT group) (Table 4). The percentage of participants within each of the groups who successfully achieved a K >0.6 for the DHC is shown in Figure 2. For the AC only one participant achieved K >0.6, and they were from the GDP registrant group.

Table 3 Agreement statistics for DHC for the different registrant groups
Figure 1
figure 1

Mean DHC kappa scores for the different registrant groups

Table 4 Agreement statistics for AC for the different registrant groups
Figure 2
figure 2

Percentage of participants achieving DHC kappa >0.60 for the different registrant groups

For the DHC all groups achieved an acceptable mean sensitivity >70% ranging from 74.1 to 97.2. Only the PGOS group achieved an acceptable mean specificity of >70% (Table 5). The mean range for specificity was 26.0 to 82.1 (Table 5). The mean sensitivity scores were higher than the specificity scores for all registrant groups indicating participants were better at identifying when treatment was needed than was not.

Table 5 Sensitivity and specificity for DHC of the IOTN

For the AC, none of the groups achieved an acceptable mean sensitivity >70%, the mean sensitivity scores ranged from 54.3(DFT group) to 68.3(QOT group) (Table 6).

Table 6 Sensitivity and specificity for AC of the IOTN

All the groups achieved an acceptable mean specificity of >70% (Table 6).

For the AC all participant groups were better at identifying those that did not need treatment than those that did. Which is opposite to how they performed with the DHC of the IOTN.

Assessment of bias

The results from the Wilcoxon signed rank test found a trend of underscoring in both the DHC and AC of the IOTN. Each of the registrant groups had underscored a greater number of cases than overscored, indicating a bias in the direction of underscoring.

Discussion

The participation rate within the individual registrant groups is varied due to the limitations in the recruiting of participants. However, the range is representative of the proportion of dental registrants registered with the GDC, with the GDP and DFT group being the largest groups and the QOT and SOT being the smallest groups.

A total of 229 participants took part in the study, which compares favourably to other studies that have looked at the use of IOTN.9,14,15

Agreement levels for the DHC

The study has demonstrated that the SO, PGOS and QOT groups achieved an acceptable level of DHC agreement with the expert scores, with mean kappa greater than 0.60. The SO and PGOS group achieved the highest mean kappa scores with 76% of the SO group and 64% of the PGOS group achieving acceptable agreement with the expert scores (k >0.60). The participants within these groups are either specialists or training to be specialists in orthodontics and hence one would expect them to have the higher agreement kappa scores.

The introduction of direct access by the general dental council (GDC) in May 2013 permitting orthodontic therapists to carry out IOTN has influenced the training of the orthodontic therapist to include the use of IOTN. As the majority of the QOT that took part in the study (n = 18/21) were registered before this date, it is unlikely that IOTN would have been included in their core training even though all participants in this group reported receiving training in the IOTN. The SOT group achieved mean DHC kappa of 0.55 which was nearer to the acceptable standard of k >0.06 indicating moderate agreement.

The DFT and GDP groups achieved a mean DHC kappa of 0.25 and 0.29 respectively. These groups did not achieve the acceptable standard of agreement in the DHC of the IOTN indicating fair agreement with the expert scores. The findings for the GDP group are consistent with a recent study assessing IOTN knowledge of GDPs in Scotland where they found a mean kappa of 0.42.14 This Scottish study found that only 10% of GDPs achieved an acceptable level of agreement. In our study 16% of GDPs achieved a kappa greater than 0.6 with a mean kappa of 0.29 for this group, indicating less than acceptable agreement. It is difficult to directly compare the results from our study with the Scottish study, as their study design was questionnaire based. The other differing factor is that use of IOTN in Scotland has been shown to be lower when compared to England and Wales due to a later implementation of the IOTN into NHS Scotland.14

Ten percent of the DFT group achieved a kappa greater than 0.6 with a mean kappa of 0.25. This group of participants had completed their undergraduate dental training within a year of taking part in the study and are currently working as DFTs within the UK. The findings of this group are inconsistent with previously published literature which has shown that trained dental students can achieve a mean kappa of 0.65.9 The study design was similar in that study models were used, however this group of students had been provided with a set training protocol with the aim to achieve calibration, which in turn provided the students with a substantial ability to apply both the DHC and AC of the IOTN. IOTN teaching is incorporated into the dental undergraduate curriculum,16 but great variability in the delivery of orthodontic teaching and learning has been reported.17 It is essential that the undergraduate dental students are able to perform an orthodontic assessment involving the application of IOTN in order to make appropriate and timely referrals for orthodontic treatment when working in the NHS primary care service.

Assessment of validity when using the DHC

All the registrant groups in this study were considered sensitive when using the DHC, indicating the ability to identify cases needing treatment based on the DHC. However, only the PGOS was considered to be specific in identifying the cases not needing treatment according to the DHC. All the other groups did not achieve a mean specificity greater than 70%. These results indicate that even though agreement was only fair for some of the participant groups, they were all able to identify cases that needed treatment to an acceptable level. Out of all groups, only the PGOS was acceptable in identifying cases that did not require treatment and hence considered specific in diagnostic validity. The groups, which achieved the least specificity scores, were the DFT and GDP group, which imply these groups were more likely to consider a case for needing treatment that in fact did not require treatment. This could potentially lead to inappropriate referrals with cost implications to both patients and NHS commissioners. These findings are supported by previous studies, which have reported a significant number of referrals for orthodontic treatment to be considered as inappropriate.18

Agreement and validity for the AC

The results for the AC of the IOTN indicated that all dental registrant groups were below the acceptable level of agreement with the expert scores. The mean kappa scores ranged from 0.13 to 0.21 for the different registrant groups indicating poor to fair agreement (Table 4). The results for the AC fared worse compared to previous studies with the trend confirming that participants were better at using the DHC of the IOTN compared with the AC, highlighting the subjective nature of this component. This highlighted trend is consistent with early IOTN studies assessing reproducibility.7,19 Previous studies have reported poor agreement between calibrated examiners when using the AC of the IOTN when looking at photographs compared with scores recorded clinically or from study models.20 Within our study the participants had access to both the study models and intra oral photographs when making decisions on the AC of the IOTN.

Our findings suggest that even though all the registrant groups had not achieved an acceptable agreement with the expert scores for the AC, they were all considered to be specific in their decisions with mean specificity greater than 70% for all groups (Table 6). On the other hand none of the groups were considered sensitive when using the AC. These results indicate that participants were able to identify cases that did not require orthodontic treatment to an 'acceptable' level, but were unable to identify those who required treatment to an acceptable level based on the AC. These results imply that patients could be incorrectly refused orthodontic treatment on the grounds of inaccurate use of the AC. Clinically, decisions become more critical in borderline cases with a DHC of 3. If registrants are not applying the AC sensitively then patients could potentially be refused orthodontic treatment when they are entitled to it. The opposite was true for the DHC, in that all registrant groups were deemed sensitive, but not specific in the use of IOTN.

The findings from this study support the findings of previous IOTN calibration studies which have shown the DHC to be applied more accurately than the AC.9

Assessment of bias

Presence of bias in the participants IOTN scores was assessed for in both the DHC and AC independently using the Wilcoxon signed rank test. The results showed a tendency for all the groups to underscore both in the DHC and AC of the IOTN indicating presence of bias. This can be easily explained by errors of omission, failing to detect specific traits, which would indicate a higher level of treatment need. An underscoring bias or tendency of participants' IOTN scores can imply a potential for registrants to incorrectly identify patients not requiring treatment.

Limitations of the study were considered when calculating sensitivity and specificity to evaluate the validity of participants making a binary decision to treat or not to treat with a DHC score of 3 qualifying as no treatment needed. In reality a score of 3 for the DHC indicates borderline orthodontic case as are the scores 6 and 7 for the AC. The validity does not therefore account for the borderline cases. Previous studies have also used this method accepting this limitation.7,9,21,22

Conclusion

Overall agreement for the DHC was varied for the different registrant groups, ranging from fair to substantial agreement. Agreement for the AC ranged from poor to fair. In this study registrants were better at applying the DHC of IOTN when compared with the AC. This study has highlighted significant gaps in knowledge base in the IOTN among dental registrants with unacceptable reproducibility by important groups of the dental profession. The need for further specialist training and provision of tools to help registrants use the IOTN to an acceptable level is essential.