Reliability and accuracy of the torque applied to osteosynthesis screws by maxillofacial surgeons and residents

Applying the right torque to osteosynthesis screws is important for undisturbed bone healing. This study aimed to compare test–retest and intra-individual reliabilities of the torque applied to 1.5 mm and 2.0 mm osteosynthesis screws by residents and oral and maxillofacial surgeons (OMF-surgeons), to define the reference torque intervals, and to compare reference torque interval compliances. Five experienced OMF-surgeons and 20 residents, 5 of each 4 residency years, were included. Each participant inserted six 1.5 × 4 mm and six 2.0 × 6 mm screws into a preclinical model at two test moments 2 weeks apart (T1 and T2). Participants were blinded for the applied torque. Descriptive statistics, reference intervals, and intra-class correlation coefficients (ICC) were calculated. The OMF-surgeons complied more to the reference intervals (1.5 mm screws: 95% and 2.0 mm screws: 100%) than the residents (82% and 90%, respectively; P = 0.009 and P = 0.007) with the ICCs ranging between 0.85–0.95 and 0.45–0.97, respectively. The residents’ accuracy and reliability were inadequate regarding the 1.5 mm screws but both measures improved at T2 for both screw types compared to T1, indicating a learning effect. Training residents and/or verifying the applied torque by experienced OMF-surgeons remains necessary to achieve high accuracy and reliability, particularly for 1.5 mm screws.

www.nature.com/scientificreports/ To enable evidence-based, standardized, and reliable guidance in the application of osteosynthesis screws and to illustrate a simple and low-cost setup to train clinicians, this study aimed to: (1) assess the test-retest and intraindividual reliabilities of the torque applied by residents and experienced OMF-surgeons, (2) define a reference torque interval for the commonly used 1.5 and 2.0 mm osteosynthesis screws and, (3) compare the compliance with the reference torque interval between OMF-surgeons and residents with varying years of experience.

Materials and methods
The most commonly used titanium osteosynthesis screws in oral and maxillofacial (OMF)-surgery were selected, i.e. the 1.5 × 4 mm and 2.0 × 6 mm KLS Martin MaxDrive® screws (Gebrüder Martin GmbH & Co., Tuttlingen, Germany) 2,3,10 . Predrilled 36 × 36 mm high-pressure laminate (HPL) blocks were chosen as a reproducible model, with a similar elastic modulus as cortical bone [11][12][13] . Predrilling was performed in a standardized manner with water cooling and using the 1.1 and 1.5 mm diameter drills provided by the manufacturer. To simulate the clinical situation, the thickness of the HPL blocks used for the 1.5 mm screws was 1.0 mm as these screws are commonly used in the midface where the bone is generally thin (e.g., the anterior wall of the maxillary sinus; Fig. 1a). The HPL blocks used for the 2.0 mm screws were 6.0 mm thick as these screws are more commonly used in thick cortical bone (e.g., in the mandible; Fig. 1b) 14 .
A total of 25 participants were included: five experienced OMF-surgeons (i.e., with many years' weekly exposure to these osteosynthesis systems in the clinic) and five randomly chosen residents from each of the four residency years (i.e., a total of 20 residents) from University Medical Center Groningen (UMCG, Groningen, the Netherlands) and the Amsterdam University Medical Centers (Amsterdam UMC, the Netherlands), namely Academic Medical Center (AMC) and 'Vrije Universiteit' Medical Center (VUmc). The participants were asked Figure 1. Example of (a) a high-pressure laminate (HPL) block with 1 mm thickness used for the 1.5 mm screws and (b) an HPL block with 6 mm thickness used for the 2.0 mm screws. Note that the screw goes through the 1 mm thick HPL plate (a), i.e. simulating a screw that goes through thin cortical bone (e.g., the anterior wall of the maxillary sinus) while the screw does not go through the 6 mm HPL block (b), i.e. simulating a bone screw in cortical bone. (c) The test setup with a torque meter with an inserted HPL block. The HPL-block was positioned in such a way that the screw hole of the HPL-block that was used to insert the screw was always aligned with the axis of the torque meter to ensure accurate torque measurement. www.nature.com/scientificreports/ to insert 6 screws of each size as they would do in the clinic ('two-finger tight') at two test moments (T1 and T2) two weeks apart (Fig. 2). The participants were blinded for the applied torque during both test moments. The burr holes were irrigated with water while inserting the screws to simulate the clinical situation. Saline was avoided to prevent possible corrosion of the test environment. The use of water instead of saline was not expected to influence the test results 10 . The applied torque was measured using a calibrated torque meter (Nemesis Howards Torque Gauge, Smart MT-TH 50 sensor; accuracy 2.5 Nmm; Fig. 1c). Screw breakage and stripped screw holes were recorded. All the participants were asked for the amount of experience with osteosynthesis systems (also from other disciplines, e.g. orthopaedics, traumatology) and, regarding the residents, the current internship and the number of, and which, internships were completed during their residency.
All methods were carried out in accordance with relevant guidelines and regulations, including the Declaration of Helsinki. The protocol of this study was approved by the Institutional Review Board of the University Medical Center Groningen, the Netherlands. All participants provided written informed consent.
Sample size calculation. The number of screws of each screw size per participant and per test moment (i.e., m = 6) were derived from the international standard for mechanical testing of bone screws 15 . The number of included participants was based on an a priori performed sample size estimation (1) for group comparisons and (2) to assess intra-individual reliability. The sample size calculation was based on data from a study that assessed differences in the torque applied by 4 OMF-surgeons to 1.5 and 2.0 mm osteosynthesis screws 16 . To provide sufficient power for both the 2.0 and 1.5 mm osteosynthesis screws, the 1.5 mm screw values were used. Using α = 0.05, power = 0.8, effect size = 0.78, and number of groups = 5, resulted in a sample size of 25 participants (i.e., 5 per group) 16 . Regarding the reliability analyses, an expected intra-class correlation coefficient (ICC) of 0.8, and the number of repeated measurements, being 12 per screw size, also resulted in a sample size of 5 participants per group 17 . Therefore, five experienced OMF-surgeons and five randomly chosen residents from each of the four residency years participated, inserting 6 screws of each screw size at two test moments (i.e., a total of 300 measurements per screw size).

Statistical analyses.
All the data were calculated and presented separately for each screw size. The assumptions of normal distribution of continuous data were tested by examining Q-Q plots and histograms, and by performing the Shapiro-Wilk test. Continuous data were presented as mean ± standard deviation (SD) or median (25th to 75th percentile, P 25 -P 75 ). Categorical data were reported in numbers and percentages.
Multilevel models were fitted using restricted maximum likelihood estimations that took into account variances between screws within one test moment of a certain participant, between participants within one test moment, and between test moments. A linear multilevel model was fitted for continuous outcome data while a logistic multilevel model was fitted for dichotomous outcome variables. Between-group comparisons (e.g., between OMF-surgeons and residents) were performed using a type III analysis of variance (ANOVA) test.  Figure 2. Flowchart of the study procedures to assess the test-retest (at T1 and T2) and intra-individual reliability of the two main groups (i.e., oral and maxillofacial surgeons and residents) and the subgroups (i.e., the different residency years; the dashed lines and lighter colour boxes). OMF oral and maxillofacial, n number of participants, m number of measurements, T 1 at baseline, T 2 after 2 weeks. www.nature.com/scientificreports/ The test-retest reliability at T 1 , the test-retest reliability at T 2 , and the intra-individual reliability between T 1 and T 2 were assessed by calculating the ICC (absolute agreement using a two-way mixed model 17 ) with a 95% confidence interval (CI) per group (Fig. 2). An ICC of ≤ 0.50, 0.50-0.75, 0.75-0.90, and ≥ 0.90 was considered as poor, moderate, good or excellent reliability, respectively 18 . A lower limit of the 95% CI of ICC ≥ 0.70 was deemed sufficient for research purposes 18 . The ICC was calculated by dividing the variance components of the participants and the interaction between the participants and the test moments by the total variance 17,19,20 . Bland-Altman plots with limits of agreement were constructed to assess systematic measurement differences 17 .
Due to the lack of a gold standard for osteosynthesis screw torque, the five experienced OMF-surgeons' measurements (m = 60 screws per screw size) were used to calculate the reference torque intervals for each screw size. We first checked whether the assumption that OMF-surgeons apply osteosynthesis screws consistently was met (i.e., the lower limit of the 95% CI of ICC intra-individual reliability ≥ 0.70). If this assumption was met, the 95% reference intervals of each screw size were calculated based on the experienced OMF-surgeons' multilevel model data.
Here, the variance components of the fixed and random effects were summed (i.e., the total variance), the degrees of freedom were calculated based on the generalized Satterthwaite method (i.e., using the observed variances), and applying the t-values corresponding to the degrees of freedom and α = 0.05, as appropriate 21 . The number and percentage of measurements which complied with the reference intervals were calculated per group and compared between groups. P ≤ 0.05 (two-tailed) was considered statistically significant. The Bonferroni correction was applied to all the pairwise comparisons to correct for multiple testing. All analyses were performed in R, version 4.0.5, using the lme4-and blandr-packages [22][23][24] .

Results
Participants' characteristics. Of the included participants, 16 (64%) were male (all the OMF-surgeons and eleven residents (55%); Table 1). The median age (P 25 Table 1. Characteristics of the included participants. Bold P-values represent statistically significant differences. OMF-surgeons oral and maxillofacial surgeons, P 25 -P75 25th to 75th percentile, UMCG University Medical Center Groningen, AMC Academic Medical Center, VUmc 'Vrije Universiteit' Medical Center, NA not applicable, TMJ temporomandibular joint. *Calculated by dividing the number of internships followed at academic medical centres by the total number of internships.

First year(n = 5) Second year (n = 5) Third year (n = 5) Fourth year (n = 5)
Gender, n (%) Medical center, n (%) Experience with osteosynthesis systems in years, median (P 25 -P 75 ) 14.8 (9.5-37.0) Internships followed at academic medical centers, n academic /N total (%)* NA 75/88 (85%) 6/6 (100%) 16   www.nature.com/scientificreports/ OMF-surgeons. The torque applied by the fourth-year residents at T1 was significantly lower than the first-, second-and third-year residents (  Fig. 3b). The residents applied 314.2 ± 84.0 and 330.7 ± 69.9 Nmm torque to the 2.0 mm osteosynthesis screws at T1 and T2, respectively. The torque applied to the 2.0 mm screws by the residents at both test moments was significantly lower than the torque applied by the OMF-surgeons. The torque applied by the fourth-year residents at T1 and T2 was significantly lower than the first-, second-and third-year residents.
Test-retest and intra-individual reliability. The OMF-surgeons achieved moderate to good test-retest and intra-individual reliability for the 1.5 mm screws ( Table 3). The residents (i.e., as one group) achieved good to excellent test-retest and intra-individual reliability for the 1.5 mm screws. The subgroup analysis showed that the test-retest and the intra-individual reliability of the first-and second-year residents ranged from poor to moderate. In contrast, the third-and fourth-year residents achieved moderate to good reliabilities ( Table 3). The Bland-Altman plot (Fig. 4a) demonstrated a systematic difference of 0.6 Nmm (limits of agreement (LOA) 38.9 to − 37.7 Nmm).
The OMF-surgeons achieved moderate to good test-retest and intra-individual reliability for the 2.0 mm screws. The residents achieved excellent test-retest and intra-individual reliability for the 2.0 mm screws. The subgroup analysis showed that the T 1 test-retest reliability of the second-, third-, and fourth-year residents ranged Table 3. Test-retest reliability (at T1 and T2) and intra-individual reliability between T1 and T2. The bold values indicate sufficient reliability (i.e., ICC ≥ 0.7). ICC intra-class correlation coefficient, 95% CI 95% confidence interval, OMF-surgeons oral and maxillofacial surgeons.

OMF-surgeons Residents
Residents   www.nature.com/scientificreports/ from poor to moderate. However, the T 2 test-retest reliability and intra-individual reliability of these subgroups increased to good-excellent reliability ( Table 3). The Bland-Altman plot (Fig. 4b) showed a systematic difference of -5.9 Nmm (LOA 110.8 to − 122.7Nmm).

Reference intervals and complications.
Since the assumptions that OMF-surgeons apply osteosynthesis screws consistently were met for both the 1.5 and 2.0 mm osteosynthesis screws, reference intervals for both screw sizes could be calculated ranging from 73.7 to 127.9 Nmm for the 1.5 mm screws (Fig. 3a) and from 233.9 to 629.5 Nmm for the 2.0 mm screws (Fig. 3b). The OMF-surgeons' compliance with the reference torque interval for the 1.5 mm screws was 57 (95%) whereas the residents' compliance was 195 (82%) (P = 0.009; Table 4; Fig. 3a). The first-and second-year residents complied with the reference interval significantly more often than the third-and fourth-year residents ( Table 4). The compliance to the reference interval increased from 82% at T1 to 86% at T2. Screw hole stripping with the 1.5 mm screws was similar among the OMF-surgeons and residents ( Table 4). The second-year residents had the highest proportion of stripped screw holes (17%).
The OMF-surgeons complied with the reference torque interval on applying all the 2.0 mm screws (Table 4; Fig. 3b). The residents' compliance with the 2.0 mm screw reference interval was 215 (90%) (P = 0.007; Table 4; Fig. 3b). Compliance with the 2.0 mm screw reference interval was similar among all the residents ( Table 4). The reference interval compliance increased from 88% at T1 to 95% at T2. The OMF-surgeons and residents' screw hole stripping with the 2.0 mm screws was similar.

Discussion
This study shows a clear effect of "learning-by-doing", with increased compliance to the reference torque intervals and reliability for both 1.5 and 2.0 mm osteosynthesis screws at T2 compared to T1. The senior residents showed higher reliability but lower compliance with the reference torque interval compared to the junior residents. Thus, despite the residency year, it is still necessary to train residents and/or to verify the applied torque by experienced OMF-surgeons remains necessary to utilize the full potential osteosynthesis systems.
A simulated learning environment is very suitable for acquiring the "feeling" of adequate screw fixation with sufficient tightness and when a screw hole will strip. This study shows that learning-by-doing increases both the test-retest reliability and compliance with the reference torque intervals for both 1.5 and 2.0 mm screws. Although first-and second-year residents showed an increase in reliability with the 1.5 mm screws at T2 compared to T1, these reliabilities were still insufficient at T2 (i.e. ICC < 0.7). All the other groups with insufficient applied torque reliability at T1 increased their reliability to a sufficient level at T2. These results indicate that this test setup has a learning effect on OMF clinicians resulting in increased reliability and accuracy for both screw types. Since bone stripping and screw breakage are more likely to occur when the difference between the torque applied to the screws for adequate fixation (i.e., hand-tight) and the maximum allowed torque (i.e., torque up to screw breakage) is small 25 as well as that this setup can increase both accuracy and reliability of the applied torque, this setup is appropriate for educational purposes.
At first glance, the calculated reference intervals for both screw sizes may seem wide. The reference intervals are wide because the dispersion around the mean torque applied by the maxillofacial surgeons (i.e., the standard deviation and, thus, the variance) is relatively large. The high variability of torque applied to osteosynthesis screws between surgeons has also been reported in literature previously 1 . However, as each surgeon applied the torque Table 4. The compliance with the reference intervals and the number of complications during osteosynthesis screw insertion. The bold P-values represent statistically significant differences. Each superscript denotes significant differences in the pairwise comparisons (see P-values below): 'a' is derived from the pairwise comparison between first-and second-year residents, 'b' between first-and third-year residents, 'c' between first-and fourth-year residents, 'd' between second-and third-year residents, 'e' between second-and fourthyear residents, and 'f ' between third-and fourth-year residents. 1.5 mm screw reference interval compliance: a P > 0.999; b P = 0.064; c P = 0.064; d P = 0.028; e P = 0.028; f P > 0.999; stripped screw holes, a P = 0.008; b P = 0.712; c P = NA; d P = 0.920; e P = 0.008; f P = 0.712; broken screws: NA. 2.0 mm screw reference interval compliance: non-significant differences between subgroups and, thus, no pairwise comparisons were performed; stripped screw holes: NA; broken screws: NA. *Comparison between OMF-surgeons and residents. # Comparison between the residency years. † Reference interval 1.5 mm screws: 73.7-127.9 Nmm. § Reference interval 2.0 mm screws: 233.9-629.5 Nmm. OMF-surgeons oral and maxillofacial surgeons, NA not applicable. www.nature.com/scientificreports/ consistently (i.e., the intra-individual reliability was good to excellent) and there were no signs of systematic difference between T1 and T2 in the Bland-Altman plots, the measured variability between surgeons is, thus, part of the actual application of screws. Due to the higher maximum torque needed to adequately insert the 2.0 mm screws, which in turn inevitably results in a loss in precision 16 , the reference interval of the 2.0 mm screws is much wider than that of the 1.5 mm screws. The reliability and compliance with the 2.0 mm screw reference torque interval were generally better than the 1.5 mm screws as the latter are more prone to errors (i.e., too little or much applied torque). An explanation for these differences is that the tactile feedback is higher when applying 2.0 mm screws 1,16 , as shown by other studies that increasing tactile or visual feedback results in increased accuracy and the ability to predict screw hole stripping 1 . Therefore, complying with the 1.5 mm screw reference interval requires a higher degree of accuracy. Thus, although training is beneficial for both screw sizes, training of the applied torque to 1.5 mm screws is, in particular, needed.
Our study shows that this combination of compliance with the reference interval and residents' intra-individual reliability is currently inadequate for 1.5 mm screws. Although the first-and second-year residents showed higher compliance with the reference interval, the intra-individual reliability of both subgroups was poor and moderate, respectively. The third-and fourth-year residents demonstrated good intra-individual reliability but poorer compliance with the reference interval. On the other hand, regarding the 2.0 mm osteosynthesis screws, the first-, second-and third-year residents had good intra-individual reliability and high compliance with the reference interval. The fourth-year residents displayed good intra-individual reliability but applied too little torque to a substantial proportion of the screws. A post hoc analysis of the fourth-year residents' insertions showed that the torque of 10/17 (59%) of the 1.5 mm and 7/11 (64%) of the 2.0 mm screws was insufficient. A recent review also showed substantial between-surgeon variability in the application of osteosynthesis screws 1 . Therefore, regardless of the residency year, training residents (e.g., by using this test setup) and/or verification of the applied torque by experienced OMF-surgeons remains necessary when applying osteosynthesis systems.
Stripping of the screw holes only occurred on inserting the 1.5 mm screws. Interestingly, the second-year residents showed the highest proportion of stripped screw holes but with the highest compliance with the 1.5 mm reference interval. An explanation is that this was caused by the self-tapping technique for osteosynthesis screws, i.e. tightening the screws by clockwise rotation, followed by loosening the screws a bit by rotating anti-clockwise and then tightening the screws further. This technique is necessary to lower the torsional resistance when applying osteosynthesis screws as well as to remove debris that is formed on self-tapping the screw holes. However, when this technique is executed too forcefully, the screw holes get stripped without having applied excess torque 1 . All the (sub)groups' stripping rates remained lower compared to the average stripping rate (26%) reported in the literature 1 , probably because the screws used in this study were smaller, necessitating less torque.
The calculated reference intervals and the reported learning effect indicate that training clinicians (e.g., during the residency period, seminars or courses) with this simple, yet effective test setup has the potential to improve the effectivity of osteosynthesis systems. This has the potential to enhance patient care quality by increasing fracture or osteotomy stability, resulting in less compromised healing, and reducing the need for emergency screws following the stripping of bone intraoperatively, with a corresponding reduction in operation time and costs.
The osteosynthesis screws included in this study are used for fixating fractures and osteotomies in different locations of the facial skeleton, e.g., the crista zygomaticoalveolaris, anterior wall of the maxillary sinus, and mandible. These maxillofacial bones have different mechanical properties 12,13,26,27 . The HPL-blocks used in this study have mechanical properties within the known mechanical property range of maxillary and mandibular bones 11 , making them a suitable bone simulation model. However, the translation of the reference intervals to the clinical setting remains uncertain due to in vivo variabilities in bone density and thickness. Therefore, we advocate that translation of the reference intervals to a clinical setting should not be done until in vivo validation of the calculated reference intervals has been performed.
Although this study focused on maxillofacial osteosynthesis systems, the results of this study also seem applicable to other disciplines that use osteosynthesis systems, e.g., orthopaedic and trauma surgery. A recent systematic review showed that, on average, 26% of all inserted osteosynthesis screws by experienced orthopaedic and trauma surgeons are irreparably damaged or have stripped screw holes 1 . Currently, it remains unknown how residents of these disciplines perform. The authors of that review indicated that there is a need for defining reference torque intervals and that future research should focus on developing methods to train clinicians to apply osteosynthesis screws accurately and reliably 1 . The test setup presented in this study can be easily adjusted by using a different torque meter (i.e., that can measure higher torque for larger screws) and different HPL-blocks, making this test setup useful for educational purposes with different sizes of osteosynthesis systems.
The strengths of this study are the simple, effective and standardized test setup, blinding all the participants to the applied torque, and the thorough study design (i.e., test-retest reliability at T 1 and T 2 , and intra-individual reliability between T 1 and T 2 ). The presented low-cost test setup can be easily fabricated for educational purposes. Furthermore, commonly used osteosynthesis screws were applied to a standardized bone model. A limitation of this study is that, although we used a suitable bone simulation model, translation of the reference intervals to the clinical setting remains uncertain due to in vivo variabilities in bone density and thickness. Moreover, bone blocks were not used because the variability in bone mineral density, cortical and spongious bone layer thickness, and block dimensions impede their use as a standardized and reproducible model since reliability assessment is then uncertain. Another limitation is the lack of a gold standard for torques applied to screws. We, therefore, determined reference intervals based on the torque values of experienced OMF-surgeons. The participating surgeons have many years of experience with osteosynthesis systems in the clinic. However, another group of OMF-surgeons might have given other reference intervals. External validation of the defined reference intervals by future research is therefore desired. Finally, since the error of the torque meter is a fixed absolute value (i.e., 2.5 Nmm), the relative error increases as the measured torque decreases. However, this study aimed to assess