Effectiveness of 2D radiographs in detecting CBCT-based incidental findings in orthodontic patients

Some craniofacial diseases or anatomical variations are found in radiographic images taken for other purposes. These incidental findings (IFs) can be detected in orthodontic patients, as various radiographs are required for orthodontic diagnosis. The radiographic data of 1020-orthodontic patients were interpreted to evaluate the rates of IFs in three-dimensional (3D) cone-beam-computed tomography (CBCT) with a large field of view (FOV) and investigate the effectiveness and accuracy of two-dimensional (2D) radiographs for detecting IFs compared to CBCT. Prevalence and accuracy in five areas was measured for sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). The accuracies of various 2D-radiograph were compared through a proportion test. A total of 709-cases (69.5%) of 1020-subjects showed one or more IFs in CBCT images. Nasal cavity was the most affected area. Based on the CBCT images as a gold standard, different accuracies of various 2D-radiographs were observed in each area of the findings. The highest accuracy was confirmed in soft tissue calcifications with comprehensive radiographs. For detecting nasal septum deviations, postero-anterior cephalograms were the most accurate 2D radiograph. In cases the IFs were not determined because of its ambiguity in 2D radiographs, considering them as an absence of findings increased the accuracy.

of the CBCT was small 14,18 . When the FOV value is small, only a few parts of the craniofacial area are viewed for evaluation 19 .
As described above, 2D radiographs such as panoramic radiographs and cephalograms are routinely taken in orthodontic patients. Although 2D radiographs are not as revealing as 3D radiographs, they can detect abnormal structures or diseases. Some authors have shown that IFs were found in 6% to 43% of 2D radiographic images [20][21][22] , while the rate of IFs in 3D CBCT was from 24.6% to 94.3% 18 . However, the accuracy of 2D radiographs in finding lesions incidentally has not yet been confirmed. Comparison of the ability between 2D and 3D CBCT radiology to detect IFs would be valuable since most clinicians tend to use 2D radiographs instead of 3D CBCT, given their lower radiation doses.
Therefore, this study aimed to investigate the rates of IFs of craniofacial diseases or abnormal structures using a large number of 3D CBCT images that were taken for orthodontic diagnosis without other symptoms, and to compare the detecting accuracy of 2D radiographs to that of 3D CBCT images.

Materials and methods
Samples. For this retrospective study, CBCT images were obtained from sorted patients who visited the Department of Orthodontics at Kyung Hee University Dental Hospital for orthodontic treatment with biocreative strategy from January 2010 to July 2019. CBCT radiographs were taken for orthodontic diagnosis only. All the experimental protocol with informed consent from all participants and the legally authorized representatives/parents/guardian/next of kin (in case of minors) for study participation was approved by the institutional review board of Kyung Hee University Dental Hospital (IRB No. KH-DT19022). The authors confirm that all methods were carried out in accordance with relevant guidelines and regulations.
The inclusion criteria were as follows: Korean patients who underwent CBCT for orthodontic treatment without any craniofacial symptoms, and CBCT images with C-mode, regardless of whether they proceeded orthodontic treatment after diagnosis. C-mode covers the craniofacial area with large field of view (FOV). A panoramic radiograph, lateral cephalogram, and/or a postero-anterior cephalogram were taken concurrently with the CBCT, based on the clinicians' decision. There was no age restriction in this study. The exclusion criteria were patients who visited the clinic for orthodontic treatment to manage previously diagnosed craniofacial diseases, CBCT images, except the C-mode, and the cases whose radiographs were taken with intervals of one month or more. If multiple CBCTs were taken for one patient during the orthodontic treatment, only the first images were selected and the subsequent images were excluded. A total of 1020 CBCT images from 1020 patients were included (Table 1). As we selected patients according to the presence of CBCT images, some of them do not have 2D radiographic images such as a panoramic radiograph, lateral cephalogram, or postero-anterior cephalogram. The definite number of each radiographic image is shown in Table 2.  (5) pathology. In all five categories, CBCT readings were designated as gold standards based on the following: (1) In the maxillary sinus area, mucous retention pseudocyst, flat mucosal thickening (> 3 mm), polypoid mucosal thickening, partial opacification, and total opacification, based on a previous study (Fig. 1) 23 .
(2) Any resorptive change in condyles in the IFs of the TMJ area. (3) Nasal septum deviation and concha bullosa for IFs of the nasal cavity, according to Mladina's types 24 . A concha bullosa was defined as being present when more than 50% of the vertical height (measured from superior to inferior in the coronal plane) of the middle turbinate was pneumatized. 4) Clinically significant calcifications in the salivary glands, tonsils, or lymph nodes, and 5) any pathologic findings in the craniofacial area. The examples of pathologic findings included cystic change of bone, enlarged canals, and fibrous dysplasia, etc. All 2D radiographic images were evaluated by the same examiners, who evaluated CBCT images in the same manner. There were three types of 2D radiographs: panoramic radiograph, lateral cephalogram, and postero-anterior cephalogram. Some patients did not need all three radiographs for the orthodontic diagnosis. Therefore, the number of images were different depending on the type of radiographs (Table 2). To exclude the prejudgment of measurements by other images, the images of the same patients were not read concurrently but were merged for comprehensive evaluation.
Basically, all the 2D radiographic images were used to detect the IFs. However, there were a few exceptions due to the limitation of 2D images. Resorptive changes of the condyles could not be detected by lateral or postero-anterior cephalograms, and we were not able to find nasal septum deviation with lateral cephalograms. In addition, concha bullosa in the analysis with 2D radiographic images were excluded because it can be detected www.nature.com/scientificreports/ only in 3D images. When the findings were indefinite in 2D images because of the overlaps with other anatomic structures, the obscure border of lesions, and so on, they were classified as the "not determined" category.

Statistical analysis.
To evaluate the reliability of interpretation, 50 samples were selected randomly, and inter-and intra-examiner correlations were calculated with percent agreement and kappa statistics. In each area, the data of 2D radiographic images that showed the highest accuracy were regarded as the reference, and the differences were calculated to determine the effectiveness of each 2D image in detecting IFs. The term "comprehensive radiographs" meant the decision was arrived at using all the 2D radiographs. If all three 2D radiographs had shown no IFs, we called it "absence" on the comprehensive radiographs. And if IFs had been detected on at least one radiograph, we considered it as "presence" on the comprehensive radiographs. In order to compare the distribution of readings in each set of 2D radiographic images to that in the 3D CBCT, kappa statistics was used. Cohen's kappa coefficient provides the information about the similarity between two groups: < 0.2 means poor agreement; < 0.4 means fair agreement; < 0.6 means moderate agreement; < 0.8 means good agreement; and > 0.8 means very good agreement 25 .
As there was a number differences between the presence and absence of IFs in CBCT images, the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and resultant accuracy could be distorted. Therefore, we selected cases randomly to match the number between the group in which IFs were detected and the group in which IFs were not detected with CBCT images. For example, as the number of presences of paranasal sinus findings on CBCT images was 151, 151 panoramic radiographs were supposed to be selected randomly. However, 6 of 151 patients do not have panoramic radiographs. In that case, we selected 145 panoramic radiographs randomly. On the other hand, 3 of 151 patients do not have lateral cephalograms, so 148 random lateral cephalograms were selected for the statistical analysis. Data in the "not determined" category were included to "absence" category for conservative interpretation. The cut-off value accuracy test was conducted with the following values: (PPV: Positive Predictive Value, NPV: Negative Predictive Value). A proportion test was also conducted to compare the accuracy of each 2D radiograph and the statistical significance was evaluated with a P-value of 0.05. All statistical analyses were carried out with "caret" package of R 3.6.1 program (https:// cran.r-proje ct. org).

Results
Intra-and inter-examiner agreements were over 80% in all variables. Kappa coefficients over 0.5 suggested that all the variables were in substantial agreement 26 . (Table 3). At least one IF was confirmed in overall 709 CBCT images, which accounted for 69.5% of all CBCT images. The most common IFs of all craniofacial areas were findings in the nasal cavity, especially nasal septum deviation. Ninety-three patients (9.1%) showed two or more IFs simultaneously. The mean age of the subjects with incidental findings was 22.3 years old, and the mean age of the subjects without incidental findings was 20.1 years old. Table 4 shows the prevalence of IFs in each 2D radiographic image. Of the 2D images, panoramic radiographs showed the highest prevalence of IFs in all areas when "presence" and "not determined" categories were included as IFs. However, when only definite findings in the "presence" category were considered as IFs, postero-anterior cephalograms showed the highest prevalence of nasal septum deviation. 100 panoramic radiographs and 32 postero-anterior cephalograms were included in the "not determined" group for detecting nasal septum deviation, showing the largest number in the "not determined" categories. However, the proportion of the "not determined" group to the "presence" group was higher in detecting pathologies and maxillary sinus findings. The kappa value provides a simple comparison for distribution of readings. In all categories except nasal cavity, the kappa values were over 0.4, which meant fairly good agreement between the panoramic radiograph readings and CBCT readings. In the soft tissue calcification category, there was a substantial agreement between CBCT readings and lateral cephalogram readings. www.nature.com/scientificreports/ www.nature.com/scientificreports/ This applies to the similarity of the distribution and does not guarantee the accuracy of readings, because they can include not only true positive and negative readings, but also false positive and negative readings. Despite this limitation, panoramic radiographs showed more similar distribution than cephalograms, based on the gold standards by CBCT. Table 5 shows the source of data in the "not determined" category for each 2D radiograph confirmed by the 3D CBCT images. "Not determined" results of panoramic radiographs and postero-anterior cephalograms in the maxillary sinus category, panoramic radiograph in TMJ category, and postero-anterior cephalogram in nasal cavity category were mostly absent in CBCT images. However, all the "not determined" results in soft tissue calcification and pathology categories were present in CBCT images.

IFs detected on the 2D radiographs (Tables 4 and 5).
Evaluation of 2D combination radiography diagnostic ability based on the gold standards by CBCT (Table 6). Table 6 shows the actual accuracy of 2D radiographs (combination of panoramic radiograph, lateral cephalogram and postero-anterior cephalogram) in detecting IFs compared to 3D CBCT images. We divided the accuracy analysis in terms with two different assumptions: Assumption 1, grouping the "not determined" group as "absence" group; Assumption 2, grouping the "not determined" group as "presence" group. The accuracy of each 2D combination radiograph in detecting IFs is indicated in Table 6, and the highest accuracy was 85.71% observed in the soft tissue calcification findings. However, the lowest accuracy was 62.22% observed in the nasal cavity findings. And in most cases, the accuracy tends to be higher in the case of assumption 1, that is, when it was determined that there is no IFs in the ambiguous case. Absence 0 0 Table 6. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy of 2D combination radiography (PANO + LAT + PA) based on the gold standards by CBCT.  28 found a similar prevalence rate (18.7%) of incidental findings in airways, but the total prevalence rate (92%) was much higher. The differences came from other abnormalities such as temporomandibular joint (TMJ) lesions, soft tissue calcification, and pathological problems. Given that the definition and range of incidental findings vary depending on the authors, the absolute value of the total prevalence rate might not be meaningful. However, the information of each prevalence rate involves the effectiveness of CBCT in detecting the incidental findings of abnormal structures or lesions. The originality of this study comes from the fact that large-scale samples of over 1000 subjects were included and the CBCT images were taken with a large FOV. The required size of the FOV varies according to the purpose of the CBCT. In general, small FOVs evaluate limited areas such as the maxillary or mandibular area. Recently, CBCT with a small FOV has been required to evaluate the bone quality of dental implants. Conversely, large FOVs evaluate all craniofacial areas, which include not only the maxilla and mandible but also surrounding areas such as sinuses and airways 18 . Lopes et al. suggested that the frequency of IFs was higher in CBCT images with a larger FOV 29 . In contemporary orthodontics, a large FOV is beneficial for complicated craniofacial or orthognathic cases 30 , and the evaluation of the TMJ area and pharyngeal airway has become more important for orthodontic patients. In addition, CBCT with a large FOV can replace conventional 2D radiographs, such as panoramic radiographs and cephalograms 31 , because the region of interest (ROI) of the orthodontists includes all craniofacial parts, not only the dental area.
Edward et al. suggested that IFs are detected more frequently in older patients than in younger populations 14 . For example, the condylar pathologic changes in patients aged > 65 years were detected 3.6 times more than those in younger patients 32 . Some authors demonstrated that life-threatening pathologies and malignancies were found in CBCT images 18 . Allareddy et al. found three malignancies in 1000 patients 33 , and Warhekar et al. found 11 malignancies in 795 patients 34 . Although the number of subjects in this study was larger than that of previous studies, there were neither malignancies nor life-threatening pathologies found in the CBCT images. The key factor that affects these differences might be the young age of the subjects. As the subjects in this study were patients who visited for orthodontic treatment, they were at a relatively young age of 21.7 years. This is much lower than previous studies whose subjects took CBCT for other purposes such as evaluation for dental implants rather than orthodontic diagnosis: for example, 39 34 . Despite the relatively low severity of pathologies found in this study, some of them required therapeutic interventions (Fig. 1). Additionally, we did not find significant difference in age between the subjects with incidental findings and the subjects without incidental findings. A different result in Hernandez et al. 's study showed a significant association between age and the presence of incidental findings 36 . However, they used only panoramic radiographs and lateral cephalograms, a set that showed relatively low accuracy in our study. We used 3D CBCT images and additional postero-anterior cephalograms to achieve a more precise result. Over 80% of our subjects were under 29 years old. Because of the concentrated distribution of age groups, there were no significant differences among age groups.
We also focused on the effectiveness of 2D radiographic images. As previous studies investigated the lower accuracy or effectiveness of 2D radiographic images compared to 3D radiographic images, it could be predicted that relatively less false negative findings affect the sensitivity. However, interestingly, there were more false positive findings in 2D radiographic images than we expected. For example, in the case shown in Fig. 2, it seemed to be a haziness of both maxillary sinuses, and a total opacification was suspected on the right sinus, but the sinuses observed in CBCT images were clear without any inflammation.
Unlike with 3D CBCT evaluation, we could not determine the findings in 2D radiographs and they were categorized as "not determined". In 100 panoramic radiographs of 1020 cases, we could not confirm the nasal cavity findings. As these ratios were higher, the efficiency of 2D radiographs in detecting IFs might be lower. For the conservative interpretation, we regarded the data in the "not determined" category as "absence" of findings when calculating the accuracy. We tried to discover the source of data in the "not determined" category in Table 5. Although the "not determined" data in the soft tissue calcification and pathology categories were all from "presence" of findings in CBCT images, most other data were from "absence" in CBCT results. Hence, we needed the cut-off value accuracy test to confirm the detailed accuracy of each kind of radiograph compared to the gold standard results of the CBCT.
Constantine et al. suggested that panoramic imaging has poor efficiency in finding the problems of maxillary sinuses, presenting a result of 36.7% sensitivity and 51.9% NPV 37 . However, higher sensitivity and NPV of the panoramic radiographs were measured in the present study resulting in higher accuracy. Conversely, the accuracy of lateral and postero-anterior cephalograms was significantly lower than that of the reference, implying a low diagnostic value of cephalograms for detecting maxillary sinus findings. For the assessment of the TMJ area, the specificity was over 90%, but the sensitivity was only about 50% with panoramic radiography. This means that the resorptive changes of condyles could not be examined by panoramic radiographs even in cases with actual resorption of condyles evaluated on CBCT images. This result partially conforms with the explanations by De Grauwe et al., which suggested that CBCT imaging is superior to the conventional 2D radiographic methods for the diagnosis of TMJ abnormalities, and that CBCT can be regarded as a first diagnostic aid instead of 2D www.nature.com/scientificreports/ radiography 38 . As described above, concha bullosa cannot be found in 2D radiographic images, so only the nasal septum deviation was evaluated in 2D images. The highest accuracy was observed in the postero-anterior cephalogram, and panoramic radiography showed a statistically significant low accuracy. Comprehensive evaluation with the panoramic radiographs and postero-anterior cephalograms also showed a low accuracy, although the difference was statistically insignificant; therefore, accurate findings might be drawn only by the postero-anterior cephalogram without the panoramic radiograph. The specificity and PPV were 100% in all 2D radiographs for the findings of soft tissue calcification and pathology. This resulted in higher accuracy in detecting these abnormalities. Since the subjects who had soft tissue calcification and pathology were relatively small, the reliability of this statistical result might not be secured.
In summary, this study suggests a higher accuracy in detecting the IFs of soft tissue calcification, a moderate accuracy for IFs of the paranasal sinus, TMJ, and pathology, and low accuracy of IFs of the nasal cavity. Maxillary sinus abnormalities on panoramic radiographs were found more accurately than on other 2D radiographs, and nasal septum deviation should be evaluated by postero-anterior cephalograms alone excluding panoramic radiographs when using only 2D radiographic images. Soft tissue calcification can be found with a higher accuracy using comprehensive 2D radiographic images.
When it comes to the radiation issues, there have been controversies about performing CBCT routinely for orthodontic patients, especially in young patients. De Grauwe et al. emphasized that CBCT in a pediatric population can be justified in cases that cannot be diagnosed accurately with conventional radiographs, such as impacted teeth or a cleft lip and/or palate, when the effective dose is higher than the conventional 2D series 38 . The effective dose of CBCT ranges from 11 to 674 μSv (median value, 61 μSv) in small and medium field of view (FOV), and from 30 to 1073 μSv (median value: 87 μSv) in large FOV 39 . The radiation dose of a conventional set of orthodontic radiographs (i.e., panoramic radiograph, and lateral and postero-anterior cephalograms) is 35.81 μSv 40 . That is far less than the effective dose of a multislice CT ranging from 280 to 1410 μSv when providing high-resolution images, used traditionally in medical areas 41 . Based on these advantages, CBCT has been utilized in orthodontics 42 . The ALARA principle, which is an acronym for "as low as (is) reasonably achievable, " has been applied not only in the industrial field but also in the medical field 43 . Despite cautious concerns about radiation exposure, the benefits and risks should be compared. Diagnostic value of CBCT has risen in regular orthodontic patients, not just in specific orthodontic patients, with benefits which could not be achieved by conventional 2D radiographs. Nevertheless, we do not take CBCT images in all regular orthodontic patients for who the conventional 2D radiographic sets have been taken. CBCT images are taken only in patients with special purposes such as the evaluation for orthognathic surgery, impacted teeth, alveolar bone housing, or transverse discrepancy at the furcation level, in cases which had definite benefits. If the benefits were higher than the expected risks, taking required radiographs could be considered to be allowed ethically. Although it is certain that the radiation dose from medical equipment should be minimized, it is known that the risks of being exposed to ionizing radiation used for medical indication are quite low and similar to other risks which are acceptable for everyday life 44 . Moreover, multi factors influence the effective dose for CBCT, and controlling the factors the effective dose of CBCT can be lowered than that of the conventional radiograph set 45 . There is an inevitable drift to use 3D CBCT data in contemporary dentistry, which means the mainstream of imaging in dentistry has become shifted from conventional 2D radiographs to 3D radiographs. This study was meaningful in that the efficiencies of 2D and 3D radiographs to detect IFs were compared in this transition period. With a help of the artificial intelligence technique with deep-learning-based approach, improved quality of CBCT images can be acquired with a lower radiation dose 46 . The detection of IFs may be improved by the development of CBCT equipment with a lower radiation dose and higher resolution.
This study used only radiographic interpretations, using CBCT as the gold standard. A limitation of this study was that there was no histopathological or clinical confirmation of the findings. For example, the findings www.nature.com/scientificreports/ in a pathology category were expressed based on the radiologic impression, not on the clinical diagnosis. To evaluate the effectiveness of radiographs in detecting IFs, and to overcome the limitation of regarding CBCT images as a gold standard, further studies should match the evaluation of radiographic examinations with clinical information.
The detection of IFs may be improved by the development of CBCT equipment with a lower radiation dose and higher resolution.
In conclusion, (1) 69.5% of subjects showed at least one IFs, so clinicians should be responsible for investigating radiographic images carefully; (2) the possibility of detecting IFs on panoramic, lateral and postero-anterior cephalograms was verified, although the accuracy was not very high; (3) In cases the IFs were not determined because of its ambiguity in 2D radiographs, considering them as an absence of findings increased the accuracy.