Introduction

Head and neck cancer include oral cavity and oropharyngeal cancers, nasopharyngeal cancers (NPC), laryngeal and hypopharyngeal cancers, et al. According to the globocan 2012, the incidence of head and neck cancer every year is 683235/1000,000, the mortality is 375665/1000,000. And it is the sixth cause of cancer death throughout the world1,2,3. Because of the difficulty in early diagnosis, the five-year survival rate of head and neck cancer is lower than other cancers like breast, cervix and colorectal cancers4. Therefore, early diagnosis of head and neck cancer is extremely important.

There are many kinds of methods to detect head and neck cancer, including radiologic examinations and endoscopy. Radiologic examinations include CT, MRI, PET/CT and so on. CT or MRI provides cross sectional imaging, if the tumors are too small, they can’t be visualized by CT or MRI. The time and cost may be also problem for patients. Additionally, CT or MRI may be beneficial in staging laryngeal carcinoma and planning the most appropriate surgical procedure5. PET/CT identified primary tumors of non-squamous origin that infiltrated the middle or deep submucosal layer, including nonkeratinizing and undifferentiated carcinomas. Small or superficial lesions can easily be missed, because they are too small to take up sufficient FDG for detection on PET, this is the limited resolution of the PET scanner6,7. Therefore, endoscopy may be the prime method to early diagnose head and neck cancer. Among them, narrow band imaging (NBI) is a novel optical digital method of image-enhanced endoscopy which combined with the electronic laryngoscope. The thin capillary network on the mucosal surface shows as brownish and thick blood vessels show as cyan, on the NBI image. To identify suspicious lesions, especially in the colon, stomach, and esophagus, NBI has been applied as the standard endoscopic examination. Besides there were many meta-analysis studies to evaluate the diagnostic performance of NBI in these cancers8,9,10,11,12,13,14,15.

However, there was only one meta-analysis to systematically evaluate the value of NBI in diagnosing head and neck cancer16, to our knowledge, and there was another meta-analysis to assess the diagnostic performance of NBI for second primary lesions in patients with esophageal and head and neck cancer17. Therefore, we aimed to establish the sensitivity, specificity of NBI for differentiation between benign and malignant head and neck lesions.

Results

Study selection

We included 25 studies with total 6187 lesions. The search process and articles eligible for this meta-analysis are shown in Fig. 1. From the initial keyword search, a total of 1044 studies after removing duplications were identified. After screening the title, sixty-two studies were left. Then, eighteen studies were excluded on the basis of abstract. Then, nineteen studies were excluded because of different studies (n = 11) and incomplete data (n = 8) after further reviewing in full text. Finally, twenty-five studies included in the meta-analysis, including two studies that had been reported in abstracts, twenty-three studies were full texts.

Figure. 1
figure 1

Study flow chart.

Characteristics of eligible studies

The characteristics of the 25 included studies are summarized in Table 118,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42. All included studies were published in English except one which was published in Chinese42. The year of publication of studies ranged from 2008 to 2016. These studies were from 6 different countries, most of them were done in Asia Twenty (80%) studies were prospective, only five (20%) were retrospective. Four (16%) studies were performed with magnification system, other twenty-one (84%) studies were without magnification. And eight (32%) studies were assessed with high definition system, twenty (80%) studies were without it, three (12%) studies consisted of both high definition and no high definition. Six studies (24%) were done with Exera series endoscopy systems, three (12%) with Lucera spectrum endoscopy systems, and sixteen (64%) with unknown systems. Sensitivity for differentiation of lesions in all studies varied from 43.2% to 100%, and specificity from 74.5% to 100%. The numbers of patients and lesions, mean age the ratio of male and female, the numbers of pathologists and endoscopists also can be seen in the Table 1.

Table 1 Study characteristics of narrow band imaging.

The potential heterogeneity

Analyzed by meta-regression, the location, type of assessment, type of endoscope system and high definition were not significant sources of heterogeneity (P > 0.05). However, magnification may be related to the source of heterogeneity, diagnostic accuracy of NBI with magnification was 0.1 times higher than NBI without magnification (RDOR = 0.10, 95% CI: 0.02–0.49; P = 0.0065).

The overall diagnostic accuracy of NBI

We pooled the sample statistics directly with Spearman correlation coefficient: −0.155, p-value = 0.461, which manifested there was no threshold effect. In terms of heterogeneity, we used a random effects model.

A total of 6187 lesions were detected. The pooled sensitivity and specificity of NBI were 88.5% (95%CI: 86.6–90.2, I2 = 85.1%, Figs. 2) and 95.6% (95%CI: 95.0–96.2, I2 = 91.9%, Fig. 3), respectively. The pooled PLR and NLR of NBI were 12.33 (95%CI: 7.72–19.67, I2 = 91%, Figs. 4) and 0.11 (95%CI: 0.07–0.17, I2 = 88.3%, Fig. 5), respectively. The pooled DOR was 121.26 (95%: 63.50–231.56, I2 = 80.8%, Supplementary Figure 1). The overall area under the curve (AUC) of SROC was 96.94%, Q statistic was 91.98%. (Supplementary Figure 2).

Figure 2
figure 2

Forest plot showing pooled sensitivity of narrow-band imaging for head and neck cancer.

Figure 3
figure 3

Forest plot showing pooled specificity of narrow-band imaging for head and neck cancer.

Figure 4
figure 4

Forest plot showing pooled positive likelihood ratio of narrow-band imaging for head and neck cancer.

Figure 5
figure 5

Forest plot showing pooled negative likelihood ratio of narrow-band imaging for head and neck cancer.

The diagnostic accuracy of NBI in different locations

A total of 1349 lesions in the head and neck, 2387 lesions in nasopharyngeal carcinoma, 1071 lesions in oral and oropharyngeal cancers and 1380 lesions in laryngeal and hypopharyngeal cancers were detected. The sensitivity in nasopharyngeal carcinoma, laryngeal and hypopharyngeal cancers were higher than in the head and neck cancers, oral and oropharyngeal cancers (92.9% and 91.0% vs 84.6% and 81.9%). The specificity in head and neck cancers, nasopharyngeal carcinoma were higher than in oral and oropharyngeal cancers, laryngeal and hypopharyngeal cancers (97.5% and 97.4% vs 92.0% and 91.5%). Besides, the PLR of NBI in head and neck cancers, nasopharyngeal carcinoma were higher than in oral and oropharyngeal cancers, laryngeal and hypopharyngeal cancers. The NLR in head and neck cancers, nasopharyngeal carcinoma were lower than in oral and oropharyngeal cancers, laryngeal and hypopharyngeal cancers. The DOR and SROC also can be seen in Table 2.

Table 2 Diagnostic performance of narrow band imaging.

The diagnostic accuracy of NBI in different types of assessment

A total of 5073 lesions with real-time assessment and 1114 lesions with post-procedure were detected in retrospective assessment. The sensitivity with real-time was lower than post-procedure (81.5% vs 91.7%). The specificity was higher in real-time (96.4% vs 91.9%). The PLR of NBI in real time was higher than in post-procedure. The NLR, DOR and SROC also can be seen in Table 2.

The diagnostic accuracy of NBI with or without magnification

A total of 482 lesions with magnification and 5705 lesions without magnification were detected. The sensitivity and specificity with magnification were both lower than without magnification (86.0% vs 89.3%, 79.6% vs 96.4%). The PLR of NBI with magnification was lower than without magnification. The NLR, DOR and SROC also can be seen in Table 2.

The diagnostic accuracy of NBI with or without high definition

Three studies used both high definition technology and no high definition technology. A total of 1627 lesions with high definition and 5446 lesions without high definition were detected. The sensitivity with high definition was higher than without high definition (91.6% vs 88.6%). However, the specificity with high definition was lower than without high definition (93.8% vs 96.3%). The NLR of NBI with high definition was lower than without high definition. The PLR, DOR and SROC also can be seen in Table 2.

The diagnostic accuracy of NBI in different types of endoscope system

A total of 282 lesions were detected by the type of Lucera and 950 lesions were detected by the type of Exera. The sensitivity in Lucera system was higher than in Exera system (92.5% vs 85.5%). However, the specificity in Lucera system was lower than in Exera (90.6% vs 92.8%). The NLR of NBI in Lucera was lower than in Exera. The PLR, DOR and SROC also can be seen in Table 2.

Quality assessment and publication bias

Based on QUADAS, the quality assessment of included studies is presented in the Supplementary Table 1. A total of 17 studies scored more than 11 in 23 studies with full text. Besides, the first and fifth items scored more “no” than others. All of included studies scored “yes” in the third, sixth and seventh items. (Supplementary Table 2) There was no significant asymmetry in Deeks’ funnel plot (P = 0.447), representing no striking publication bias.

Discussion

We analyzed the overall diagnostic accuracy of NBI, and found that NBI could be a considerable tool for differentiating benign and malignant lesions in head and neck cancer (sensitivity: 88.5%, specificity: 95.6%, PLR:12.33, NLR: 0.11). Additionally, we found that diagnostic accuracy of NBI was similar in the location, type of assessment, type of endoscope system and high definition (P > 0.05). However, diagnostic accuracy of NBI was different in magnification (P = 0.0065).

To the best of our knowledge, Cosway, B. et al. found that combined use of NBI and white light imaging (WLI) showed high diagnostic accuracies for primary, recurrent, and nasopharyngeal lesions16. This article only included 17 studies and emphasized more on the identification of primary tumors, recurrent tumors, cancers of unknown primary(CUP), and nasopharyngeal tumors. This meta-analysis included 25 studies with 6187 lesions. Meanwhile, we paid more attention on the locations of lesions and the methods of using NBI. In addition, one previous study showed that the sensitivity of NBI in detecting second primary neoplasm of head and neck cancer in patients with esophagus cancer was 87%17, similar with the sensitivity in this study. However the previous meta-analysis included only 3 studies to evaluate diagnostic performance of NBI for detecting head and neck cancer.

There are many studies investigating the performance of NBI in colon, gastric and esophagus cancer, et al.8,9,10,11,12,13, but there is rare study to explore the diagnostic value of NBI in the head and neck cancer16,17. Meanwhile, the difficulty of observing lesions of different locations in head and neck is different, thus the diagnostic accuracy may be also different. However, in this study, we didn’t find the difference. The sensitivity and specificity were both high. Additionally, the PLR of NBI were all more than 10 and the NLR were all less than 0.1 except in oral and oropharynx. Besides, the AUC of SROC were all more than 90%. The results indicated that NBI is a considerable tool to differentiate between benign and malignant lesions in head and neck cancer.

The type of included studies were RCT, prospective and retrospective, and we considered RCT and prospective as real time, the retrospective as post-procedure. Real-time assessment is the optimum selection to evaluate performance, because it avoids bias of photographic selection and performs an in-vivo optical diagnosis. However, real time assessment sometimes can not acquire the timely histological examinations. Thus, we analyzed the type of assessment to investigate the difference. The PLR and AUC of NBI in real time were higher than in post-procedure. However, we didn’t find the difference, which was consistent with the results of the previous study8. The small number of included studies may lead to this result. Therefore, a meta-analysis including more studies is needed to explain the reason.

With the development of optical technology, NBI can detect lesions with magnification or high definition system. Thus, this may influence the sensitivity and specificity of NBI. The utility of NBI is enhanced when it is employed with a magnifying endoscope43,44. This meta-analysis showed that NBI with magnification had a lower specificity (79.6%) than NBI without magnification (96.4%). Besides, the PLR and AUC of NBI with magnification were lower than without magnification. The results were not consistent with the previous study which indicated no significant difference between NBI with or without magnification14,45,46. This may because that the number of included studies of with or without magnification is extremely different. Additionally, the location is also different between this study which is in the head and neck and the previous studies which are in esophagus, stomach or colon. There are more polyps in stomach or colon cancer, while there is more squamous cell carcinoma in head and neck cancer. One study reported that high definition endoscopy could improve the detection and diagnosis of early gastric cancer47. However, this study showed that there was no difference between NBI with or without high definition. This is consistent with the previous study48. It may be related to the small number of studies including NBI with high definition. One previous study found that high definition significantly decreased the performance of NBI12. This previous study reported that this may because of the lack of experiences of endoscopists. More studies need be performed to explore the accuracy of NBI with or without high definition.

Different types of endoscope system may have different performance. The NBI technique developed by Olympus Medical Systems is now available in the most recent models of video-endoscopes that use the non-sequential system of illumination (Lucera Spectrum) or the sequential R/G/B system of illumination (Exera II). The two systems use different technology and improvements over time in the successive versions series endoscopes. Result in this study didn’t show the difference between the two systems, which is consistent with the previous study49. This indicated that the LUCERA and EXERA series provided the same clinical benefit in detecting head and neck cancer.

There were several limitations in this study. First, we didn’t know exactly the standard of experts and non-experts. In some studies, experts were experienced persons, and in other studies, experts were got trained persons, and another studies had no clear explanation of expert. Uneven standard greatly reduced the generalizability. Second, the diagnostic criteria were different. In some studies, they regarded a well-demarcated brownish area with scattered brown spots as susceptible lesions. Some studies used Ni’s classification, the classification of the microvascular endoscopic patterns of Ni, et al.37. Additionally, some studies used visualize intrapapillary capillary loops (IPCLs) pattern and a color change in the area between IPCLs. Several different classification systems used in the included studies also can reduce the generalizability of the overall performance. Because of the limitation of the numbers, we didn’t analyze this factor. Third, some studies included patients after treatment, while some studies included patients without treatment. This may influence the morphology of lesions, and influence the diagnosis Fourth, the heterogeneity of this study was relatively high. We showed that the NBI with or without magnification was the possible source. However, the time of detecting, the lesion size, the sample size, the pathological type and the bias of selection of patients can also reduce the generalizability of the overall performance and increase the heterogeneity of this study. Finally, our study only included studies written by English or Chinese.

In conclusion, narrow band imaging could be used by appropriately trained endoscopists to make a reliable optical diagnosis for head and neck lesions in daily practice. The location, type of assessment, type of endoscope system and high definition were not significant sources of heterogeneity (P > 0.05). The NBI with magnification should have better diagnosis accuracy. However, our article was not consistent with the hypothesis. More articles are needed to estimate the diagnosis value of NBI. Moreover, further studies should be focused on the diagnostic criteria, whether real-time and whether training will help improve the diagnostic performance of narrow band imaging.

Methods

Search strategy

We did a meta-analysis based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines50. We searched PubMed from inception to Feb 29, 2016, Embase from Jan 1, 1974 to Feb 29, 2016, Cochrane Library from Jan 1, 1999 to Feb 29, 2016 with following search terms: “narrow band imaging” or “NBI” and “sensitivity”. There was no restriction in PubMed and Embase. The studies were limited to trails in Cochrane Library. Additionally, we checked the reference lists of applicable studies to include other potentially eligible studies.

Inclusion and exclusion criteria

In accordance with PICOS (participants, intervention, control, outcome, study design), the inclusion criteria were as follows:

  1. 1.

    Studies were in English or Chinese.

  2. 2.

    Studies included populations who were suspected to have oral cancers, oropharyngeal cancers, nasopharyngeal cancers, hypopharyngeal cancers, laryngeal cancers or head and neck cancers.

  3. 3.

    Studies investigated detection of narrow band imaging.

  4. 4.

    Studies took pathology as gold standard.

  5. 5.

    Studies provided data of true-positive (TP), false-positive (FP), true-negative (TN), and false-negative (FN); or reported data of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and numbers of patients or lesions.

  6. 6.

    Studies were prospective, randomized controlled trails (RCTs) or retrospective.

The abstracts were also included when they contained available data.

The exclusion criteria were as follows:

  1. 1.

    Studies were in other languages.

  2. 2.

    Studies focused on other lesions, such as oral leukoplakia, ulcers.

  3. 3.

    Studies didn’t use narrow band imaging or had no pathological results.

  4. 4.

    Studies did not have available data.

  5. 5.

    Studies were other research types, such as reviews, case reports, letters.

Data extraction

Two authors independently extracted the following data: the name of first author, year of publication, country, publication (full text or abstract), number of patients, number of lesions, patients’ characteristics (mean year, sex ratio), characteristics of narrow band imaging (magnification, high definition, endoscopy system, real time), endoscopists, pathologists, outcome data (TP, FP, FN, TN or sensitivity, specificity, PPV, NPV), and study design (prospective, RCTs, or retrospective). Extracted data were imported to a standard Excel (Microsoft Office). If there were any disagreements, the two authors would discuss and reach a consensus.

Qualitative assessment

We used the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool to assess risk of bias, applicability and reporting quality1,2,3. Each of the 14 items included in the original QUADAS tool is rated as “yes”, “no”, or “unclear”, with “yes” representing a positive outcome. The majority of the items (items 3, 4, 5, 6, 7, 10, 11, 12 and 14) is related to bias, with only two items related to variability (items 1 and 2) and the quality of reporting (items 8, 9 and 13), respectively1. Two independent authors performed and crosschecked the quality assessment.

Statistical analysis

We estimated the pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR) and diagnostic OR with corresponding 95%CI by a fixed effect model (Mantel-Haenszel method) when homogeneity was fine or a random effect model (DerSimonian-Laird method) when heterogeneity was significant, to indicate the accuracy of NBI in the diagnosis of benign and malignant tumors51. Besides, the summary receiver operating characteristic (SROC) curve was constructed as described by Moses et al.18 to show the results. The areas under the curve (AUC) were calculated to assess the diagnostic accuracy of NBI for the benign and malignant tumors.

We calculated Spearman correlation coefficient to analyze diagnostic threshold. If there was no threshold effect, we pooled these sample statistics directly. If there was threshold effect, we employed fit SROC curve and calculated the AUC and Q statistic. Besides, we used the inconsistency index (I2) to manifest the percentage variability attributable to heterogeneity. If the I2 value >50%, it demonstrates substantial heterogeneity. To investigate the potential sources, we performed subgroup analysis and meta-regression analysis to assess the effects of different locations, type of assessment (real-time vs post-procedure), magnification, high definition, and type of endoscope system.

Publication bias was assessed by Deeks’ funnel plot asymmetry test, with p > 0.05 representing no significant publication bias52. All statistical analysis were performed by using Meta-DiSc statistical software version 1.4 and STATA version 12.0.

Data availaility

All the data we get was from public sources.