Article | Open | Published:

# Mutation load estimation model as a predictor of the response to cancer immunotherapy

## Introduction

Cancer is the leading cause of human deaths worldwide. Cancer therapeutics are intensively studied, and immunotherapy represents one of the novel promising therapeutic approaches. In this type of therapy, the immune system is recruited to fight against tumor development and expansion, and the most successful immunotherapeutics to date have been immune checkpoint inhibitors, such as anti-programmed cell death protein 1 (PD-1), anti-PD-L1, and anti-CTLA-4 antibodies.1 Under normal conditions, T-cells can identify and kill tumor cells by recognizing the antigens on tumor cells. However, one tumor cell mechanism, which allows them to avoid killing by taking advantage of the tightly regulated nature of T-cells, has evolved. Specifically, PD-1, the surface receptor on T-cells, is an immune checkpoint molecule responsible for avoiding autoimmunity. Upon the binding of PD-1 to its ligand, PD-L1, the T-cells are deactivated. Therefore, tumor cells can present PD-L1 on their surfaces and escape death by deactivating T-cells.2 Immune checkpoint inhibitors have been developed to block the interaction between PD-1 and PD-L1, allowing the immune system to act against tumor.3 US Food and Drug Administration (FDA) have approved anti-PD-1 (nivolumab, pembrolizumab), anti-PD-L1 (atezolizumab), and anti-CTLA-4 (ipilimumab) drugs for the treatment of different kinds of cancers, such as melanoma, non-small-cell lung cancer, bladder cancer, head and neck cancer, and renal cell carcinoma.4,5,6 Clinical trials, examining the anti-tumor activity of PD-1/PD-L1 blocking antibodies against other solid and hematological malignancies are in progress, demonstrating that the PD-1 pathway represents a promising target for anti-cancer therapy.7

Although the efficacy of immunotherapy has been demonstrated, treatment response is only observed in a subset of patients.8,9,10 Therefore, the identification of patients that can potentially respond to drugs and the understanding of the underlying mechanisms are necessary. Rizvi et al.10 demonstrated that the mutation load, the number of nonsynonymous point mutations, may be a useful predictive biomarker for treatment response. An increased number of nonsynonymous point mutations is associated with improved objective response, durable clinical benefit (DCB), and progression-free survival (PFS). However, whole-exome sequencing, necessary for the determination of mutation load is not sufficiently cost and time-effective to be applied as a standard clinical test. In contrast, cancer gene panels composed of about 300–600 cancer-related genes are used in clinical practice to investigate the genetic profile of tumors.11,12 Therefore, the application of the next-generation sequencing (NGS) gene panels for the precise estimation of the mutation load and treatment response prediction was investigated. Johnson et al.13 showed that the mutation counts detected in the 315-gene NGS panel for melanoma are highly correlated with those assessed by whole-exome sequencing (Spearman correlation coefficient = 0.995). Additionally, patients with high mutation counts detected by NGS gene panels were demonstrated to have a significantly higher PFS than those with the low gene panel mutation counts.12 Further, Roszik et al.14 developed a novel algorithmic method to accurately predict total mutation load within tumors using approximately 170 genes in the NGS panels. These results indicate that the NGS gene panels with hundreds of genes can be used to estimate the mutation load of tumors and to predict the efficacy of immunotherapy. However, Campesato et al.12 further demonstrated that the predictive accuracy is apparently lost when the number of genes in the NGS panel is lower than 150, suggesting that the comprehensive gene panels, comprising more than 300 cancer-related genes, should be employed. Unfortunately, the cost of the NGS gene panels with more than 300 genes is still high, and this may be unattainable for the routine clinical tests in most hospitals worldwide.

Here, based on the publicly available cancer genomics information, we proposed a computational framework for the construction of a mutation load estimation model for lung adenocarcinoma, the most common type of lung cancer, and we analyzed the effectiveness of this model for the prediction of cancer immunotherapy response. Furthermore, the computational framework was applied to construct the mutation load estimation models for melanoma and colorectal cancer, respectively. These cancer-specific models may allow the design of customized panels for the targeted sequencing of selected genes to estimate mutation load, instead of whole-exome sequencing, decreasing the cost and time required for the assessment of mutation load.

## Results

### Computational framework overview

The flowchart of the computational framework used during the mutation load estimation model construction for lung adenocarcinoma is shown in Fig. 1. We generated the mutation matrix with the somatic mutation data downloaded from The Cancer Genome Atlas (TCGA)15 as the training data. Subsequently, the candidate genes were selected based on a set of defined criteria. Afterward, a simple linear model was used for the construction of mutation load estimation model. Least squares parameter estimation method was employed for parameter identification and Bayesian information criterion (BIC) was used for model selection. After the selection of the most appropriate model, the performance of the mutation load estimation model was evaluated and verified using the mutation information obtained from the independent validation data. Details of this procedure are presented in Materials and methods.

### Mutation load estimation model for lung adenocarcinoma was constructed using only 24 genes

With the lung adenocarcinoma somatic mutation data downloaded from TCGA database, a computational framework was developed to construct the mutation load estimation model. After selecting nonsynonymous point mutations, the mutation matrix with 13,526 genes and 230 patients was generated. Subsequently, based on the defined selection criteria (mutation frequency ≥ 10%, coding DNA sequence (CDS) length ≤ 15,000, and Bonferroni corrected p-value < 0.05 in Wilcoxon test), 62 candidate genes were selected (Materials and methods, Supplementary Fig. 1, and Supplementary Table 1).

For the 62 candidate genes selected, there are 262−1 combinations of gene sets, resulting in 262−1 possible models. Based on the least squares parameter estimation and BIC for model selection (Materials and methods, Supplementary Methods), the most appropriate mutation load estimation model for lung adenocarcinoma was shown to contain only 24 genes, selected as follows:

$$\begin{array}{l} \hat y{{ = }}68.72 \cdot {\mathrm{{\it PXDNL} + }}64.27 \cdot {\mathrm{{\it NOTCH4} + }} \cdots {{ + }}27.1 \cdot {\mathrm{\it PAPPA2}} \\ + 22.57 \cdot {\mathrm{{\it ZFHX4} + }}47.24, \end{array}$$
(1)

where $$\hat y$$ is the estimated mutation load using the 24-gene model. The complete list of genes and their corresponding parameters in the constructed estimation model are shown in Table 1. With the model constructed as shown by equation (1), the mutation counts in these 24 genes of a patient allow the estimation of the mutation load.

### The constructed model for lung adenocarcinoma can be used for the precise estimation of the mutation load and accurate prediction of the immunotherapy treatment response

For the performance evaluation of the constructed model for lung adenocarcinoma, the mutation load for all patients in the training data from TCGA (n= 230) was estimated using this model. R2 between the estimated and actual mutation load was shown to be 0.9336 (Supplementary Fig. 2), indicating that the estimated mutation loads highly correlate with the actual mutation loads. Additionally, in order to validate the constructed mutation load estimation model, two independent validation datasets (n= 211) were applied as well, to test the performance (Materials and methods),10,16 and R2 between the estimated and actual mutation load was shown to be 0.7626 for the independent validation cohort (Fig. 2a).

### Performance verification by random models

Although we demonstrated that our estimation model for lung adenocarcinoma can be used for the precise estimation of the mutation load of a patient, and the estimated mutation load is useful for the prediction of cancer immunotherapy treatment response, we further verified the results, comparing them with the results of a model constructed using 24 randomly selected genes. Therefore, 24 genes were randomly selected from the generated mutation matrix, and a random model was constructed with the help of the least squares parameter estimation method. The procedure was repeated 10,000 times, resulting in 10,000 random models. Subsequently, the performances of these random models were evaluated. The empirical distribution of R2 between the estimated and actual mutation loads in the independent validation cohort for 10,000 random models is presented in Fig. 3a. The R2 of our constructed model (0.7626) was shown to be higher than all R2 calculated by random models, and the empirical p-value of R2 was p < 0.0001. Further, based on the random models and the immunotherapy treatment response data, the ROC curves for all 10,000 random models were plotted (Fig. 3b) and the empirical distribution of AUC is shown in Fig. 3c (empirical p-value = 0.0002). For each random model, the optimal discrimination threshold can also be identified using the ROC curve, allowing the determination of the classification accuracy. The empirical distribution of classification accuracy for 10,000 random models is displayed in Fig. 3d and the empirical p-value of our constructed model was 0.0001.

## Discussion

Immunotherapy using immune checkpoint inhibitors has emerged as a promising new therapeutic approach to cancer treatment in recent years. However, there are still patients who do not respond to this type of therapy, and the potential predictive biomarkers that can be used to identify the potential responders of immunotherapy are intensively studied, since this information can support the medical decision-making. Previous studies demonstrated that the mutation load measured by whole-exome sequencing may predict the sensitivity to cancer immunotherapy. However, due to the high costs and technical threshold, the routine use of whole-exome sequencing is generally not feasible in medical institutions, which hinders the application of this method as a standard clinical test. In this study, we developed a computational framework for the construction of the mathematical model that can be used for the estimation of the patient mutation load using the genetic information on a small number of genes. The constructed mutation load estimation model for lung adenocarcinoma using only 24 genes was shown to allow the precise estimation of the mutation load and the highly accurate prediction of the immunotherapy response in the lung adenocarcinoma patients, and this accuracy was shown to be similar to that of the whole-exome sequencing. Furthermore, all performance indices demonstrated that our mutation load estimation model outperforms the random models, which shows the effectiveness of the computational framework proposed in this study.

Previous studies showed that the commercial or institutional gene panels that consist of genes known or suspected to be relevant to cancer can be used to estimate the mutation load.25 However, the number of genes in these panels is considerably higher than that in our model, including as many as 170, 315, and 641 genes.12,14 Additionally, only four genes used in our lung adenocarcinoma model are currently included in other cancer gene panels, and only one of them is included in all three panels (Supplementary Table 6). This suggests that the majority of genes used in our model is not well-recognized as cancer-associated genes. Since the mutational profile of these 24 lung adenocarcinoma model genes was shown to be highly associated with the responses to cancer immunotherapy and mutation load, the role of these genes in cancer development and progression should be elucidated in future studies.

The genes used in our lung adenocarcinoma mutation load estimation model have a total CDS length of 187,188, which is much shorter than that in the commercial or institutional gene panels.11,25,26 Therefore, this represents an additional advantage of a gene panel developed based on our mutation load estimation model, since panel cost depends on the total lengths of the genes selected. Our model should help decrease the cost and time required for panel analysis, which will further accelerate the establishment of diagnosis and medical decisions. Additionally, since there are many gene transcripts, and the CDS length information retrieved from the Ensembl BioMart represents the length of the transcripts, the CDS length of the longest transcript was used when selecting the candidate genes. Therefore, if the panels are developed using the most common transcript of each gene, the total CDS length and cost can be further reduced. Moreover, mutational hotspots can be considered as well when developing a gene panel to minimize the cost.

Although the cost can vary across different platforms, panel designs, analysis pipelines, and practices, we believe a customized targeted gene panel based on our 24-gene lung adenocarcinoma model may be a cost-effective solution for the mutation load estimation and prediction of responses to cancer immunotherapy in lung adenocarcinoma patients. A previous study directly compared the costs of a targeted sequencing panel (Einstein_v1, with a targeted region of 4.98 Mb) to whole-exome sequencing, using the same sequencing platform.27 The cost of Einstein_v1 was shown to be approximately one-fourth lower than that of the whole-exome sequencing (USD $281.25 vs.$1266). The targeted region in our 24-gene model (approximately 0.2 Mb) is much smaller than that in Einstein_v1, and the cost can be anticipated to reduce further. Additionally, targeted gene panel approach shortens the turnaround time. A previous study estimated that the data processing CPU time for a 90-gene panel is one-twentieth of that needed for the whole-exome sequencing (5 vs. 100 h).28 Furthermore, targeted gene panel approach can substantially increase the throughput, because of its high multiplexing capabilities. For example, in the aforementioned study comparing Einstein_v1 and whole-exome sequencing, whole-exome sequencing allowed only three samples per lane, whereas the targeted sequencing panel Einstein_v1 can analyze 16 samples per lane.27 These are all important issues determining clinical applicability of a test.

The limitation of our study is a relatively small number of cases, since the immunotherapy treatment response data for lung adenocarcinoma patients included only 28 cases,10 and therefore, a larger number of cases is required for the validation of the performance of the treatment response prediction. Furthermore, the datasets in this study were mostly obtained in the Caucasian population, and the performance of our model in other ethnicities should be tested. Recently, in addition to the mutation load, other features such as microsatellite instability and neoantigen burden emerged as potential predictive biomarkers for cancer immunotherapy treatment response as well.29,30,31 Therefore, the strategies that integrate different features may be more effective biomarkers for the accurate prediction of cancer immunotherapy response in future.32

In summary, we have proposed a computational framework and successfully constructed a mathematical model using only 24 genes that can be used to estimate the mutation load in lung adenocarcinoma precisely. The estimated mutation load can be used to predict the clinical outcome of cancer immunotherapy with high accuracy. Therefore, a customized panel for the targeted sequencing of these selected genes can be designed, instead of whole-exome sequencing. Consequently, by using our mutation load estimation model, the cost and time needed for the assessment of the mutation load should considerably decrease and the cancer immunotherapy response prediction should be more obtainable in the standard clinical setting.

## Materials and methods

### Data used for model construction

Genomics data, specifically somatic mutation information, were used for the construction of the mutation load estimation model. As the training data for the construction of the lung adenocarcinoma model, the somatic mutation data were downloaded from TCGA database (n= 230).15 As the validation data, the somatic mutation data from two independent studies were retrieved (n= 181 for Imielinski et al.;16 n= 30 for Rizvi et al.,10 excluding four patients with squamous cell carcinoma). Additionally, we retrieved the data showing the treatment responses to anti-PD-1 immunotherapy.10 For the melanoma model, the somatic mutation data was obtained from TCGA database (n = 333)21 as the training data. The somatic mutation information from four independent studies (n = 333)17,18,19,20 and clinical outcomes of melanoma patients treated with anti-CTLA-4 (Snyder et al. (n = 64)17 and Van Allen et al. (n = 110)18) or anti-PD-1 therapy (Hugo et al. (n = 38)20) were used as the validation data for the melanoma model. For the colorectal model, as the training data, the somatic mutation data obtained from TCGA database (n = 536)22 were used, while the validation data were the mutation data retrieved from two independent studies (n= 619 for Giannakis et al.23; n= 72 for Seshagiri et al.24).

### Selection of nonsynonymous point mutations and the construction of mutation matrix

Since the number of nonsynonymous point mutations has been demonstrated to be associated with the clinical benefits of immunotherapy,10 the first step was selecting nonsynonymous point mutations from the training data downloaded from TCGA. Here, the column “Variant_Classification” indicates the translational effect of a variant. There are 11 different types of variant classification in TCGA lung adenocarcinoma somatic mutation data and three of them, including nonsense mutation, nonstop mutation, and missense mutation, are considered nonsynonymous point mutations. The mutations of these three types were selected and used for mutation matrix construction. Mutation matrix is an m × n matrix where m indicates the number of genes and n represents the number of patients. Each element in the mutation matrix specifies the number of nonsynonymous point mutations in a gene in one patient. Following the selection of the nonsynonymous point mutations, the “Variant_Type” information in TCGA somatic mutation raw data, showing variant types, was used for the calculation of mutation count. The types of variants used here were single-nucleotide polymorphism (SNP), double-nucleotide polymorphism (DNP), and tri-nucleotide polymorphism (TNP), indicating the mutations in one, two, or three consecutive nucleotides, respectively. Therefore, the mutation count calculation was one, two, and three for SNP, DNP, and TNP, respectively. The summation of all mutation counts of a gene in a patient represented the total number of nonsynonymous point mutations. For example, three SNPs, two DNPs, and one TNP in a gene A of a patient gave ten nonsynonymous point mutations in gene A. In this way, the number of nonsynonymous point mutations in each gene for each patient was calculated, generating the mutation matrix.

### Construction of the mutation load estimation model

Based on the selected candidate genes, a linear mathematical model was used to estimate the mutation load:

$$y_m{\mathrm{ = }}c{\mathrm{ + }}\mathop {\sum}\nolimits_{i = 1}^n {a_i \cdot x_{mi}} {\mathrm{ + }}e_m$$
(2)

where y m is the mutation load of the m-th patient, x mi , i = 1, …, n, indicates the mutation count of the selected model gene i in the m-th patient, a i , i = 1, …, n, represents the weighting of each selected model gene i on the mutation load, c specifies the constant term, and e m is the model uncertainty for the m-th patient. The equation shows that the mutation load of a patient can be calculated using the mutation counts of the selected model genes multiplied by the corresponding weightings and adding the constant term and the model uncertainty.

In the mutation load estimation model shown in equation (2), the mutation load y m and the mutation counts of the selected genes x mi can be obtained from the generated mutation matrix. On the other hands, the weighting of each selected gene a i and the constant term c represent the model parameters that had to be identified. Subsequently, least squares parameter estimation method was employed for parameter identification and BIC was used for model selection. BIC is a model selection criterion widely used in the field of system identification.35 It measures the trade-off between the estimated error and model complexity. The model with the lower value of BIC can estimate the mutation load more precisely without including too many genes in the model. Therefore, the model with the minimal BIC statistics was selected as the most appropriate mutation load estimation model. Details are presented in Supplementary Methods.

### Statistical analysis

Differences in mutation loads were examined by using the Mann–Whitney U-test or the Kruskal–Wallis exact test. The log-rank test was used to compare Kaplan–Meier survival curves. Cox proportional-hazards regression model was used to estimate hazard ratios and their associated 95% confidence intervals.

### Data availability

All data used in this study were publicly available prior to analysis (Materials and methods).

### Code availability

The code for mutation load estimation model construction is available upon request.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

Gubin, M. M. et al. Checkpoint blockade cancer immunotherapy targets tumour-specific mutant antigens. Nature 515, 577–581 (2014).

2. 2.

Pardoll, D. M. The blockade of immune checkpoints in cancer immunotherapy. Nat. Rev. Cancer 12, 252–264 (2012).

3. 3.

Postow, M. A., Callahan, M. K. & Wolchok, J. D. Immune checkpoint blockade in cancer therapy. J. Clin. Oncol. 33, 1974–1982 (2015).

4. 4.

Homet Moreno, B. & Ribas, A. Anti-programmed cell death protein-1/ligand-1 therapy in different cancers. Br. J. Cancer 112, 1421–1427 (2015).

5. 5.

Topalian, S. L., Drake, C. G. & Pardoll, D. M. Immune checkpoint blockade: a common denominator approach to cancer therapy. Cancer Cell. 27, 450–461 (2015).

6. 6.

Topalian, S. L., Taube, J. M., Anders, R. A. & Pardoll, D. M. Mechanism-driven biomarkers to guide immune checkpoint blockade in cancer therapy. Nat. Rev. Cancer 16, 275–287 (2016).

7. 7.

Lipson, E. J. et al. Antagonists of PD-1 and PD-L1 in cancer treatment. Semin. Oncol. 42, 587–600 (2015).

8. 8.

Topalian, S. L. et al. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. N. Engl. J. Med. 366, 2443–2454 (2012).

9. 9.

Prieto, P. A. et al. CTLA-4 blockade with ipilimumab: long-term follow-up of 177 patients with metastatic melanoma. Clin. Cancer Res. 18, 2039–2047 (2012).

10. 10.

Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science 348, 124–128 (2015).

11. 11.

Cheng, D. T. et al. Memorial sloan kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): A hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).

12. 12.

Campesato, L. F. et al. Comprehensive cancer-gene panels can be used to estimate mutational load and predict clinical benefit to PD-1 blockade in clinical practice. Oncotarget 6, 34221–34227 (2015).

13. 13.

Johnson, D. B. et al. Targeted next generation sequencing identifies markers of response to PD-1 blockade. Cancer Immunol. Res. 4, 959–967 (2016).

14. 14.

Roszik, J. et al. Novel algorithmic approach predicts tumor mutation load and correlates with immunotherapy clinical outcomes using a defined gene mutation set. BMC Med. 14, 168 (2016).

15. 15.

The Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).

16. 16.

Imielinski, M. et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150, 1107–1120 (2012).

17. 17.

Snyder, A. et al. Genetic basis for clinical response to CTLA-4 blockade in melanoma. N. Engl. J. Med. 371, 2189–2199 (2014).

18. 18.

Van Allen, E. M. et al. Genomic correlates of response to CTLA-4 blockade in metastatic melanoma. Science 350, 207–211 (2015).

19. 19.

Hodi, F. S. et al. Improved survival with ipilimumab in patients with metastatic melanoma. N. Engl. J. Med. 363, 711–723 (2010).

20. 20.

Hugo, W. et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165, 35–44 (2016).

21. 21.

The Cancer Genome Atlas Network. Genomic classification of cutaneous melanoma. Cell 161, 1681–1696 (2015).

22. 22.

The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

23. 23.

Giannakis, M. et al. Genomic correlates of immune-cell infiltrates in colorectal carcinoma. Cell Rep. 15, 857–865 (2016).

24. 24.

Seshagiri, S. et al. Recurrent R-spondin fusions in colon cancer. Nature 488, 660–664 (2012).

25. 25.

Chalmers, Z. R. et al. Analysis of 100,000 human cancer genomes reveals the landscape of tumor mutational burden. Genome Med. 9, 34 (2017).

26. 26.

Chen, K. et al. Clinical actionability enhanced through deep targeted sequencing of solid tumors. Clin. Chem. 61, 544–553 (2015).

27. 27.

Delio, M. et al. Development of a targeted multi-disorder high-throughput sequencing assay for the effective identification of disease-causing variants. PLoS ONE 10, e0133742 (2015).

28. 28.

van Nimwegen, K. J. et al. Is the \$1000 genome as near as we think? A cost analysis of next-generation sequencing. Clin. Chem. 62, 1458–1464 (2016).

29. 29.

Dudley, J. C., Lin, M. T., Le, D. T. & Eshleman, J. R. Microsatellite instability as a biomarker for PD-1 blockade. Clin. Cancer Res. 22, 813–820 (2016).

30. 30.

McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).

31. 31.

Charoentong, P. et al. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade. Cell Rep. 18, 248–262 (2017).

32. 32.

Gibney, G. T., Weiner, L. M. & Atkins, M. B. Predictive biomarkers for checkpoint inhibitor-based immunotherapy. Lancet Oncol. 17, e542–e551 (2016).

33. 33.

International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).

34. 34.

Kinsella, R. J. et al. Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database 2011, bar030 (2011).

35. 35.

Johnson, J. B. & Omland, K. S. Model selection in ecology and evolution. Trends Ecol. Evol. 19, 101–108 (2004).

## Acknowledgements

This work was supported by Ministry of Science and Technology, Taiwan (MOST 104-2221-E-010-008-MY2, MOST 106-2221-E-010-019-MY3).

## Author information

### Affiliations

1. #### Institute of Biomedical Informatics, National Yang-Ming University, Taipei, 11221, Taiwan

• Guan-Yi Lyu
•  & Yu-Chao Wang
2. #### Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, 11221, Taiwan

• Yu-Hsuan Yeh
3. #### Department of Pathology and Laboratory Medicine, Taipei Veterans General Hospital, Taipei, 11217, Taiwan

• Yi-Chen Yeh
4. #### School of Medicine, National Yang-Ming University, Taipei, 11221, Taiwan

• Yi-Chen Yeh
5. #### Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei, 11221, Taiwan

• Yu-Chao Wang

### Contributions

Y.-C.Y. and Y.-C.W. conceived of the study. G.-Y.L. and Y.-C.W. developed the method. G.-Y.L., Y.-H.Y., Y.-C.Y., and Y.-C.W. analyzed the data. G.-Y.L., Y.-C.Y., and Y.-C.W. wrote the manuscript.

### Competing interests

The authors declare that they have no competing interests.

### Corresponding authors

Correspondence to Yi-Chen Yeh or Yu-Chao Wang.