Whole blood mRNA expression-based targets to discriminate active tuberculosis from latent infection and other pulmonary diseases

Current diagnostic tests for tuberculosis (TB) are not able to predict reactivation disease progression from latent TB infection (LTBI). The main barrier to predicting reactivation disease is the lack of our understanding of host biomarkers associated with progression from latent infection to active disease. Here, we applied an immune-based gene expression profile by NanoString platform to identify whole blood markers that can distinguish active TB from other lung diseases (OPD), and that could be further evaluated as a reactivation TB predictor. Among 23 candidate genes that differentiated patients with active TB from those with OPD, nine genes (CD274, CEACAM1, CR1, FCGR1A/B, IFITM1, IRAK3, LILRA6, MAPK14, PDCD1LG2) demonstrated sensitivity and specificity of 100%. Seven genes (C1QB, C2, CCR2, CCRL2, LILRB4, MAPK14, MSR1) distinguished TB from LTBI with sensitivity and specificity between 82 and 100%. This study identified single gene candidates that distinguished TB from OPD and LTBI with high sensitivity and specificity (both > 82%), which may be further evaluated as diagnostic for disease and as predictive markers for reactivation TB.


Scientific Reports
| (2020) 10:22072 | https://doi.org/10.1038/s41598-020-78793-2 www.nature.com/scientificreports/ TB diagnosis has extensively been researched 10 and many of them have focused on distinguishing latent infection from active TB [11][12][13] . Unfortunately, none of these gene signatures has so far been translated into a point of care (POC) diagnostic test. The translation into the clinical practice of gene signature-based assays is challenged by the difficulty in determining which of the multiple gene signatures can be implemented as a diagnostic platform that is simple and cost-effective.
Here, we report the results of an immune-based gene expression profile study based on the NanoString technology in patients with active TB and other pulmonary diseases (OPD), healthy donors with latent TB infection (LTBI), and uninfected health controls (HC). The aim of this study was to identify whole blood markers that can distinguish active TB from OPD, HC, and LTBI. We identified 23 and seven genes associated with inflammatory mechanisms that distinguished with high sensitivity and specificity, patients with TB from OPD and LTBI, respectively.

Results
Demographic and clinical characteristics of the study population. The demographic, clinical, and laboratory features of the 35 study participants are shown in Table 1. Of the 17 TB patients, 13 (76.5%) had sputum smear test positive, three were positive by Mtb culture and one patient had the TB confirmed by Mtb molecular test (XPERT TB/RIF). Of all TB patients, eight (47.1%) were screened by Mtb culture. The median age was 41.9 (± 14.04) years in the TB group, 42.7 (± 17.06) in the LTBI group, 43.8 (± 9.70) in the OPD group, and 32.5 (± 3.53) in the HC group.
Sample clustering. We evaluated 594 inflammatory genes in whole blood from 17 TB patients and 18 controls (seven with LTBI, six HC and five with OPD). We further organized these groups in order to identify whole blood biomarkers to diagnose active TB (TB vs. OPD) and candidate to predict TB reactivation (LTBI vs. TB). First, we evaluated all four study groups together to verify whether the gene panel would be able to distinguish them. Figure 1 shows a heatmap of the normalized data generated via unsupervised hierarchical clustering. The mRNA expression levels of 46 of 594 genes segregated the study groups into two large groups. Transcripts that showed increased expression (red) clustered among TB patients while those that showed decreased expression clustered among non-active TB groups. Two individuals belonged to the groups LTBI (LTBI1) and HC (HC3) clustered with patients with active TB.
Gene expression data of TB and OPD donors. Asthma represents a chronic non-infectious inflammatory airways disease and needs to be promptly distinguished from TB by healthcare providers. We identified 23 candidates genes that differentiated most of the TB patients from asthma (OPD group) (p < 0.001 and fold change [FC] > 2) ( Fig. 2A). Principal component analyses (PCA) of the gene expression data showed significant separation between TB and OPD patients (Fig. 2B). The findings are also presented by the volcano plots of all data displayed in orange at a significance level of p < 0.05 and at a log2-fold change higher than 2 for both groups (Fig. 2C). These analyses identify genes that can be used to distinguish TB and OPD patients, which included CD274, PDCD1LG2 and FCGR1A/B (p-value < 0.0001 and log2-fold change ratio > 2.6) (Fig. 2C). www.nature.com/scientificreports/

Gene expression data of TB and LTBI donors.
We also compared the gene expression levels between TB and LTBI groups aiming to identify candidate markers able to differentiate these groups. Both heatmap ( Fig. 3A) and PCA analysis (Fig. 3B) show 7 of 594 inflammatory genes that significantly differentiate those groups (p < 0.001 and FC > 2). Volcano plots analyses revealed two promising genes (CCR2 and CIQB, p-value < 0.0001 and log2-fold change ratio > 1.1 and 2.4, respectively) that can be further tested as a possible marker of TB reactivation (Fig. 3C).
Receiver operating characteristic (ROC) curve analysis. ROC analysis was used to evaluate the individual discriminatory performance of the genes that showed a p-value < 0.001 on the heatmap for the study group's comparison. The values of area under the curve (AUC), sensitivity, specificity, and the optimal cut-off points for TB diagnostic tests ( Table 2) and to differentiate TB and LTBI subjects (  Supplementary Fig. S1). Table 3 presents seven possible candidates to be further evaluated as a www.nature.com/scientificreports/ predictor of TB progression, including the CCR2, which showed an AUC = 1.0 and both sensitivity and specificity of 100% (see Supplementary Fig. S2).

Discussion
The World Health Organization (WHO) identified the need for a non-sputum-based test as a high-priority for TB diagnosis and suggested that a rapid biomarker-based test should be easy to perform and implement at health posts; should increase the number of patients diagnosed with TB; should have sensitivity > 98% among patients with smear-positive, culture-positive, and ≥ 68% for smear-negative and culture-positive pulmonary TB in adults; and the test would ideally be able to diagnose adults and children, and pulmonary TB and extrapulmonary TB alike 14 . Here, we performed a multiplex gene expression analysis in a single assay for more than 500 inflammatory genes in whole blood samples. By this approach, of all 30 genes herein identified, 23 were candidate targets to diagnose active TB and seven can be validated as biomarkers to distinguish LTBI and TB. All those 30 genes showed sensitivity and specificity > 82%, and ROC AUC > 0.8. A major challenge to interrupt the TB transmission cycle is to predict when an individual with LTBI will develop active TB. Here, we identified seven genes that were able to discriminate TB patients from LTBI individuals, all presenting high sensitivity and specificity in ROC curve analysis ( Table 3). The expression of five (CCRL2, C1QB, C2, LILRB4, and CCR2) of seven genes placed the donor TB8 (TB patient) in the cluster enriched by the LTBI group (Fig. 3). It is possible that the other two genes (MSR1 and MAPK14), which shared a pattern of expression similar to the TB patients, maybe the first set of genes to undergo a change in the level of expression during progression to active TB. To confirm these findings, it is necessary to carry out an evaluation of the expression of these genes in a cohort with LTBI subjects.
We identified 30 candidate genes to be further tested for TB diagnosis and as biomarkers for TB progression. From 23 genes suggested to be suitable for TB diagnosis, ten were related to adaptive immune response, ten were involved in innate immune response, and the other three genes (JAK2, JAK3, and LY96) were not specifically related to either. Conversely, for TB progression, five of seven genes were components of the innate immune system and were increased in TB patients relative to LTBI volunteers (Table 4). These data suggest the involvement of activation of the innate immune response during progression to active TB in latently infected subjects.
Previously identified genes that can discriminate TB patient from non-TB patients and TB risk 11,13,15-25 either do not fill the minimum sensitivity requirements in adults regardless of HIV status for a POC test (95% in smearpositive culture-confirmed cases and 60-80% in smear-negative culture-confirmed cases), or they proposed gene www.nature.com/scientificreports/ signatures-based tests which are very difficult to implement. Here, although the number of participants was a limiting issue, we identified single candidate genes for TB diagnosis and progression, all of them presenting high levels of AUC, sensitivity, and specificity. This study provided valuable information on the development of new diagnostic tests for TB. When validated in a larger population-based study, the expression of the genes herein identified can compose new tools that will overcome the limitations of the currently available diagnostic tests, including low sensibility, long time consuming to perform, and requirement of sputum samples collection. Besides, some of the genes can distinguish seek people with TB from those latently infected. These targets need to be further validated as a possible biomarker to predict TB reactivation in a prospective cohort study.

Methods
Study participants. Subjects were recruited between November 2015 to December 2016. Written informed consent was obtained from all participants. Our study included 35 participants, 17 active TB, and 18 controls from which seven were healthy donors with latent M. tuberculosis infection (LTBI), six were uninfected health controls (HC), and five were patients with asthma (OPD). All participants were recruited at the Instituto Brasileiro para Investigação de Tuberculose (IBIT), Bahia, Brazil and 2° Centro de Saúde Rodrigo Argolo, Bahia, Brazil. TB patients were confirmed to have active pulmonary TB by chest X-ray and at least sputum smear microscopy and/or culture positive. Symptomatic patients with sputum smear microscopy negative had TB confirmed by TB culture. TB patients with no sputum smear microscopy and/or culture screened had TB diagnosis by the Xpert MTB-Rif system. The blood sample was collected prior to TB treatment. Household contacts of TB patients were defined as belonging to either LTBI or HC groups, according to QuantiFERON-TB (QFT) Gold In-Tube test. Those with QTF Gold In-tube test negative (cut-off ≤ 0.35 IU/mL) were considered healthy controls while the household contacts with positive results (cut-off > 0.35 IU/mL) were considered LTBI patients. OPD group was composed of patients who sought care with suspected pulmonary TB but were negative to both sputum smear microscopy and culture. Individuals who tested positive for human immunodeficiency virus and patients taking immunosuppressive drugs were excluded. All subjects were between 18 and 65 years old.

RNA isolation.
For each donor, we collected 2.5 mL peripheral blood in a PAXgene blood RNA tube (Pre-AnalytiX). RNA was isolated and purified with the PAXgene Blood RNA kit (Qiagen), according to the manu-

Data analysis.
The files corresponding to each cartridge were initially analyzed in nSolver Software (NanoString Technologies) for quality control assessment. Then, we analyzed the data in R statistical environment (version 3.6.3) 26 . Distributions of raw counts were evaluated in quantro package 27 . Normalization and differential expression were carried out with NanoStringNorm package 28 . Raw data were normalized with the geometric mean of positive control and housekeeping genes. Hierarchical clustering with Pearson correlation coefficient distance of differentially expressed genes was performed on ComplexHeatmap package 29 . The ability