Tuberculosis (TB) still poses a profound burden on global health, owing to significant morbidity and mortality worldwide. Although a fully functional immune system is essential for the control of Mycobacterium tuberculosis infection, the underlying mechanisms and reasons for failure in part of the infected population remain enigmatic. Here, whole-blood microarray gene expression analyses were performed in TB patients and in latently as well as uninfected healthy controls to define biomarkers predictive of susceptibility and resistance. Fc gamma receptor 1B (FCGRIB)was identified as the most differentially expressed gene, and, in combination with four other markers, produced a high degree of accuracy in discriminating TB patients and latently infected donors. We determined differentially expressed genes unique for active disease and identified profiles that correlated with susceptibility and resistance to TB. Elevated expression of innate immune-related genes in active TB and higher expression of particular gene clusters involved in apoptosis and natural killer cell activity in latently infected donors are likely to be the major distinctive factors determining failure or success in controlling M. tuberculosis infection. The gene expression profiles defined in this study provide valuable clues for better understanding of progression from latent infection to active disease and pave the way for defining predictive correlates of protection in TB.
Tuberculosis (TB) is an ancient disease caused by the bacterial pathogen Mycobacterium tuberculosis, which has co-evolved with the human population for millennia.1 Infection with M. tuberculosis is controlled in most infected individuals, while in a minority of cases containment of infection fails over time, and active TB disease develops.2 Although only a minority of exposed individuals develop active disease, morbidity and mortality in humans worldwide remain high, with nearly 2 million deaths annually.3, 4 In the face of this major impact on global health, remarkably little progress has been made in the past towards the development of new tools and elucidation of key features involved in protection and pathogenesis.5, 6, 7
It is generally accepted that a competent immune response is key to control of M. tuberculosis infection.8 Although the adaptive immune response induces development of solid granulomas, which contain the bacteria,9 control of latent M. tuberculosis infection also requires robust innate immunity. The precise mechanisms by which these innate processes contribute to the control of latent infection remain elusive.10, 11 In part of the infected population, latent infection progresses to active disease. Once immune control of infection wanes, granulomas liquefy and containment breaks down. As a result, actively replicating M. tuberculosis are released into the airways, rendering the disease contagious.12 Sensing of M. tuberculosis by macrophages and dendritic cells via pattern recognition receptors such as the Toll-like receptor family trigger signalling cascades, which initiate cytokine secretion, suggesting a role in protection against M. tuberculosis.13 However, the exact role of Toll-like receptor and other pattern recognition receptors in immunity against M. tuberculosis remains controversial.11, 14 Bacterial killing is a key function of macrophages, which are activated by cytokines, notably tumour necrosis factor alpha and interferon gamma, which are mainly produced by T cells and natural killer (NK) cells. NK cells are well-known key players in innate immunity, which actively induce death of M. tuberculosis-infected monocytes.15
It has been proposed to reduce TB incidence to <1 per 106 inhabitants by 2050.4 To reach this ambitious goal, better intervention measures are needed.7 It has been calculated that a combination of tools for rapid diagnosis, efficacious vaccines and novel drugs will be needed to reduce current TB incidence by >90%.16 Biomarkers will be of critical importance for this approach as they can facilitate the identification of targets for new drugs and vaccines, allow better monitoring of clinical trials and provide guidelines for rapid diagnosis. In the present study, we evaluated transcriptional signatures in TB patients, healthy donors with latent M. tuberculosis infection (LTBI) and healthy non-infected donors (NIDs) in an attempt to define biomarkers predictive of susceptibility for, or resistance against, active TB.
Gene expression profiling in TB patients and household contacts (LTBI and NID)
Genome-wide transcription profiles in whole blood from 33 TB patients, 34 LTBI and from 9 NIDs from a TB-endemic area were generated using microarray chips comprising ∼45 000 unique features. Our strategy is aimed at analysing three separate comparisons to identify differentially expressed genes between study groups, that is, TB versus LTBI, TB versus NID and LTBI versus NIDs (Figure 1). Our ultimate goal is the identification of unique profiles that are associated with active TB. For our first goal, that is, identifying disease-related gene profiles, we focussed on comparisons of TB with LTBI as well as TB with NID. We proposed that these comparisons define patterns that distinguish between active TB disease and health. For our second goal, that is, identifying profiles that correlate with resistance and susceptibility, we focussed on genes that are differentially expressed in TB versus LTBI, but not in TB versus NID. This approach filtered out general disease-related patterns.
A total of 2048 differentially expressed transcripts (representing 1935 genes) were identified between TB patients and LTBI. These included 988 transcripts (927 genes) that were also differentially expressed between TB patients and NIDs. In this analysis, a fold-change cutoff of >0.2 or <−0.2 (log 2 scale) with q<0.01 was used (q-value equals the raw P-value corrected for multiple testing). A clustering analysis was performed to interrogate whether donors within this study could be divided into distinct groups based on their gene expression profiles. The resulting two-way clustered heat map, showing intensity levels in all donors, is depicted in Supplementary Figure 1. This analysis revealed clustering of most of the TB donors, whereas no obvious clusters of LTBI and NIDs were identified. Thus, there was a discriminative pattern between TB versus healthy individuals with or without M. tuberculosis infection, but no apparent difference between LTBI and NID.
Validation of differential gene expression by reverse transcriptase (RT)-PCR
From the genes with highest degree of differential expression between TB and LTBI, 10 were selected to validate differential expression levels by quantitative RT-PCR. Beta-2-microglobulin was chosen as reference gene as this gene showed the least variation in expression within and between each study group (data not shown). A single sample from the TB group was not included because of insufficient amounts of RNA. Although most of the high-ranking genes showed distinct levels of expression between study groups, not all microarray results could be confirmed by RT-PCR (Supplementary Figure S2).
The most strongly differentially expressed gene that was identified in TB versus control groups in this study was Fc gamma receptor 1B (FCGR1B). Expression validation by RT-PCR confirmed significant differences in expression of this gene between study groups (Figure 2a). An earlier study by our group had identified a minimal set of three genes (CD64, RAB33A and LTF (lactoferrin)), whose combined expression patterns showed a discriminating power between TB and LTBI in Caucasians.17 These candidate biomarker genes were also included in our RT-PCR validation to investigate whether this subset of genes could discriminate between TB and LTBI individuals from the South African cohort studied herein as well. Similar to those previous findings, CD64 and LTF showed a significant difference in gene expression between the study groups, whereas RAB33A expression levels were similar between TB and LTBI (Figure 2b). These findings underline differences and communalities in candidate TB biomarkers between study populations of different ethnicities. Universal biomarkers shared by a broad variety of ethnicities would be preferable.
Random forest analysis was applied to calculate the discriminating power of these three genes (CD64, LTF and RAB33A) to distinguish between TB donors and LTBI. With CD64 being the most powerful discriminating gene, the combination of the three markers achieved a sensitivity and specificity of 88 and 91%, respectively, in identifying TB patients (Table 1a). Running a similar analysis on all 13 genes that had been subjected to RT-PCR expression evaluation, a combination of five most prominently differentiating genes turned out to give the highest accuracy in discriminating between TB and LTBI, that is, FCGR1B, CD64, LTF, guanylate binding protein 5 and Granzyme A. This intriguing subset of genes formed a biosignature capable of increasing sensitivity and specificity to 94% (30/32) and 97% (33/34), respectively (see Table 1b).
Functional categorization of differentially expressed genes
To categorize differentially regulated genes, we used the web-based tool Database for Annotation, Visualization and Integrated Discovery.18, 19 Functional annotation clustering of genes common in comparisons of TB versus LTBI and TB versus NID revealed a high proportion of genes involved in protein binding, cell communication and signal transduction (Figure 3). Also, genes involved in distinct immune responses, apoptosis and stress responses were enriched in this gene ontology analysis (enrichment of gene ontology terms depicted in Figure 3, P<0.05). Genes differentially expressed in both comparisons behaved similarly, 792/988 transcripts (744/927 genes) showed increased or decreased expression in both TB versus LTBI and TB versus NID analyses. These genes, which showed highly significant enrichment in gene ontology terms, were mainly involved in immune response regulation and in apoptosis. Our results verify that systemic changes in immune responses are a useful indicator for active TB disease. The similarities in gene expression in both comparisons are likely to reflect a systemic inflammatory response in TB patients compared with healthy individuals with (LTBI) or without (NID) M. tuberculosis infection.
Unique expression profiles in TB disease and LTBI
In an attempt to identify patterns of gene expression indicative of progression to active TB disease from LTBI, we next focussed our analyses on genes differentially expressed in TB versus LTBI only. For this, we filtered out all genes from the TB versus LTBI comparison that were differentially expressed in TB versus NID, as well. This filtering revealed 1060 transcripts (representing 1008 genes) between TB and LTBI, but not between TB and NID. A comparative analysis led to the finding that a gene cluster is involved in apoptosis regulation with reduced expression in TB compared with LTBI (Table 2). Because genes present in both comparisons had been filtered out, this particular gene cluster showed similar levels of expression in TB and NID. Similarly, a group of genes involved in host defence responses and mainly active in granulocyte and macrophage effector function and differentiation was expressed at lower levels in LTBI compared with TB and NID (Table 2). These subsets of genes indicating decreased apoptotic activity and concomitantly increased innate host defence are apparently indicative for progression of TB disease from LTBI.
Functional and cell-type-associated expression profiles
Functional categorization of differentially expressed genes unique for TB versus LTBI revealed profound upregulation of Toll-like receptor-associated genes including MyD88 and IRF7. Simultaneously, the data reveal an increased expression of several interferon-inducible genes, including guanylate binding protein 1 and 2, interferon alpha-inducible protein 6 and 27, ISG15 ubiquitin-like modifier and 2′,5′-oligoadenylate synthetase 1 (see Table 3), which might illustrate the induction of interferon responses upon activation of pattern recognition receptors pathways.20, 21 Intriguingly, a distinct gene expression profile between TB and LTBI was observed when analysing expression of well-defined subsets of macrophage- and NK cell-associated genes. Macrophage-associated genes were upregulated in TB patients, whereas NK cell-associated genes were downregulated in TB patients as compared with LTBI patients (Table 3, list of cell-type-associated genes in part derived from Gaucher et al.22). Similarly, expression of complement- and inflammasome-associated genes were elevated in TB patients, whereas B cell-associated genes were, for the most part, decreased. It is to be noted that for all these functional and cell type-associated genes, gene expression levels did not differ between TB and NID. However, we did not perform a comparative phenotypic analysis of cellular composition of macrophages and NK cells in our samples. Thus, we cannot formally exclude that the observed differences in gene expression reflect divergent total cell counts of these cell types in LTBI patients compared with both TB patients and NIDs.
Although several key components involved in M. tuberculosis infection and active TB disease have been described, the mechanisms that determine successful resistance or susceptibility to TB disease remain ill-defined. Definition of correlates of protection in TB will most likely require a multitude of biomarkers, termed biosignature.23, 24 The complexity of the infection process that comprises LTBI and active TB disease necessitates a broader strategy aimed at identifying unique gene expression profiles. In the present study, we applied whole-genome microarray expression analysis on whole-blood samples from TB patients and their controls (LTBI and NID) from a high TB-endemic area to identify such biomic profiles. Previous studies using microarray analysis in TB patients have identified gene patterns, which identify subjects at risk for recurrent disease among patients with cured TB25 and a distinct subset of genes to classify TB patients and LTBI in a Caucasian population.17 TB patients and healthy household control subjects (both latently infected and uninfected) included in this study were recruited from a cohort with high incidence of infection and disease. Accordingly, the recruitment of NID proved difficult, as almost all individuals in this cohort were found to give a positive result in a tuberculin skin test.
FCGR1B (Fc gamma receptor 1B) was identified as the gene with the highest degree of differential expression between the three groups of donors in this study. An earlier study by our group had identified CD64 (alias FCGR1A; Fc gamma receptor 1A) as one of the highest ranking differentially expressed genes in TB-diseased versus latently infected Caucasian subjects.17 Despite of their different annotations, these genes have remarkably similar sequences. The oligonucleotide probes on the microarray chips used in both studies, were 100% complementary to the sequences of the probes used in either study. The microarray chips in both studies thus targeted transcripts from both genes. In contrast, in our RT-PCR evaluation, we designed primers to specifically target the CD64 encoding mRNA. Primers for FCGR1B were cross-reactive with CD64 owing to the high sequence similarity, thus allowing expression analysis of both genes. Although flow cytometry analyses revealed a higher percentage of monocytes—the main cell type expressing CD64—among peripheral blood mononuclear cells from TB donors compared with LTBI, the ∼15-fold difference in mRNA expression (estimate from RT-PCR analysis) cannot solely be explained by this slightly higher proportion of monocytes. Increased expression of the CD64 gene in monocytes from TB patients was also confirmed at the protein level in our previous study.17 A major function of the family of Fc receptors for IgG (FcγRs) is binding of antibodies by their constant domain. The role of antibodies has long been considered to be of minor importance in TB,26 although more recent studies indicate that antibodies may have a role in control against M. tuberculosis infections.27 In addition to their antibody binding capacity, the members within the FcγR family comprise a group of molecules that can simultaneously trigger activating and inhibitory signalling pathways to set thresholds for cell activation and thus generate a well-balanced immune response.28 The strong upregulation of the activating receptor FCGR1 in TB patients as identified here could reflect a crucial role of this molecule in the sustenance of chronic immune activation in these individuals, which might have a detrimental effect on disease outcome. Marked upregulation of FCGR1 in TB patients of different ethnicity warrants further elucidation of its precise role in TB, notably whether FCGR1 upregulation indicates an immune modulatory function, or an ‘attempt in vain’ of a protective response.
The profiles defined in this study comprise a significant number of differentially expressed genes between individuals in all three study groups. These differences were most pronounced between diseased donors (TB) versus both groups of healthy controls (LTBI and NID), independent of M. tuberculosis infection. Functional annotation clustering of these genes indicated a high prevalence of factors involved in immune modulation, signal transduction, gene expression activity, as well as protein and metal ion binding factors. Additionally, distinct clusters of genes involved in apoptosis regulation and stress responses were differentially expressed between TB patients and both control groups, independent of infection with M. tuberculosis.
Although these clusters can be considered to be disease susceptibility-related, these profiles may not be specific for TB. In general, we assume that many alterations in gene expression patterns reflect an overall activation and differentiation of a sustained inflammatory response. To define TB-specific susceptibility- and resistance-related genes within these clusters, we filtered out all genes that were differentially expressed in both TB versus LTBI and TB versus NID comparisons to exclude general pathology-related differences in genes expression. Differentially expressed gene profiles unique for TB versus LTBI comprised an intriguing subset of genes involved in apoptosis regulation, which were found to be less active in TB compared with LTBI. In addition, a group of gene clusters involved in defence responses of granulocytes and in macrophage effector functions and differentiation, was expressed at lower levels in LTBI. These expression levels were likely to be modulated by M. tuberculosis infection. Although bacterial infections often induce an increase in granulocyte numbers in the blood, this subset of genes showed no differences in expression levels between TB and NID. Thus, these patterns most likely indicate a downregulation in expression of these genes in LTBI. The most straightforward conclusion from our findings would rate increased expression of apoptosis-related genes and simultaneous downregulated expression of genes involved in granulocyte activation in LTBI beneficial for preventing outbreak of active disease. Alternatively, higher gene expression levels and activation of granulocytes and monocytes in TB patients could reflect mobilization of professional phagocytes during active TB in an attempt to strengthen defence in affected lung tissue despite the risk of heightened tissue damage.
Control of M. tuberculosis infection depends on a delicate balance of immune control mainly by macrophages and T lymphocytes. Disturbance of this balance results in progression of latent infection to active TB disease. This imbalance in immune control apparently correlates with increased innate immunity, as indicated by elevated defence response gene expression, Toll-like receptor signalling and macrophage effector functions, as observed in TB patients in this study. Simultaneous reduced expression of particular subsets of genes involved in apoptosis and NK cell activity could affect eradication of replicating M. tuberculosis and thus have a key role in failure of the immune defence to successfully control latent infection. Reciprocally, fine-tuning of apoptosis and the innate immune defence responses appears critical for sustained control of M. tuberculosis in LTBI.
The gene expression profiles identified in this study will need further validation by wide-scale gene expression analyses in different geographical and ethnical cohorts. Such studies are currently ongoing in cohorts from other African countries to validate current biomarker gene profiles and to identify TB-specific-gene expression signatures, as well as expression patterns common to TB and other chronic infectious diseases. More detailed analyses of expression profiles that may be crucial for protection and susceptibility in M. tuberculosis-infected individuals will thus be required to ultimately define a unique biosignature for TB.
Materials and methods
Subject enrollment and sample collection
The study presented here was approved by ethical committees in both Stellenbosch (South Africa) and Berlin (Germany) and written informed consent has been obtained from all study participants. From a cohort of TB patients and household contacts recruited at Stellenbosch University (South Africa), 33 TB patients, 34 healthy donors with LTBI and 9 healthy NIDs were included in this study (Figure 1a depicts the gender and age distribution of donors within groups). Household contacts are defined herein as immediate family members of a TB case (acid-fast bacilli, sputum-smear positive) sharing at least 3 h contact per day within the previous 3 months. TB patients and controls were characterized based on chest radiography, M. tuberculosis sputum culture results and tuberculin skin testing. All subjects were HIV– and samples from TB patients were taken before chemotherapy. From every donor, 2.5 ml of peripheral whole blood was collected in PAXgene tubes (PreAnalytix, Europe BD, Erembodegem, Belgium) and stored at −80°C before processing.
RNA extraction and microarray procedure
RNA was extracted from 2.5 ml of peripheral whole blood collected in PAXgene tubes, using the PAXgene Blood RNA Kit following the manufacturer's protocol (both tubes and kits from PreAnalytix). RNA concentration and integrity were determined using an Agilent 2100 Bioanalyzer (Agilent Technologies, Foster City, CA, USA). RNA labelling was done with the Fluorescent Linear Amplification Kit (Agilent Technologies) according to the manufacturer's instructions. Quantity and labelling efficiency were verified before hybridization of the samples to whole-genome oligonucleotide microarray (Agilent Technologies). In this study, we used two-colour microarray slides on which Cy3- and Cy5- labelled RNA from each study group (TB, LTBI or NID) was co-hybridized with RNA from a gender- and age-matched individual from the other groups. Dye colours were balanced within group comparisons to avoid any dye-specific bias. Because of the lower number of NID available in this study, each RNA sample from this group was co-hybridized with a pool of RNA from three matched individuals from the other two groups.
Microarray chips were processed and scanned at 5 μm using an Agilent scanner. Image analysis was performed with Feature Extraction software (version 6.1.1, Agilent Technologies). Raw microarray data were processed using the function read.maimages of the Bioconductor29 R package limma30 and intensities of the spots not flagged out were background corrected using the method norm-exp.31 Background-corrected values were lowess normalized to obtain as unbiased red/green ratios as possible. For global comparability, the data of all arrays were quantile normalized.32 Further, features with mean log intensities <7 were also flagged out, as for these low-intensity spots random 1 noise in the data would lead to spurious signals. To statistically verify differential expression in the three pairwise comparisons of LTBI, NID and TB cases, we applied a combination of t-test with subsequent false discovery rate correction based on the approach of Efron and Tibshirani.33
Data were log transformed and differentially expressed genes were identified based on log-fold changes (M-values) in average gene expression with a q value <0.01 (q equals the P-value corrected for multiple testing) by two-sample t-test. Functional annotation analysis and clustering was performed with the help of the Database for Annotation, Visualization and Integrated Discovery Bioinformatics Resources 2008 (http://david.abcc.ncifcrf.gov).18, 19 Random forest analysis was done according to the method of Liaw and Wiener.34
Differential expression of several genes was validated by quantitative RT-PCR (qRT-PCR). cDNA was generated by reverse transcription using oligo-dT primers and Superscript II (Invitrogen GmbH, Darmstadt, Germany). Transcripts were quantified by qRT-PCR based on SYBR Green incorporation (Applied Biosystems GmbH, Darmstadt, Germany) on an ABI PRISM 7900 thermocycler. Target genes and primer sequences are listed in Table 4.
We thank ML Grossman for thoroughly revising the paper, J Weiner for biostatistical support and the Microarray Core Facility unit at the Max Planck Institute for Infection Biology in Berlin for sample processing. This study was funded by Grant number 37772 from the Bill & Melinda Gates Foundation through the Grand Challenges in Global Health Initiative, to all authors except DR
About this article
Supplementary Information accompanies the paper on Genes and Immunity website (http://www.nature.com/gene)