Spatial and temporal localization of immune transcripts defines hallmarks and diversity in the tuberculosis granuloma

Granulomas are the pathological hallmark of tuberculosis (TB) and the niche where bacilli can grow and disseminate or the immunological microenvironment in which host cells interact to prevent bacterial dissemination. Here we show 34 immune transcripts align to the morphology of lung sections from Mycobacterium tuberculosis-infected mice at cellular resolution. Colocalizing transcript networks at <10 μm in C57BL/6 mouse granulomas increase complexity with time after infection. B-cell clusters develop late after infection. Transcripts from activated macrophages are enriched at subcellular distances from M. tuberculosis. Encapsulated C3HeB/FeJ granulomas show necrotic centers with transcripts associated with immunosuppression (Foxp3, Il10), whereas those in the granuloma rims associate with activated T cells and macrophages. We see highly diverse networks with common interactors in similar lesions. Different immune landscapes of M. tuberculosis granulomas depending on the time after infection, the histopathological features of the lesion, and the proximity to bacteria are here defined.

In association with the diverse outcome of infection, studies from autopsies have shown important diversity in the granuloma histology. In addition to the encapsulated granuloma with a caseous necrotic center, TB granulomas can be non-necrotizing, neutrophilrich, mineralized, fibrotic or cavitary 4,5 . TB granulomas in commonly used inbred mouse strains such as C57BL/6 do not develop necrotizing lesions, while encapsulated necrotizing granulomas are found in other strains such as the C3HeB/FeJ [6][7][8] . While the histological features of granulomas have been well characterized, the immune and inflammatory mechanisms that underlie variable granuloma dynamics and clinical outcomes of TB infection remain to be further elucidated.
In order to understand the immunological architecture of the murine TB granulomas, we have used a novel method for highly sensitive multiplexed in situ imaging of selected immune mRNA species. The method is based on rolling-circle amplification (RCA) of padlock probes and on sequencing-by-ligation chemistry, as reported 9 . RCA has been used to 4 produce highly specific amplified products that enabled detection of individual mRNA molecules in situ in the unperturbed context of fixed tissues at cellular resolution 9, 10 .
We show that granulomas developed with time showing increasing complexity and diversity of co-expressing molecular networks. Transcripts corresponding to activated macrophages localized at subcellular distances from M. tuberculosis. Moreover, we describe profound differences between different granuloma regions and between encapsulated necrotic and non-encapsulated granulomas. Common networks underlying the observed heterogeneity could be defined for different areas of the granulomas, for granulomas at different time points and from different mouse strains.

Specificity, reproducibility and performance of in situ sequencing
The in situ sequencing technique was used to localize simultaneously 34 immune transcripts coding for chemokine receptors, cytokines, effector molecules and surface molecules that define immune populations in paraformaldehyde-fixed sections of lungs from M. tuberculosis-infected mice. Three consecutive sections from C57BL/6 mice at 3, 8 and 12 weeks after aerosol infection with M. tuberculosis (wpi) and those from C3HeB/FeJ mice that develop encapsulated necrotic granulomas were used 11 . Transcripts were aligned with the histopathological features of the same lung section ( Figure S1).
Non-specific signals (when base calling did not correspond to those built-in the barcoding sequences) were minimized by increasing the signal threshold, while the density of specific signals remained mostly unaltered, indicating a high specificity of the reaction ( Figure S2A). At the fixed threshold selected (0,45), the performance (the total number of signals) ranged from 7 to 15 x 10 4 signals per section ( Figure S2B). All transcripts were simultaneously and differentially detected in the lungs ( Figure S2C). As expected, cc10 mRNA, expressed by Clara cells, located in the epithelium of pulmonary airways, and inos, expressed by inflammatory cells, localized in the granuloma. These showed similar distribution in consecutive slides ( Figure S2D). The ratio of the density of specific mRNAs in annotated lesions versus unaffected areas (defined in Figure S1) from consecutive sections were similar for some but the variance was higher for other transcripts ( Figure S2E). This usually related to the higher sparsity of signals in the latter group. 6 Maturation of the granuloma Several immune transcripts located within the granulomas as shown by a signal and a density plot representation for the frequencies of cd68, inos and cd3 mRNA ( Figure 1A). All but 4 mRNA species (cd4, cd40l, elane and il17a) showed increased localization in the lesions compared to non-affected areas at 3 wpi, while all except cd4 and cd40l preferentially located in the granuloma 12 wpi of C57BL/6 mice in consecutive slides (p<0.01 χ 2 test).
The house keeping actb mRNA was increased 2-fold in the granuloma compared to the unaffected regions, reflecting higher cellular density in the lesion ( Figure S2E). Several transcripts in the lesions displayed higher relative frequencies than actb mRNA ( Figure S2E).
Several transcripts expressed by myeloid cells (i.e. tnf, inos and il6) localized at similar frequencies in granulomas from lungs at all time points after M. tuberculosis infection studied ( Figure 1B). Instead, increased relative enrichment was observed for several transcripts associated with adaptive immune responses at 8 and 12 wpi (i.e. cd19, ccr6, cd68, cxcr3 and cd3e mRNA) ( Figure 1B).
We then analyzed transcripts co-expressed at <10 μm using the Insitunet app 12 .
Transcripts with significant spatial co-expression were displayed as edges in a network map.
Interactions were only studied when observed in at least 2 consecutive slides ( Figure 2A). As an example, out of 561 possible mathematical combinations of pairs of the 34 transcripts at 8 wpi studied (C 34,2 ), 19 transcript pairs significantly co-localized (Table S1).
We observed that granulomas, showed co-expression of inos and different T cellrelated transcripts (cd3e, tcrb, cd8) at 3 wpi ( Figure 2A). Inos sequences were also the principal interacting node in granulomas at 8 and 12 wpi, when the number of co-localizing transcripts in the granulomas increased ( Figure 2A). No transcript co-expression was detected in unaffected regions of the sections. 7 Distinct localization of transcripts within the granuloma An unsupervised clustering of mRNA densities across the pulmonary tissue was then performed. The tissues were divided into hexagons (hexbin) with a 70 μm long radius ( Figure   2B). The hexbins were separated into the minimal number of clusters showing different sequence frequencies (3 clusters for 3 and 8 wpi, 4 clusters 12 wpi). The hexbin clusters identified areas that corresponded either to the granulomas or to unaffected sites (as defined by H&E staining, Figure S1) at all times of infection ( Figure 2B). At 3 wpi clusters differed by the concentration but not on the relative densities of transcripts ( Figure 2B, C). One cluster (blue) located in the granuloma, other (red) in unaffected areas, and the third (green) around the granuloma in areas with mild inflammation or unaffected ( Figure 2B and S1). In contrast, at 12 wpi the clusters contained different relative frequencies of transcripts ( Figure 2B, C).
Three of the 4 hexbin clusters located in the granuloma area. One of these (green) also localized in areas surrounding the granuloma ( Figure 2B). Another cluster (yellow) overlapped with lymphoid rich areas within the granuloma and showed an over-representation of cd19 mRNA, expressed by B cells, and several other transcripts (ccr6, il12, mpo, cd3) ( Figure 2C). However, myeloid markers (cd68, inos, cd11b, cd11c) were mainly expressed in the blue cluster. Thus, distinct areas within the granulomas containing different densities of transcripts were revealed by an unbiased analysis of the images. This indicated an increased compartmentalization of granulomas at late time points after infection.
Cd19 mRNA showed a sparse and indiscriminate expression at 3 wpi, but a strong focal distribution in lymphoid-rich areas that resembles inducible B cell follicles 13,14 within the C57BL/6 granuloma at later time points ( Figure 3A). Immunohistochemical labelling of B cells with anti-B220 in these sections resulted in a similar pattern further confirming the specificity of the in situ sequencing ( Figure 3B).

8
Epithelioid areas in 8-12 wpi granulomas contained a high density of cd68 mRNA and were devoid of cd19 mRNA ( Figure 3C). Instead, cd3 mRNA co-localized with both cd68 and with cd19 mRNA at the epithelioid and lymphoid areas ( Figure 3C). Inos and cd68 mRNA localized in the same areas of the lesions, while other transcripts, such as tnf and il12p35/40 mRNA only partially co-localized with the former ones ( Figure S3A). Ccr6 and cd19 mRNA showed a similar localization ( Figure 3C).
The density of several transcripts including ccr6, cd19, il12 and to a lesser degree cd3, cxcr3 and cd8b mRNA was increased in the lymphoid in relation to the epithelioid areas ( Figure 3D) after a supervised annotation of the lymphoid and the epithelioid areas of the granulomas based on H&E staining. Cc10 and myeloid-associated markers were found among the underrepresented transcripts ( Figure 3D). In contrast, cd11b, inos and cd68 mRNA were enhanced in the epithelioid as compared to the lymphoid area, while a third group (including il6, tnf, ifng, cd127 and ccr4 mRNA) showed similar densities in both areas ( Figure 3D).
Thus, in an unsupervised way or after annotation based on histological features, different areas of the granuloma were defined on the frequencies and distribution of transcripts.
Distinct networks of co-expressed transcripts in lymphoid and epithelioid areas of the granulomas were apparent. In lymphoid areas, cd19 mRNA was a major node interacting with cd3e and ccr6 mRNA at 8 and 12 wpi ( Figure 4A, B). Il12, ifng, tnf and cxcr3 mRNA were also found in lymphoid networks at 12 wpi ( Figure 4A). Mpo (neutrophil myeloperoxidase) and rorg transcripts co-localized with cd19 mRNA as well ( Figure 4A). Thus, transcripts from T and B cells are in close proximity in the lymphoid areas.
On the other hand, inos, the main interacting node in the epithelioid area, co-expressed with cd68 mRNA ( Figure 4B). Co-localizing sequences in some epithelioid areas were ccr4, tnf as well as other transcripts expressed by T cells ( Figure 4B). Density plots of cd3, cd8b, cxcr3 and ifng mRNA showed a similar and wide spread location throughout the granuloma of T cell-related transcripts that only in part overlapped with ccr4 mRNA ( Figure S4A The frequencies of transcripts localized at expanding distances from the bacteria converged with those in the whole lung, validating the results ( Figure S5A, B, Figure 4E). Inos, cd68, cd11b, tnf and socs3 mRNA were enriched at shorter distances to M. tuberculosis, suggesting that activated macrophages expressing these transcripts co-localize with bacteria in the granuloma ( Figure 4E). Similar results were observed in 3 consecutive slides analyzed as depicted for a selected number of transcripts ( Figure S5C). On the other hand, transcripts like cc10, spc (expressed by type II alveolar epithelial cells), il6 and foxp3 showed an opposite trend: the relative frequency of these transcripts decreased at shorter distances to M. tuberculosis bacteria ( Figure 4E). Other transcripts including cd8, cd3e, cxcr3, ccr4 and il12p40 showed similar frequencies at different distances from M. tuberculosis, suggesting a neutral spatial association with the bacteria. Interestingly, M. tuberculosis and adjacent transcripts located in the epithelioid areas of the granuloma, in proximity to the lymphoid areas, rather than randomly spread throughout the whole lesion ( Figure 4F).

Heterogeneity of granulomas
The mRNA density in granulomas from the same lung was then compared. Heat maps and principal component analysis showed that 3 out of 4 granulomas showed similar transcript compositions at 3 wpi while a fourth diverged ( Figure 5A, C). The three lesions depicted at 12 wpi were heterogeneous ( Figure 5B, D). All lesions were different compared to unaffected areas ( Figure 5 A-D). The ratio of transcript densities in epithelioid and lymphoid areas of different granulomas within the same lung at 12 wpi also showed substantial variation ( Figure 5E).
Qualitative differences in networks in lymphoid or epithelioid areas of granulomas from the same lung were also observed ( Figure 6A, B). Despite such variation, common networks of transcripts to all epithelioid or lymphoid area determined inos-cd68 and cd19 mRNA as the main interacting nodes, respectively ( Figure 6C). Those differed substantially from the core networks of necrotic granulomas in the C3HeB/FeJ mice analysed below ( Figure 6D).   Figure 7C).

Comparison of encapsulated and non-encapsulated granulomas
An unbiased hexbin analysis also showed a clear distinction of clusters in their transcript contents that aligned in the necrotic cores (blue) and the surrounding areas of the necrotic and the cellular granuloma (green cluster, Figure 7H). The centroid expression of foxp3, il6 and il10 transcripts were highest in the blue cluster in agreement with the analysis of annotated areas ( Figure 7I).
Altogether, C3HeB/FeJ mice encapsulated granulomas contained a center enriched of transcripts associated with immune suppression while cellular granulomas contained networks associated to protective responses.

Discussion
Studies using TB granulomas in man came to an end in the 1950s with the introduction of antibiotics and the decline of lung lobectomies 16 . Research interest relocated from morphological histopathology to immunology, molecular microbiology, and genetics studied in isolated cells and animal models. While the understanding of the biology of infection in these areas has increased immensely, that of the granuloma structure has advanced at a slower pace. While the novel methods have recently improved the molecular and cellular dissection of the granuloma, this has largely been done in homogenates or isolated cell suspensions in which the histological context is lost.
Here, the localization of 34 immune transcripts in lungs during infection with M.
tuberculosis was resolved at cellular resolution and aligned with the tissue topology generating high resolution images of the immunological landscape of granulomas.

Specificity, reproducibility and performance of in situ sequencing
The in situ sequencing method displayed a high signal to noise ratio of fluorescent signals and high specificity of analyzed sequences allowing target recognition at the single nucleotide level, as previously reported 9 . One limitation of our study is the small number of independent sections examined. This is due to the extensive image acquisition and data processing. Statistics were performed by comparing frequencies of transcripts in different areas of the lung or in different regions of the granuloma but also by comparison of determinations in consecutive sections, which confirmed the specificity of the signals. The results presented were confirmed in one independent sample for each condition. T cells also located in the epithelioid areas of the granuloma. In these areas, the main clusters included cd68, tnf and inos but also cxcr3 and ccr4 mRNA expressed by T H 1 and T H 2, cells respectively 33 . This suggests both T H 1 and T H 2 cells penetrate into the epithelioid compartment. In agreement with the observed compartmentalization, CD68 + macrophages in the human tuberculous tissues do not form aggregates with B lymphoid clusters 17 .
Our data indicate the preferential presence of inos, cd68, tnf and cd11b mRNA sequences at subcellular distances (<3µm) to M. tuberculosis and strongly advocate that the lesion contains infected activated macrophages. These transcripts localized in the epithelioid areas of the C57/Bl6 mice granuloma, usually in proximity to the lymphoid regions containing cellular populations that might perpetuate the activation of these inflammatory macrophages. iNOS-derived NO plays a key role in host defense against mycobacteria 34

Heterogeneity of granulomas
Structural, microbiological and immunological heterogeneity of TB granulomas in the same lung has been described in animals and humans [44][45][46] . Both, heat maps and the coexpression network revealed significant variances in transcript localization among different lesions in the same individual. We also show a significant variance between different lymphoid and epithelioid areas of different granulomas, indicating that the differences are not only due to a diverse ratio between both areas. Thus, each granuloma probably represents a limited microenvironment that might be influenced independently from the others in the same lung by the quality of the local immune response and the level of inflammation. However, common core-networks that are unique for each region, time point and mouse strain were here defined. The presence and suppressive function of foxp3 expressing regulatory T cells (T regs ) in granulomatous responses has been previously shown [48][49][50] , and we suggest also that the localization of these cells coincides with regions of high bacterial density. Similar to our findings, granulomas from children with tuberculous lymphadenitis from showed increased numbers of CD4+FoxP3+ T cells, while CD8+ T cells surrounded the granuloma 51 .

In situ sequencing of encapsulated granulomas
IL-10 is produced at the site of active TB in humans and mice and in both an inhibitory role of protective immunity by IL-10 has been suggested [52][53][54][55] . T regs , IFN-γ-secreting CD4+ cells, neutrophils and macrophages have all been shown to produce IL-10 during mouse or human TB 53,56,57 . In association with lower bacterial levels, non-organized granulomas showed a higher frequency of tcrb, ifng, inos, cd68 and cd11b mRNA, that code for proteins of activated macrophages and T H 1 cells, as compared to the necrotic granuloma.

Immunohistochemistry
For immunofluorescent staining of mouse B cells, deparaffinized, rehydrated and demasked sections were blocked in 5% BSA for 30 min, washed and incubated with 1:50 anti-mouse with B220 antibody (clone RA3-6B2) in 1% BSA, 0.3% triton-X PBS for 1h at room temperature. After washing in PBS and incubated for 1 hour at RT with secondary donkey-anti-rat rhodamine red 1:100 (both from BD Pharmingen) in antibody buffer. Slides were then washed and mounted in Vectashield with DAPI.

In situ sequencing technique
The in situ sequencing technique was performed as previously described 9 . Briefly, 8 μm tissue sections from paraformaldehyde-fixed and paraffin-embedded lungs were obtained, mounted on microscope slides (Superfrost Plus) and stored at -80 o C until further processing.
For in situ sequencing, mRNA after partial digestion with pepsin at 37 o C for 30 min, washed in PBS, then dehydrated by passing them through 70% and 100% ethanol and air-dryed.
Gaskets (SecureSeal Hybridization chambers, Grace Bio-Labs) were glued onto the slides such that they form a sealable reaction chamber enclosing the tissue. The mRNA in such sections was in situ reverse transcribed to cDNA using random decamer primers and primers that partially overlap with the recognition sequence of the padlock probes (Table S2). After reverse transcription, an additional crosslinking step was performed (4% paraformaldehyde at room temperature for 45 min), followed by degradation of the mRNA strand and hybridization of padlock probes to the remaining cDNA strand. A ligase in the reaction mix catalyzes circularization of hybridized padlock probes. Multiple Padlock probes were designed for each of the 34 genes of interest such that they detect non-overlapping, transcriptspecific sequences (Table S3). Every set of padlock probes for a given transcript carries a unique 4-base barcode, which is used for identification 9 .
In situ sequencing substrates are generated in a rolling circle amplification reaction (RCA), which is primed by the cDNA using circularized padlock probes as template.
Resulting rolling circle amplification products (RCPs) are subjected to sequencing by ligation. An AlexaFluor750-conjugated sequencing anchor oligo is hybridized immediately adjacent to the barcode sequences of all RCPs (Table S4). Four nonamer libraries are added to the reaction, each containing random nucleotides and one specific base (A, T, G or C) at a fixed position. The libraries are designed such that every base corresponds to a specifc dye.
In an enzymatic DNA ligation reaction, the anchor primer is joined to species of the nonamer pool that best match the barcode. After performing the ligation and imaging, the RCPs are reset by stripping off anchor primer and nonamers and a new cycle is initiated. The process is repeated until all four positions of the barcodes have been interrogated. Imaging was performed in an Axio Imager Z2 epifluorescence microscope (Zeiss) by acquiring Z-stacks of overlapping tiles that together cover the tissue section (10 percent overlap). Image stacks were merged to maximum-intensity projections (Zen software).

Image analysis
A fully automated image analysis pipeline was performed using CellProfiler (v.2.1.1) calling ImageJ plugins for image registration customized to previously described 9 . Briefly, images from all four sequencing cycles were aligned utilizing their general stain for RCPs saving x and y coordinates as well as fluorescence intensities for each RCP in their four base positions to a .csv file and decoded using a Matlab script. For each RCP and hybridization step, the RCP was assigned the base with the highest intensity. A quality score was extracted from each base, defined as the maximum signal (i.e., intensity of assigned letter) divided by the sum of all signals (letters) for that base that ranges from 0.25 to 1. A value close to 0.25 means poor quality (similar signal for all letters), whereas a value close to 1 means that the signal of the assigned letter is strong above a low background. A fixed threshold of 0.45 that gave around 3% unexpected reads was applied.
For signal visualization, Matlab scripts were used to plot selected transcripts on H&E/DAPI-stained images of the analyzed section and to calculate 2log transformations of kernel density estimations for each mRNA species. The number of transcript signals were extracted from the whole tissue and from manually selected regions based on pathological features ( Figure S1) for further analysis.
For an unsupervised analysis of our spatial data, we used a Matlab script that generated k-mean clusters for a given number of clusters and size of hexbins. This iterative learning algorithm aims to find groups in the data by clustering them based on similarities in their transcripts expression levels. Transcript counts in every hexbin were normalized by their maximum counts. For each RNA species a cluster centroid (=mean normalized expression level) was computed. The minimum number of clusters that rendered differential results were analysed.

Heat map and principal component analysis
Extracted transcript reads were normalized to the area and uploaded to ClustVis, a web tool for visualizing clustering of multivariate data using heatmaps and principal component analysis (https://biit.cs.ut.ee/clustvis/) 59 . ClustVis calculates principal components using one of the methods in the pcaMethods R package and plots heatmaps using heatmap R package (version 0.7.7).

Colocalization analysis
We used two applications in Cytoscape an open source software platform (http://apps.cytoscape.org). InsituNet, a cytoscape app that converts in situ sequencing data ( above described .csv file with transcript name, x and y coordinates) into interactive networkbased visualizations, where each unique transcript is a node in the network and edges represent the spatial co-expression relationships between transcripts, was applied to identify co-expressed transcripts 12 . Co-expressed transcripts were defined in a range of <10 µm and their statistical significance was assessed by label permutation and corrected for multiple testing by the Boniferroni method. Those networks were imported into DynetAnalyzer to further compare networks and extract core-networks 60 .

Identification of bacteria and surrounding transcripts
After in situ sequencing, sections were stained for bacteria with Auramine-Rhodamine T and also hybridized with the fluorescence-labeled anchor primer for the general stain. As for individual base positions multiple focal plane images were acquired and merged to maximum-intensity projections. Bacteria images were aligned to the corresponding four base position images customizing the CellProfiler pipeline from above. In short, bacteria were 23 identified after subtraction of autofluorescence based on global Otsu thresholding, intensities of RCPs were extracted as before with the additional information whether they were located within the assigned distance (3, 10, 30, 300 and 600 μm) to identified bacteria. Custom Matlab scripts (available upon request) were used for decoding and plotting of transcript signals.     Table S1  C. The mean centroid normalized transcript counts in each hexagon was compared for the clusters. Note that the red clusters corresponding to unaffected areas contained less counts for of all specific sequences. Note also at 12 wpi that while some sequences were dominant in single clusters (i.e. cd19 mRNA in the yellow cluster) other sequences were more evenly distributed, or predominated in the blue clusters.   tuberculosis bacteria were identified. The frequency of each sequence within a given distance was determined in relation to the total transcript count for that distance. The fold increase of this frequency with respect to that observed for the total lung section is depicted. Thus, whether a certain transcript is over-or under represented within the defined distances from M. tuberculosis was determined.