Synthetic mycobacterial diacyl trehaloses reveal differential recognition by human T cell receptors and the C-type lectin Mincle

The cell wall of Mycobacterium tuberculosis is composed of diverse glycolipids which potentially interact with the human immune system. To overcome difficulties in obtaining pure compounds from bacterial extracts, we recently synthesized three forms of mycobacterial diacyltrehalose (DAT) that differ in their fatty acid composition, DAT1, DAT2, and DAT3. To study the potential recognition of DATs by human T cells, we treated the lipid-binding antigen presenting molecule CD1b with synthetic DATs and looked for T cells that bound the complex. DAT1- and DAT2-treated CD1b tetramers were recognized by T cells, but DAT3-treated CD1b tetramers were not. A T cell line derived using CD1b-DAT2 tetramers showed that there is no cross-reactivity between DATs in an IFN-γ release assay, suggesting that the chemical structure of the fatty acid at the 3-position determines recognition by T cells. In contrast with the lack of recognition of DAT3 by human T cells, DAT3, but not DAT1 or DAT2, activates Mincle. Thus, we show that the mycobacterial lipid DAT can be both an antigen for T cells and an agonist for the innate Mincle receptor, and that small chemical differences determine recognition by different parts of the immune system.

lipids, while the hydrophobic parts of the lipid that are buried deep in the CD1 cleft are not typically recognized by the T cell receptor through direct contact 14 . Parts of the fatty acids that lie on the CD1 surface or sit near the antigen exit portal, might contribute to T cell recognition and specificity, especially when they show distinguishing features like double bonds, hydroxylations, and methylations.
Not all Mtb (glyco)lipids have been studied as T cell antigens. We propose that if a lipid binds sufficiently to a CD1 molecule, and it has clear features that distinguish it from common self-lipids like phospholipids and sphingolipids, it may be specifically recognized by T cells. Because CD1 interacts with lipid antigens via hydrophobic interactions, any glycolipid with one, two, or three hydrophobic tails and a suitable size might bind to CD1. One of these candidate lipids is diacyl trehalose (DAT). DAT is suggested to be part of the external surface of the mycobacterial cell wall 15 and belongs to the family of trehalose-based glycolipids, which includes lipids such as Ac 2 SGL. Although known as a biological substance for decades, DAT had not been chemically synthesized until recently 16 .
Besides functioning as lipid antigens for T cells, some mycobacterial lipids induce an innate response through pattern recognition receptors, such as the family of C-type lectin receptors (CLRs). The macrophage inducible Ca 2+ -dependent lectin (Mincle) receptor is one of the human CLRs. Trehalose-6,6′-dimycolate, a highly abundant glycolipid in the mycobacterial cell wall, was the first known Mtb lipid to activate the murine and human Mincle receptor 17 . Since then several natural and synthetic mycobacterial lipids have been shown to act as agonists for both the murine and human Mincle receptor, including DAT isolated from Mtb 18 . There has been growing interest in using Mincle ligands as adjuvants to promote a Th1 and Th17 immune response to subunit vaccines 19,20 .
Here, we took advantage of precisely defined synthetic forms of DAT to discover receptor mediated human cellular responses. Three synthetic DATs were tested for their potential as Mincle and T cell receptor (TCR) ligands. We developed CD1b tetramers loaded with synthetic DAT to study recognition of DAT by T cells ex vivo in both healthy individuals and tuberculosis patients.

Results
Validation of synthetic diacyl trehalose. Natural DAT isolated from the cell wall of M. tuberculosis (Mtb) is a mixture of compounds that has immunomodulatory properties 21 . We recently synthesized three forms of DAT: DAT 1 , DAT 2 , and DAT 3 , that differ in the fatty acyl unit esterified to the 3-position of the glucose moiety in trehalose, where DAT 1 carries mycosanoic acid, DAT 2 carries mycolipanolic acid, and DAT 3 carries mycolipenic acid 16 . Synthetic DAT 1 and DAT 3 were previously demonstrated to be identical to natural products, but synthetic DAT 2 possessed identical fragmentation patterns to natural product, but did not co-elute by HPLC, suggesting that the two molecules are stereoisomers 16 . As a validation of the synthesized compound structure and quality after storage, high-performance liquid chromatography-mass spectrometry (HPLC-MS) analysis was performed ( Fig. 1A-C). All three compounds yielded molecular ions with m/z values that, within experimental error, were consistent with the ammoniated synthesized target structures (m/z 948.735, 1006.776, and 988.766 for DAT 1 , DAT 2 , and DAT 3 respectively). Each synthetic compound gave a single major chromatographic peak, consistent with isomeric purity. Retention times in the reversed phase method are expected to increase with molecular size but decrease with increasing polarity of groups, such as the hydroxy group on the hydrocarbon chain, and the relative retention times of the synthetic DATs matched this prediction as DAT 2 was < DAT 1 , with DAT 3 showing the longest retention time. Thus, the compounds showed high purity, correct mass and the expected retention times, allowing biological investigations of the antigenicity of DAT.
Diacyl trehalose acts as a Mincle ligand. Both natural mixed and synthetic forms of DAT can be ligands for macrophage-inducible C-type lectin (Mincle), a receptor of the innate immune system and activator of macrophages that responds to several types of trehalose containing glycolipids 16,18 . To confirm that the stored synthetic lipids are bioactive, we tested their ability to activate Mincle by measuring the activation of NFAT-GFP in reporter cells expressing murine Mincle and its signaling subunit, the FcRγ chain. Synthetic DAT 3 acts as an agonist for Mincle, while stimulation of Mincle by DAT 1 and DAT 2 was not much higher than background even at the highest concentrations tested (Figs. 1D and S1), consistent with the previously reported pattern 16 . These results confirm that the chemical structure of DAT influences recognition by Mincle and that DAT 3 is a strong activator of Mincle.

Identification of CD1b-DAT tetramer-specific T cells.
Regardless of their capacity to stimulate the innate immune system, it is possible that DAT 1 , DAT 2 , or DAT 3 can function as foreign lipid antigens presented by CD1 proteins for human T cells. Among human CD1 proteins, CD1b can present lipids with the longest and most alkyl chains 22 . The ~ C42 forms of DAT studied here were somewhat larger than most lipids presenting by other CD1 isoforms, so we hypothesized that DAT could be presented by CD1b molecules to activate CD1b reactive T cells. To test this, we enriched T cells from healthy donor peripheral blood mononuclear cells (PBMCs) by depleting non-T cells using magnetic selection and stained them with CD1b tetramers that were treated with either synthetic DAT 1 , synthetic DAT 2 , or synthetic DAT 3 . Some T cells recognize 'CD1b-endo' complexes, which are so named because they carry endogenous self-phospholipids from the mammalian CD1 protein expression system. Such T cells recognize CD1b-phosholipid or bind the CD1b protein itself independent of the lipid bound 12,13 . Therefore, we stained the T cells simultaneously with a phycoerythrin (PE)-labeled synthetic DAT-treated CD1b tetramer and an allophycocyanin (APC)-labeled untreated CD1b tetramer (CD1bendo). For quantification, residual non-T cells and auto-fluorescent cells were gated out, as well as CD1b-endo positive cells to determine true CD1b-DAT tetramer binding cells (Fig. S2a). CD1b-DAT tetramer + cells were detected in all eight donors tested ( Fig. 2A). Binding of CD1b-DAT 1 tetramers was the highest, with frequencies ranging from 0.031 to 0.007% of total T cells, followed by CD1b-DAT 2 (0.021-0.003%) and CD1b-DAT 3  www.nature.com/scientificreports/ was the lowest (0.009-0.001%). Visualization of double staining of T cells with CD1b-DAT-and CD1b-endo tetramers, not gating CD1b-endo tetramer + cells out (Fig. S2b), showed that most CD1b-DAT tetramer-binding cells fail to bind CD1b-endo, as illustrated by dot plots from donor 49 (Fig. 2B) or the other donors (Fig. S3).
Together these results suggest that synthetic DAT 1 and synthetic DAT 2 are T cell antigens, while we could not convincingly detect CD1b-synthetic DAT 3 binding TCRs. Although we expect the three forms of DAT to load with comparable efficiency onto CD1b, we cannot formally exclude the possibility that DAT 3 was less efficiently loaded. Therefore, our inability to detect CD1b-DAT 3 binding T cells can be due to their absence in blood, or a failure to load tetramers with DAT 3 .
To enable functional studies of T cell response, we stained PBMC from a healthy blood bank donor (HD1) with anti-CD3 and CD1b-synthetic DAT 2 tetramer. After two rounds of sorting and expansion of cells that were positive for CD3 and tetramer, we obtained an oligoclonal T cell line that, upon flow cytometric analysis, was demonstrated to consist mainly of T cells that stained double positive for mock treated CD1b (CD1b-mock) and CD1b-DAT 2 tetramer. The approximately 3% of the cells in the cell line that stained brightly with CD1b-DAT 2 tetramer but were negative for CD1b-mock tetramer were sorted and expanded further (Fig. 2C, third sort) to generate the 98% pure cell line HD1A (Fig. 2C, right panel). A 1.3% contamination of CD1b-mock + cells was detected, which was not surprising because it formed the majority of the cells before the third sort. We further characterized line HD1A by staining with a panel of 24 Vβ antibodies (Fig. S4) and identified that it was an Vβ T cell line that stained with anti-Vβ13.2, which stains the TRBV6-2 gene product (Fig. 2D)  www.nature.com/scientificreports/ was confirmed by a multiplex PCR approach (Fig. 2E). Thus, we were able to detect CD1b-DAT tetramer-binding T cells ex vivo and derived a TRBV6-2 + synthetic DAT 2 -specific αβT cell line.

Primary CD1b-DAT recognizing T cells show functional responses to antigen.
For some CD1bpresented lipid antigens, the exact composition of the lipid tail does not matter for T cell recognition 5 . However, for other CD1b antigens, such as Ac 2 SGL, mycolic acid, and mannosyl phosphomycoketide, the length and configuration of the acyl tail influences T cell activation 2,23-25 . For these three lipids it was shown that differences in the structure of the acyl tails, such as length of the acyl chain and the number or pattern of C-methyl branched groups changes recognition by the TCR. We wondered whether the diversity in the acyl chains of the three synthesized DATs could influence recognition by T cells. Therefore, we asked whether T cell line HD1A that was sorted with CD1b-synthetic DAT 2 tetramers would be cross-reactive to the other synthetic DATs. Line HD1A, which stained strongly with CD1b tetramers treated with DAT 2 , did not stain with DAT 1 -treated tetram- www.nature.com/scientificreports/ ers more than the background obtained with mock-treated tetramers (Fig. 3A). CD1b-DAT 3 -treated tetramers showed a weak staining. Tetramer staining suggests that the T cells would be responsive to the lipid antigen loaded onto the tetramers, but not all tetramer-binding T cells show functional responses upon presentation of antigen by antigenpresenting cells. To investigate functional responses to cellular presentation of synthetic DATs, we tested whether HD1A cells are functionally reactive to monocyte-derived dendritic cells, which represent in vitro generated primary APCs 26 , treated with synthetic DATs. DAT 2 induced secretion of interferon-γ (IFN-γ) by HD1A T cells, while APCs treated with DAT 1 , DAT 3 or medium alone, did not (Fig. 3B). Production of IFN-γ was almost completely blocked by anti-CD1b antibodies, indicating that CD1b is necessary for the activation of the HD1A by antigen, but not by other receptors present on the APCs, including CD1a, CD1c, or CD1d. Of note, synthetic DAT 3 , which supported low CD1b tetramer staining, did not lead to IFN-γ responses, which is most likely due to low potency of DAT 3 as an antigen for HD1A. Thus, the HD1A cell line shows CD1b-dependent, highly specific functional responses to synthetic DAT 2 presented by APCs, which is likely caused by TCR recognition of the CD1b-DAT 2 complex. The lack of functional responses to DAT 1 and DAT 3 suggests that the chemical differences among DATs, consisting of the differing fatty acyl units at C3 of trehalose (Fig. 1A), influence recognition by T cells and prevents cross-reactivity.  www.nature.com/scientificreports/ tuberculosis (TB) before the start of anti-TB drug treatment. In addition, PBMCs were isolated from 50 patients with latent TB infection and 50 household contacts with Mtb exposure, but no documented infection (uninfected), based on IFN-γ release assay results 27,28 . To generate adequate numbers of T cells for the tetramer analysis, we expanded an aliquot of PBMCs by stimulation with anti-CD3 antibody and feeder cells, as previously described 28 . CD1b-synthetic DAT 2 binding T cells were observed in subjects across all three groups, although in low frequencies. Comparing the median tetramer staining rate among the three groups based on TB disease status, the frequencies of CD1b-DAT 2 tetramer positive T cells did not significantly differ among active TB patients, latently infected patients, and uninfected subjects as determined by the Kruskall-Wallis test (Fig. 4A). Staining patterns of expanded PBMCs with CD1b-synthetic DAT 2 vary from broad smear of tetramer positive cells as illustrated by three subjects with tetramer-binding T cells (subjects 115-7, 149-4, and 63-1) to smaller high affinity populations, as seen for subject 206-0 (Fig. 4B). Thus, synthetic DAT 2 is recognized by T cells in the blood, but frequencies of these cells did not increase after infection with Mtb.

Discussion
Here we have characterized the antigenicity of DAT for the human immune system and show that synthetic DATs are able to act as both an innate and an adaptive agonist. Small differences in chemical structure between the three synthetic forms of DAT had strong effects on stimulation of innate versus adaptive receptors. We determined that DAT 3 , but not DAT 1 or DAT 2 , behaved as a highly potent activator of the innate receptor Mincle, while DAT 1 was by far the most potent compound recognized by polyclonal, ex-vivo T cells across multiple donors. As predicated, DAT could be presented by CD1b and act as an antigen for T cells. Across multiple healthy donors we observed T cell binding to CD1b-DAT tetramers with frequencies similar to binding of CD1b-GMM and mycolic acid tetramers 28 . The highest percentage of CD1b-DAT tetramer + cells was observed using synthetic DAT 1 treated tetramers, followed by synthetic DAT 2 treated CD1b tetramers, while the percentage CD1b-synthetic DAT 3 tetramer + T cells was extremely low. These results suggest that the composition of methyl-branched fatty acids of DAT strongly influences recognition by CD1b-reactive T cells.
Among the trehalose-based glycolipids that are made by Mtb, DAT is one of the smallest and simplest. Whereas sulfoglycolipids are sulfated on the 2′-position of the trehalose core and can carry up to four alkyl chains 9 , DAT is not sulfated and by definition carries two alkyl chains 16 . DAT carries an esterified unbranched saturated fatty acid on the 2-position of trehalose and a branched fatty acid on the 3-position: mycosanoic (DAT 1 ), mycolipanolic (DAT 2 ) or mycolipenic acid (DAT 3 ) 16 . Thus, although sulfoglycolipids can carry longer and more complex branched fatty acids at the 2-and 3-position, Ac 2 SGL is the closest relative of DAT. The binding mechanism of Ac 2 SGL to CD1b is known and shows that CD1b presents Ac 2 SGL to T cells with the participation of endogenous spacer lipid that is simultaneously bound in the cleft 29 . The presence of these spacers in addition to Ac 2 SGL leads to rearrangement of the lipid-binding groove, allowing accommodation for bulky antigens. At the same time this rearrangement reduces the capacity of the A' pocket of CD1b to accomodate the phthioceranoyl chain of Ac 2 SGL, forcing the first three methyl groups of the fatty acyl chain to remain exposed above the CD1b surface for recognition by TCR 29 . Since Ac 2 SGL and DAT show structural similarities, DAT might be presented by CD1b in a similar way, with the methyl branched motif exposed on the outer surface of CD1b. If that is true, differences in the exposed residues that are available for TCR recognition, such as the presence of the extra hydroxy group in the acyl chain of DAT 2 and the α,β-unsaturation in DAT 3 , might explain the observed lack of cross-recognition by TCRs, such as HD1A. In addition, the differences in the number of C-methyl groups on the fatty acid of DAT 1 (2 groups) and DAT 2 and DAT 3 (3 groups) could play a role in lack of cross-reactivity. However, the opposite effect was observed for Ac 2 SGL: an increased number of methyl-branched carbons led to an increase in the ability of the synthetic Ac 2 SGLs to stimulate T cells, which was true for up to four methyl groups 24 . Thus, the nature of the effect of methyl-branched fatty acids of DAT on T cell recognition by CD1b-reactive T cells can only be fully understood by additional analyses, including protein crystallography of the trimolecular complex of CD1b-DAT-TCR.
The Peruvian TB cohort data shows that synthetic DAT 2 is recognized by T cells in people with active TB, healthy latently Mtb-infected, and uninfected controls. However, a difference in frequencies of CD1b-DAT binding T cells among highly exposed groups that differed in their IGRA status was not observed, similar to other CD1 tetramer studies in cohorts 28,30 . Also, the range of percentage of CD1b-DAT 2 tetramer + T cells of Peruvian subjects was similar to the Boston healthy donors. Together, these data suggest that CD1b-DAT 2 -specific T cells do not expand upon Mtb exposure, or, if they do, it is not detectable among T cells that circulate in the blood. Recent studies have suggested that total blood MR1-reactive T cells can stay unchanged or fall in the setting of infection or antigen-stimulation 27,31-33 . Thus, a more general perspective to emerge from these studies is that blood-based quantification is not a reliable measure of total body T cell dynamics. Nevertheless, these studies provide proof of principle for DAT specific T cells response and point to DAT 1 as the T cell antigen with highest response.
In conclusion, our results show that the mycobacterial lipid DAT is an antigen for T cells as well as a stimulating ligand for the Mincle receptor, but the structural differences in the fatty acyl chains of the different forms of DAT strongly influence the type of biological response they elicit.

T cell lines and T cell assays.
For generation of T cell lines, total PBMC or PBMC-derived T cells were stained with CD1b-DAT tetramer and anti-CD3. PBMCs were sorted for double positive staining of CD3 and tetramer. Expansion of sorted cells was performed by plating cells at 100-700 cells/well in round-bottom 96-well plates containing 2.5 × 10 5 irradiated allogeneic PBMCs, 5 × 10 4 irradiated Epstein Barr Virus transformed B cells, and 30 ng/mL anti-CD3 antibody (clone OKT3) per plate as described previously 28 . The next day human IL-2 was added to the wells. After 2 weeks, sorting and expansion procedure was repeated as needed. For ELIS-POT assays, cocultures of 4 × 10 4 APCs (G4 monocytes) pre-incubated with DAT for 30 min 37 °C and 1 × 10 3 T cells were incubated for 16 h in a Multiscreen-IP filter plate (96 wells; Millipore) coated according to the manufacturer's instructions (Mabtech). For blocking, APCs were preincubated for 1 h at 37 °C with anti-CD1b blocking antibody BCD1b3.1 or control IgG P3 (10 μg/mL) before adding T cells.

Staining protocol. T cells were enriched by depletion of non-T cells using the Pan T cell Isolation Kit
(Miltenyi Biotec) according to manufacturer's protocol. Human enriched T cells and T-cell lines were stained with tetramers at 2 μg/mL in PBS containing 1% BSA and 0.01% sodium azide. Cells and tetramer were incubated for 10 min at room temperature in the dark, followed by addition of cell surface antibodies for 10 min at room temperature as described previously 27,28 . Subsequently, cells were treated with unlabeled OKT3 antibody and incubated for 20 min at 4 °C. Cells were analyzed using the BD LSRFortessa flow cytometer and FlowJo software. For staining of PBMCs from Peruvian participants ~ 3 × 10 6 cells were stained with a "live-dead" fixable blue cell stain (Molecular Probes), then treated with tetramer in for 10 min at room temperature, followed by cell surface antibodies for 5 min. Subsequently, cells were treated with unlabeled OKT3 antibody and incubated for 5 min at room temperature, followed by 10 min at 4 °C. Cells were fixed in fresh 2% paraformaldehyde (Electron Microscopy Sciences) in PBS for 20 min. Antibodies that were used: CD3-BV421 (UCHT1; Biolegend), CD3-FITC (SK7; BD Bioscience).
TCR sequencing. TCR sequences were determined by isolating RNA from bulk sorted T cell populations using the RNeasy kit (QIAGEN), followed by complementary DNA synthesis using the QuantiTect Reverse Transcription Kit (QIAGEN). TCR transcripts were amplified using a multiplex approach 12 , followed by direct Sanger sequencing of the PCR product. www.nature.com/scientificreports/