Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci

Abstract

Genetic studies have revealed that autoimmune susceptibility variants are over-represented in memory CD4+ T cell regulatory elements1,2,3. Understanding how genetic variation affects gene expression in different T cell physiological states is essential for deciphering genetic mechanisms of autoimmunity4,5. Here, we characterized the dynamics of genetic regulatory effects at eight time points during memory CD4+ T cell activation with high-depth RNA-seq in healthy individuals. We discovered widespread, dynamic allele-specific expression across the genome, where the balance of alleles changes over time. These genes were enriched fourfold within autoimmune loci. We found pervasive dynamic regulatory effects within six HLA genes. HLA-DQB1 alleles had one of three distinct transcriptional regulatory programs. Using CRISPR–Cas9 genomic editing we demonstrated that a promoter variant is causal for T cell–specific control of HLA-DQB1 expression. Our study shows that genetic variation in cis-regulatory elements affects gene expression in a manner dependent on lymphocyte activation status, contributing to the interindividual complexity of immune responses.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Dynamic allele-specific expression during T cell activation.
Fig. 2: Dynamic allele-specific expression patterns and enrichment in autoimmune disease loci.
Fig. 3: HLA-DQB1 dynamic allele-specific expression at mRNA and protein levels.
Fig. 4: Validation of causal variant for Late-Spike cis-regulatory program.

Data availability

The RNA-seq data supporting this publication are available at GEO, with accession number GSE140244. The flow cytometry data supporting this publication are available at ImmPort (https://www.immport.org) under study accession SDY1555. Source data for Fig. 4 are provided with the paper.

Code availability

Code for key analyses in this study are publicly available in GitHub (https://github.com/immunogenomics/dynamicASE) or upon request to the authors.

References

  1. 1.

    Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).

    CAS  PubMed  Google Scholar 

  2. 2.

    Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Farh, K. K.-H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).

    CAS  PubMed  Google Scholar 

  4. 4.

    Simeonov, D. R. et al. Discovery of stimulation-responsive immune enhancers with CRISPR activation. Nature 549, 111–115 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Gutierrez-Arcelus, M., Rich, S. S. & Raychaudhuri, S. Autoimmune diseases—connecting risk alleles with molecular traits of the immune system. Nat. Rev. Genet. 17, 160–174 (2016).

    PubMed  PubMed Central  Google Scholar 

  6. 6.

    Raj, T. et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Dimas, A. S. et al. Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325, 1246–1250 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Gutierrez-Arcelus, M. et al. Passive and active DNA methylation and the interplay with genetic variation in gene regulation. eLife 2, e00523 (2013).

    PubMed  PubMed Central  Google Scholar 

  9. 9.

    Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Ishigaki, K. et al. Polygenic burdens on cell-specific pathways underlie the risk of rheumatoid arthritis. Nat. Genet. 49, 1120–1125 (2017).

    CAS  PubMed  Google Scholar 

  11. 11.

    Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

    PubMed  PubMed Central  Google Scholar 

  12. 12.

    Hu, X. et al. Regulation of gene expression in autoimmune disease loci and the genetic basis of proliferation in CD4+ effector memory T cells. PLoS Genet. 10, e1004404 (2014).

    PubMed  PubMed Central  Google Scholar 

  13. 13.

    Buil, A. et al. Gene–gene and gene–environment interactions detected by transcriptome sequence analysis in twins. Nat. Genet. 47, 88–91 (2015).

    CAS  PubMed  Google Scholar 

  14. 14.

    Moyerbrailean, G. A. & et al. High-throughput allele-specific expression across 250 environmental conditions. Genome Res. 26, 1627–1638 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    van de Geijn, B., McVicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Knowles, D. A. et al. Allele-specific expression reveals interactions between genetic variation and environment. Nat. Methods 14, 699–702 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47, 898–905 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Sollid, L. M. et al. Evidence for a primary association of celiac disease to a particular HLA-DQ alpha/beta heterodimer. J. Exp. Med. 169, 345–350 (1989).

    CAS  PubMed  Google Scholar 

  19. 19.

    Burmester, G. R., Yu, D. T., Irani, A. M., Kunkel, H. G. & Winchester, R. J. Ia+ T cells in synovial fluid and tissues of patients with rheumatoid arthritis. Arthritis Rheumatol. 24, 1370–1376 (1981).

    CAS  Google Scholar 

  20. 20.

    Yu, D. T. et al. Peripheral blood Ia-positive T cells. Increases in certain diseases and after immunization. J. Exp. Med. 151, 91–100 (1980).

    CAS  PubMed  Google Scholar 

  21. 21.

    Ko, H. S. Ia determinants on stimulated human T lymphocytes. Occurrence on mitogen- and antigen-activated T cells. J. Exp. Med. 150, 246–255 (1979).

    CAS  PubMed  Google Scholar 

  22. 22.

    Rao, D. A. et al. Pathologically expanded peripheral T helper cell subset drives B cells in rheumatoid arthritis. Nature 542, 110–114 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Fonseka, C. Y. et al. Mixed-effects association of single cells identifies an expanded effector CD4+ T cell subset in rheumatoid arthritis. Sci. Transl. Med. 10, eaaq0305 (2018).

    PubMed  PubMed Central  Google Scholar 

  24. 24.

    Lanzavecchia, A., Roosnek, E., Gregory, T., Berman, P. & Abrignani, S. T cells can present antigens such as HIV gp120 targeted to their own surface molecules. Nature 334, 530–532 (1988).

    CAS  PubMed  Google Scholar 

  25. 25.

    LaSalle, J. M., Tolentino, P. J., Freeman, G. J., Nadler, L. M. & Hafler, D. A. Early signaling defects in human T cells anergized by T cell presentation of autoantigen. J. Exp. Med. 176, 177–186 (1992).

    CAS  PubMed  Google Scholar 

  26. 26.

    Brandes, M., Willimann, K. & Moser, B. Professional antigen-presentation function by human γδ T cells. Science 309, 264–268 (2005).

    CAS  PubMed  Google Scholar 

  27. 27.

    Guo, M. H. et al. Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms. Proc. Natl Acad. Sci. USA 114, E327–E336 (2017).

    CAS  PubMed  Google Scholar 

  28. 28.

    Bild, D. E. et al. Multi-Ethnic Study of Atherosclerosis: objectives and design. Am. J. Epidemiol. 156, 871–881 (2002).

    PubMed  Google Scholar 

  29. 29.

    Roadmap Epigenomics Consortium, et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Google Scholar 

  30. 30.

    ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  31. 31.

    Wong, D. et al. Genomic mapping of the MHC transactivator CIITA using an integrated ChIP–seq and genetical genomics approach. Genome Biol. 15, 494 (2014).

    PubMed  PubMed Central  Google Scholar 

  32. 32.

    GTEx Consortium, et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Google Scholar 

  33. 33.

    Nédélec, Y. et al. Genetic ancestry and natural selection drive population differences in immune responses to pathogens. Cell 167, 657–669.e21 (2016).

    PubMed  Google Scholar 

  34. 34.

    Aguiar, V. R. C., César, J., Delaneau, O., Dermitzakis, E. T. & Meyer, D. Expression estimation and eQTL mapping for HLA genes with a personalized pipeline. PLoS Genet. 15, e1008091 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Schofield, E. C. et al. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics 32, 2511–2513 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Castel, S. E. et al. Modified penetrance of coding variants by cis-regulatory variation contributes to disease risk. Nat. Genet. 50, 1327–1334 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Raychaudhuri, S. et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat. Genet. 44, 291–296 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Raj, P. et al. Regulatory polymorphisms modulate the expression of HLA class II molecules and promote autoimmunity. eLife 5, e12089 (2016).

    PubMed  PubMed Central  Google Scholar 

  42. 42.

    Cavalli, G. et al. MHC class II super-enhancer increases surface expression of HLA-DR and HLA-DQ and affects cytokine production in autoimmune vitiligo. Proc. Natl Acad. Sci. USA 113, 1363–1368 (2016).

    CAS  PubMed  Google Scholar 

  43. 43.

    Vandiedonck, C. et al. Pervasive haplotypic variation in the spliceo-transcriptome of the human major histocompatibility complex. Genome Res. 21, 1042–1054 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Pelikan, R. C. et al. Enhancer histone-QTLs are enriched on autoimmune risk haplotypes and influence gene expression within chromatin networks. Nat. Commun. 9, 2905 (2018).

    PubMed  PubMed Central  Google Scholar 

  45. 45.

    Senju, S. et al. Allele-specific expression of the cytoplasmic exon of HLA-DQB1 gene. Immunogenetics 36, 319–325 (1992).

    CAS  PubMed  Google Scholar 

  46. 46.

    Baecher-Allan, C., Wolf, E. & Hafler, D. A. MHC class II expression identifies functionally distinct human regulatory T cells. J. Immunol. 176, 4622–4631 (2006).

    CAS  PubMed  Google Scholar 

  47. 47.

    Reinherz, E. L. et al. Ia determinants on human T-cell subsets defined by monoclonal antibody. Activation stimuli required for expression. J. Exp. Med. 150, 1472–1482 (1979).

    CAS  PubMed  Google Scholar 

  48. 48.

    Engleman, E. G., Benike, C. J. & Charron, D. J. Ia antigen on peripheral blood mononuclear leukocytes in man. II. Functional studies of HLA-DR-positive T cells activated in mixed lymphocyte reactions. J. Exp. Med. 152, 114s–126s (1980).

    CAS  PubMed  Google Scholar 

  49. 49.

    Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS ONE 8, e64683 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    GAP Registry (The Feinstein Institute for Medical Research, accessed 27 February 2019); https://www.feinsteininstitute.org/robert-s-boas-center-for-genomics-and-human-genetics/gap-registry/

  51. 51.

    Liao, Y., Smyth, G. K. & Shi, W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res. 41, e108 (2013).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

    CAS  Google Scholar 

  53. 53.

    R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2013).

  54. 54.

    Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    CAS  Google Scholar 

  55. 55.

    Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    1000 Genomes Project Consortium, et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Google Scholar 

  59. 59.

    Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2011).

    PubMed  Google Scholar 

  60. 60.

    Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. 62.

    Castel, S. E., Levy-Moonshine, A., Mohammadi, P., Banks, E. & Lappalainen, T. Tools and best practices for data processing in allelic expression analysis. Genome Biol. 16, 195 (2015).

    PubMed  PubMed Central  Google Scholar 

  63. 63.

    Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).

    Google Scholar 

  64. 64.

    Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    CAS  PubMed  Google Scholar 

  65. 65.

    Robinson, J. et al. The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res. 43, D423–D431 (2015).

    CAS  PubMed  Google Scholar 

  66. 66.

    Dilthey, A., Cox, C., Iqbal, Z., Nelson, M. R. & McVean, G. Improved genome inference in the MHC using a population reference graph. Nat. Genet. 47, 682–688 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. 67.

    Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2017).

  68. 68.

    Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).

    Google Scholar 

  70. 70.

    Schumann, K. et al. Generation of knock-in primary human T cells using Cas9 ribonucleoproteins. Proc. Natl Acad. Sci. USA 112, 10437–10442 (2015).

    CAS  PubMed  Google Scholar 

  71. 71.

    Richardson, C. D., Ray, G. J., DeWitt, M. A., Curie, G. L. & Corn, J. E. Enhancing homology-directed genome editing by catalytically active and inactive CRISPR–Cas9 using asymmetric donor DNA. Nat. Biotechnol. 34, 339–344 (2016).

    CAS  PubMed  Google Scholar 

  72. 72.

    Slowikowski, K., Hu, X. & Raychaudhuri, S. SNPsea: an algorithm to identify cell types, tissues and pathways affected by risk loci. Bioinformatics 30, 2496–2497 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Phipson, B. & Smyth, G. K. Permutation P-values should never be zero: calculating exact P-values when permutations are randomly drawn. Stat. Appl. Genet. Mol. Biol. https://doi.org/10.2202/1544-6115.1585 (2010).

Download references

Acknowledgements

We are indebted to G. Klein RN for her outstanding management of the Genotype and Phenotype (GaP) registry at the Feinstein Institute, to the Raychaudhuri laboratory members for critical discussions and feedback and to H. Long and P. Cejas for support on primary T cell ATAC-seq experiments. This work was supported by the National Institutes of Health (grant nos. U19AI111224, U01GM092691, U01HG009379 and R01AR063759 to S.R., NHGRI T32 HG002295 to T.A.), the Swiss National Science Foundation (Early Postdoc Mobility Fellowship to M.G.-A.), the Broad Institute through the SPARC mechanism (S.R.), the Estonian Research Council (PUT1660 to T.E.) and the European Union Horizon 2020 (grant no. MP1GI18418R to T.E.). Whole-genome sequencing (WGS) for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). WGS for ‘NHLBI TOPMed: Multi-Ethnic Study of Atherosclerosis (MESA)’ (accession no. phs001416.v1.p1) was performed at the Broad Institute of MIT and Harvard (grant no. 3U54HG003067-13S1). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (grant no. 3R01HL-117626-02S1; contract no. HHSN268201800002I). Phenotype harmonization, data management, sample identity quality control and general study coordination were provided by the TOPMed Data Coordinating Center (grant no. 3R01HL-120393-02S1; contract no. HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contract nos. HHSN268201500003I, N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079 and UL1-TR-001420. The provision of genotyping data was supported in part by the National Center for Advancing Translational Sciences, CTSI grant no. UL1TR001881, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant no. DK063491 to the Southern California Diabetes Endocrinology Research Center. The full authorship list for the NHLBI TOPMed consortium can be found in https://www.nhlbiwgs.org/topmed-banner-authorship.

Author information

Affiliations

Authors

Consortia

Contributions

M.G.-A., Y.B. and S.R. conceived, designed and performed analyses/experiments, wrote the paper and supervised the research. J.A., S.H., Y.L., T.A. and N.T. performed analyses or experiments, interpreted data and contributed to the paper. D.A.R., J.E., A.H.J. and M.B.B. provided experimental supervision and contributed to the paper. C.N. and P.K.G. contributed to sample recruitment and results discussion. C.N. facilitated experiments. T.E., S.S.R., K.D.T. and J.I.R. contributed to data acquisition.

Corresponding author

Correspondence to Soumya Raychaudhuri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Replication of dynamic ASE in two pilot individuals.

For two individuals, we performed full time-course replicates (from the same CD4+ memory T cell isolation batch, but independent stimulation experiment and RNA-seq library preparation). From the dynamic ASE events called significant in replicate A at 5% FDR (as explained in main text and Methods), we asked how do the P-values and betas look in replicate B. Left plots show distribution of P-values in replicate B, middle plots show correlation of betas for time, right plots show correlation of betas for time squared. a, Individual TB03072560. b, Individual TB03073798.

Extended Data Fig. 2 Replication examples of dynamic ASE in two pilot individuals.

Examples of a dynamic ASE event significant in individual TB03072560 (a) and TB03073798 (b). Shown are allelic counts for heterozygous SNP (left) and reference fraction over time (right) for replicate A (top panels) and replicate B (bottom panels).

Extended Data Fig. 3 Reproducibility of dynASE across heterozygous individuals for the same SNP.

Here we wanted to assess whether dynamic ASE replicates well in different heterozygous individuals for the same SNP. First, from the 561 dynASE events at 5% FDR we took the top 356 unique SNPs (ensuring one heterozygous individual per SNP), and then asked how do the P-values look in other heterozygous individuals for those 356 SNPs. a, Qqplot depicting the observed P-values in the other heterozygous individuals (y-axis), compared to the expected uniform distribution of P-values (x-axis). b, Next, within all 561 significant events at 5% FDR, we evaluated the correlation of betas for time (left) and time squared (right) for all pairwise combinations of heterozygous individuals for the same SNP, i.e. het1 and het2 in x and y axis labels.

Extended Data Fig. 4 DynASE examples for SNPs with two or more heterozygous individuals.

ac, Shown are gene expression levels across 24 individuals (left), and allele counts (SNP and individual indicated) and reference fraction (P-value and FDR for dynASE indicated) for heterozygous SNPs in corresponding gene.

Extended Data Fig. 5 Scheme depicting HLA allelic expression quantification with HLA-personalized genome.

In order to quantify robustly allele-specific expression in the highly polymorphic HLA genes, we first create an HLA-personalized genome per individual. We do this by inserting into the reference genome the cDNA sequences of each HLA allele as separate sequences (12 in total given that we sequenced or typed 6 HLA genes), and masking the exonic sequences corresponding to those cDNAs in chromosome 6 of the reference genome. Next, we map the RNA-seq reads to this HLA-personalized genome, we remove PCR duplicates and we count the number of uniquely mapped reads to each HLA cDNA allele.

Extended Data Fig. 6 Allelic fraction replication in HLA gene quantifications.

Allelic fraction over time for the 3 HLA class II genes (a) and 3 HLA class I genes (b), for the two pilot individuals with full time course replicates. Replicate A in black, replicate B in blue.

Extended Data Fig. 7 Principal component analysis of HLA-DQB1 allelic profiles over time.

PCA performed for 48 HLA-DQB1 allelic expression profiles of 24 individuals (log2(FPKM+1) values over time. Allelic profiles are colored by 4-digit classical HLA-DQB1 allele (a), and by the k-means cluster to which they belong (b). Average allelic expression was computed for samples with replicates. Twelve hour time point was removed because of high number of missing values. These plots depict how 4-digit alleles group near each other (a), and how PCA also captures the three distinct cis regulatory programs (Fluctuating, Constant-Low and Late-Spike) (b).

Extended Data Fig. 8 Mapping variants associated with Late-Spike haplotype.

a, r2 between Late-Spike haplotype dosage and SNPs within 1Mb of HLA-DQB1 in Estonian cohort. Orange vertical lines indicate location of HLA-DQB1. Dots that are colored pink are intragenic SNPs in HLA-DQB1, HLA-DRB1, and HLA-DQA1. Right plot is zoomed in on HLA-DQB1 region to show top SNPs (reference genome hg19). b, HLA-DQB1 gene expression levels (log2(FPKM+1)) at 72 hours after stimulation for individuals separated by their rs71542466 genotype. c, Same as in (a) but in European MESA cohort (reference genome GRCh38). d, r2 comparison between Estonian and European MESA cohort, for all SNPs in the region (left) or the subset of SNPs in the regions that do not overlap HLA-DQB1, HLA-DRB1 or HLA-DQA1 start-end genomic coordinates (right). The 6 intergenic SNPs with top r2 in Estonians are highlighted, with 3 of them having top r2 in the European MESA cohort too. Identity line marked. These results show that our top candidate SNP rs71542466 (and the other candidate SNPs) tracks well with the Late-Spike haplotype in both the Estonian and the MESA cohort of individuals of European ancestry recruited in the United States.

Extended Data Fig. 9 Genomic location of nearest gRNAs to tested causal SNPs and representative flow cytometry plot of CRISPR-Cas9 edited HH cells.

a, Location of SNPs (red) is shown in reference to the nearest exon (blue) both upstream and downstream of HLA-DQB1. The nearest gRNA sequences used for targeting the regions are highlighted with their corresponding colors (rs71542466 - dark green, rs71542467 - light purple, rs71542468 - purple, rs72844401 - beige/orange, rs4279477 - blue, rs28451423 - light green). Alignments were plotted using SnapGene(v3.2.1). b, Representative staining of HLA-DQ on CRISPR-Cas9 modified HH cells. Cells were modified with proximal gRNA as shown in (a) and labelled accordingly. Cells stained 7-10 days after modification with HLA-DQ antibodies as a bulk population.

Extended Data Fig. 10 Sanger sequencing alignment of HH reference and base-edited clones reveal seamless editing.

Genomic DNA from expanded clones was sequenced and aligned to the reference (hg38) and visualized using SnapGene(v3.2.1). Red colored nucleotide indicates the location of the rs71542466 SNP in the reference. Highlighted red nucleotides indicate mismatches from the reference and yellow colored nucleotides indicate unresolved/heterozygous sequences.

Supplementary information

Supplementary Information

Supplementary Figs. 1–19, Note and unprocessed EMSAs from Supplementary Fig. 15

Reporting Summary

Supplementary Tables

Supplementary Table 1. Dynamic allele-specific expression for SNPs genome wide with FDR < 0.05. Supplementary Table 2. Reported eQTLs for HLA-DQB1 and LD with the Late-Spike regulatory SNP. Supplementary Table 3. Primers, probes and oligonucleotide sequences.

Source data

Source Data Fig. 4

Unprocessed EMSA from Fig. 4d.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247–253 (2020). https://doi.org/10.1038/s41588-020-0579-4

Download citation

Further reading