High-throughput identification of noncoding functional SNPs via type IIS enzyme restriction

Abstract

Genome-wide association studies (GWAS) have identified many disease-associated noncoding variants, but cannot distinguish functional single-nucleotide polymorphisms (fSNPs) from others that reside incidentally within risk loci. To address this challenge, we developed an unbiased high-throughput screen that employs type IIS enzymatic restriction to identify fSNPs that allelically modulate the binding of regulatory proteins. We coupled this approach, termed SNP-seq, with flanking restriction enhanced pulldown (FREP) to identify regulation of CD40 by three disease-associated fSNPs via four regulatory proteins, RBPJ, RSRC2 and FUBP-1/TRAP150. Applying this approach across 27 loci associated with juvenile idiopathic arthritis, we identified 148 candidate fSNPs, including two that regulate STAT4 via the regulatory proteins SATB2 and H1.2. Together, these findings establish the utility of tandem SNP-seq/FREP to bridge the gap between GWAS and disease mechanism.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Diagram of tandem SNP-seq and FREP.
Fig. 2: Screening of fSNPs within the CD40 locus.
Fig. 3: Validation of fSNPs rs4810485, rs6032664, rs6065926, rs1883832 and rs6074022 as CD40 fSNPs.
Fig. 4: Expression of CD40 in RNAi knockdown experiments with human B cells and human synovial fibroblasts.
Fig. 5: Demonstration of the binding of RBPJ to rs4810485 and TRAP150 to rs6065926.
Fig. 6: SNP-seq high-throughput screening of 608 JIA-associated SNPs.
Fig. 7: Characterization of fSNPs at the STAT4 locus.

References

  1. 1.

    Bogdanos, D. P. et al. Twin studies in autoimmune disease: genetics, gender and environment. J. Autoimmun. 38, J156–J169 (2012).

  2. 2.

    Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 44, 483–489 (2012).

  3. 3.

    Lucas, C. L. & Lenardo, M. J. Identifying genetic determinants of autoimmunity and immune dysregulation. Curr. Opin. Immunol. 37, 28–33 (2015).

  4. 4.

    Anonymous. Little boxes. Nat. Genet. 46, 659 (2014).

  5. 5.

    Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

  6. 6.

    Deplancke, B., Alpern, D. & Gardeux, V. The genetics of transcription factor DNA binding variation. Cell 166, 538–554 (2016).

  7. 7.

    Farh, K. K. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).

  8. 8.

    Butter, F. et al. Proteome-wide analysis of disease-associated SNPs that show allele-specific transcription factor binding. PLoS Genet. 8, e1002982 (2012).

  9. 9.

    Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

  10. 10.

    Trynka, G. et al. Disentangling the effects of colocalizing genomic annotations to functionally prioritize non-coding variants within complex-trait loci. Am. J. Hum. Genet. 97, 139–152 (2015).

  11. 11.

    Inoue, F. & Ahituv, N. Decoding enhancers using massively parallel reporter assays. Genomics 106, 159–164 (2015).

  12. 12.

    Ulirsch, J. C. et al. Systematic functional dissection of common genetic variation affecting red blood cell traits. Cell 165, 1530–1545 (2016).

  13. 13.

    Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 165, 1519–1529 (2016).

  14. 14.

    Hardenbol, P. & Van Dyke, M. W. Sequence specificity of triplex DNA formation: analysis by a combinatorial approach, restriction endonuclease protection selection and amplification. Proc. Natl Acad. Sci. USA 93, 2811–2816 (1996).

  15. 15.

    Li, G. et al. The rheumatoid arthritis risk variant CCR6DNP regulates CCR6 via PARP-1. PLoS Genet. 12, e1006292 (2016).

  16. 16.

    Raychaudhuri, S. et al. Common variants at CD40 and other loci confer risk of rheumatoid arthritis. Nat. Genet. 40, 1216–1223 (2008).

  17. 17.

    Sawcer, S. et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476, 214–219 (2011).

  18. 18.

    Beecham, A. H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat. Genet. 45, 1353–1360 (2013).

  19. 19.

    Vazgiourakis, V. M. et al. A common SNP in the CD40 region is associated with systemic lupus erythematosus and correlates with altered CD40 expression: implications for the pathogenesis. Ann. Rheum. Dis. 70, 2184–2190 (2011).

  20. 20.

    Hinks, A. et al. Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis. Nat. Genet. 45, 664–669 (2013).

  21. 21.

    Li, G. et al. Human genetics in rheumatoid arthritis guides a high-throughput drug screen of the CD40 signaling pathway. PLoS Genet. 9, e1003487 (2013).

  22. 22.

    Pearson, L. L., Castle, B. E. & Kehry, M. R. CD40-mediated signaling in monocytic cells: up-regulation of tumor necrosis factor receptor-associated factor mRNAs and activation of mitogen-activated protein kinase signaling pathways. Int. Immunol. 13, 273–283 (2001).

  23. 23.

    Ward, L. D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res 40, D930–D934 (2012).

  24. 24.

    Sokolova, E. A. et al. Association of SNPs of CD40 gene with multiple sclerosis in Russians. PLoS One 8, e61032 (2013).

  25. 25.

    Jacobson, E. M., Concepcion, E., Oashi, T. & Tomer, Y. A Graves’ disease-associated Kozak sequence single-nucleotide polymorphism enhances the efficiency of CD40 gene translation: a case for translational pathophysiology. Endocrinology 146, 2684–2691 (2005).

  26. 26.

    Fries, K. M. et al. CD40 expression by human fibroblasts. Clin. Immunol. Immunopathol. 77, 42–51 (1995).

  27. 27.

    Larman, H. B. et al. PhIP-Seq characterization of autoantibodies from patients with multiple sclerosis, type 1 diabetes and rheumatoid arthritis. J. Autoimmun. 43, 1–9 (2013).

  28. 28.

    Levo, M. & Segal, E. In pursuit of design principles of regulatory sequences. Nat. Rev. Genet. 15, 453–468 (2014).

  29. 29.

    Nguyen, H. N. et al. Autocrine loop involving IL-6 Family Member LIF, LIF receptor, and STAT4 drives sustained fibroblast production of inflammatory mediators. Immunity 46, 220–232 (2017).

  30. 30.

    Maerkl, S. J. & Quake, S. R. A systems approach to measuring the binding energy landscapes of transcription factors. Science 315, 233–237 (2007).

  31. 31.

    Fordyce, P. M. et al. De novo identification and biophysical characterization of transcription-factor binding sites with microfluidic affinity analysis. Nat. Biotechnol. 28, 970–975 (2010).

  32. 32.

    Borggrefe, T. & Oswald, F. The Notch signaling pathway: transcriptional regulation at Notch target genes. Cell. Mol. Life Sci. 66, 1631–1646 (2009).

  33. 33.

    Tun, T. et al. Recognition sequence of a highly conserved DNA binding protein RBP-Jκ. Nucleic Acids Res 22, 965–971 (1994).

  34. 34.

    Barrett, J. C. et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41, 703–707 (2009).

  35. 35.

    Stahl, E. A. et al. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci. Nat. Genet. 42, 508–514 (2010).

  36. 36.

    Querol, L. et al. Protein array-based profiling of CSF identifies RBPJ as an autoantigen in multiple sclerosis. Neurology 81, 956–963 (2013).

  37. 37.

    Hsiao, H. H. et al. Quantitative characterization of the interactions among c-myc transcriptional regulators FUSE, FBP, and FIR. Biochemistry 49, 4620–4634 (2010).

  38. 38.

    Choi, J. H. et al. Thrap3 docks on phosphoserine 273 of PPARγ and controls diabetic gene programming. Genes Dev. 28, 2361–2369 (2014).

  39. 39.

    Kurehara, H. et al. A novel gene, RSRC2, inhibits cell proliferation and affects survival in esophageal cancer patients. Int. J. Oncol. 30, 421–428 (2007).

  40. 40.

    Liu, J. et al. The FBP interacting repressor targets TFIIH to inhibit activated transcription. Mol. Cell 5, 331–341 (2000).

  41. 41.

    Nigrovic, P. A., Raychaudhuri, S. & Thompson, S. D. Genetics and the classification of arthritis in adults and children. Arthritis Rheumatol. 70, 7–17 (2018).

  42. 42.

    Alvarez, J. D. et al. The MAR-binding protein SATB1 orchestrates temporal and spatial expression of multiple genes during T-cell development. Genes Dev. 14, 521–535 (2000).

  43. 43.

    Boulikas, T. Chromatin domains and prediction of MAR sequences. Int. Rev. Cytol. 162A, 279–388 (1995).

  44. 44.

    Dobreva, G., Dambacher, J. & Grosschedl, R. SUMO modification of a novel MAR-binding protein, SATB2, modulates immunoglobulin mu gene expression. Genes Dev. 17, 3048–3061 (2003).

  45. 45.

    Lunning, M. A. & Green, M. R. Mutation of chromatin modifiers; an emerging hallmark of germinal center B-cell lymphomas. Blood Cancer J. 5, e361 (2015).

  46. 46.

    Noss, E. H., Nguyen, H. N., Chang, S. K., Watts, G. F. & Brenner, M. B. Genetic polymorphism directs IL-6 expression in fibroblasts but not selected other cell types. Proc. Natl Acad. Sci. USA 112, 14948–14953 (2015).

Download references

Acknowledgements

We thank P. Y. Lee, I-C. Ho, P. Libby and R. M. Plenge for scientific discussions. This work was supported by grants from the Arthritis National Research Foundation, National Multiple Sclerosis Society, NIH R21 NS096443 and NIH R21 AR070378 (G.L.), and from the Rheumatology Research Foundation, NIH R01 AR065538, NIH P30 AR070253 and the Fundación Bechara (P.A.N.).

Author information

G.L. developed SNP-seq and FREP, designed the study, performed all the experiments and analyzed the data in the laboratory of P.A.N. G.L. and P.A.N. drafted the manuscript. M.M.-B. performed experiments and analysis and revised the manuscript. D.W. performed data analysis and revised the manuscript. J.C. performed sequencing data analyses. Y.Y. assisted with experiments. P.C. and A.L. assisted with analysis of CD40 expression by FACS. M.B. assisted with the 3 C assay. H.N.N. performed siRNA experiments on human synovial fibroblasts and assisted with the ChIP assays in the laboratory of M.B.B. E.A.H. assisted with the CRISPR–Cas9 experiments. Y.O., M.M.-B. and S.R. assisted with the fine-mapping analysis at the CD40 and STAT4 loci. H.-J.W. assisted with figure formatting and data analysis. S.R., E.A.H. and R.L.M. assisted with data analysis and revised the manuscript.

Correspondence to Gang Li or Peter A. Nigrovic.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Allele-specific gel shifting at CD40 variant rs1883832 but not rs6074022.

EMSA revealing lack of allele-specific gel shifting at rs6074022 (T, non-risk/minor allele; C, risk/major allele) but confirming allele-specific gel shifting for rs1883832 (T, non-risk/minor allele; C, risk/major allele). The data reflect five replicate experiments with similar results

Supplementary Figure 2 EMSA with gel supershift showing the enhanced binding of protein to rs6032664 and rs6065926 by anti-RSRC2 (left) and anti-FUBP1 (right) antibodies.

The data represent three biological replicates with similar results. Red brackets indicate enhanced intensity, rather than shift, upon binding of specific antibody. Ab, antibody

Supplementary Figure 3 Algorithm for identification of candidate fSNPs by allele-specific enrichment at cycle 10.

C1 and C2, sequence counts for control (no nuclear extract) at alleles 1 and 2 for each SNP. S1.1 and S1.2, sequence counts for test (with nuclear extract) replicates 1 and 2 for allele 1 at each SNP; S2.1 and S2.2, same for allele 2

Supplementary Figure 4 Algorithm for identification of candidate fSNPs by progressive enrichment across cycles 4, 7 and 10 as numbers 1, 2 and 3.

C1 and C2, sequence counts for control (no nuclear extract) at alleles 1 and 2 for each SNP. S1.1 and S1.2, sequence counts for test (with nuclear extract) replicates 1 and 2 for allele 1 at each SNP; S2.1 and S2.2, same for allele 2

Supplementary Figure 5 EMSA showing allele-imbalanced gel shifting on 14 randomly selected candidate fSNPs from 9 JIA-associated loci using nuclear extract from PBMCs.

The number in parentheses is the number of candidate fSNPs divided by the total number of SNPs within LD r2 > 0.8 of the lead SNP at the locus. The data represent two repeats with similar results

Supplementary Figure 6 EMSA for assessment of allele-imbalanced gel shifting as per SNP-seq.

a,b, EMSAs are shown for the seven negative SNPs at STAT4 (a) and the six negative SNPs at FAS (b). The FAS SNPs rs2182408 and rs2148287 show clear allele-specific shifting, while rs7574865 at STAT4 and rs1926194 at FAS are borderline. The data represent three replicates with similar results

Supplementary Figure 7 Expression of CD40 in Notch1- and PUF60-knockdown human synovial fibroblasts.

a, CD40 expression (lower panel) is not affected by downregulation of Notch1 measured by qPCR with either actin or 18S as a control (upper panel). b, CD40 expression is downregulated (lower panel) after knockdown of Puf60 (upper panel). Expression is shown compared with two controls, actin and 18S. The data points represent two biological repeats; statistical testing not performed

Supplementary Figure 8 3C assay at the CD40 locus with rs4810485, rs6032664 and rs6065926.

a, PCR showing bands with predicted sizes for ligation between fSNPs in five samples. Data reflect three biological replicates. b, Sequence traces showing the ligation of five different sequences from rs4810485, rs6032664 and rs6065926 with an interposed BamHI site

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8 and Supplementary Tables 1–7

Reporting Summary

Supplementary Data

Source gels for figures

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading