Abstract

We report 17.6 million genetic variants from whole-genome sequencing of 2,120 Sardinians; 22% are absent from previous sequencing-based compilations and are enriched for predicted functional consequences. Furthermore, 76,000 variants common in our sample (frequency >5%) are rare elsewhere (<0.5% in the 1000 Genomes Project). We assessed the impact of these variants on circulating lipid levels and five inflammatory biomarkers. We observe 14 signals, including 2 major new loci, for lipid levels and 19 signals, including 2 new loci, for inflammatory markers. The new associations would have been missed in analyses based on 1000 Genomes Project data, underlining the advantages of large-scale sequencing in this founder population.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    et al. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Genet. 39, 830–832 (2007).

  2. 2.

    et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009).

  3. 3.

    et al. Genetic variants near TIMP3 and high-density lipoprotein–associated loci influence susceptibility to age-related macular degeneration. Proc. Natl. Acad. Sci. USA 107, 7401–7406 (2010).

  4. 4.

    et al. Common variants associated with plasma triglycerides and risk for coronary artery disease. Nat. Genet. 45, 1345–1352 (2013).

  5. 5.

    , & Exome sequencing and complex disease: practical aspects of rare variant association studies. Hum. Mol. Genet. 21, R1–R9 (2012).

  6. 6.

    et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl. Acad. Sci. USA 111, E455–E464 (2014).

  7. 7.

    , , & Power of deep, all-exon resequencing for discovery of human trait genes. Proc. Natl. Acad. Sci. USA 106, 3871–3876 (2009).

  8. 8.

    , & Use of population isolates for mapping complex traits. Nat. Rev. Genet. 1, 182–190 (2000).

  9. 9.

    et al. Cholesterol fractions and apolipoproteins as risk factors for heart disease mortality in older men. Arch. Intern. Med. 167, 1373–1378 (2007).

  10. 10.

    et al. Inflammatory markers and the risk of coronary heart disease in men and women. N. Engl. J. Med. 351, 2599–2610 (2004).

  11. 11.

    et al. Genetic variants regulating immune cell levels in health and disease. Cell 155, 242–256 (2013).

  12. 12.

    Global Lipids Genetics Consortium. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).

  13. 13.

    et al. A genome-wide association scan on the levels of markers of inflammation in Sardinians reveals associations that underpin its complex regulation. PLoS Genet. 8, e1002480 (2012).

  14. 14.

    et al. Heritability of cardiovascular and personality traits in 6,148 Sardinians. PLoS Genet. 2, e132 (2006).

  15. 15.

    et al. Variants within the immunoregulatory CBLB gene are associated with multiple sclerosis. Nat. Genet. 42, 495–497 (2010).

  16. 16.

    et al. Variation within the CLEC16A gene shows consistent disease association with both multiple sclerosis and type 1 diabetes in Sardinia. Genes Immun. 10, 15–17 (2009).

  17. 17.

    et al. Genotype calling and haplotyping in parent-offspring trios. Genome Res. 23, 142–151 (2013).

  18. 18.

    , , & An efficient and scalable analysis framework for variant extraction and refinement from population scale DNA sequence data. Genome Res. 25, 918–925 (2015).

  19. 19.

    , , , & Low-coverage sequencing: implications for design of complex trait association studies. Genome Res. 21, 940–951 (2011).

  20. 20.

    et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

  21. 21.

    et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

  22. 22.

    et al. Peopling of three Mediterranean islands (Corsica, Sardinia, and Sicily) inferred by Y-chromosome biallelic variability. Am. J. Phys. Anthropol. 121, 270–279 (2003).

  23. 23.

    et al. Low-pass DNA sequencing of 1200 Sardinians reconstructs European Y-chromosome phylogeny. Science 341, 565–569 (2013).

  24. 24.

    et al. Major factors influencing linkage disequilibrium by analysis of different chromosome regions in distinct populations: demography, chromosome recombination frequency and selection. Hum. Mol. Genet. 9, 2947–2957 (2000).

  25. 25.

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  26. 26.

    et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).

  27. 27.

    et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008).

  28. 28.

    et al. The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am. J. Hum. Genet. 83, 347–358 (2008).

  29. 29.

    et al. Demographic history and rare allele sharing among human populations. Proc. Natl. Acad. Sci. USA 108, 11983–11988 (2011).

  30. 30.

    et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).

  31. 31.

    & Demography and the age of rare variants. PLoS Genet. 10, e1004528 (2014).

  32. 32.

    & Family-based association tests for genomewide association scans. Am. J. Hum. Genet. 81, 913–926 (2007).

  33. 33.

    , , & Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).

  34. 34.

    et al. Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs. Eur. J. Hum. Genet. 23, 975–983 (2015).

  35. 35.

    et al. Fine mapping of five loci associated with low-density lipoprotein cholesterol detects variants that double the explained heritability. PLoS Genet. 7, e1002198 (2011).

  36. 36.

    & β-thalassemia. Genet. Med. 12, 61–76 (2010).

  37. 37.

    et al. Plasma lipoprotein composition, apolipoprotein(a) concentration and isoforms in β-thalassemia. Atherosclerosis 131, 127–133 (1997).

  38. 38.

    et al. Plasma lipids in β-thalassemia minor. Atherosclerosis 75, 245–248 (1989).

  39. 39.

    Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).

  40. 40.

    et al. Genetic variant on PDGFRL associated with Behçet disease in Chinese Han populations. Hum. Mutat. 34, 74–78 (2013).

  41. 41.

    et al. An integrative approach to characterize disease-specific pathways and their coordination: a case study in cancer. BMC Genomics 9 (suppl. 1), S12 (2008).

  42. 42.

    et al. Arg89Cys substitution results in very low membrane expression of the Duffy antigen/receptor for chemokines in Fyx individuals. Blood 92, 2147–2156 (1998).

  43. 43.

    et al. Structural analysis of human CCR2b and primate CCR2b by molecular modeling and molecular dynamics simulation. J. Mol. Model. 8, 217–222 (2002).

  44. 44.

    et al. Association of exome sequences with plasma C-reactive protein levels in >9000 participants. Hum. Mol. Genet. 24, 559–571 (2015).

  45. 45.

    et al. Apolipoprotein E genotype is associated with serum C-reactive protein but not abdominal aortic aneurysm. Atherosclerosis 209, 487–491 (2010).

  46. 46.

    et al. Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. Am. J. Hum. Genet. 89, 131–138 (2011).

  47. 47.

    et al. Mapping the genetic architecture of gene expression in human liver. PLoS Biol. 6, e107 (2008).

  48. 48.

    , , & Genome-wide complex trait analysis (GCTA): methods, data analyses, and interpretations. Methods Mol. Biol. 1019, 215–236 (2013).

  49. 49.

    et al. A common Greenlandic TBC1D4 variant confers muscle insulin resistance and type 2 diabetes. Nature 512, 190–193 (2014).

  50. 50.

    et al. Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels. Nat. Genet. doi: (14 September 2015).

  51. 51.

    et al. Height-reducing variants and selection for short stature in Sardinia. Nat. Genet. doi: (14 September 2015).

  52. 52.

    et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337 (2010).

  53. 53.

    , , & In silico method for inferring genotypes in pedigrees. Nat. Genet. 38, 1002–1004 (2006).

  54. 54.

    et al. The Metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits. PLoS Genet. 8, e1002793 (2012).

  55. 55.

    , , & Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat. Rev. Genet. 14, 661–673 (2013).

  56. 56.

    et al. zCall: a rare variant caller for array-based genotyping: genetics and population analysis. Bioinformatics 28, 2543–2545 (2012).

  57. 57.

    et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369 (2008).

  58. 58.

    & Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

  59. 59.

    et al. QPLOT: a quality assessment tool for next generation sequencing data. BioMed. Res. Int. 2013, 865181 (2013).

  60. 60.

    et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012).

  61. 61.

    , , , & MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).

  62. 62.

    , , , & Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).

  63. 63.

    et al. Discerning the ancestry of European Americans in genetic association studies. PLoS Genet. 4, e236 (2008).

  64. 64.

    , & Convergence and prediction of principal component scores in high-dimensional settings. Ann. Stat. 38, 3605–3629 (2010).

  65. 65.

    et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

  66. 66.

    , & METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

  67. 67.

    & Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).

  68. 68.

    et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).

  69. 69.

    et al. Using extended genealogy to estimate components of heritability for 23 quantitative and dichotomous traits. PLoS Genet. 9, e1003520 (2013).

  70. 70.

    et al. Estimating genome-wide significance for whole-genome sequencing studies. Genet. Epidemiol. 38, 281–290 (2014).

  71. 71.

    , , & Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet. 30, 97–101 (2002).

  72. 72.

    , , & Cohort profile: TwinsUK and Healthy Ageing Twin Study. Int. J. Epidemiol. 42, 76–85 (2013).

  73. 73.

    et al. Genetic characterization of northeastern Italian population isolates in the context of broader European genetic diversity. Eur. J. Hum. Genet. 21, 659–665 (2013).

  74. 74.

    et al. Heritability and demographic analyses in the large isolated population of Val Borbera suggest advantages in mapping complex traits genes. PLoS ONE 4, e7554 (2009).

  75. 75.

    et al. Rationale and design of the LURIC study—a resource for functional genomics, pharmacogenomics and long-term prognosis of cardiovascular disease. Pharmacogenomics 2, S1–S73 (2001).

  76. 76.

    et al. Whole-genome sequence–based analysis of thyroid function. Nat. Commun. 6, 5681 (2015).

Download references

Acknowledgements

We thank all the volunteers who generously participated in this study and made this research possible. This research was supported by National Human Genome Research Institute grants HG005581, HG005552, HG006513, HG007022 and HG007089; by National Heart, Lung, and Blood Institute grant HL117626; by the Intramural Research Program of the US National Institutes of Health, National Institute on Aging, contracts N01-AG-1-2109 and HHSN271201100005C; by Sardinian Autonomous Region (L.R. 7/2009) grant cRP3-154; by the PB05 InterOmics MIUR Flagship Project; by grant FaReBio2011 'Farmaci e Reti Biotecnologiche di Qualità'; by a US National Institutes of Health National Research Service Award (NRSA) postdoctoral fellowship (F32GM106656) to C.W.K.C.; and by the UC MEXUS/CONOCYT fellowship to V.D.O.d.V. The replication cohorts acknowledge the use of data generated by the UK10K Consortium, supported by Wellcome Trust award WT091310. The UK10K research was specifically funded by a Wellcome Trust award, '10,000 UK Genome Sequences: Accessing the Role of Rare Genetic Variants in Health and Disease' (WT091310/C/10/Z). The research of N.S. is supported by the Wellcome Trust (grants WT098051 and WT091310), the European Union's Seventh Framework Programme (EPIGENESYS grant 257082 and BLUEPRINT grant HEALTH-F5-2011-282510) and the National Institute for Health Research (NIHR) British Research Council (BRC). The ING-FVG cohort was supported by grant Ministero della Salute—Ricerca Finalizzata PE-2011-02347500 (to P.G.); the ING-VB study thanks the inhabitants of Val Borbera for participating in the study, M. Traglia, C. Sala and C. Masciullo for data management, and the funding sources Fondazione Cariplo (Italy), the Ministry of Health, Ricerca Finalizzata (Italy) 2008, 2011-2012, and the Public Health Genomics Project 2010. The HELIC cohorts are thankful to the residents of the Pomak villages and the Mylopotamos villages for participating and to their funding sources, including the Wellcome Trust (098051) and the European Research Council (ERC-2011-StG 280559-SEPI).

Author information

Author notes

    • Carlo Sidore
    • , Fabio Busonero
    • , Andrea Maschio
    • , Eleonora Porcu
    •  & Silvia Naitza

    These authors contributed equally to this work.

    • Serena Sanna
    • , David Schlessinger
    • , Francesco Cucca
    •  & Gonçalo R Abecasis

    These authors jointly supervised this work.

Affiliations

  1. Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche (CNR), Monserrato, Cagliari, Italy.

    • Carlo Sidore
    • , Fabio Busonero
    • , Andrea Maschio
    • , Eleonora Porcu
    • , Silvia Naitza
    • , Magdalena Zoledziewska
    • , Antonella Mulas
    • , Giorgio Pistis
    • , Maristella Steri
    • , Fabrice Danjou
    • , Maristella Pitzalis
    • , Andrea Angius
    • , Serena Sanna
    •  & Francesco Cucca
  2. Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, USA.

    • Carlo Sidore
    • , Fabio Busonero
    • , Andrea Maschio
    • , Eleonora Porcu
    • , Giorgio Pistis
    • , Alan Kwong
    • , Jennifer Bragg-Gresham
    • , Christian Fuchsberger
    • , Hyun M Kang
    •  & Gonçalo R Abecasis
  3. Dipartimento di Scienze Biomediche, Università degli Studi di Sassari, Sassari, Italy.

    • Carlo Sidore
    • , Eleonora Porcu
    • , Antonella Mulas
    • , Giorgio Pistis
    • , Riccardo Berutti
    •  & Francesco Cucca
  4. DNA Sequencing Core, University of Michigan, Ann Arbor, Michigan, USA.

    • Fabio Busonero
    • , Andrea Maschio
    • , Brendan Tarrier
    • , Christine Brennan
    •  & Robert Lyons
  5. Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, USA.

    • Vicente Diego Ortega del Vecchyo
  6. Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, USA.

    • Charleston W K Chiang
  7. Laboratory of Genetics, National Institute on Aging, US National Institutes of Health, Baltimore, Maryland, USA.

    • Ramaiah Nagaraja
    •  & David Schlessinger
  8. Porto Conte Ricerche, Tramariglio, Alghero, Italy.

    • Sergio Uzzau
  9. Center for Advanced Studies, Research and Development in Sardinia (CRS4), Parco Scientifico e Tecnologico della Sardegna, Pula, Italy.

    • Rossano Atzeni
    • , Frederic Reinier
    • , Riccardo Berutti
    • , Chris Jones
    •  & Andrea Angius
  10. Human Genetics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK.

    • Jie Huang
    • , Eleftheria Zeggini
    •  & Nicole Soranzo
  11. Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, Bristol, UK.

    • Nicholas J Timpson
  12. Division of Genetics and Cell Biology, San Raffaele Scientific Institute, Milan, Italy.

    • Daniela Toniolo
  13. Dipartimento di Salute Mentale, University of Trieste and IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico) Burlo Garofolo Children Hospital, Trieste, Italy.

    • Paolo Gasparini
  14. Experimental Genetics Division, Sidra, Doha, Qatar.

    • Paolo Gasparini
  15. Department of Life and Reproduction Sciences, University of Verona, Verona, Italy.

    • Giovanni Malerba
  16. Department of Nutrition and Dietetics, Harokopio University Athens, Athens, Greece.

    • George Dedoussis
  17. Department of Haematology, University of Cambridge, Cambridge, UK.

    • Nicole Soranzo
  18. Department of Human Genetics, University of Chicago, Chicago, Illinois, USA.

    • John Novembre

Authors

  1. Search for Carlo Sidore in:

  2. Search for Fabio Busonero in:

  3. Search for Andrea Maschio in:

  4. Search for Eleonora Porcu in:

  5. Search for Silvia Naitza in:

  6. Search for Magdalena Zoledziewska in:

  7. Search for Antonella Mulas in:

  8. Search for Giorgio Pistis in:

  9. Search for Maristella Steri in:

  10. Search for Fabrice Danjou in:

  11. Search for Alan Kwong in:

  12. Search for Vicente Diego Ortega del Vecchyo in:

  13. Search for Charleston W K Chiang in:

  14. Search for Jennifer Bragg-Gresham in:

  15. Search for Maristella Pitzalis in:

  16. Search for Ramaiah Nagaraja in:

  17. Search for Brendan Tarrier in:

  18. Search for Christine Brennan in:

  19. Search for Sergio Uzzau in:

  20. Search for Christian Fuchsberger in:

  21. Search for Rossano Atzeni in:

  22. Search for Frederic Reinier in:

  23. Search for Riccardo Berutti in:

  24. Search for Jie Huang in:

  25. Search for Nicholas J Timpson in:

  26. Search for Daniela Toniolo in:

  27. Search for Paolo Gasparini in:

  28. Search for Giovanni Malerba in:

  29. Search for George Dedoussis in:

  30. Search for Eleftheria Zeggini in:

  31. Search for Nicole Soranzo in:

  32. Search for Chris Jones in:

  33. Search for Robert Lyons in:

  34. Search for Andrea Angius in:

  35. Search for Hyun M Kang in:

  36. Search for John Novembre in:

  37. Search for Serena Sanna in:

  38. Search for David Schlessinger in:

  39. Search for Francesco Cucca in:

  40. Search for Gonçalo R Abecasis in:

Contributions

D.S., F.C. and G.R.A. conceived and supervised the study. C.S., S.N., S.S., D.S., F.C. and G.R.A. drafted the manuscript. E.P., M.Z., C.W.K.C. and J.N. revised the manuscript and wrote specific sections of it. F.B., A. Maschio, A.A., C.J. and R.L. supervised sequencing experiments. F.B., A. Maschio, B.T. and C.B. performed sequencing experiments. C.S., E.P., G.P., M.S., F.D. and S.S. carried out genetic association analyses. C.S., A.K., R.A., F.R., R.B., C.J., R.L. and H.M.K. were responsible for sequencing data processing. C.S., E.P. and G.P. analyzed DNA sequence data. M.Z., A. Mulas, F.B., S.U. and R.N. carried out SNP array genotyping. M.Z. designed the validation strategy, and M.Z., F.B. and A. Mulas verified genotypes by Sanger sequencing and TaqMan genotyping. C.S., J.B.-G., M.P., C.F. and S.S. were responsible for selection of samples for sequencing, J.N., C.W.K.C. and V.D.O.d.V. performed the allele-sharing, principal-component and FST analyses. J.H., P.G., G.M., N.J.T., E.Z., D.T., G.D. and N.S. provided replication results. All authors reviewed and approved the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Francesco Cucca or Gonçalo R Abecasis.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Note, Supplementary Figures 1–10 and Supplementary Tables 1–17.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/ng.3368

Further reading