Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Large-scale whole-genome sequencing of the Icelandic population

Subjects

Abstract

Here we describe the insights gained from sequencing the whole genomes of 2,636 Icelanders to a median depth of 20×. We found 20 million SNPs and 1.5 million insertions-deletions (indels). We describe the density and frequency spectra of sequence variants in relation to their functional annotation, gene position, pathway and conservation score. We demonstrate an excess of homozygosity and rare protein-coding variants in Iceland. We imputed these variants into 104,220 individuals down to a minor allele frequency of 0.1% and found a recessive frameshift mutation in MYL4 that causes early-onset atrial fibrillation, several mutations in ABCB4 that increase risk of liver diseases and an intronic variant in GNAS associating with increased thyroid-stimulating hormone levels when maternally inherited. These data provide a study design that can be used to determine how variation in the sequence of the human genome gives rise to human diversity.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Distribution of indel lengths inside and outside protein-coding regions.
Figure 2: FRV and variant density by impact class and OMIM disease-related gene classification.
Figure 3: Sequencing coverage, FRV and variant density by exon rank.
Figure 4: FRV and variant density for SNPs by mammalian conservation (GERP score), PANTHER subset of GO terms, chromatin state and sensitive regions.
Figure 5: The fraction of SNPs and indels identified in 2,636 Icelanders present in dbSNP and ESP by consequence.
Figure 6: Effect of geography on frequency distribution.

References

  1. Frazer, K.A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).

    Article  CAS  PubMed  Google Scholar 

  2. Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Sulem, P. et al. Identification of low-frequency variants associated with gout and serum uric acid levels. Nat. Genet. 43, 1127–1130 (2011).

    Article  CAS  PubMed  Google Scholar 

  4. Jonsson, T. et al. A mutation in APP protects against Alzheimer's disease and age-related cognitive decline. Nature 488, 96–99 (2012).

    Article  CAS  PubMed  Google Scholar 

  5. Rafnar, T. et al. Mutations in BRIP1 confer high risk of ovarian cancer. Nat. Genet. 43, 1104–1107 (2011).

    Article  CAS  PubMed  Google Scholar 

  6. Holm, H. et al. A rare variant in MYH6 is associated with high risk of sick sinus syndrome. Nat. Genet. 43, 316–320 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Styrkarsdottir, U. et al. Nonsense mutation in the LGR4 gene is associated with several human diseases and other traits. Nature 497, 517–520 (2013).

    Article  CAS  PubMed  Google Scholar 

  8. Jonsson, T. et al. Variant of TREM2 associated with the risk of Alzheimer's disease. N. Engl. J. Med. 368, 107–116 (2013).

    Article  CAS  PubMed  Google Scholar 

  9. Helgason, H. et al. A rare nonsynonymous sequence variant in C3 is associated with high risk of age-related macular degeneration. Nat. Genet. 45, 1371–1374 (2013).

    Article  CAS  PubMed  Google Scholar 

  10. Gudmundsson, J. et al. A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer. Nat. Genet. 44, 1326–1329 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Stacey, S.N. et al. A germline variant in the TP53 polyadenylation signal confers cancer susceptibility. Nat. Genet. 43, 1098–1103 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Steinthorsdottir, V. et al. Identification of low-frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nat. Genet. 46, 294–298 (2014).

    Article  CAS  PubMed  Google Scholar 

  13. Tennessen, J.A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–69 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Fu, W. et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).

    Article  CAS  PubMed  Google Scholar 

  15. Li, Y. et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat. Genet. 42, 969–972 (2010).

    Article  CAS  PubMed  Google Scholar 

  16. Abecasis, G.R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    Article  PubMed  CAS  Google Scholar 

  17. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 1068–1075 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Pruitt, K.D., Tatusova, T., Brown, G.R. & Maglott, D.R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res. 40, D130–D135 (2012).

    Article  CAS  PubMed  Google Scholar 

  20. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Stubbs, A. et al. Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection. J. Clin. Bioinforma 2, 19 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Chen, F.C., Chen, C.J., Li, W.H. & Chuang, T.J. Human-specific insertions and deletions inferred from mammalian genome sequences. Genome Res. 17, 16–22 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Montgomery, S.B. et al. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res. 23, 749–761 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. McKusick, V.A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Flicek, P. et al. Ensembl 2013. Nucleic Acids Res. 41, D48–D55 (2013).

    Article  CAS  PubMed  Google Scholar 

  29. MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823–828 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zavolan, M. & van Nimwegen, E. The types and prevalence of alternative splice forms. Curr. Opin. Struct. Biol. 16, 362–367 (2006).

    Article  CAS  PubMed  Google Scholar 

  31. Baker, K.E. & Parker, R. Nonsense-mediated mRNA decay: terminating erroneous gene expression. Curr. Opin. Cell Biol. 16, 293–299 (2004).

    Article  CAS  PubMed  Google Scholar 

  32. Keller, A., Zhuang, H., Chi, Q., Vosshall, L.B. & Matsunami, H. Genetic variation in a human odorant receptor alters odour perception. Nature 449, 468–472 (2007).

    Article  CAS  PubMed  Google Scholar 

  33. Mainland, J.D. et al. The missense of smell: functional variability in the human odorant receptor repertoire. Nat. Neurosci. 17, 114–120 (2014).

    Article  CAS  PubMed  Google Scholar 

  34. Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Smith, N.G., Webster, M.T. & Ellegren, H. Deterministic mutation rate variation in the human genome. Genome Res. 12, 1350–1356 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Mi, H., Muruganujan, A. & Thomas, P.D. PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41, D377–D386 (2013).

    Article  CAS  PubMed  Google Scholar 

  38. Ernst, J., Vainas, O., Harbison, C.T., Simon, I. & Bar-Joseph, Z. Reconstructing dynamic regulatory maps. Mol. Syst. Biol. 3, 74 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  40. Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).

    Article  CAS  PubMed  Google Scholar 

  41. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Mayr, E. Systematics and the Origin of Species from the Viewpoint of a Zoologist (Columbia University Press, 1942).

  43. Thorlacius, S. et al. A single BRCA2 mutation in male and female breast cancer families from Iceland with varied cancer phenotypes. Nat. Genet. 13, 117–119 (1996).

    Article  CAS  PubMed  Google Scholar 

  44. Helgason, A., Yngvadottir, B., Hrafnkelsson, B., Gulcher, J. & Stefansson, K. An Icelandic example of the impact of population structure on association studies. Nat. Genet. 37, 90–95 (2005).

    Article  CAS  PubMed  Google Scholar 

  45. Small, K.S. et al. Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nat. Genet. 43, 561–564 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kong, A. et al. Parental origin of sequence variants associated with complex diseases. Nature 462, 868–874 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wallace, C. et al. The imprinted DLK1-MEG3 gene region on chromosome 14q32.2 alters susceptibility to type 1 diabetes. Nat. Genet. 42, 68–71 (2010).

    Article  CAS  PubMed  Google Scholar 

  48. Abreu, A.P. et al. Central precocious puberty caused by mutations in the imprinted gene MKRN3. N. Engl. J. Med. 368, 2467–2475 (2013).

    Article  CAS  PubMed  Google Scholar 

  49. Falls, J.G., Pulford, D.J., Wylie, A.A. & Jirtle, R.L. Genomic imprinting: implications for human disease. Am. J. Pathol. 154, 635–647 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Go, A.S. et al. Prevalence of diagnosed atrial fibrillation in adults: national implications for rhythm management and stroke prevention: the AnTicoagulation and Risk Factors in Atrial Fibrillation (ATRIA) Study. J. Am. Med. Assoc. 285, 2370–2375 (2001).

    Article  CAS  Google Scholar 

  51. Lloyd-Jones, D.M. et al. Lifetime risk for development of atrial fibrillation: the Framingham Heart Study. Circulation 110, 1042–1046 (2004).

    Article  PubMed  Google Scholar 

  52. Strohman, R.C., Micou-Eastwood, J., Glass, C.A. & Matsuda, R. Human fetal muscle and cultured myotubes derived from it contain a fetal-specific myosin light chain. Science 221, 955–957 (1983).

    Article  CAS  PubMed  Google Scholar 

  53. Cohen-Haguenauer, O. et al. Chromosomal assignment of two myosin alkali light-chain genes encoding the ventricular/slow skeletal muscle isoform and the atrial/fetal muscle isoform (MYL3, MYL4). Hum. Genet. 81, 278–282 (1989).

    Article  CAS  PubMed  Google Scholar 

  54. Nicolaou, M. et al. Canalicular ABC transporters and liver disease. J. Pathol. 226, 300–315 (2012).

    Article  CAS  PubMed  Google Scholar 

  55. Davit-Spraul, A., Gonzales, E., Baussan, C. & Jacquemin, E. Progressive familial intrahepatic cholestasis. Orphanet J. Rare Dis. 4, 1 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Dixon, P.H. et al. Heterozygous MDR3 missense mutation associated with intrahepatic cholestasis of pregnancy: evidence for a defect in protein trafficking. Hum. Mol. Genet. 9, 1209–1217 (2000).

    Article  CAS  PubMed  Google Scholar 

  57. Gudmundsson, J. et al. Discovery of common variants associated with low TSH levels and thyroid cancer risk. Nat. Genet. 44, 319–322 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Sathasivam, S. Brown-Vialetto–Van Laere syndrome. Orphanet J. Rare Dis. 3, 9 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Chan, W.M. et al. Expanded polyglutamine domain possesses nuclear export activity which modulates subcellular localization and toxicity of polyQ disease protein via exportin-1. Hum. Mol. Genet. 20, 1738–1750 (2011).

    Article  CAS  PubMed  Google Scholar 

  60. Johnson, J.O. et al. Exome sequencing reveals riboflavin transporter mutations as a cause of motor neuron disease. Brain 135, 2875–2882 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Ciccolella, M. et al. Riboflavin transporter 3 involvement in infantile Brown-Vialetto-Van Laere disease: two novel mutations. J. Med. Genet. 50, 104–107 (2013).

    Article  CAS  PubMed  Google Scholar 

  62. Haack, T.B. et al. Impaired riboflavin transport due to missense mutations in SLC52A2 causes Brown-Vialetto–Van Laere syndrome. J. Inherit. Metab. Dis. 35, 943–948 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Green, P. et al. Brown-Vialetto–Van Laere syndrome, a ponto-bulbar palsy with deafness, is caused by mutations in c20orf54. Am. J. Hum. Genet. 86, 485–489 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Johnson, J.O., Gibbs, J.R., Van Maldergem, L., Houlden, H. & Singleton, A.B. Exome sequencing in Brown-Vialetto–van Laere syndrome. Am. J. Hum. Genet. 87, 567–569, author reply 569–570 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Bosch, A.M. et al. Brown-Vialetto–Van Laere and Fazio Londe syndrome is associated with a riboflavin transporter defect mimicking mild MADD: a new inborn error of metabolism with potential treatment. J. Inherit. Metab. Dis. 34, 159–164 (2011).

    Article  CAS  PubMed  Google Scholar 

  66. da Silva-Júnior, F.P., Moura Rde, D., Rosemberg, S., Marchiori, P.E. & Castro, L.H. Cor pulmonale in a patient with Brown-Vialetto–Van Laere syndrome: a case report. J. Neurol. Sci. 300, 155–156 (2011).

    Article  PubMed  Google Scholar 

  67. Dakhil, F.O., Bensreiti, S.M. & Zew, M.H. Pontobulbar palsy and sensorineural deafness (Brown-Vialetto–van Laere syndrome): the first case from Libya. Amyotroph. Lateral Scler. 11, 397–398 (2010).

    Article  PubMed  Google Scholar 

  68. Lombaert, A., Dom, R., Carton, H. & Bruchler, J.M. Progressive ponto-bulbar palsy with deafness. A clinico-pathological study. Acta Neurol. Belg. 76, 309–314 (1976).

    CAS  PubMed  Google Scholar 

  69. van Bogaert, L. & van der Broeck, J. Sclérose latérale amyotrophique ou myasthénie bulbospinal avec exaltation des réflexes tendineux et cntractions fibrillaires. J. Neurol. Psychiatry 6, 380–382 (1929).

    Google Scholar 

  70. Rotowski, J. & McHarg, J.F. A case of amyotrophic lateral sclerosis complicated by progressive lipodystrophy. Edin. Med. J. 60, 281–293 (1953).

    Google Scholar 

  71. Gudbjartsson, D.F. et al. Sequence variants from whole genome sequencing a large group of Icelanders. Sci. Data 2, 150011 doi:10.1038/sdata.2015.11 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Flicek, P. et al. Ensembl 2012. Nucleic Acids Res. 40, D84–D90 (2012).

    Article  CAS  PubMed  Google Scholar 

  77. Paten, B., Herrero, J., Beal, K., Fitzgerald, S. & Birney, E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 1814–1828 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res. 18, 1829–1843 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank all the participants in this study. This study was performed in collaboration with Illumina.

Author information

Authors and Affiliations

Authors

Contributions

D.F.G., H. Helgason, S.A.G., F.Z., D.O.A., O.T.M., G. Masson, A.H., P.S. and K.S. wrote the initial draft of the manuscript. D.F.G., H. Helgason, S.A.G., F.Z., A.O., G. Magnusson, B.V.H., E.H., G.T.S., S.N.S., M.L.F., A.K., G. Masson and P.S. analyzed the data. D.F.G., H. Helgason, S.A.G., F.Z., A.G., S.B., H.G. and G. Masson created methods for analyzing the data. S.N.S., H. Holm, J.S., H.T.H., H.J. and O.T.M. performed the experiments. H. Holm, G.S., G.T., J.T.S., S.G., G.B.W., T.R., B.T., E.S.B., S.O., H.T., T.S., T.S.G., A.T., J.G.J., A.S., G.B., J.J.J., O.T., P.L., G.I.E., O.S., I.O. and D.O.A. collected the samples and information. D.F.G., D.O.A., G. Masson, U.T., A.H., P.S. and K.S. designed the study.

Corresponding authors

Correspondence to Daniel F Gudbjartsson or Kari Stefansson.

Ethics declarations

Competing interests

The authors affiliated with deCODE Genetics are employed by the company, which is owned by Amgen, Inc: D.F.G., H. Helgason, S.A.G., F.Z., A.O., A.G., G. Magnusson, B.V.H., E.H., G.T.S., S.N.S., M.L.F., H. Holm, J.S., H.T.H., H.J., S.G., G.B.W., T.R., A.S., G.B., H.G., O.T.M., A.K., G. Masson, U.T., A.H., P.S. and K.S.

Integrated supplementary information

Supplementary Figure 1 Sequencing depth of the 2,636 sequenced Icelanders.

Supplementary Figure 2 Overview of sequence alignment and variant calling.

Supplementary Figure 3 Overview of the process for sequence variant imputation.

Supplementary Figure 4 Distribution of the number of observed alleles in 2,636 sequenced Icelanders by impact class.

Shown are the proportions of variants for which the minor allele was seen one to six times (MAF ≤ 0.11%).

Supplementary Figure 5 Comparison of imputed and chip genotypes.

Shown is the fraction of the 28,204 SNPs identified in exons and splice regions and present on SNP chips that have r2 > 0.8, 0.9 and 0.99 between the imputed and chip genotypes as a function of their derived allele frequency (DAF).

Supplementary Figure 6 The five pedigrees containing the eight homozygous carriers of c.234delC in MYL4.

Symbols for homozygous carriers are colored black. Symbols for deceased individuals are stricken through with a forward-leaning line. Symbols for individuals who have not been genotyped directly are stricken through with a backward-leaning line. Under each individual are up to five lines containing information about the individual. First appear an identifier, consisting of a pedigree name (f1–f5), the generation of the individual in roman numerals and an enumerator within the generation. Second appear the individual’s year of birth and, if appropriate, the individual’s year of death. Third appears the individual’s c.234delC genotype, where D and W denote directly genotyped deletion and wild-type alleles, respectively, and d and w denote in silico genotypes inferred from the genotypes of relatives. The order of the alleles indicates the parent of origin, where the first allele comes from the father and the second allele comes from the mother, except for the three cases for whom parent of origin could not be assigned: f2-I:1, f2-I:2 and f3-I:2. Fourth appear an indication of whether the individual has been diagnosed with AF and the age at onset after the @ sign. Fifth appear the presence of other relevant phenotypes: sick sinus syndrome (SSS), pacemaker implantation (PM) and sudden cardiac death (SCD).

Supplementary Figure 7 The transmission of chromosome 17 through pedigree f-2.

The transmission of the founding couple of pedigree f-2 can be reconstructed on the basis of the expected values for meiotic transmissions of chromosome 17. The horizontal red lines indicate the position of c.234delC in MYL4, and the small red square surrounding the line indicates the region around c.234delC shared identically by decent by the founding couple. The length of this interval is estimated to be 3.3 cM. The first and last 10 cM of the chromosome have been truncated. The sisters f2-II:6 and f2-II:9 are imputed to be carrying c.234delC on their paternal chromosome on the basis of the chromosomal region around the deletion having been transmitted to their children (dark blue). There is no clear transmission of either sister’s maternal chromosomal region around the deletion to one of her children (although f2-III:3 may have inherited her mother’s paternal chromosome, but a crossover occurred in the region around c.234delC where f2-III:3 is homozygous). However, for both sisters, the maternal chromosome carrying the deletion (light blue) was transmitted to an offspring at regions on both sides of MYL4 (to f2-III:2 and f2-III:4) such that, unless a double crossover occurred around MYL4, they both carry c.234delC on their maternal chromosome.

Supplementary Figure 8 The families of the BVVL cases.

Shown are birth years and genotypes at the SLC52A2 mutation, where W denotes the wild-type allele and M denotes the mutated allele. Symbols for cases are colored black, and the symbols corresponding to the two siblings of case 4 who died early are colored gray. A forward slash indicates that the individual is deceased, and a backward slash indicates that an SLC52A2 genotype is not available for that individual.

Supplementary Figure 9 The effect of the filtering steps on the number of sequence variants that are candidates for causing BVVL syndrome in the two sisters.

The occurrence of a rare syndrome such as BVVL in two sisters suggests that it is caused by a rare genotype with high penetrance. We therefore restricted our search to LoF and MODERATE-impact variants. The sisters are affected but neither parent is, which suggests an autosomal recessive mode of inheritance. Allelic frequency over 2% would dictate a homozygous frequency of over 1 in 2,500, which would be too high for BVVL syndrome. This brought the number of potential variants down to 3 from 147. This would not have been possible using non-Icelandic resources such as ESP, as 4 of the 147 variants are not present in the database. We note that crude filtering, such as removing all variants present in public databases, would result in removing the causative sequence variant. This left us with three correlated MODERATE-impact variants on chromosome 8q24.3: p.Leu339Pro (rs148234606) in SLC52A2, p.Gln931Arg in OPLAH and c.2982C>T in CPSF1. No one was imputed to be homozygous for the SLC52A2 variant in the set of additional chip-typed Icelanders, whereas 5 and 19 Icelanders were imputed to be homozygous for the OPLAH and CPSF1 SNPs, respectively. No early deaths were reported among these homozygous carriers, and the oldest homozygous carriers reached ages 77 and 89 years for OPLAH and CPSF1, respectively, which is inconsistent with diagnosis of BVVL, as only 1 of 77 reported BVVL syndrome cases has lived past 60 years36,37,38,39,40,41,42,43,44.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–9, Supplementary Tables 1–15 and Supplementary Note. (PDF 1678 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gudbjartsson, D., Helgason, H., Gudjonsson, S. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet 47, 435–444 (2015). https://doi.org/10.1038/ng.3247

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3247

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing