Focus on Genomes of Icelanders

Large-scale whole-genome sequencing of the Icelandic population

Journal name:
Nature Genetics
Volume:
47,
Pages:
435–444
Year published:
DOI:
doi:10.1038/ng.3247
Received
Accepted
Published online

Abstract

Here we describe the insights gained from sequencing the whole genomes of 2,636 Icelanders to a median depth of 20×. We found 20 million SNPs and 1.5 million insertions-deletions (indels). We describe the density and frequency spectra of sequence variants in relation to their functional annotation, gene position, pathway and conservation score. We demonstrate an excess of homozygosity and rare protein-coding variants in Iceland. We imputed these variants into 104,220 individuals down to a minor allele frequency of 0.1% and found a recessive frameshift mutation in MYL4 that causes early-onset atrial fibrillation, several mutations in ABCB4 that increase risk of liver diseases and an intronic variant in GNAS associating with increased thyroid-stimulating hormone levels when maternally inherited. These data provide a study design that can be used to determine how variation in the sequence of the human genome gives rise to human diversity.

At a glance

Figures

  1. Distribution of indel lengths inside and outside protein-coding regions.
    Figure 1: Distribution of indel lengths inside and outside protein-coding regions.

    (a) The 4,001 indels inside protein-coding regions. (b) The 1,437,571 indels outside protein-coding regions. Insertions have a positive length, and deletions have a negative length. Indels that are not a multiple of three are colored gray. Indels that are a multiple of three are colored black.

  2. FRV and variant density by impact class and OMIM disease-related gene classification.
    Figure 2: FRV and variant density by impact class and OMIM disease-related gene classification.

    (a) FRV by annotation. (b) Variant density. SNPs are shown in blue, and indels are shown in red. (c) FRV by OMIM disease gene classification and impact class. (d) Variant density relative to the impact class average by OMIM disease-related gene classification and impact class. Loss-of-function, moderate-impact and low-impact variants are shown in red, blue and green, respectively. The line segments indicate the 95% confidence interval around each observed FRV or variant density. The dotted lines indicate the genomic average FRV or variant density.

  3. Sequencing coverage, FRV and variant density by exon rank.
    Figure 3: Sequencing coverage, FRV and variant density by exon rank.

    (a) Distribution of the mean coverage by position for the whole genome, intronless genes, and the first, middle and last exons of multi-exon genes among the 2,636 whole genome–sequenced Icelanders. (b) FRV by exon rank and impact class. (c) Variant density relative to the impact class average by exon rank and impact class. (d) Variant density by exon rank. Loss-of-function, moderate-impact and low-impact variants are shown in red, blue and green, respectively, in b and c. The line segments indicate the 95% confidence interval around each observed FRV or variant density. The dotted lines indicate the genomic average FRV or variant density.

  4. FRV and variant density for SNPs by mammalian conservation (GERP score), PANTHER subset of GO terms, chromatin state and sensitive regions.
    Figure 4: FRV and variant density for SNPs by mammalian conservation (GERP score), PANTHER subset of GO terms, chromatin state and sensitive regions.

    (a) FRV as a function of GERP score by annotation for 2,218 stop-gain or -loss and initiator codon SNPs, 1,085 splice acceptor or donor SNPs, 82,176 missense SNPs, 10,498 splice-region SNPs, 149,415 UTR SNPs, 6,782,063 intronic, upstream or downstream SNPs, 49,837 synonymous SNPs and 8,663,577 other SNPs. (b) Variant density as a function of GERP score by annotation for coding regions spanning 25.4 Mb, splice regions spanning 2.5 Mb, UTR regions spanning 24.2 Mb, intronic, upstream or downstream regions spanning 984.4 Mb and other regions spanning 1,242.7 Mb. (c) A red diamond represents the sensory perception of smell class, which contains the olfactory receptors, and black X's represent the remaining 307 GO classes with more than 50 gene members, after the removal of the olfactory receptor genes. (d) The 13 chromatin states are indicated by black X's. Txn, transcription. (e) The 15 sensitive and 4 ultra-sensitive regions are indicated by black X's and red diamonds, respectively. For reference, the 95% confidence ellipses for annotated regions are also shown.

  5. The fraction of SNPs and indels identified in 2,636 Icelanders present in dbSNP and ESP by consequence.
    Figure 5: The fraction of SNPs and indels identified in 2,636 Icelanders present in dbSNP and ESP by consequence.

    The analysis was restricted to 16,587,813 SNPs and 1,191,089 indels for which the ancestral allele could be inferred. (ac) Shown is the overlap with dbSNP only (a), ESP only (b) and the union of dbSNP and ESP (c) as a function of DAF by annotation and variant type. LoF, loss of function.

  6. Effect of geography on frequency distribution.
    Figure 6: Effect of geography on frequency distribution.

    (a) The ratio of the number of SNPs found in Iceland and the number of SNPs found in the European-American portion of ESP as a function of MAF. SNPs in repeat regions, with MAF under 0.1%, and in regions for which either project had low coverage (15× for Iceland and 40× for ESP) were excluded. The SNPs are divided by consequence: stop gain (n Iceland = 563, n ESP = 413), missense (n Iceland = 38,142, n ESP = 36,272) and synonymous (n Iceland = 30,148, n ESP = 31,765). All counts are shown in Supplementary Table 9. (b) The ratio of observed and expected minor allele homozygote counts as a function of MAF. The black, blue and red lines represent all chip-typed Icelanders, the chip-typed offspring of parents from the same Icelandic county and the chip-typed offspring of parents coming from different Icelandic counties, respectively. The expected homozygote counts were calculated assuming Hardy-Weinberg equilibrium. (c) The geographical distribution of the minor alleles of the risk-conferring variants in MYL4, ABCB4, GNAS and SLC52A2, in 104,220 chip-typed Icelanders. Each bar shows the allelic frequency of the variant relative to the geographical region with the highest frequency.

  7. Sequencing depth of the 2,636 sequenced Icelanders.
    Supplementary Fig. 1: Sequencing depth of the 2,636 sequenced Icelanders.
  8. Overview of sequence alignment and variant calling.
    Supplementary Fig. 2: Overview of sequence alignment and variant calling.
  9. Overview of the process for sequence variant imputation.
    Supplementary Fig. 3: Overview of the process for sequence variant imputation.
  10. Distribution of the number of observed alleles in 2,636 sequenced Icelanders by impact class.
    Supplementary Fig. 4: Distribution of the number of observed alleles in 2,636 sequenced Icelanders by impact class.

    Shown are the proportions of variants for which the minor allele was seen one to six times (MAF ≤ 0.11%).

  11. Comparison of imputed and chip genotypes.
    Supplementary Fig. 5: Comparison of imputed and chip genotypes.

    Shown is the fraction of the 28,204 SNPs identified in exons and splice regions and present on SNP chips that have r2 > 0.8, 0.9 and 0.99 between the imputed and chip genotypes as a function of their derived allele frequency (DAF).

  12. The five pedigrees containing the eight homozygous carriers of c.234delC in MYL4.
    Supplementary Fig. 6: The five pedigrees containing the eight homozygous carriers of c.234delC in MYL4.

    Symbols for homozygous carriers are colored black. Symbols for deceased individuals are stricken through with a forward-leaning line. Symbols for individuals who have not been genotyped directly are stricken through with a backward-leaning line. Under each individual are up to five lines containing information about the individual. First appear an identifier, consisting of a pedigree name (f1–f5), the generation of the individual in roman numerals and an enumerator within the generation. Second appear the individual’s year of birth and, if appropriate, the individual’s year of death. Third appears the individual’s c.234delC genotype, where D and W denote directly genotyped deletion and wild-type alleles, respectively, and d and w denote in silico genotypes inferred from the genotypes of relatives. The order of the alleles indicates the parent of origin, where the first allele comes from the father and the second allele comes from the mother, except for the three cases for whom parent of origin could not be assigned: f2-I:1, f2-I:2 and f3-I:2. Fourth appear an indication of whether the individual has been diagnosed with AF and the age at onset after the @ sign. Fifth appear the presence of other relevant phenotypes: sick sinus syndrome (SSS), pacemaker implantation (PM) and sudden cardiac death (SCD).

  13. The transmission of chromosome 17 through pedigree f-2.
    Supplementary Fig. 7: The transmission of chromosome 17 through pedigree f-2.

    The transmission of the founding couple of pedigree f-2 can be reconstructed on the basis of the expected values for meiotic transmissions of chromosome 17. The horizontal red lines indicate the position of c.234delC in MYL4, and the small red square surrounding the line indicates the region around c.234delC shared identically by decent by the founding couple. The length of this interval is estimated to be 3.3 cM. The first and last 10 cM of the chromosome have been truncated. The sisters f2-II:6 and f2-II:9 are imputed to be carrying c.234delC on their paternal chromosome on the basis of the chromosomal region around the deletion having been transmitted to their children (dark blue). There is no clear transmission of either sister’s maternal chromosomal region around the deletion to one of her children (although f2-III:3 may have inherited her mother’s paternal chromosome, but a crossover occurred in the region around c.234delC where f2-III:3 is homozygous). However, for both sisters, the maternal chromosome carrying the deletion (light blue) was transmitted to an offspring at regions on both sides of MYL4 (to f2-III:2 and f2-III:4) such that, unless a double crossover occurred around MYL4, they both carry c.234delC on their maternal chromosome.

  14. The families of the BVVL cases.
    Supplementary Fig. 8: The families of the BVVL cases.

    Shown are birth years and genotypes at the SLC52A2 mutation, where W denotes the wild-type allele and M denotes the mutated allele. Symbols for cases are colored black, and the symbols corresponding to the two siblings of case 4 who died early are colored gray. A forward slash indicates that the individual is deceased, and a backward slash indicates that an SLC52A2 genotype is not available for that individual.

  15. The effect of the filtering steps on the number of sequence variants that are candidates for causing BVVL syndrome in the two sisters.
    Supplementary Fig. 9: The effect of the filtering steps on the number of sequence variants that are candidates for causing BVVL syndrome in the two sisters.

    The occurrence of a rare syndrome such as BVVL in two sisters suggests that it is caused by a rare genotype with high penetrance. We therefore restricted our search to LoF and MODERATE-impact variants. The sisters are affected but neither parent is, which suggests an autosomal recessive mode of inheritance. Allelic frequency over 2% would dictate a homozygous frequency of over 1 in 2,500, which would be too high for BVVL syndrome. This brought the number of potential variants down to 3 from 147. This would not have been possible using non-Icelandic resources such as ESP, as 4 of the 147 variants are not present in the database. We note that crude filtering, such as removing all variants present in public databases, would result in removing the causative sequence variant. This left us with three correlated MODERATE-impact variants on chromosome 8q24.3: p.Leu339Pro (rs148234606) in SLC52A2, p.Gln931Arg in OPLAH and c.2982C>T in CPSF1. No one was imputed to be homozygous for the SLC52A2 variant in the set of additional chip-typed Icelanders, whereas 5 and 19 Icelanders were imputed to be homozygous for the OPLAH and CPSF1 SNPs, respectively. No early deaths were reported among these homozygous carriers, and the oldest homozygous carriers reached ages 77 and 89 years for OPLAH and CPSF1, respectively, which is inconsistent with diagnosis of BVVL, as only 1 of 77 reported BVVL syndrome cases has lived past 60 years36, 37, 38, 39, 40, 41, 42, 43, 44.

References

  1. Frazer, K.A. et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851861 (2007).
  2. Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 93629367 (2009).
  3. Sulem, P. et al. Identification of low-frequency variants associated with gout and serum uric acid levels. Nat. Genet. 43, 11271130 (2011).
  4. Jonsson, T. et al. A mutation in APP protects against Alzheimer's disease and age-related cognitive decline. Nature 488, 9699 (2012).
  5. Rafnar, T. et al. Mutations in BRIP1 confer high risk of ovarian cancer. Nat. Genet. 43, 11041107 (2011).
  6. Holm, H. et al. A rare variant in MYH6 is associated with high risk of sick sinus syndrome. Nat. Genet. 43, 316320 (2011).
  7. Styrkarsdottir, U. et al. Nonsense mutation in the LGR4 gene is associated with several human diseases and other traits. Nature 497, 517520 (2013).
  8. Jonsson, T. et al. Variant of TREM2 associated with the risk of Alzheimer's disease. N. Engl. J. Med. 368, 107116 (2013).
  9. Helgason, H. et al. A rare nonsynonymous sequence variant in C3 is associated with high risk of age-related macular degeneration. Nat. Genet. 45, 13711374 (2013).
  10. Gudmundsson, J. et al. A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer. Nat. Genet. 44, 13261329 (2012).
  11. Stacey, S.N. et al. A germline variant in the TP53 polyadenylation signal confers cancer susceptibility. Nat. Genet. 43, 10981103 (2011).
  12. Steinthorsdottir, V. et al. Identification of low-frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nat. Genet. 46, 294298 (2014).
  13. Tennessen, J.A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 6469 (2012).
  14. Fu, W. et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216220 (2013).
  15. Li, Y. et al. Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat. Genet. 42, 969972 (2010).
  16. Abecasis, G.R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665 (2012).
  17. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 12971303 (2010).
  18. Kong, A. et al. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat. Genet. 40, 10681075 (2008).
  19. Pruitt, K.D., Tatusova, T., Brown, G.R. & Maglott, D.R. NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res. 40, D130D135 (2012).
  20. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 20692070 (2010).
  21. Eilbeck, K. et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 6, R44 (2005).
  22. Stubbs, A. et al. Huvariome: a web server resource of whole genome next-generation sequencing allelic frequencies to aid in pathological candidate gene selection. J. Clin. Bioinforma 2, 19 (2012).
  23. Chen, F.C., Chen, C.J., Li, W.H. & Chuang, T.J. Human-specific insertions and deletions inferred from mammalian genome sequences. Genome Res. 17, 1622 (2007).
  24. Montgomery, S.B. et al. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res. 23, 749761 (2013).
  25. McKusick, V.A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588604 (2007).
  26. Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013).
  27. Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).
  28. Flicek, P. et al. Ensembl 2013. Nucleic Acids Res. 41, D48D55 (2013).
  29. MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823828 (2012).
  30. Zavolan, M. & van Nimwegen, E. The types and prevalence of alternative splice forms. Curr. Opin. Struct. Biol. 16, 362367 (2006).
  31. Baker, K.E. & Parker, R. Nonsense-mediated mRNA decay: terminating erroneous gene expression. Curr. Opin. Cell Biol. 16, 293299 (2004).
  32. Keller, A., Zhuang, H., Chi, Q., Vosshall, L.B. & Matsunami, H. Genetic variation in a human odorant receptor alters odour perception. Nature 449, 468472 (2007).
  33. Mainland, J.D. et al. The missense of smell: functional variability in the human odorant receptor repertoire. Nat. Neurosci. 17, 114120 (2014).
  34. Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901913 (2005).
  35. Smith, N.G., Webster, M.T. & Ellegren, H. Deterministic mutation rate variation in the human genome. Genome Res. 12, 13501356 (2002).
  36. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 2529 (2000).
  37. Mi, H., Muruganujan, A. & Thomas, P.D. PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41, D377D386 (2013).
  38. Ernst, J., Vainas, O., Harbison, C.T., Simon, I. & Bar-Joseph, Z. Reconstructing dynamic regulatory maps. Mol. Syst. Biol. 3, 74 (2007).
  39. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774 (2012).
  40. Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906913 (2007).
  41. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308311 (2001).
  42. Mayr, E. Systematics and the Origin of Species from the Viewpoint of a Zoologist (Columbia University Press, 1942).
  43. Thorlacius, S. et al. A single BRCA2 mutation in male and female breast cancer families from Iceland with varied cancer phenotypes. Nat. Genet. 13, 117119 (1996).
  44. Helgason, A., Yngvadottir, B., Hrafnkelsson, B., Gulcher, J. & Stefansson, K. An Icelandic example of the impact of population structure on association studies. Nat. Genet. 37, 9095 (2005).
  45. Small, K.S. et al. Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nat. Genet. 43, 561564 (2011).
  46. Kong, A. et al. Parental origin of sequence variants associated with complex diseases. Nature 462, 868874 (2009).
  47. Wallace, C. et al. The imprinted DLK1-MEG3 gene region on chromosome 14q32.2 alters susceptibility to type 1 diabetes. Nat. Genet. 42, 6871 (2010).
  48. Abreu, A.P. et al. Central precocious puberty caused by mutations in the imprinted gene MKRN3. N. Engl. J. Med. 368, 24672475 (2013).
  49. Falls, J.G., Pulford, D.J., Wylie, A.A. & Jirtle, R.L. Genomic imprinting: implications for human disease. Am. J. Pathol. 154, 635647 (1999).
  50. Go, A.S. et al. Prevalence of diagnosed atrial fibrillation in adults: national implications for rhythm management and stroke prevention: the AnTicoagulation and Risk Factors in Atrial Fibrillation (ATRIA) Study. J. Am. Med. Assoc. 285, 23702375 (2001).
  51. Lloyd-Jones, D.M. et al. Lifetime risk for development of atrial fibrillation: the Framingham Heart Study. Circulation 110, 10421046 (2004).
  52. Strohman, R.C., Micou-Eastwood, J., Glass, C.A. & Matsuda, R. Human fetal muscle and cultured myotubes derived from it contain a fetal-specific myosin light chain. Science 221, 955957 (1983).
  53. Cohen-Haguenauer, O. et al. Chromosomal assignment of two myosin alkali light-chain genes encoding the ventricular/slow skeletal muscle isoform and the atrial/fetal muscle isoform (MYL3, MYL4). Hum. Genet. 81, 278282 (1989).
  54. Nicolaou, M. et al. Canalicular ABC transporters and liver disease. J. Pathol. 226, 300315 (2012).
  55. Davit-Spraul, A., Gonzales, E., Baussan, C. & Jacquemin, E. Progressive familial intrahepatic cholestasis. Orphanet J. Rare Dis. 4, 1 (2009).
  56. Dixon, P.H. et al. Heterozygous MDR3 missense mutation associated with intrahepatic cholestasis of pregnancy: evidence for a defect in protein trafficking. Hum. Mol. Genet. 9, 12091217 (2000).
  57. Gudmundsson, J. et al. Discovery of common variants associated with low TSH levels and thyroid cancer risk. Nat. Genet. 44, 319322 (2012).
  58. Sathasivam, S. Brown-Vialetto–Van Laere syndrome. Orphanet J. Rare Dis. 3, 9 (2008).
  59. Chan, W.M. et al. Expanded polyglutamine domain possesses nuclear export activity which modulates subcellular localization and toxicity of polyQ disease protein via exportin-1. Hum. Mol. Genet. 20, 17381750 (2011).
  60. Johnson, J.O. et al. Exome sequencing reveals riboflavin transporter mutations as a cause of motor neuron disease. Brain 135, 28752882 (2012).
  61. Ciccolella, M. et al. Riboflavin transporter 3 involvement in infantile Brown-Vialetto-Van Laere disease: two novel mutations. J. Med. Genet. 50, 104107 (2013).
  62. Haack, T.B. et al. Impaired riboflavin transport due to missense mutations in SLC52A2 causes Brown-Vialetto–Van Laere syndrome. J. Inherit. Metab. Dis. 35, 943948 (2012).
  63. Green, P. et al. Brown-Vialetto–Van Laere syndrome, a ponto-bulbar palsy with deafness, is caused by mutations in c20orf54. Am. J. Hum. Genet. 86, 485489 (2010).
  64. Johnson, J.O., Gibbs, J.R., Van Maldergem, L., Houlden, H. & Singleton, A.B. Exome sequencing in Brown-Vialetto–van Laere syndrome. Am. J. Hum. Genet. 87, 567569, author reply 569–570 (2010).
  65. Bosch, A.M. et al. Brown-Vialetto–Van Laere and Fazio Londe syndrome is associated with a riboflavin transporter defect mimicking mild MADD: a new inborn error of metabolism with potential treatment. J. Inherit. Metab. Dis. 34, 159164 (2011).
  66. da Silva-Júnior, F.P., Moura Rde, D., Rosemberg, S., Marchiori, P.E. & Castro, L.H. Cor pulmonale in a patient with Brown-Vialetto–Van Laere syndrome: a case report. J. Neurol. Sci. 300, 155156 (2011).
  67. Dakhil, F.O., Bensreiti, S.M. & Zew, M.H. Pontobulbar palsy and sensorineural deafness (Brown-Vialetto–van Laere syndrome): the first case from Libya. Amyotroph. Lateral Scler. 11, 397398 (2010).
  68. Lombaert, A., Dom, R., Carton, H. & Bruchler, J.M. Progressive ponto-bulbar palsy with deafness. A clinico-pathological study. Acta Neurol. Belg. 76, 309314 (1976).
  69. van Bogaert, L. & van der Broeck, J. Sclérose latérale amyotrophique ou myasthénie bulbospinal avec exaltation des réflexes tendineux et cntractions fibrillaires. J. Neurol. Psychiatry 6, 380382 (1929).
  70. Rotowski, J. & McHarg, J.F. A case of amyotrophic lateral sclerosis complicated by progressive lipodystrophy. Edin. Med. J. 60, 281293 (1953).
  71. Gudbjartsson, D.F. et al. Sequence variants from whole genome sequencing a large group of Icelanders. Sci. Data 2, 150011 doi:10.1038/sdata.2015.11 (2015).
  72. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760 (2009).
  73. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491498 (2011).
  74. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573580 (1999).
  75. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 9961006 (2002).
  76. Flicek, P. et al. Ensembl 2012. Nucleic Acids Res. 40, D84D90 (2012).
  77. Paten, B., Herrero, J., Beal, K., Fitzgerald, S. & Birney, E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 18141828 (2008).
  78. Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res. 18, 18291843 (2008).
  79. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 4349 (2011).

Download references

Author information

  1. These authors contributed equally to this work.

    • Daniel F Gudbjartsson &
    • Hannes Helgason

Affiliations

  1. deCODE Genetics/Amgen, Inc., Reykjavik, Iceland.

    • Daniel F Gudbjartsson,
    • Hannes Helgason,
    • Sigurjon A Gudjonsson,
    • Florian Zink,
    • Asmundur Oddson,
    • Arnaldur Gylfason,
    • Gisli Magnusson,
    • Bjarni V Halldorsson,
    • Eirikur Hjartarson,
    • Gunnar Th Sigurdsson,
    • Simon N Stacey,
    • Michael L Frigge,
    • Hilma Holm,
    • Jona Saemundsdottir,
    • Hafdis Th Helgadottir,
    • Hrefna Johannsdottir,
    • Solveig Gretarsdottir,
    • G Bragi Walters,
    • Thorunn Rafnar,
    • Asgeir Sigurdsson,
    • Gyda Bjornsdottir,
    • Hakon Gudbjartsson,
    • Olafur Th Magnusson,
    • Augustine Kong,
    • Gisli Masson,
    • Unnur Thorsteinsdottir,
    • Agnar Helgason,
    • Patrick Sulem &
    • Kari Stefansson
  2. School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland.

    • Daniel F Gudbjartsson,
    • Hannes Helgason,
    • Hakon Gudbjartsson &
    • Augustine Kong
  3. Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle, Aarhus, Denmark.

    • Soren Besenbacher
  4. Institute of Biomedical and Neural Engineering, Reykjavík University, Reykjavík, Iceland.

    • Bjarni V Halldorsson
  5. Division of Cardiovascular Diseases, Mayo Clinic, Rochester, Minnesota, USA.

    • Hilma Holm
  6. Children's Hospital, Landspitali University Hospital, Reykjavik, Iceland.

    • Gunnlaugur Sigfusson
  7. Department of Medicine, Landspitali University Hospital, Reykjavik, Iceland.

    • Gudmundur Thorgeirsson,
    • Bjarni Thjodleifsson &
    • David O Arnar
  8. Faculty of Medicine, University of Iceland, Reykjavik, Iceland.

    • Gudmundur Thorgeirsson,
    • Einar S Bjornsson,
    • Sigurdur Olafsson,
    • Thora Steingrimsdottir,
    • Jon G Jonasson,
    • David O Arnar,
    • Unnur Thorsteinsdottir &
    • Kari Stefansson
  9. Department of Internal Medicine, Akureyri Hospital, Akureyri, Iceland.

    • Jon Th Sverrisson
  10. Department of Internal Medicine, Division of Gastroenterology and Hepatology, Landspitali University Hospital, Reykjavik, Iceland.

    • Einar S Bjornsson,
    • Sigurdur Olafsson,
    • Hildur Thorarinsdottir &
    • Asgeir Theodors
  11. Department of Obstetrics and Gynecology, Landspitali University Hospital, Reykjavik, Iceland.

    • Thora Steingrimsdottir &
    • Thora S Gudmundsdottir
  12. Department of Pathology, Landspitali University Hospital, Reykjavik, Iceland.

    • Jon G Jonasson
  13. Icelandic Cancer Registry, Reykjavik, Iceland.

    • Jon G Jonasson
  14. Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Iceland, Reykjavik, Iceland.

    • Jon J Jonsson
  15. Department of Genetics and Molecular Medicine, Landspitali University Hospital, Reykjavik, Iceland.

    • Jon J Jonsson
  16. Department of Pediatrics, Section of Child Neurology, The Children's Hospital of Reykjavik, Landspitali University Hospital, Reykjavik, Iceland.

    • Olafur Thorarensen &
    • Petur Ludvigsson
  17. Icelandic Medical Center (Laeknasetrid), Laboratory in Mjodd (RAM), Reykjavik, Iceland.

    • Gudmundur I Eyjolfsson
  18. Department of Clinical Biochemistry, Akureyri Hospital, Akureyri, Iceland.

    • Olof Sigurdardottir
  19. Department of Clinical Biochemistry, Landspitali University Hospital, Reykjavik, Iceland.

    • Isleifur Olafsson
  20. Department of Anthropology, University of Iceland, Reykjavik, Iceland.

    • Agnar Helgason

Contributions

D.F.G., H. Helgason, S.A.G., F.Z., D.O.A., O.T.M., G. Masson, A.H., P.S. and K.S. wrote the initial draft of the manuscript. D.F.G., H. Helgason, S.A.G., F.Z., A.O., G. Magnusson, B.V.H., E.H., G.T.S., S.N.S., M.L.F., A.K., G. Masson and P.S. analyzed the data. D.F.G., H. Helgason, S.A.G., F.Z., A.G., S.B., H.G. and G. Masson created methods for analyzing the data. S.N.S., H. Holm, J.S., H.T.H., H.J. and O.T.M. performed the experiments. H. Holm, G.S., G.T., J.T.S., S.G., G.B.W., T.R., B.T., E.S.B., S.O., H.T., T.S., T.S.G., A.T., J.G.J., A.S., G.B., J.J.J., O.T., P.L., G.I.E., O.S., I.O. and D.O.A. collected the samples and information. D.F.G., D.O.A., G. Masson, U.T., A.H., P.S. and K.S. designed the study.

Competing financial interests

The authors affiliated with deCODE Genetics are employed by the company, which is owned by Amgen, Inc: D.F.G., H. Helgason, S.A.G., F.Z., A.O., A.G., G. Magnusson, B.V.H., E.H., G.T.S., S.N.S., M.L.F., H. Holm, J.S., H.T.H., H.J., S.G., G.B.W., T.R., A.S., G.B., H.G., O.T.M., A.K., G. Masson, U.T., A.H., P.S. and K.S.

Corresponding authors

Correspondence to:

Author details

Supplementary information

Supplementary Figures

  1. Supplementary Figure 1: Sequencing depth of the 2,636 sequenced Icelanders. (57 KB)
  2. Supplementary Figure 2: Overview of sequence alignment and variant calling. (82 KB)
  3. Supplementary Figure 3: Overview of the process for sequence variant imputation. (84 KB)
  4. Supplementary Figure 4: Distribution of the number of observed alleles in 2,636 sequenced Icelanders by impact class. (95 KB)

    Shown are the proportions of variants for which the minor allele was seen one to six times (MAF ≤ 0.11%).

  5. Supplementary Figure 5: Comparison of imputed and chip genotypes. (56 KB)

    Shown is the fraction of the 28,204 SNPs identified in exons and splice regions and present on SNP chips that have r2 > 0.8, 0.9 and 0.99 between the imputed and chip genotypes as a function of their derived allele frequency (DAF).

  6. Supplementary Figure 6: The five pedigrees containing the eight homozygous carriers of c.234delC in MYL4. (98 KB)

    Symbols for homozygous carriers are colored black. Symbols for deceased individuals are stricken through with a forward-leaning line. Symbols for individuals who have not been genotyped directly are stricken through with a backward-leaning line. Under each individual are up to five lines containing information about the individual. First appear an identifier, consisting of a pedigree name (f1–f5), the generation of the individual in roman numerals and an enumerator within the generation. Second appear the individual’s year of birth and, if appropriate, the individual’s year of death. Third appears the individual’s c.234delC genotype, where D and W denote directly genotyped deletion and wild-type alleles, respectively, and d and w denote in silico genotypes inferred from the genotypes of relatives. The order of the alleles indicates the parent of origin, where the first allele comes from the father and the second allele comes from the mother, except for the three cases for whom parent of origin could not be assigned: f2-I:1, f2-I:2 and f3-I:2. Fourth appear an indication of whether the individual has been diagnosed with AF and the age at onset after the @ sign. Fifth appear the presence of other relevant phenotypes: sick sinus syndrome (SSS), pacemaker implantation (PM) and sudden cardiac death (SCD).

  7. Supplementary Figure 7: The transmission of chromosome 17 through pedigree f-2. (91 KB)

    The transmission of the founding couple of pedigree f-2 can be reconstructed on the basis of the expected values for meiotic transmissions of chromosome 17. The horizontal red lines indicate the position of c.234delC in MYL4, and the small red square surrounding the line indicates the region around c.234delC shared identically by decent by the founding couple. The length of this interval is estimated to be 3.3 cM. The first and last 10 cM of the chromosome have been truncated. The sisters f2-II:6 and f2-II:9 are imputed to be carrying c.234delC on their paternal chromosome on the basis of the chromosomal region around the deletion having been transmitted to their children (dark blue). There is no clear transmission of either sister’s maternal chromosomal region around the deletion to one of her children (although f2-III:3 may have inherited her mother’s paternal chromosome, but a crossover occurred in the region around c.234delC where f2-III:3 is homozygous). However, for both sisters, the maternal chromosome carrying the deletion (light blue) was transmitted to an offspring at regions on both sides of MYL4 (to f2-III:2 and f2-III:4) such that, unless a double crossover occurred around MYL4, they both carry c.234delC on their maternal chromosome.

  8. Supplementary Figure 8: The families of the BVVL cases. (53 KB)

    Shown are birth years and genotypes at the SLC52A2 mutation, where W denotes the wild-type allele and M denotes the mutated allele. Symbols for cases are colored black, and the symbols corresponding to the two siblings of case 4 who died early are colored gray. A forward slash indicates that the individual is deceased, and a backward slash indicates that an SLC52A2 genotype is not available for that individual.

  9. Supplementary Figure 9: The effect of the filtering steps on the number of sequence variants that are candidates for causing BVVL syndrome in the two sisters. (111 KB)

    The occurrence of a rare syndrome such as BVVL in two sisters suggests that it is caused by a rare genotype with high penetrance. We therefore restricted our search to LoF and MODERATE-impact variants. The sisters are affected but neither parent is, which suggests an autosomal recessive mode of inheritance. Allelic frequency over 2% would dictate a homozygous frequency of over 1 in 2,500, which would be too high for BVVL syndrome. This brought the number of potential variants down to 3 from 147. This would not have been possible using non-Icelandic resources such as ESP, as 4 of the 147 variants are not present in the database. We note that crude filtering, such as removing all variants present in public databases, would result in removing the causative sequence variant. This left us with three correlated MODERATE-impact variants on chromosome 8q24.3: p.Leu339Pro (rs148234606) in SLC52A2, p.Gln931Arg in OPLAH and c.2982C>T in CPSF1. No one was imputed to be homozygous for the SLC52A2 variant in the set of additional chip-typed Icelanders, whereas 5 and 19 Icelanders were imputed to be homozygous for the OPLAH and CPSF1 SNPs, respectively. No early deaths were reported among these homozygous carriers, and the oldest homozygous carriers reached ages 77 and 89 years for OPLAH and CPSF1, respectively, which is inconsistent with diagnosis of BVVL, as only 1 of 77 reported BVVL syndrome cases has lived past 60 years36, 37, 38, 39, 40, 41, 42, 43, 44.

PDF files

  1. Supplementary Text and Figures (1,719 KB)

    Supplementary Figures 1–9, Supplementary Tables 1–15 and Supplementary Note.

Additional data