Despite the important role that monozygotic twins have played in genetics research, little is known about their genomic differences. Here we show that monozygotic twins differ on average by 5.2 early developmental mutations and that approximately 15% of monozygotic twins have a substantial number of these early developmental mutations specific to one of them. Using the parents and offspring of twins, we identified pre-twinning mutations. We observed instances where a twin was formed from a single cell lineage in the pre-twinning cell mass and instances where a twin was formed from several cell lineages. CpG>TpG mutations increased in frequency with embryonic development, coinciding with an increase in DNA methylation. Our results indicate that allocations of cells during development shapes genomic differences between monozygotic twins.
Subscribe to Journal
Get full journal access for 1 year
only $17.42 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Access to these data is controlled; the sequence data cannot be made publicly available because Icelandic law and the regulations of the Icelandic Data Protection Authority prohibit the release of individual-level and personally identifying data. Data access can be granted only at the facilities of deCODE genetics in Iceland, subject to Icelandic law regarding data usage. Anyone wanting to gain access to the data should contact Kári Stefánsson (firstname.lastname@example.org). Data access consists of the lists of mutations identified in monozygotic twins with numbered proband identifiers. The lists of mutations are provided in Supplementary Data 1–3.
The major components in our sequence data processing pipeline consist of publicly available software, notably Burrows–Wheeler Aligner-MEM for the alignment (https://github.com/lh3/bwa), Samtools for the processing of BAM files (http://samtools.github.io/), Picard for PCR duplication marking (https://broadinstitute.github.io/picard/) and GraphTyper for sequence variant calling (https://github.com/DecodeGenetics/graphtyper). The implementation of the phasing and imputation of sequence variants is described in the data descriptor32.
Van Dongen, J., Slagboom, P. E., Draisma, H. H. M., Martin, N. G. & Boomsma, D. I. The continuing value of twin studies in the omics era. Nat. Rev. Genet. 13, 640–653 (2012).
Vadlamudi, L. et al. Timing of de novo mutagenesis: a twin study of sodium-channel mutations. N. Engl. J. Med. 363, 1335–1340 (2010).
Ehli, E. A. et al. De novo and inherited CNVs in MZ twin pairs selected for discordance and concordance on attention problems. Eur. J. Hum. Genet. 20, 1037–1043 (2012).
Baranzini, S. E. et al. Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 464, 1351–1356 (2010).
Dal, G. M. et al. Early postzygotic mutations contribute to de novo variation in a healthy monozygotic twin pair. J. Med. Genet. 51, 455–459 (2014).
Zink, F. Clonal hematopoiesis, with and without candidate driver mutations, is common in the elderly.Blood 130, 742–752 (2017).
Hall, J. G. Twinning. Lancet 362, 735–743 (2003).
Herranz, G. The timing of monozygotic twinning: a criticism of the common model. Zygote 23, 27–40 (2015).
McNamara, H. C., Kane, S. C., Craig, J. M., Short, R. V. & Umstad, M. P. A review of the mechanisms and evidence for typical and atypical twinning. Am. J. Obstet. Gynecol. 214, 172–191 (2016).
Tang, W. W. C., Kobayashi, T., Irie, N., Dietmann, S. & Surani, M. A. Specification and epigenetic programming of the human germ line. Nat. Rev. Genet. 17, 585–600 (2016).
D’Gama, A. M. & Walsh, C. A. Somatic mosaicism and neurodevelopmental disease. Nat. Neurosci. 21, 1504–1514 (2018).
Dou, Y., Gold, H. D., Luquette, L. J. & Park, P. J. Detecting somatic mutations in normal cells. Trends Genet. 34, 545–557 (2018).
Sasani, T. A. et al. Large, three-generation families reveal post-zygotic mosaicism and variability in germline mutation accumulation. elife 8, e46922 (2019).
Campbell, I. M. et al. Parental somatic mosaicism is underrecognized and influences recurrence risk of genomic disorders. Am. J. Hum. Genet. 95, 173–182 (2014).
Rahbari, R. et al. Timing, rates and spectra of human germline mutation. Nat. Genet. 48, 126–133 (2016).
Scally, A. Mutation rates and the evolution of germline structure. Philos. Trans. R. Soc. Lond. B Biol. Sci. 371, 20150137 (2016).
Jónsson, H. et al. Multiple transmissions of de novo mutations in families. Nat. Genet. 50, 1674–1680 (2018).
Lindsay, S. J., Rahbari, R., Kaplanis, J., Keane, T. & Hurles, M. E. Similarities and differences in patterns of germline mutation between mice and humans.Nat. Commun. 10, 4053 (2019).
Harland, C. et al. Frequency of mosaicism points towards mutation-prone early cleavage cell divisions in cattle. Preprint at bioRxiv https://doi.org/10.1101/079863 (2016).
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Seplyarskiy, V. B. et al. Population sequencing data reveal a compendium of mutational processes in human germline. Preprint at bioRxiv https://doi.org/10.1101/2020.01.10.893024 (2020).
Xia, B. et al. Widespread transcriptional scanning in the testis modulates gene evolution rates. Cell 180, 248–262.e21 (2020).
Moorjani, P., Amorim, C. E. G., Arndt, P. F. & Przeworski, M. Variation in the molecular clock of primates. Proc. Natl Acad. Sci. USA 113, 10607–10612 (2016).
Gao, Z. et al. Overlooked roles of DNA damage and maternal age in generating human germline mutations. Proc. Natl Acad. Sci. USA 116, 9491–9500 (2019).
Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).
Greenberg, M. V. C. & Bourc’his, D. The diverse roles of DNA methylation in mammalian development and disease. Nat. Rev. Mol. Cell Biol. 20, 590–607 (2019).
Reik, W., Dean, W. & Walter, J. Epigenetic reprogramming in mammalian development. Science 293, 1089–1093 (2001).
Coe, B. P. et al. Neurodevelopmental disease genes implicated by de novo mutation and copy number variation morbidity. Nat. Genet. 51, 106–116 (2019).
Hardy, K., Handyside, A. H. & Winston, R. M. The human blastocyst: cell number, death and allocation during late preimplantation development in vitro. Development 107, 597–604 (1989).
Tabansky, I. et al. Developmental bias in cleavage-stage mouse blastomeres. Curr. Biol. 23, 21–31 (2013).
Ju, Y. S. et al. Somatic mutations reveal asymmetric cellular dynamics in the early human embryo. Nature 543, 714–718 (2017).
Jónsson, H. et al. Whole genome characterization of sequence diversity of 15,220 Icelanders. Sci. Data 4, 170115 (2017).
Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).
Eggertsson, H. P. et al. Graphtyper enables population-scale genotyping using pangenome graphs. Nat. Genet. 49, 1654–1660 (2017).
Halldorsson, B. V. et al. Characterizing mutagenic effects of recombinations through a sequence level genetic map. Science 363, eaau1043 (2019).
Busing, F. M. T. A., Meijer, E. & Van Der Leeden, R. Delete-m jackknife for unequal m. Stat. Comput. 9, 3–8 (1999).
We thank everyone who participated in our studies.
H.J., H.P.E., O.A.S., O.E., G.A.A., F.Z., E.A.H., I.J., A.G., Adalbjorg Jonasdottir, Aslaug Jonasdottir, D.B., G.L.N., O.T.M., G.M., B.V.H., U.T., A.H., P.S., D.F.G. and K.S. are employed by deCODE genetics/Amgen.
Peer review information Nature Genetics thanks Jeffrey Beck, Dorret Boomsma, Ziyue Gao, Brandon Johnson, and Amy Williams for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Histogram of the genome-wide sequence, coverage of the twins. Note that the sequence coverage for the monoyzygotic twins was aggregated across several sequencing runs, and the aggregated sequence data were used for the subsequent analysis.
The genome-wide sequence coverage of the, probands’ family members. The family members of the probands were used to detect pre-PGCS mutations. Note, that if both twins of a pair have sequenced children then they will appear as ‘Proband’ and as ‘Twin’.
Number of children with a pre-PGCS mutation. a, We counted how many children have a pre-PGCS mutation with VAF higher than a cutoff. b, We restricted to children where at least one pre-PGCS mutation was detected.
The maximum VAF of pre-PGCS mutations per proband/mate pair. a, The maximum VAF of pre-PGCS mutations per proband/mate pair. b, The standard deviation of the maximum VAF per proband/mate pair against the average of the maximum VAF.
Alternative calculations of the slopes from the three-generation approach. a, Histogram of the slopes as Fig. 5e, except the slopes are transformed with atan. b, The slopes in three generation approach with swapped roles. Note that the reciprocal slopes are not defined for near constitutional probands due to zero sample variance.
Supplementary Note and Tables 1–9
A summary of the simulation results in each scenario, compared to the relevant quantities in our observed data
The mutations identified by comparing the somatic tissues of the twins.
The mutations identified by the quad approach.
The mutations identified by the three generation approach.
About this article
Cite this article
Jonsson, H., Magnusdottir, E., Eggertsson, H.P. et al. Differences between germline genomes of monozygotic twins. Nat Genet 53, 27–34 (2021). https://doi.org/10.1038/s41588-020-00755-1