Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome sequencing of human in vitro fertilisation embryos for pathogenic variation screening


Whole-genome sequencing of preimplantation human embryos to detect and screen for genetic diseases is a technically challenging extension to preconception screening. Combining preconception genetic screening with preimplantation testing of human embryos facilitates the detection of de novo mutations and self-validates transmitted variant detection in both the reproductive couple and the embryo’s samples. Here we describe a trio testing workflow that involves whole-genome sequencing of amplified DNA from biopsied embryo trophectoderm cells and genomic DNA from both parents. Variant prediction software and annotation databases were used to assess variants of unknown significance and previously not described de novo variants in five single-gene preimplantation genetic testing couples and eleven of their embryos. Pathogenic variation, tandem repeat, copy number and structural variations were examined against variant calls for compound heterozygosity and predicted disease status was ascertained. Multiple trio testing showed complete concordance with known variants ascertained by single-nucleotide polymorphism array and uncovered de novo and transmitted pathogenic variants. This pilot study describes a method of whole-genome sequencing and analysis for embryo selection in high-risk couples to prevent early life fatal genetic conditions that adversely affect the quality of life of the individual and families.


Whole-genome sequencing in the IVF clinic

For over two decades, preimplantation genetic testing (PGT) has been available for couples who are aware they carry a genetic condition or have had a child affected by a genetic disease. In vitro fertilisation (IVF) used in conjunction with monogenic PGT is available for couples to prevent transmission of known hereditary monogenic disorders. PGT for aneuploidy screens embryos for large segmental or whole-chromosome copy number changes and is commonly used for older women (>35 years) who have a history of infertility, miscarriages or chromosomally abnormal conceptions1,2,3. The most recent developments in clinical PGT are low-coverage next-generation sequencing and Karyomapping, which uses a highly polymorphic single-nucleotide polymorphism (SNP) microarray to identify disease-causing haplotypes. Next-generation sequencing PGT for aneuploidy (typically <0.1× depth) is useful for high-throughput screening at a reasonable cost for detecting chromosomal aneuploidies, structural variations and large copy-number variations (CNVs)4,5,6. In addition to pedigree analysis for monogenic disorders, Karyomapping has been reported to identify partial chromosomal aneuploidies as small as 1.8 Mb7.

For couples seeking to ascertain their risk of having an affected child, around 6,000 diseases exist that may be genetically screened for8. A mutation or disease-causing variant in one or both copies of approximately 5,000 human genes can cause a syndromic disease or phenotype9,10,11,12,13,14. Between 0.5–5% of infants are born with a genetic condition or disorder15,16. The preconception genetic screening panels that are available to determine a couple’s carrier status for disease-causing genetic variants are limited to a subset of high-risk genes7. Currently, preconception screening and PGT are performed as separate unlinked tests17. An estimated 74 de novo SNP mutations are introduced at embryogenesis, which, when expressed dominantly or as a compound heterozygote, result in severe pathogenic phenotypes18,19,20.

With the declining cost and increased availability of whole-genome sequencing, we sought to explore the design of combined preconception screening and embryo PGT using whole-genome sequencing to detect disease-causing genetic variants in couples and their embryos in accordance with recommended practice guidelines21,22,23. We investigated whether whole-genome sequencing of IVF-conceived embryos could screen for hereditary syndromic genetic diseases in addition to identifying the more technically challenging syndromes resulting from de novo mutations10,15. To date, whole-genome sequencing has been used in a limited number of assisted reproduction cases, principally due to the high cost of high-throughput sequencing21. We hypothesised that whole-genome sequencing of preimplantation embryos combined with sequencing genomic DNA from both parents could address the limitations associated with current PGT techniques. The aim was to use whole-genome sequencing analysis to screen embryos for pathogenic variants that would result in severe childhood-onset diseases24.

For this pilot study, we sequenced the genomes of five IVF couples and 11 of their IVF embryos that had previously undergone clinical PGT for familial diseases with Karyomapping6,25. The whole-genome amplified trophectoderm cell biopsy samples and the genomic DNA of the parents’ samples were used as template DNA for library generation for whole-genome sequencing. Each embryo’s resolved genome sequence was triangulated using multiple trio testing of their parents’ sequences for confirmation of variant status and vice versa22. To detect clinically actionable pathogenic variations, multiple trio testing of the parental and the embryo genomes was performed. This was followed by variant annotation using databases to grade variant pathogenicity and the use of pathogenicity prediction algorithms for inherited and de novo mutations and variants of unknown significance26. Detecting disease-causing pathogenic variants necessitated the use of inheritance-mode filtering to exclude false positives caused by sequencing artefacts. For each of the major modes of inheritance, curatable variant filter and classification sets were generated to detect known ClinVar archive pathogenic variants27. Variants of unknown significance were classified using a range of pathogenicity prediction algorithms and functional annotation databases. The threshold for classifying candidate pathogenic variants was based on pathogenic and likely pathogenic ClinVar categories9. For differentiating between type I and type II error calls for de novo mutations, we used the variant allele frequency (VAF) and quality by depth (QD) metric to filter false-positive pathogenic variants in combination. The purpose of this was to detect inherited pathogenic and unacceptably high-risk de novo variations that would be clinically actionable and to guide personalised diagnosis and treatment28,29. Our study to design and test a framework to determine clinically actionable pathogenic variants is, to our understanding, the first of its kind.


Study participants

Couples who had PGT for single gene disorders provided written informed consent to having whole-genome sequencing on themselves and their biopsied embryos included in the study6. Each participant was given the option to have the results of their genomic DNA and their biopsied embryo samples reported or withheld. All participating couples consented to whole-genome sequencing and elected to receive results for themselves and their tested PGT embryos. The study and protocol were approved by the Monash Health Human Research Ethics Committee (Ref: HREC/17/MonH/286) and all experiments were performed in accordance with protocol guidelines and regulations.

Library preparation and sequencing

Genomic DNA from five couples who had been used as reference templates for PGT using Karyomapping were selected for whole-genome sequencing. The DNA had been extracted from whole blood using a ReliaPrep™ Blood genomic DNA Miniprep System (Promega, USA). For the isolation of embryonic DNA, intracytoplasmic sperm injection method created embryos belonging to the five PGT couples underwent trophectoderm biopsy, using laser or mechanical techniques, on day five or six of culture to remove 4–10 trophectoderm cells. Biopsied cells were washed three times in a solution of 1× phosphate-buffered buffer (Cell Signalling Technologies, USA) and 1× polyvinylpyrrolidone (Cook Medical, Australia) followed by whole-genome amplification by multi-displacement amplification with SureMDA system (Illumina, USA) as per manufacturer’s instructions. Samples for whole-genome sequencing were selected based on Karyomapping quality control metrics, which indicated a SNP call-rate on the HumanCytoSNP-12 BeadArray of >96% and allele dropout and miscall rates of <1%. A 1 ug sample of parental genomic DNA and embryo whole-genome amplification products were sent to BGI Genomics (Tai Po, Hong Kong) for sequencing with the BGI-SEQ500. Briefly, the DNA samples were fragmented to approximately 350 bp with a E220 Covaris (Covaris Inc., USA) followed by 3′ end-repair, adaptor ligation and amplification by ligation-mediated polymerase chain reaction, single strand separation and cyclisation. DNA nanoballs were produced with rolling-circle amplification, placed in patterned nanoarrays which are 100 bp paired-end reads on a BGI-SEQ50030.

Read processing

Standard raw read processing through to variant call format was performed in accordance with Genome Analysis Toolkit best practices by the BGI Genomics Online portal pipeline31,32. Raw reads were mapped to the human reference genome (GRCh37/HG19) with Burrows-Wheeler Aligner33,34, polymerase chain reaction duplicates were removed using Picard tools35, local realignment was undertaken with Genome Analysis Toolkit36,37 and variants were called with HaplotypeCaller using the variant quality score recalibration method.

SNP and indel analysis

Analysis was guided by the Standards and Guidelines from the American College of Medical Genetics for interpretation of sequence variants38,39,40. Clinically actionable variants were defined as those that could be justified in requesting for screening by an accredited medical ethics committee41,42. Each parental and embryo binary alignment map (BAM) and raw variant call format files were imported into VarSeq (GoldenHelix, USA). Variant filtering workflows were arranged for the inheritance modes of; dominant heterozygous, recessive homozygous, compound heterozygous, X-linked, de novo and a low-specificity high-sensitivity failsafe filter with a low depth threshold (read depth >1) and was missing the genotype quality filter (Supplementary Table 3). The failsafe filter therefore having intentionally high number of false positives for manual curation (Fig. 1). For variants of unknown significance or conflicting variants, a stringent pathogenicity functional prediction filter was set using the following prediction algorithms: SIFT, Polyphen2 HVAR, MutationTaster2, MutationAssessor, FATHMM and FATHMM MKL43,44,45,46,47. If more than one of the algorithms predicted a variant as damaging, the variant was retained. Variants were then filtered by MPC scores >2 and a final Phred-scaled CADD score of >35 concluded the mutation prediction filter set48,49. Short tandem repeats were calculated with ExpansionHunter version 2.5.5 using the default 17 tandem repeat loci to determine short tandem repeat numbers on embryos and parents50. Calculation was performed at BGI Genomics for the following loci provided by ExpansionHunter version 2.5.5: cbl proto-oncogene (CBL), atrophin 1 (ATN1), ataxin 2 (ATXN2), ataxin 3 (ATXN3), junctophilin 3 (JPH3), calcium channel, voltage-dependent, P/Q type, alpha 1A subunit (CACNA1A), dystrophia myotonica-protein kinase (DMPK), cystatin B (CSTB), ataxin 10 (ATXN10), ataxin 7 (ATXN7), huntingtin (HTT), protein phosphatase 2, regulatory subunit B beta (PPP2R2B), ataxin 10 (ATXN1), chromosome 9 open reading frame 72 (C9ORF72), frataxin (FXN), androgen receptor (AR) and fragile X mental retardation 1 (FMR1) on all embryo and parental samples.

Figure 1
figure 1

Filter sets for pathogenic variant detection from the classifications of variants: (A) variants classified as ‘likely pathogenic’ or ‘pathogenic’, (B) unclassified variants with a potentially feasibly damaging likelihood and (C) copy number variant calling pipelines.

Copy number and structural variation

CNVs were called using CNVnator (v.0.2.7)51 and structural variations with Breakdancer52 and CREST53. A secondary, overlapping CNV discovery analysis was performed by binning into 10 kb windows, filtering by calling loss of heterozygosity (LoH) in more than 95% of variants in flagged regions54,55 and annotating using ClinGen Gene Dosage Sensitivity (27-09-2017 release). Structural variations were called and included in the analysis using Breakdancer52. CNVnator and Breakdancer calls were imported into Varseq and then compared with the inherited CNVs from each parent and categorised as having dosage pathogenicity for either haploinsufficiency or triplosensitivity. LoH regions (>100 and 95% of variants) were trio-called compared with the parental LoH regions. Filtering was applied for the haploinsufficiency and triplosensitivity categories of ‘sufficient evidence for dosage pathogenicity’ or ‘gene associated with autosomal recessive phenotype’ and called for pathogenicity using the target copy number state for proband per sample. This was performed by applying a ratio of >2.0 with a Z-score of >0 for duplications and <0.5 with a Z-score of <0, a mean targeted depth >5 and a lack of quality control flags (high control variation, low control depth, low Z-score or within regional interquartile range) for detecting true positive CNVs. CNVs with recessive inheritance were cross-checked against the autosomal recessive SNP and indel variants.


PGT variant validation

Sequencing depth was comparable between the amplified trophectoderm-biopsy DNA from embryos and the parents’ from genomic DNA (mean depth of 48.2× versus 46.1×). Embryo reads were equivalent to the couple’s genomic DNA samples for raw and clean reads, bases aligned and transitions to transversion ratios of 2.071 and 2.081 (Supplementary Table 1 and Fig. 1a). Genome coverage for embryos and couples was comparable at sequencing depths of 4× and 10×. However at 20×, genome coverage was relatively decreased for biopsied embryos at 87.5% compared with 96.4% from genomic DNA (Supplementary Figs. 1b, 4a,b). Therefore, with the exception of the failsafe filter, variant filter sets each had the depth threshold at >10× coverage.

Assembly and mapping for the SNP and indel calls were highly concordant between embryos and couples (Supplementary Fig. 1c–f), except for novel SNPs, which averaged 85,527 (standard deviation [SD] 29,576.6) variants in embryos and 21,663 (SD 1102.4) variants for couples. This was reflected in the high number of LoH regions in embryos (5460, SD 1609 versus 3733, SD 87) that presumably indicates regions of allele dropout.

De novo mutations

As expected for the couple’s male and female partners genomic DNA samples, non-homozygote VAFs showed a normal distribution, with the average centred at 0.5 (indicating 50% of reads per base, Supplementary Fig. 2b). The embryos heterozygote VAF distribution ranged from 0.08 to 0.34 with an average peak at 0.26 and maximum at 0.12 (Supplementary Fig. 2a). This low embryo VAF is believed to represent false positive heterozygote calls from either base misincorporation or read misalignment22. Due to this, the de novo filter included a false-positive filtering gate to remove de novo SNP variants with a VAF < 0.35, the rationale being that the failsafe filter will shortlist potentially dangerous or clinically actionable variants for individual curation. Variations involving deletions >1 bp had a higher VAF than those involving a base change, although we did not alter the filtering based on this as the upper limit was approximately consistent.

An additional quality by depth (QD) threshold of >12 was added to the non-dbSNP variant subfilters. This QD threshold reduced the number of de novo variants flagged for curation from 285 across all the eleven embryos to 57. QD filtering was not applied to the transmitted variants, but when this stringent filter was applied to the non-dbSNP variants, 8/125 unique and pathogenic transmitted variants were removed from reporting.

Variant filters were therefore arranged to classify for each mode of inheritance into two parallel sub-filter sets that all variants would be assessed; one sub-filter of each filter set for annotating variants catalogued in dbSNP and a second for variants not catalogued to date, for which pathogenicity prediction was used (Fig. 1a–c).

Variant trio-calling

Three of the five couples had undergone PGT for autosomal dominant conditions, one for an autosomal recessive condition and one for an X-linked condition (Table 1). To confirm the embryo PGT results, in three of the five couples at least one euploid embryo was available (i.e. affected, carrier or unaffected). To determine the concordance between the whole-genome sequencing results to the HumanCytoSNP-12 BeadArray platform used for the couples clinical Karyomapping cycles, assessment of heterozygote calls (~75,000 variants) indicated >99.0% concordance with whole genome sequencing calls. Comparing the results of the pathogenic variants previously diagnosed during monogenic PGT cycles using Karyomapping to those obtained through whole-genome sequencing indicated complete concordance for both couples and embryos (Table 1). One embryo’s PGT variant had a substantially lower than expected VAF (0.143; 3/21 reads) but as this was a transmitted variant for it was called by the filter pipeline.

Table 1 Couples and embryo numbers by inheritance, disease status and type of variant.

Pathogenic and predicted pathogenic variant detection in embryos

For the recessive filter there was an average of 0.82 transmitted pathogenic variants found in dbSNP per embryo (build 151, ranging between 1 and 2 stars for ClinVar review status, 0 stars representing no assertion criteria or minimal evidence, up to 4 stars for clinical practice guideline). This is compared to an average of 1.27 non-inherited variants per embryo that were predicted pathogenic (Fig. 2, excluding variants for which the couples had originally sought PGT). In one of the couples, both were heterozygote carriers of the CTFR ΔF508 mutation and resulted in a heterozygote in at least one embryo.

Figure 2
figure 2

Bar graphs of the filter system for determining the clinically relevant variants proposed for embryo selection for each mode of inheritance: (A) filter sets for determining clinically relevant variants classified as either likely pathogenic or pathogenic and (B) filter sets for variants not yet classified but potentially damaging or disease causative. Filters in each row are successively added to the total number of variants remaining.

For the dominant filters, 1.27 pathogenic variants per embryo were in dbSNP, compared to a mean of 0.45 non-dbSNP predicted pathogenic variants. To detect transmitted pathogenic or predicted pathogenic variants occurring in regions of allele dropout and/or low-coverage in the amplified embryo DNA compared to parental sequences that used genomic DNA, LoH was used (>95% and 100 variants) for variants which had fewer than 10 reads. An average sum of 2.3 (SD 1.2) pathogenic or predicted pathogenic variants were noted as expected but missing from the embryo sequencing due to low coverage threshold or LoH from all the filters. Pathogenic variants in low-coverage regions were phased using the nearest flanking SNPs of the missing regions to determine the carrier status. A mean of 4.5 (SD 3.7) likely pathogenic or pathogenic variants were found in embryos and a mean of 5.5 (SD 3.4) variants deemed potentially pathogenic and required haplotype curation via LoH to account for dropout of potentially inherited but missing pathogenic variants.

To prevent filtering of true positive de novo mutations, the failsafe filter container was used to capture clinically relevant variants for curation. After elimination of PGT variants, 17 variants were detected in the 11 embryos with review status of 3 stars, of which none were clinically actionable essential or developmental delay genes and were removed following QD filtering. Review status classification revealed that only the failsafe filters had missing calls, with a mean of 2.36 (SD 3.86); none of the variants captured by the failsafe filter resulted in compound heterozygotes derived from transmitted variants. There were no ClinVar review status 1-star (conflicting interpretations) variants found in any of the embryo samples. Similarly, there were no compound heterozygotes, homozygous autosomal recessive or X-linked (in females), or likely pathogenic or pathogenic in American College of Medical Genetics incidental findings variants in embryos or parental genomes. There were 109 unclassified candidate pathogenic de novo mutations across the 11 embryos with nine variants featured repeatedly across multiple embryos, all but two of which occurred in more than one family. There were 10 candidate de novo autosomal dominant variants in four embryos which had a VAF < 0.4 and only one having a VAF > 0.5, indicating the high likelihood of false-positive calls. Addition of the QD minimum threshold to the unclassified filters for QD < 12 reduced the candidate false positive unclassified variant calls to one de novo mutation at the ABL1 locus (rs121913459, VAF 0.63, QD = 20.9) in a single embryo56.

Tandem repeat disease loci analysis

For the 17 loci that Expansion Hunter assessed the tandem repeat number at known disease-causing loci, no parental samples indicated pathogenic repeat numbers. In embryo samples, most of the loci tested provided at least one concordant call in terms of transmission exactness. At three loci, both alleles were discordant: FMR1, ATXN1 and ATXN3.

Copy-number and structural variations

CNVs were assessed by direct transmission and binning reads in 10 kb windows and comparing against inheritance and ClinGen dosage sensitivity scores for pathogenicity. CNVs calls were higher in the embryos compared to parental samples, except for inter-chromosomal structural variants and structural deletions, suggesting a high false-positive rate (Supplementary Fig. 1f and Supplementary Table 1). As anticipated from the Karyomapping results, no pathogenic CNVs were detected (Fig. 3). There was a mean of 2.0 deleterious autosomal recessive structural variations for both couples and embryos compared with a mean of 5.21 and 8.05 structural variations for couples and embryos, respectively, for which triplosensitivity was contributing as autosomal recessive.

Figure 3
figure 3

Copy number variant charts for an embryo genome sequencing sample from chromosomes 1–22: (A) Target mean depth, where the top intensity bar is the paternal depth, the central bar is the maternal depth and the lower bar is the embryo depth (black indicates no coverage and yellow indicates high coverage); (B) loss of heterozygosity proportion of the variants in the expected state of variant heterozygosity loss for the embryo (green dots); (C) ratio of coverage regions for the embryo sample (blue connector); (D) ratio of binned regions in 10 kb windows (red connector). (E) z-scores of the parents and embryo samples, where the top intensity bar is the paternal depth, the central bar is the maternal depth and the lower bar is the embryo depth (light purple indicates a low a-score dark purple indicates a high z-score).


The purpose of this study was to develop a method of whole-genome sequencing analysis that could be used to screen human embryos for pathogenic variants. To achieve this, we firstly used parental genome sequences to identify the transmitted variants. Embryo biopsy samples that had undergone multiple displacement amplification and parental genomic DNA samples obtained from blood were used as templates for generating DNA libraries that were subsequently sequenced. Sequenced genomes of embryos and parents were analysed using variant annotation databases and functional prediction algorithms to detect the transmission or introduction of pathogenic mutations. Parallel filter sets were arranged to filter separately to predict unacceptably high-risk or known pathogenic variations, CNVs or chromosomal scale rearrangements. Multiple trio-testing of each embryo against the couples’ genomes facilitated the detection of transmitted and de novo variants calling as likely pathogenic or pathogenic by disorder or variant categorisation. The complete concordance between variant calls on the SNP array and whole-genome sequencing results indicated that inherited variants were confidently detected via trio-testing.

De novo variant calling in embryos presented a unique challenge. A custom VAF filter was required to minimise false positives that were likely introduced as a result of multiple displacement amplification from single base substitutions. The VAF soft threshold of <0.35 and quality scores guided the de novo variant calling. This threshold was marginally higher than the reported de novo false-positive threshold of 0.28 to 0.3322. We used VAF, base quality metrics and functional interpretation to determine pathogenicity to differentiate between true- and false-positive calls. Strict filtering of de novo mutations and the risk of under-calling was offset by the failsafe filter set, which was intended to perform a low-sensitivity function that would pick-up clinically actionable variants. Individual curation of these candidate variants indicated that these were likely to be false positives based on low VAF. To validate specific de novo variants, performing direct polymerase chain reaction following embryo re-biopsy or from DNA obtained from culture media are feasible options57. The known PGT variant occurring at an extraordinarily low VAF (0.143) in one of the embryos exemplifies the necessity to have specific filter sets for each mode of transmission and variant subtype.

To avoid pathogenic variants being transmitted in low or missing coverage regions and being undetected, an untransmitted variant filter manually examined uncalled variants flanking haplotypes to confirm the result at each site. The uniform coverage exhibited by multiple displacement amplification of DNA from the embryos suggests that the likelihood of a pathogenic de novo mutation arising in a region with low coverage is remote. These type 2 errors are further mitigated by the failsafe low-coverage assessment filter, although LoH and VAF filtering can guide manual decision-making. We avoided imputation for regions of LoH to focus on what could be ascertained directly from the data.

For this study we performed pathogenic variant detection of known likely pathogenic and pathogenic variants in accordance with available databases of variants that have high to complete penetrance. Further work is required to stratify the outcomes of compound heterozygotes in which at least one variant is ranked likely pathogenic. Here, we used a non-exhaustive list of essential genes combined with known developmental delay genes. A list of core disease genes for embryo genome screening is necessary to avoid overcalling58.

For CNV calls, the recommended 10 kb size for the bins represents the lower limit for the annotation software, which coincides with the upper limit for variant call format file indels. For variations exceeding 10 kb, variant calls were inconsistent between the couples and the embryos, and a read-binning approach was required to confidently call CNV and structural variations. CNV detection via analysis of 10 kb bins overcomes the issue of high false-positive CNV calls, as evidenced by the concordance between partner and embryo genomes. The effective 10 kb upper size limit of indels is conveniently bridged by performing binned CNV analysis in 10 kb blocks. This addressed the issue of the limitations of multiple displacement amplification, enabling comprehensive compound CNV detection of inherited variants and de novo mutations. Short tandem repeat loci yielded inconsistent results for parental and embryo genomes, an observation not pursued further. Clinically, it would be beneficial to use preconception short tandem repeat assessment of premutations at loci responsible for short tandem repeat disorders.

There are limitations to this pilot study and areas where further work is required. Pathogenic de novo mutations occurring in a region of no or low coverage will be a challenging limitation to overcome. Further work is required to determine the likelihood of one of these highly improbable scenarios occurring. A second limitation is the threshold of VAF, which obfuscates de novo mutation calling. The need to determine the validity of de novo mutation calls meant filtering out variants which were likely polymerase base incorporation errors of the MDA, allele dropout or mis-aligned reads, generating false-positive variants. The advantages to embryo development and implantation rates conferred by the technique of trophectoderm biopsy of 4–8 cells serves as an additional benefit by maximising embryo genome sequencing coverage. Although the VAF suggested that the type of mutation varies in mean VAF, this was not explored in the present study. Minimising amplification and sequencing artefacts through allelic ratio and haplotype scoring effectively minimises the number of candidate de novo mutations to a number that can be, if necessary, curated. An ethnicity-specific penetrance magnitude metric to guide the level of pathogenicity would be highly relevant for IVF-based screening.

Controversy regarding whole-genome sequencing in IVF is reflected in contemporary questions of the utility of transferring chromosomally mosaic embryos in PGT aneuploidy screening. We provide compelling evidence in favour of using whole-genome sequencing for screening embryos for pathogenic, severe disease-causing and unacceptably high-risk de novo mutations. Offering clinical genome screening of embryos in the IVF clinic, either as a standalone test or after low-coverage PGT, is based on evidence that the major classes of pathogenic variation can be reliably detected. In addition to comprehensive genomic screening, several embryo development-related aneuploidies, that cannot currently be screened for via next-generation sequencing based PGT (i.e. 69XXX and low-level mosaicism), can be directly observed and screened via this protocol because of its unlimited resolution of structural variation. Although low-coverage PGT for aneuploidy is effective for detecting large (>10 Mbp) chromosomal aneuploidies, 1–2% of conceptions carry a de novo CNV or structural aneuploidy of >100 kb, a significant gap in the detection threshold16.

The concept of applying whole-genome sequencing for PGT is contentious, the main concern being the sensitivity and specificity of a testing system and the ethical questions that arise59,60,61,62. The ongoing emotional and psychological burden born by the parents and the monetary cost of support from a healthcare system for caring for an affected individual is vastly greater than the cost of a genome sequencing test63. For IVF patients, undiagnosed reasons for a couple’s subfertility can be diagnosed and factored into the initial screening to produce a viable pregnancy. Additionally, pharmacogenetics guided stimulation regimens for oocyte retrieval and personalised embryo culture media based on metabomic pathway analysis could be ascertained.

The method we propose for screening embryos for pathogenic content has provided evidence of the feasibility of whole-genome sequencing to screen biopsied IVF embryos for severe disease-causing pathogenic variants. By including de novo mutations and premutation short tandem repeat disorders in preconception testing, the risk of childhood disease with known genetic aetiologies can be significantly reduced, should any couple choose to. The discovery of the CFTR ΔF508 mutation in one of the couples having PGT for an alternative mutation exemplifies the justification, relevance and utility of this study.

This study is the first to demonstrate the validity of using whole-genome sequencing in the IVF clinic. Further research is required for stratifying variant penetrance across ethnicities and expanding the variant data to include variants of unknown significance and idiopathic disorders with polygenic risk is warranted.


  1. Sullivan-Pyke, C. & Dokras, A. Preimplantation Genetic Screening and Preimplantation Genetic Diagnosis. Obstetrics and Gynecology Clinics of North America 45, 113–125, (2018).

    Article  PubMed  Google Scholar 

  2. Chen, H.-F. et al. Preimplantation genetic diagnosis and screening: Current status and future challenges. Journal of the Formosan Medical Association 117, 94–100, (2018).

    Article  PubMed  Google Scholar 

  3. Munné, S. Status of preimplantation genetic testing and embryo selection. Reproductive BioMedicine Online 37, 393–396, (2018).

    Article  PubMed  Google Scholar 

  4. Wells, D. et al. Clinical utilisation of a rapid low-pass whole genome sequencing technique for the diagnosis of aneuploidy in human embryos prior to implantation. Journal of Medical Genetics 51, 553 (2014).

    CAS  Article  Google Scholar 

  5. Van der Aa, N., Esteki, M. Z., Vermeesch, J. R. & Voet, T. Preimplantation genetic diagnosis guided by single-cell genomics. Genome Medicine 5, 71–71, (2013).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. Handyside, A. H. et al. Karyomapping: a universal method for genome wide analysis of genetic disease based on mapping crossovers between parental haplotypes. Journal of Medical Genetics 47, 651–658 (2010).

    Article  Google Scholar 

  7. Harper, J. C. et al. Recent developments in genetics and medically assisted reproduction: from research to clinical applications. European Journal of Human Genetics: EJHG 26, 12–33, (2018).

    CAS  Article  PubMed  Google Scholar 

  8. Online Mendelian Inheritance in Man, O. McKusick‐Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD). (2000).

  9. Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, (2016).

    Article  Google Scholar 

  10. Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Research 29, 308–311 (2001).

    CAS  Article  Google Scholar 

  11. UniProt: the universal protein knowledgebase. Nucleic Acids Research 45, D158–D169, (2017).

  12. Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. Online Mendelian Inheritance in Man (OMIM(®)), an online catalog of human genes and genetic disorders. Nucleic Acids Research 43, D789–D798, (2015).

    CAS  Article  PubMed  Google Scholar 

  13. Stenson, P. D. et al. The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Human Genetics 136, 665–677, (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  14. Rappaport, N. et al. In Current Protocols in Bioinformatics (John Wiley & Sons, Inc. (2002).

  15. Verma, I. C. & Puri, R. D. Global burden of genetic disease and the role of genetic screening. Seminars in Fetal and Neonatal Medicine 20, 354–363, (2015).

    CAS  Article  PubMed  Google Scholar 

  16. Jackson, M., Marks, L., May, G. H. W. & Wilson, J. B. The genetic basis of disease. Essays in biochemistry 62, 643–723, (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Haham, L. M. et al. Preimplantation genetic diagnosis versus prenatal diagnosis—decision-making among pregnant FMR1 premutation carriers. Journal of Assisted Reproduction and Genetics 35, 2071–2075, (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Acuna-Hidalgo, R. et al. Post-zygotic Point Mutations Are an Underrecognized Source of De Novo Genomic Variation. The American Journal of Human Genetics 97, 67–74, (2015).

    CAS  Article  PubMed  Google Scholar 

  19. Kondrashov, A. S. Direct estimates of human per nucleotide mutation rates at 20 loci causing mendelian diseases. Human Mutation 21, 12–27, (2003).

    CAS  Article  PubMed  Google Scholar 

  20. Acuna-Hidalgo, R., Veltman, J. A. & Hoischen, A. New insights into the generation and role of de novo mutations in health and disease. Genome Biology 17, 241, (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  21. Kumar, A. et al. Whole genome prediction for preimplantation genetic diagnosis. Genome Medicine 7, 35, (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Peters, B. A. et al. Detection and phasing of single base de novo mutations in biopsies from human in vitro fertilized embryos by advanced whole-genome sequencing. Genome Research 25, 426–434, (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. Dequeker, E. et al. Best practice guidelines for molecular genetic diagnosis of cystic fibrosis and CFTR-related disorders – updated European recommendations. European Journal of Human Genetics : EJHG 17, 51–65, (2009).

    CAS  Article  PubMed  Google Scholar 

  24. Burke, W., Tarini, B., Press, N. A. & Evans, J. P. Genetic screening. Epidemiologic Reviews 33, 148–164, (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Natesan, S. A. et al. Genome-wide karyomapping accurately identifies the inheritance of single-gene defects in human preimplantation embryos in vitro. Genetics in Medicine 16, 838–845, (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  26. Natarajan, P. et al. Aggregate penetrance of genomic variants for actionable disorders in European and African Americans. Science Translational Medicine 8, 364ra151–364ra151, (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic acids research 42, D980–D985, (2014).

    CAS  Article  PubMed  Google Scholar 

  28. Yan, Y. et al. Association of Follicle-Stimulating Hormone Receptor Polymorphisms with Ovarian Response in Chinese Women: A Prospective Clinical Study. PLoS One 8, e78138, (2013).

    ADS  CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. Wosnitzer, M. S. Genetic evaluation of male infertility. Translational Andrology and Urology 3, 17–26, (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Patch, A.-M. et al. Germline and somatic variant identification using BGISEQ-500 and HiSeq X Ten whole genome sequencing. PLoS One 13, e0190264, (2018).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current protocols in bioinformatics/editoral board, Andreas D. Baxevanis … [et al.] 11, 11.10.11–11.10.33, (2013).

    Article  Google Scholar 

  32. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current Protocols in Bioinformatics 43, 11.10.11–11.10.33, (2013).

    Article  Google Scholar 

  33. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, (2010).

    Article  Google Scholar 

  34. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595, (2010).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. Picard Tools.

  36. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics 43, 491–498, (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  37. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, (2010).

    CAS  Article  Google Scholar 

  38. Rehm, H. L. et al. ACMG clinical laboratory standards for next-generation sequencing. Genetics in Medicine : Official Journal of the American College of Medical Genetics 15, 733–747, (2013).

    Article  Google Scholar 

  39. Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, (2015).

    Article  Google Scholar 

  40. McDonnell, E., Strasser, K. & Tsang, A. In Fungal Genomics: Methods and Protocols (eds Ronald P. de Vries, Adrian Tsang, & Igor V. Grigoriev) 185-208 (Springer New York (2018).

  41. Alankarage, D. et al. Identification of clinically actionable variants from genome sequencing of families with congenital heart disease. Genetics in Medicine 21, 1111–1120, (2019).

    Article  PubMed  Google Scholar 

  42. Carter, T. C. & He, M. M. Challenges of Identifying Clinically Actionable Genetic Variants for Precision Medicine. Journal of Healthcare Engineering 2016, 3617572, (2016).

    Article  PubMed Central  Google Scholar 

  43. Ng, P. C. & Henikoff, S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Research 31, 3812–3814 (2003).

    CAS  Article  Google Scholar 

  44. Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2. Current Protocols in Human Genetics/Editorial Board, Jonathan L. Haines … [et al.] 0 7, Unit7.20-Unit27.20, (2013).

    Article  Google Scholar 

  45. Schwarz, J. M., Cooper, D. N., Schuelke, M. & Seelow, D. MutationTaster2: mutation prediction for the deep-sequencing age. Nature Methods 11, 361, (2014).

    CAS  Article  Google Scholar 

  46. Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Research 39, e118–e118, (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  47. Shihab, H. A. et al. Ranking non-synonymous single nucleotide polymorphisms based on disease concepts. Human Genomics 8, 11–11, (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. bioRxiv, (2017).

  49. Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics 46, 310–315, (2014).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. Dolzhenko, E. et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Research, (2017).

    CAS  Article  Google Scholar 

  51. Abyzov, A., Urban, A. E., Snyder, M. & Gerstein, M. CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Research 21, 974–984, (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  52. Fan, X., Abbott, T. E., Larson, D. & Chen, K. BreakDancer – Identification of Genomic Structural Variation from Paired-End Read Mapping. Current Protocols in Bioinformatics/Editoral Board, Andreas D. Baxevanis ... [et al.] 2014, 10.1002/0471250953.bi0471251506s0471250945, (2014).

  53. Wang, J. et al. CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nature Methods 8, 652–654, (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  54. Kearney, H. M., Thorland, E. C., Brown, K. K., Quintero-Rivera, F. & South, S. T. American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants. Genetics In Medicine 13, 680, (2011).

    Article  PubMed  Google Scholar 

  55. Riggs, E. R. et al. Copy number variant discrepancy resolution using the ClinGen dosage sensitivity map results in updated clinical interpretations in ClinVar. Human Mutation 39, 1650–1659, (2018).

    CAS  Article  PubMed  Google Scholar 

  56. Roche-Lestienne, C. et al. Several types of mutations of the Abl gene can be found in chronic myeloid leukemia patients resistant to STI571, and they can pre-exist to the onset of treatment. Blood 100, 1014–1018, (2002).

    CAS  Article  PubMed  Google Scholar 

  57. Yang, L. et al. Presence of embryonic DNA in culture medium. Oncotarget 8, 67805–67809, (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Matthijs, G. et al. Guidelines for diagnostic next-generation sequencing. European Journal Of Human Genetics 24, 2,, (2015).

    Article  Google Scholar 

  59. Vaz-de-Macedo, C. & Harper, J. A closer look at expanded carrier screening from a PGD perspective. Human Reproduction 32, 1951–1956, (2017).

    Article  PubMed  Google Scholar 

  60. Harper, J. C. Preimplantation genetic screening. Journal of Medical Screening 25, 1–5, (2018).

    Article  PubMed  Google Scholar 

  61. Winand, R. et al. In vitro screening of embryos by whole-genome sequencing: now, in the future or never? Human Reproduction 29, 842–851, (2014).

    Article  PubMed  Google Scholar 

  62. Chrystoja, C. C. & Diamandis, E. P. Whole Genome Sequencing as a Diagnostic Test: Challenges and Opportunities. Clinical Chemistry 60, 724, (2014).

    CAS  Article  PubMed  Google Scholar 

  63. McCandless, S. E., Brunger, J. W. & Cassidy, S. B. The Burden of Genetic Disease on Inpatient Care in a Children’s Hospital. American Journal of Human Genetics 74, 121–127 (2004).

    CAS  Article  Google Scholar 

Download references

Author information

Authors and Affiliations



N.M.M. conceived the project, designed and performed experiments, performed bioinformatics, analysed data and wrote the paper; T.S. gave technical and writing support and conceptual advice; L.M. gave technical, conceptual and writing support and performed experiments; J.M. gave writing support and conceptual advice. L.R. gave writing support and conceptual advice.

Corresponding author

Correspondence to Nicholas M. Murphy.

Ethics declarations

Competing interests

There are competing funding interests from Monash IVF who contributed to a research grant for the research through the Monash Research and Education Fund. The authors were employed at Monash IVF for the main duration of the study. A provisional patent based on the study method protocol has been submitted.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Murphy, N.M., Samarasekera, T.S., Macaskill, L. et al. Genome sequencing of human in vitro fertilisation embryos for pathogenic variation screening. Sci Rep 10, 3795 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing