In the four decades since in vitro fertilization (IVF) celebrated its first live birth, the technology has significantly improved. Yet today, even in the best prognosis patients, only about one-third of embryos that are transferred to a woman’s uterus result in a live birth, a fact that causes significant emotional, physical, and financial hardship for patients. About a decade ago, a new test to distinguish viable from nonviable embryos by assessing some of their chromosomes was rapidly introduced into clinical care. While the clinical details of this method of testing embryos are primarily relevant for fertility providers and their patients, the story of its premature adoption can serve as a cautionary tale for everyone working in human genetics.

The first version of preimplantation genetic screening (PGS; later renamed preimplantation genetic testing for aneuploidy [PGT-A]) was developed almost 30 years ago. Initial PGT-A involved removing one cell from an embryo containing about eight cells (day 3 of development) and used fluorescent in situ hybridization (FISH) technology to assess whether that embryo had a normal number (ploidy) of the few (7–9) chromosomes tested.1 A different kind of analysis to look for specific DNA mutations, called preimplantation genetic diagnosis (PGD; renamed preimplantation genetic testing for monogenic diseases [PGT-M]), can also be performed on this single biopsied cell. It was later shown that initial PGT-A was not able to improve pregnancy or live birth rates. In fact, when assessed with a randomized control study, it became clear that embryos biopsied on day 3 of development had lower implantation and delivery rates than nonbiopsied embryos, presumably due to the presence of embryo mosaicism, because the test gathered incomplete chromosome information, or because the embryo was damaged by removal of a cell.1

In the early 2000s, a new PGT-A method was introduced, whereby embryos were biopsied at the blastocyst stage (day 5 or 6 of development). At the blastocyst stage, the embryo consists of a few hundred cells, including the inner cell mass, which will become the fetus, and the trophectoderm, which will become the placenta. In this later version of PGT-A, several cells (usually 5–6, but up to about 10) are biopsied from the trophectoderm and submitted for the entire set of chromosome analysis. Initial studies showed far higher implantation rates following biopsy of blastocyst than cleavage stage embryos.2 In 2012, a pilot study reported higher ongoing pregnancy rates for embryos tested using the later PGT-A than for those selected using standard morphological criteria.2 That same year, a report of over 140 trophectoderm biopsies reported that just 4% of embryos designated as aneuploid led to delivery of a healthy child, suggesting very low false positive rates for the test.3 The following year, a controlled trial reported higher live birth rates in patients whose embryos were assessed using PGT-A.2 As a consequence of these early reports, top US IVF clinics quickly adopted and advertised on their websites their use of PGT-A, claiming that the test was 96–99% accurate and that it increased the chance of a successful pregnancy by up to 30%.

In the rush to improve IVF effectiveness, little attention was given to the many limitations of these initial studies, including lack of randomization, the use of good prognosis patients making the results difficult to generalize, and reporting rates of live birth per embryo transferred rather than rates of live birth per cycle initiated or per egg retrieval (so-called per intention to treat), thereby excluding patients who produced embryos but had no transfers because all embryos were found to be aneuploid. The possibility of false positive test results was downplayed, even though a false positive test result can be devastating to fertility patients.

Recently, it has become clear that PGT-A had false positive rates higher than the 4% originally claimed and that the test was unable to differentiate between viable and nonviable embryos.4 As with the initial PGT-A, the latest PGT-A was prematurely introduced into clinical care without proper validation studies. Evidence of its inaccuracy started to emerge in 2015 when clinics in both New York and Rome reported births of healthy newborns following transfer of aneuploid embryos in patients who had no “normal, euploid” embryos available for transfer.5 Since then, similar reports have emerged from other clinics, confirming that embryos labeled “abnormal” or aneuploid were incorrectly classified.4 These reports of healthy live births after transfer of supposedly aneuploid embryos strongly suggest that fertility clinics offering the PGT-A test have discarded thousands of embryos that could have led to normal births. Patients may have suffered the financial, physical, and psychological burden of undergoing unnecessary additional IVF cycles or have foregone having children. One reproductive endocrinologist recently estimated that in the past 6 years, up to 40% of viable embryos may have been discarded on the basis of incorrect results from PGT-A.4

Why so many false positive results? The likely explanation is the presence of mosaicism (i.e., a mixture of cells with different ploidy) in the early embryo, which appears to be quite common.6 We are also learning that the embryo is a very dynamic entity, sometimes able to self-repair as it grows. The chromosomal errors leading to embryo mosaicism can occur in the gametes (meiotic errors in sperm or egg) or after fertilization during the various6 cleavage steps (mitotic errors). Embryos originating from aneuploid gametes are higher in frequency with advanced maternal age and can self-correct by “trisomy rescue,” whereby aneuploid cells move out of the inner cell mass and into the trophectoderm (see Figure S1) (ref. 6). Mosaicism from mitotic errors are more frequent (about 25% of the cases), are independent of maternal age, and can result in confined placental mosaicism (Figure S2) (ref. 6). In both of these scenarios, the true composition of the inner cell mass cannot be determined by removing and testing a few cells from the trophectoderm—it can only be inferred.

In response to reports of healthy live births from supposedly aneuploid embryos, some US clinics are suspending or radically reducing their use of PGT-A. Supporters of PGT-A have introduced the concept of “degrees of mosaicism” to guide embryo transfer decisions. Basically, if a biopsy shows 80% mosaicism, meaning that four of the five cells analyzed are aneuploid, then transfer of this embryo is not recommended.7 However, if the biopsy shows 40% mosaicism or less, the embryo could be transferred.7 In our view, such designations are quite arbitrary, given the limited number of cells analyzed, and could vary significantly based simply on the site of the biopsy. They also risk creating confusion and uncertainty for professionals and patients.

A randomized controlled trial (RCT) has yet to be performed using appropriate populations. A hypothetical RCT was conducted, using published data, showing approximately 2–5 times higher cumulative live birth rate without PGT-A.8 In March 2018, 10 months after the American Society for Reproductive Medicine (ASRM)’s ethics committee issued an opinion describing PGT-A results as clinically accurate,9 the organization’s practice committee noted “important limitations” in past studies and stated that the “value of PGT-A has yet to be determined”.10 The practice committee did not comment on the premature adoption of the test or on ASRM’s past support of PGT-A.

Worldwide, PGT-A has attracted much attention, with proponents of the technology arguing that institutions and countries that restrict PGT-A are unethical, violating the right of patients to select a viable and healthy embryo. In Japan, where one of the authors (S.T.) practices medicine, PGT-A had until recently been prohibited. Then in 2016, a Japanese physician publicly performed PGT-A, and successfully argued for its adoption, because of—in the words of the Japanese Obstetrics and Gynecology chairman—“its widespread use in the US and Europe.”

At this point, patients and clinicians might understandably be confused about the reliability and interpretation of PGT-A results. There is some evidence that the test can be helpful to certain cohorts of patients. But because it is unable to definitively determine embryo viability, PGT-A is unlikely to benefit the many patients who have limited numbers of embryos to transfer. Additional studies are required to confirm the extent of false positive results and PGT-A needs strict validation studies. In the meantime, clinicians must not underestimate its potential to cause emotional harm or economic loss to patients, and patients must be helped to understand that the test might identify as aneuploid or mosaic embryos that are in fact viable. As we are seeing with the rapid uptake of genetic and genomic tests in prenatal, general medical, and direct-to-consumer contexts, seemingly pathogenic or abnormal test results could just be variations of a potential norm. The fertility industry’s recent experience with the rapid clinical uptake of PGT-A also illustrates how challenging it can be to slow down or cease use of an unvalidated and costly technology once it is in the marketplace.