Introduction

Preimplantation genetic diagnosis (PGD) is a widely established reproductive alternative for couples with a high-risk of transmitting an inherited disorder. With respect to monogenic diseases, PGD can theoretically be applied for any genetic disease with a definitive molecular diagnosis and/or defined marker linkage within a family.1, 2 According to 10 years of data collection by the ESHRE PGD Consortium, PGD has been applied to over 190 different monogenic disorders.3 Currently all methods for PGD to exclude monogenic diseases are based on the polymerase chain reaction (PCR), with or without prior whole-genome amplification. Genetic analysis may be performed at various stages post fertilization, including the oocyte/zygote biopsied on the first day post-insemination (polar body analysis), on 1–2 blastomeres from cleavage-stage embryos biopsied on the third day post-insemination (blastomere biopsy) or on 5–10 trophectoderm cells biopsied from blastocysts on the fifth day post-insemination (blastocyst biopsy). To date, most PGD cycles have used blastomere biopsy, as in this study.

A common characteristic between all biopsy stages is the limited quantity of sample available for genetic analysis, usually a single cell only. It is this aspect of PGD that has been the most technically challenging, potentially compounded by the often sub-optimal quality of the embryo and/or embryo cell biopsied. The innate limitations of single-cell PCR include total PCR failure, allelic drop-out (ADO), and sample contamination. Overall, PCR protocols have to fulfil many conditions: they should have rapid turnaround time, be robust, sensitive and above all, absolutely accurate, to preclude misdiagnosis.

Misdiagnosis can be classified as adverse or benign.4 Adverse misdiagnosis includes cases in which there is the initiation of an affected pregnancy or the birth of an affected child. Benign misdiagnoses generally include a normal diagnosis in an embryo in fact heterozygous for a recessive disorder. In addition, healthy embryos, which are misdiagnosed as affected, can be treated as a category of adverse misdiagnosis, as this type of misdiagnosis reduces the number of embryos available for transfer, potentially decreasing the success of the IVF-PGD treatment outcome. During 10 years of data collection recording the outcome of over 4700 cycles for monogenic PGD, 12 adverse misdiagnoses were reported.3, 4 However, this is probably an underestimation, as many embryo transfers have no follow up (no pregnancy or birth). Furthermore, based on a survey done amongst ESHRE PGD consortium members in 2008 (unpublished results), only a minority of centres perform reanalysis of untransferred supernumerary embryos. Reasons for this are mainly attributed to limited staff and/or funds, no access to untransferred spare embryos (relevant for most PGD centres offering ‘transport’ PGD), and in some countries, legislation which forbids biopsy of the embryo (eg in Germany).5, 6

As audit is invaluable, the primary objective of this multi-centre study was to identify the validity of PCR-based PGD by comparing results at the time of PGD with the results of the embryo follow-up analysis in a large cohort of samples. The secondary objective was to identify factors which may influence the validity of PCR-based PGD, including the embryo biology, the PCR-PGD genotyping strategies, the number of cells used in the PGD analysis, and the category of monogenic disease for which the PGD was applied.

Materials and methods

Study design and study population

The Embryo-Follow-up study was facilitated by the ESHRE PGD Consortium as a multi-centre study, which aimed to retrospectively evaluate the validity of PCR-based PGD. The study was conducted between October 2009 and May 2010. Specifically, an invitation was sent out to all ESHRE PGD Consortium centres which had contributed data to the annual data colections (>50). Of these, 23 initially responded and finally 6 centres submitted data that met the inclusion and data integrity criteria. The inclusion criteria for centres were defined as the submission, by the set deadline, of information on at least 50 reanalyzed embryos through the mandatory completion of all fields in the database (Table 1) in the coded format requested. The parameters in the database included general information, (centre, disease category), embryo information for the PGD cycle, genotype analysis for PGD and reanalysis, conditions of embryo reanalysis, genotype assay type, and outcome(s) (Table 1). All data was evaluated for integrity, leading to complete information on 940 reanalysed embryos.

Table 1 List of fields included in database

According to regional regulations, each participating centre obtained ethical approval as well as informed consent from all couples donating embryos included in the study. All couples had undergone PGD as they were at risk of transmitting a monogenic disorder to their offspring.

Embryos considered as eligible for inclusion in this study were those that had been genotyped during a clinical PGD cycle but were not suitable or required for transfer or cryopreservation based on: (a) genetic unsuitability based on the PGD-derived genotype (affected), (b) poor developmental capacity and morphology (assessed by the embryologist), and (c) a couples’ decision that their supernumerary embryos were not required for further reproductive treatment cycles. Thus, reanalysis occurred on day 4–5 post fertilization and all centres ensured randomized selection of embryos included for reanalysis to minimize bias in the study cohort. None of the embryos in this study had been previously reanalysed. To eliminate intra-centre variation, supernumerary whole embryos or cell fractions of embryos were reanalyzed using the exact same PCR-based PGD protocols applied to genotype blastomeres at PGD. Moreover, each PGD centre applied their standard assisted reproduction technology (ART) and PGD procedure according to the recommendations outlined in the ESHRE Best Practice Guidelines.7, 8, 9

Data analysis and statistics

Reanalysis genotypes in whole embryos or cell fractions (>2 cells) of embryos were defined as the ‘true’ genotype and hence ‘true’ embryo status, on the assumption that the analysis of multiple cells is more accurate. Based on PGD analysis and embryo reanalysis genotypes, blastomeres and reanalysed embryos were classified as affected, unaffected and aberrant. Blastomeres and embryos could be potentially scored with aberrant genotypes if the PCR protocol applied for PGD and follow-up involved linkage marker analysis (two or more markers), which demonstrated abnormal ploidy (eg monosomy, trisomy).

To determine the validity of PCR-based PGD, the sensitivity (Se), specificity (Sp) and diagnostic accuracy were calculated following the categories presented in Table 2, all with a 95% confidence interval (95% CI). The Se was defined as the proportion of affected/aberrant embryos diagnosed correctly by PGD (true positive), whereas the Sp was defined as the proportion of unaffected embryos diagnosed correctly by PGD (true negative). The diagnostic accuracy of PCR-based PGD was calculated as the proportion of embryos whose genotype results at reanalysis were in agreement with the results at PGD (ie true positive and true negative results). For the embryos whose results at PGD did not match the results at reanalysis (ie false-positive, FP, or false-negative, FN, results), the possible cause of discordance at PGD was evaluated from the genotype result and recorded. The potential causes were: (i) ADO, (ii) mosaicism, (iii) contamination, and (iv) other.

Table 2 Embryo status PGD versus embryo status reanalysis

The effect of various parameters on the validity of PCR-based PGD, Se (95% CI), Sp (95% CI) and diagnostic accuracy (95% CI) were also calculated. The parameters included the impact of embryo biology, the PCR-PGD protocol strategies followed, the number of cells analysed during the PGD, and the category of disease transmission (autosomal recessive (AR), autosomal dominant (AD), X-linked dominant (XL-D) or recessive (XL-R)) for which the PGD was applied. To this end the data was divided into the following subgroups: (i) multiplex PCR method, (ii) singleplex PCR method, (iii) 1 cell biopsy, (iv) 2 cell biopsy, (v) good morphology (vi) poor morphology, (vii) AR, (viii) AD, (ix) XL-R and (x) XL-D.

The impact of embryo biology was assessed by grouping the embryos according to embryo morphology scoring. All centres scored the embryo morphology according to the same criteria10 assigning embryos to good, intermediate and poor groups. For comparisons within the morphology subgroups, only data on the best versus the poorest morphology subgroups were compared in order to minimize overlap. Although, the participating centres used the same criteria, embryo scoring is subjective and thus could lead to overlap between intermediate embryology morphology groups.10

To further assess the diagnostic performance parameters of multiplex versus singleplex PCR-based PGD methods along with one versus two cell biopsy the following subgroups were created and compared: Singleplex PCR on 1 cell or 2 cell versus Multiplex PCR on 1 cell or 2 cell biopsy. Additionally, performance parameters were calculated for each centre to identify potential differences among centres.

Moreover, as genotype aberrations in blastomeres and embryos are attributed to a biological phenomenon and not a technical limitation of the PGD-PCR protocol (such as ADO or contamination), a further analysis was conducted after excluding the aberrant embryos (808 entries remained). The existence of significant differences in Se, Sp and accuracy among subgroups were examined by using the Fisher exact test. However, owing to multiple comparisons, the Bonferroni correction was used in order to account for the increase in Type I error.

Finally, simple and multiple logistic regressions were used to evaluate the association of various characteristics with the diagnostic accuracy of PGD-PCR. Because of the cluster design of the current study (more than one embryo was enrolled in the study by each centre), the centre was considered in these analyses as a cluster variable. The results are presented as odds ratio (OR) and 95% CI, and a probability value of 5% was considered as statistically significant. STAT software was used for all the statistical calculations (version 8; 2003 Corp, College Station, TX, USA).

Results

The study sample included results from 940 reanalysed PGD cases for 53 different genetic disorders ( 15 AR, 24 AD, 10 XL-R and 4 XL-D.

Data submitted by the participants showed that 56.8% of the PCR-based PGD protocols were performed on one biopsied cell, 72.9% of the embryo genotypes were achieved by application of a multiplex PCR-based protocol and that 25.2% of the embryos were assigned to the ‘good morphology’ category at PGD.

Overall diagnostic performance parameters of PCR-PGD

Among the 940 reanalysed embryos, results for the genetic status at the time of PGD showed that 234 blastomere-based embryo analysis were classified as unaffected, 590 as affected and 116 as aberrant. Embryo reanalysis showed that 283 embryos were unaffected, 578 were affected and 79 aberrant. Hence, in 881 (out of 940) reanalyzed embryos, the status at the time of PGD was concordant with the reanalysis embryo status (diagnostic accuracy of 93.7%). The overall Se and Sp of the PCR-PGD methods were 99.2 and 80.9%, respectively (Table 3). Fifty-nine embryos were misclassified at PGD (5 were classified as FN and 54 as FP). With regards to the cause of discordant results, the observed five FN entries were attributed to mosaicism. The majority of FP results were attributed to mosaicism (29), followed by ADO (17), contamination (7), and ‘other causes’ (1).

Table 3 Validity of PCR-PGD analysis compared with embryo reanalysis (n=940 embryos, including aberrant)

Subgroup data analysis

Stratified analysis by centre revealed no significant difference among centres in terms of Se (P=0.700, Table 3). However, statistically significant differences were detected in Sp and diagnostic accuracy (P=0.002 and P<0.001, respectively). In particular, Sp was found to range between 69% and 94%, with the exception of one centre where only three unaffected embryos were included, all of which were FP (Sp: 0%). In terms of diagnostic accuracy, a range of 84–98% was detected across all participating centres.

Moreover, stratified analysis of the PCR-PGD protocols applied (multiplex versus singleplex) showed that multiplex protocols perform statistically significantly better than singleplex protocols in terms of Se (99.8 versus 97.9%, P=0.03), whereas no significant difference was detected in Sp and diagnostic accuracy (P=0.352 and P=0.547, respectively) (Table 3). Concerning the number of biopsied cells that underwent PCR-based PGD, the analysis showed that two-cell biopsy exhibits a significant advantage in terms of diagnostic accuracy compared with one-cell biopsy (96.7 versus 91.6%, P=0.001). Seventeen embryos whose PGD was based on three or four cells were excluded from the statistical analysis (only descriptive results are presented) due to the extremely small sample size.

The combined diagnostic efficiency of PCR-based PGD strategies (molecular method and biopsy protocol) was investigated by comparing the following subgroups: Singleplex 1 cell (S1cell), Singleplex2 cells (S2cell), Multiplex 1 cell (M1cell), and Multiplex 2 cells (M2cell) biopsy. A statistically significant difference was observed between the S1 cell and M1 cell for the Se (P=0.048), whereas there was no significant difference detected for Se in the remaining pairwise comparisons (Table 3). In terms of diagnostic accuracy, multiplex PGD with two cells seems to identify the status of embryos with significantly greater accuracy compared with singleplex PGD with one cell (97.1 versus 88.9%, P=0.024), whereas a marginally higher diagnostic accuracy was detected in multiplex PGD with two cells compared with multiplex PGD with one cell (97.1 versus 92.1%, P=0.066).

To investigate the degree of the influence of embryo morphology of PCR-based PGD tests on the diagnostic outcomes, data from embryos from the two extreme categories: good (class 1), and poor (class 4) morphology (see ‘Methods’ section) were compared. The Se of PCR-based PGD seems to be statistically significantly higher among embryos with ‘good’ morphology (100%) compared with those with ‘poor’ morphology (94.9%, P=0.032). On the other hand, no significant effect of embryo morphology on Sp and diagnostic accuracy of PCR-based PGD was detected (P=0.057 and P=0.999, respectively) (Table 3).

Finally, no statistically significant difference was demonstrated in Se and Sp among the mode of inheritance subgroups, although Sp was significantly lower in AR subgroup (78.7%) compared with XL-D subgroup (90.0%). However, in terms of diagnostic accuracy, it was found to be statistically significantly higher in AD group compared with AR group (89.0 versus 95.1%, P=0.002) (Table 3).

Multiple logistic regression, after taking into account the potential clustering effect of the centre, revealed that the probability of agreement between PGD and reanalysis results (either true positive or true negative) is statistically significantly higher when two cells as opposed to one cell are used in PGD (P<0.001), when multiplex instead of singleplex protocols are applied (P=0.001) and when the type of disease for which the PGD is applied is AD and XL-D versus AR (P<0.001) (Table 4).

Table 4 Factors affecting the diagnostic accuracy of PGD; Results from simple and multiple logistic regression by using centre as a cluster factor (N=923a)

Exclusion of all aberrant embryos

The analysis of the data following exclusion of all aberrant genotype results demonstrated that 785 out of the 808 blastomeres, genotyped at PGD had concordant embryo status with the corresponding reanalyzed embryos, giving Se, Sp, and diagnostic accuracy of 99.6, 91.6, and 97.2%, respectively (Table 5). The FN group contained two samples, which were scored discordant due to mosaicism. In the FP group (21 samples), a variety of phenomena underlie discordant diagnostic outcomes. ADO (9/21) was the most frequent reason of discrepancy between PGD and reanalysis embryo status and underlying genotypes. Other causes of FP results were mosaicism (7/21) and contamination (4/21), whereas 1 could not be explained by the above.

Table 5 Validity of PCR-PGD analysis compared with embryo reanalysis (n=808 embryos, excluding aberrant)

The results of stratified analyses are also presented in Table 5. These results indicate that excluding aberrant embryos, the only parameter, which was found to have statistically significant difference was the diagnostic accuracy between AR and AD (P=0.006, Table 5).

Multiple logistic regression, confirmed the absence of statistically significant differences, for all subgroups (see Materials and Methods) with the exception of the disease mode of inheritance for which the PGD was applied, whereby a statistically significant difference in the probability of agreement between PGD and reanalysis results was detected between AR and XL-D (P<0.001, Table 6).

Table 6 Factors affecting the diagnostic accuracy of PGD: results from simple and multiple logistic regression by using centre as a cluster factor (N=792a)

Discussion

The primary objective of the present study was the investigation of the diagnostic accuracy and validity of PCR-based PGD protocols that are routinely applied to diagnose single gene diseases. It is the first multicentre study to collectively quantify diagnostic performance parameters, based on data analysis methodology previously used in a single centre to internally audit the accuracy and validity of PCR-based PGD protocols.11 In general, the results are expected to highlight pitfalls of PCR-based PGD (technical or biological), which can then be addressed to optimize clinical PGD results.

Overall, data analysis demonstrates the validity, robustness, and high diagnostic performance of the PCR-based PGD protocols. The Se and Sp and accuracy of PCR-based PGD protocols applied to diagnose single gene disorders are high (99.2, 80.9, and 93.7%). The high Se reflects the significantly low risk of adverse misdiagnosis. This observation is very important as adverse misdiagnosis may have severe consequences upon the couples, such as the initiation and subsequent termination of an affected pregnancy or the birth of an affected child.4 The inter-centre comparison was not shown to influence the Se of PGD results. However, Sp was significantly different amongst the participating centres owing to the differences in the FP rate. This could be attributed to the general trend to overestimate affected status when interpreting the PCR-based PGD results, in order to preclude transfer of any affected embryos.

The calculated FN rate stems from five FN diagnostic outcomes, which were all attributed to mosaicism. This provides a strong indication that the protocols used to date are efficiently designed to detect the genetic change that causes the disease but that adverse misdiagnosis could stem from testing a cell, which does not represent the genetic make-up of the entire embryo. It has been previously shown that almost 50% of the embryos created by ARTs are mosaic, containing both normal (diploid) and abnormal (non-diploid) cells.12, 13, 14, 15 With respect to FP results, the contribution of mosaicism to the FP rate (FP=19.1%) in the overall data, reaches 54%. FP diagnosis is characterized as more benign as it does not lead to the initiation of an affected pregnancy, although it may decrease the number of genetically transferable embryos, hence reducing the probability of achieving a pregnancy. Data showed that the second most frequent cause of FP diagnosis was ADO, an inherent pitfall of single-cell PCR. Strategies to overcome the technical limitations of single-cell PCR have been previously described4, 16, 17, 18 and incorporated in the ESHRE best practice guidelines for monogenic PGD.7 To preclude adverse misdiagnosis due to ADO and contamination, the co-amplification of linked polymorphic markers across the locus of interest is highly recommended and the advantages of the approach had been extensively described.19, 20 In addition, the analogous effectiveness of multiple displacement amplification methods in preimplantation genetic haplotyping has been reported.20, 21 The technical superiority of multiplex over singleplex approaches is definitively demonstrated by the results of this present study.

With respect to biopsy strategies, the analysis of two biopsied cells from a single embryo showed higher OR compared with one-cell biopsy. However, the Se and hence the FN rate for both strategies remains the same. In addition, two-cell biopsy strategies appear to be more efficient than one-cell biopsy in terms of lower FP diagnosis. Overall, multiplex protocols on one cell (M1cell) present a significantly better diagnostic approach than singleplex one cell (S1cell). As could be expected, the latter demonstrated the poorest diagnostic performance parameters. S1cell methods had the highest FN and FP rates, thus having the greatest risk of transferring an affected embryo or significantly decreasing the number of embryos available for transfer (50% of FPs caused by ADO).

However, it must be emphasized that multiplex PCR-based methods applied on 1 cell (M1cell) are as robust as those on two cells (M2cell) regarding FN results, which is of fundamental importance for clinical PGD. The higher FP rate when analysing a single versus two cells are likely due to the inability in the former to double-check results and resolve ambiguous genotyping results on a second cell. Although the FP rate is higher in the M1cell subgroup, in practice a balance has to be found between genotype accuracy and good embryo integrity. It has been reported that two-cell biopsy can be detrimental to both the embryo development and clinical outcome,22, 23 and as the ultimate desired outcome of a PGD cycle is the initiation and delivery of an unaffected pregnancy, M1cell protocols fulfil this balance. Two-cell biopsy should probably be considered when a reliable multiplex PCR-based method is hard to develop, such as for PGD cases with de novo mutations when informative markers cannot be found.7

The present study also investigated whether embryo morphology influences PCR-based PGD diagnostic validity. Although, the Se between good and poor morphology embryos was found to be statistically different, further statistical analysis (simple and multiple logistic regression) showed that morphology does not affect the diagnostic accuracy of PGD. Morphology subgroup comparisons should be treated with caution due to the small size of data analysed.

Finally, this study quantified the impact of embryos scored with aberrant genotypes on the diagnostic efficiency of PCR-based PGD protocols. In general, a significant difference between aberrant versus non-aberrant embryos was exhibited between the Sp (<0.001) and accuracy (<0.001). FP entries, attributed to mosaicism and ADO, were decreased dramatically through the removal of aberrant embryos. More specifically, when aberrant embryos were included the main causes of FP genotypes were attributed to mosaicism (29) or ADO (17), whereas after their exclusion FP genotypes were accounted for by 7 cases of mosaicism and 9 ADO. On the other hand, no significant differences were demonstrated in the Se (P=0.375) between these two data sets, indicating that, despite of the presence of aberrant embryos, current PGD protocols are well suited to detect embryos affected for a monogenic disease.

Furthermore, the exclusion of aberrant embryos led to the abolition of significant differences between subgroups, with the exception of the accuracy between AD versus AR mode of inheritance subgroups whereby the FP rate was higher in the AR subgroup. In the case of PGD for an AR disease, in which one pathological allele is definitively identified but there is no result for the trans allele (either due to biological or technical reasons), there is only a 50% chance that the embryo is unaffected, and thus the tendency is to score the embryo as genetically affected. On the other hand, when an AD disease is tested, if the only allele detected is the mutant allele, then the embryo will be correctly scored as affected.

The present study is the first multicentre evaluation of the degree of clinical validity of PCR-based PGD methods. The statistical analysis took into consideration limitations associated with the study, such as the embryo morphology scoring which is relatively subjective, and accepts that the response rate among the centres invited to participate relative to the number of centres who participated (11%) was slightly lower than expected.

In conclusion, this study demonstrates the validity, robustness, and high diagnostic value and performance of a wide range of PCR-based methods currently used in clinical PGD. In addition, it provides substantial evidence that the embryo biology has a significant impact on the diagnostic accuracy of PCR-based PGD methods and this should always be taken into account when designing the strategy of methodologies and evaluating genotype results for monogenic PGD.