Introduction

Male infertility is considered a complex multifactorial phenotype affecting 10 to 15% of couples,1 and these levels remained constant from 1990 to 2010.2 However, there is evidence of a quality semen reduction throughout the second half of the 20th century that might reflect a reduction in male fertility.3 The nature of the spermatozoa (their vitality, motility and morphology) and the composition of the seminal fluid are important factors to sperm function. Thus, different semen anomalies can be defined depending on which semen parameter/s fall below a lower reference limit (www.who.int): asthenozoospermia (low percentage of progressively motile spermatozoa), oligozoospermia (low total number of spermatozoa), teratozoospermia (low percentage of morphologically normal spermatozoa) and the combination of two or more of them (that is, oligoasthenozoospermia and oligoasthnoteratozoospermia). The most critical phenotype is azoospermia, a condition where there are no spermatozoa in the ejaculate.

It has been reported that genetic factors might be behind infertility.4 According to some authors, genetics would contribute to infertility by affecting many physiological processes including hormonal homeostasis, spermatogenesis and sperm quality.5 The central role played by mitochondrion in cellular energy production has led to the study of mitochondrial DNA (mtDNA) variation in a number of diseases, either rare or complex (multifactorial),6, 7, 8, 9, 10 including infertility.11, 12, 13 The mtDNA is a circular molecule with 16 596 bp that encodes for two ribosomal RNA genes (12S and 16S ribosomal RNAs), 22 transfer RNAs (tRNAs) and 13 protein subunits involved in the electron transport chain. It is inherited exclusively in a matrilineal manner, from the mother to the offspring, and does not recombine.14

There are several published studies investigating the role of mtDNA variation in male infertility. Many of these were focused on the analysis of mtDNA variation in male spermatozoa and, in particular, on the effect of large-scale deletions in sperm motility. It could be hypothesized that these deletions in spermatozoa mtDNA would reduce the adenosine triphosphate production because of an incomplete respiratory chain, affecting the sperm motility, and, consequently, the male fertility. However, the results obtained were uneven. Whereas some authors pointed to a correlation between the proportion of the ‘common’ 4977 bp and 7.4 kb mtDNA deletions and poor motile sperm,15 others could not replicate the same findings.16, 17 Despite this apparent contradiction, some authors seem to agree that higher numbers of multiple macrodeletions are present in poor sperm motility samples.16, 18

Other studies have investigated the potential role of inherited mtDNA variability in male infertility through the analysis of mtDNA haplogroups or individual mtDNA single-nucleotide polymorphism (mtSNP) variation. Thus, a study performed by Holyoake et al.19 showed that carriers of mtDNA mutations G9055A and G11719A could have compromised sperm mobility and/or quantity. This study was however critically questioned by others12, 20 because of the identification of many common problems affecting case–control mtDNA studies that are much better known today.6, 21, 22 Thangaraj et al.23 analyzed the mtDNA variants in both semen and blood samples of a single case of oligoasthenoteratozoospermia. They found an atypical number of mutations (9 missense and 27 silent mutations as well as a 2 bp deletion in the mtDNA region from 6241 to 9167) in a semen sample, but not in blood, suggesting that this variation (and particularly the 2 bp deletion) may be responsible for low sperm mobility. However, the analysis by Bravi et al.24 showed that this mutation pattern was consistent with an artifact provoked by the unintended amplification of a nuclear mitochondrial DNA (NUMT).25 Selvi Rani et al.26 reported the presence of the novel mitochondrial variant C11994T in 34 oligoasthenosmermic males from India compared with none in 150 normozoospermic controls. These authors thus suggested that this variant may cause low sperm mobility in Indian patients, and proposed its systematic screening in individuals with moderate oligoasthenospermia. Nevertheless, subsequent studies have shown that this mutation was completely absent in infertile male samples from Portugal.12, 27 Despite the lack of additional mtDNA polymorphism data in the study by Selvi Rani et al.,26 Pereira et al.28 indicated that this mutation might be a diagnostic position of a minor haplogroup to which all the samples in the cohort of Selvi Rani et al.26 belonged, the result of a NUMT amplification or the result of methodological errors.29

There are many other instances in the literature claiming the association of different mtDNA SNPs and low sperm motility, including A3234G,30 A6375G, G9588A, G9387A, A6307G, A8021G and C12187A (all found in Tunisians31, 32, 33, 34, 35). The series of articles by Baklouti-Gargouri et al.31, 32, 33, 34, 35 has recently received attention by Salas et al.,36 who exposed crucial deficiencies. On the other hand, Güney et al.37 did not find statistically significant differences when analyzing nucleotide substitutions in 30 infertile versus 30 fertile men samples.

Other studies focused on a potential role for mtDNA haplogroups in male infertility. In this regard, Ruiz-Pesini et al.38 reported an association between asthenospermia and haplogroup T. Pereira et al.39 did not find haplogroup association in 101 southern Portugal oligozoospermic males using geographically well-matched controls; these authors also suggested population stratification problems in the study of Ruiz-Pesini et al.,38 and pointed to the importance of using adequate geographic matched controls to avoid spurious associations. In a later study, Montiel-Sosa et al.11 pointed to an association between several sublineages of macrohaplogroup U and sperm motility/vitality. Another study carried out in Han Chinese population highlighted the protective role of the haplogroup R versus non-R with regard to asthenozoospermia (2.97-fold decreased chance of asthenozoospermia).13

The study by Pereira et al.27 represents the only attempt so far to analyze full mitochondrial variation in subfertile males. They sequenced the mitogenomes of 43 asthenozoospermic males from Portugal and analyzed the data using a phylogenetic approach. These authors did not find a shared specific polymorphism or haplogroup potentially related to infertility. This approach was successfully used for the study of other disorders.9, 40, 41, 42

In a continuation of the study of Pereira et al.,27 the present article focuses on sperm motility and represents the largest cohort analyzed to date for their mitogenomes (n=52). The data have been meta-analyzed with the data previously genotyped by Pereira et al.27 from a phylogenetic perspective, bringing the total sample size of affected males to 96 mitogenomes.

Materials and methods

Sample collection and extraction

Saliva samples were collected from a total of 53 infertile Galician males (northwest Spain). Table 1 summarizes clinical characteristics of the patients. The sample labeled as O01 had been previously analyzed and reported by Cerezo et al.43 We obtained informed consent for all donors before the research. Rights of participants were safeguarded during the research and their identity was protected. The study complies with all relevant Spanish regulations on biomedical research, namely, the Biomedical Research Act (14/2007-3 of July), the Autonomy of the Patient Act (41/2002), Decree SAS/3470/2009 for observational studies and the Data Protection Act (15/1999).

Table 1 Semen and sperm characteristics of the 53 Spanish infertile samples

The DNA was extracted following standard phenol–chloroform protocols.

Complete mtDNA sequencing

All the samples collected in the present study were sequenced for the entire genome following the protocols used in Gómez-Carballa et al.40 and Álvarez-Iglesias et al.44 In brief, we used the PCR primers for amplification and sequencing reported previously by Torroni et al.45. PCR was performed using 10 μl of the reaction mixture, containing 4 μl of PCR Master Mix (Qiagen, Hilden, Germany), 0.5 μl 1 μM of each primer, 1 μl sample template and 4 μl of water. This PCR was carried out in a 9700 Thermocycler (Applied Biosystems, Foster City, CA, USA) using the following conditions: one cycle of 95 °C for 15 min and then 35 cycles of 94 °C for 30 s, 58 °C for 90 s and 72 °C for 90 s with a full extension cycle of 72 °C for 10 min. The sequencing procedures were undertaken using 11.5 μl of the reaction mixture; the mixture contained 2.5 μl of sequencing buffer (5 × ), 0.5 μl of BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems), 1 μl of the primer (to a final concentration of 1 μM), 3 μl of the purified PCR product and water up to 11.5 μl. The sequencing was carried out using capillary electrophoresis ABI3730 (Applied Biosystems).

GenBank accession numbers of the mitogenomes analyzed in the present study are from KU867578 to KU867629.

Nomenclature and quality control

We used the revised Cambridge Reference Sequence or rCRS46 as a reference for mtDNA variations.47 Haplogroup nomenclature was based on previous studies;44, 48, 49, 50, 51, 52 the reference phylogeny is being updated by the project Phylotree;53 see mtDNA tree Build 17 (February 2015) (http://www.phylotree.org/). In order to reduce the impact of potential sequencing artifacts we followed the phylogenetic procedures as described in previous studies.29, 41, 54 HaploGrep 255 was additional used to confirm haplogroup adscription and for sequence quality control.

Phylogenetic analysis

Maximum parsimony trees were built using the genetic information from the entire mtDNA molecule (excluding well-known hot spots, namely, T16519C, homopolymeric tracks in the control region, and the dinucleotide variation around positions 523–524). Relative positional mutation rates were taken from Soares et al.56 and Phylotree Build 17.

Statistical analysis

Unless indicated, the mitogenomes obtained in the present study were meta-analyzed with those obtained by Pereira et al.27 In order to detect differences between two cohorts, counts of different types of mutational changes were carried out as in Elson et al.;57 Pearson’s χ2 test was applied to 2 × 2 contingency tables. Moreover, it is important to note that the amount of different types of mutations (nonsynonymous, synonymous, tRNA, private and/or recurrent) in the mitogenomes of patients reported by Pereira et al.27 versus our cohort of patients was statistically nonsignificant (Pearson’s χ2 test, P>0.05). Therefore, given that mutational patterns were almost identical for both sets of patients, the figures and tables presented in the main text incorporate the two data sets together.

We additionally computed diversity indexes and differentiation tests (analysis of molecular variance (AMOVA) and FST) in the Portuguese and the Spanish cohort, as well as in the Iberian control samples, in order to further investigate differences between these two cohorts. The genetic distance (FST) and AMOVA were calculated from haplogroup frequencies and taking into account the whole mtDNA molecule as well as only the coding region. All these computations were carried out using Arlequin 3.5.58 A multidimensional scaling plot was build on pairwise FST distances using the cmdscale function of the stats library from the statistical software R v3.0.2 (http://www.r-project.org/) and the data from the Spanish and the Portuguese cohorts plus reference populations from the 1000 Genomes Project (www.1000Genomes.org; here onwards 1000G).

Besides the different phylogenetic-based analyses carried out with the whole set of mitogenomes, we undertook association tests for mtSNPs and haplogroups. As controls for mtSNP analysis, we used the coding variants contained in 107 mitogenomes of Iberian origin reported in 1000G; the data were retrieved as indicated in Gómez-Carballa et al.59 We only took into account common coding variants in controls (frequency >10%) for the comparisons. Association tests carried out on haplogroups were performed using geographically well-matched controls available in the literature from Galicia (n=620)6, 60 and from Central-South Portugal (n=362).61 Association analyses for mtSNPs and haplogroups were performed using Fisher’s exact test. A nominal P-value of 0.05 was used as a nominal threshold of statistical significance, and a Bonferroni correction was applied in order to account for multiple hypotheses. In addition, the Mantel–Haenszel χ2 test was employed to combine samples from the two cohorts analyzed in the present study; this test also accounts for potential differences between cohorts. All statistical computations were carried out using the R v3.0.2.

Statistical power was calculated using MitPower (mitPower: http://bioinformatics.cesga.es/mitpower;62) that allows adjusting for multiple tests. Statistical power was estimated considering 2 × 2 tables and the default mitPower parameters.

Pathogenicity prediction

We evaluated the possible theoretical impact of the nonsynonymous private variants (‘novel’ or reported as pathogenic in Mitomap) in pathogenicity using MitImpact tool.63 This is a web-based interface that provides precomputed pathogenicity scores of all the possible mitochondrial amino-acid changes. It counts 24 115 amino-acid variations in all 13 coding genes of the human mtDNA.

Results

Clinical characteristics

Several patient semen parameters were retrieved and are summarized in Table 1. The phenotypic features of our cohort together with the samples analyzed by Pereira et al.39 include: 88 asthenozoospermic (91.6%), 6 azoospermic (6.2%) and 2 oligozoospermic (2%) males. The 88 asthenozoospermic patients were classified as follows: 39 oligoasthenozoospermic (44.3%), 23 teratoasthenozoospermic (26.1%) and 26 (29.5%) pure asthenozoospermic.

Phylogeny of infertility male mitogenomes

Diversity values computed on the two cohorts (Galicia and Portugal) of male patients and controls were comparable. This high genetic homogeneity between samples was also confirmed by AMOVA and FST values that could not detect significant differences between them (Supplementary Table S1). AMOVA analysis revealed that almost all of the variation occurs within populations when computed by geographical regions (99.98% for the whole mtDNA; 100% for the coding region) or when computed by asthenospermic subgroups (99.95% for the whole mtDNA; 99.96% for the coding region). In addition, a multidimensional scaling plot (Figure 1) based on nucleotide pairwise distances indicates that the two cohorts of patients are very closely related.

Figure 1
figure 1

Multidimensional scaling (MDS) plot based on pairwise FST distances and considering data from the Portuguese and the Spanish cohorts and several data sets from the 1000 Genomes Project (1000G). Population codes are as follow: (SPA): Spanish cohort (present study); (POR): Portuguese cohort (from Pereira et al.39); (IBS): Iberian population in Spain; (CEU): Utah Residents (CEPH) with Northern and Western Ancestry; (STU): Sri Lankan Tamil from the UK; and (JPT): Japanese in Tokyo (Japan), all from 1000G. A full color version of this figure is available at the Journal of Human Genetics journal online.

The 96 entire mtDNA genomes were then analyzed phylogenetically (Supplementary Figures S1–S4). Almost all the patients were of Iberian origin (most of them from northwest Spain (Galicia) and Portugal; Supplementary Table S2), with the exception of two patients of American origin (one from Argentina; no O50; and another from Mexico; no O28) and two from Sub-Saharan Africa (Senegal; nos O06 and O28). One of the donors was of unknown origin (no O24).

Most of the haplogroups observed in the Spanish cohort are of European ancestry, including haplogroups H, J, T, U and so on. Two different mitogenomes (no O02 from Spain and no. AT09 from Portugal) share a transition at position 5554 within haplogroup H, defining a new branch of the phylogeny, here named as H107 (Supplementary Figure S1). Two other haplotypes could be classified as U2e1c (Supplementary Figure S3); they share two transversions (C14794A and A15817C) that allow characterizing a new minor sub-haplogroup, named here as U2e1c2. The haplogroup composition of the Galician cohort fits well with the typical pool of western European populations.64, 65 There are four samples belonging to sub-Saharan L haplogroups66, 67 and two haplotypes ascribed to M1, a haplogroup that is mainly found in North Africa, the Mediterranean coast and the Middle East.68 One of the samples is of Native American origin (Argentina; no O50) and belongs to the C1b haplogroup. This haplogroup exclusively appears in Meso-South American territories.69 There is another sample (no. AT10) belonging to I1a1 that is mainly located in North-Eastern European territories.70

The transition/transversion ratio for the coding region was 17.9:1, this ratio was slightly lower when analyzing the whole molecule (14.8:1). This pattern fits well with ratios calculated on controls when considering polymorphisms with frequency >0.01%: (22.1:1 calculated from >5100 mitogenomes,71 and 20.5:1 calculated from >30 000 mitogenomes (as reported in Mitomap; http://www.mitomap.org)). We found a transversion substitution bias toward a higher frequency of A and very low frequencies of G, C and T (12:1:5:4 ratio respectively for the coding region; 17:1:12:5 for the whole molecule).

It is well known that there is evolutionary pressure against mutations in second and first codon positions, whereas weaker selective forces act against mutations in third codon position. This fits well with the mutation pattern observed in our data set that shows a 2.2:1:5.8 ratio for first, second and third codon positions respectively (corresponding to 24.6, 11.1 and 64.3% of the variations) and is in full agreement with the results obtained by Pereira et al.71

Mutational changes in protein mtDNA genes

We found a total of 417 different mutation points in the mtDNA coding region of our data set; 350 of these are evenly distributed along the 13 mitochondrial genes, showing a very good correlation between gene size and observed variation (Figure 2a, r2=0.9). Of the 350 mutations, 113 (32%) were nonsynonymous and 238 (68%) were synonymous changes (Figure 2b). These values result in a nonsynonymous/synonymous ratio of 1:2.1. There is also a strong correlation between observed synonymous changes and maximum possible changes per gene (Figure 2c, r2=0.9). This linear pattern suggests that synonymous changes are produced at about equal rates through the mitochondrial genes. However, the nonsynonymous changes do not follow the same trend (Figure 2d, r2=0.4). This pattern, however, was previously observed in a data set of >5100 mitogenomes from healthy individuals from across the world; r2=0.4 in Pereira et al.71

Figure 2
figure 2

(a) Accumulation of mitochondrial DNA (mtDNA) changes in the protein genes of infertile samples versus size of the different mtDNA genes. (b) Distribution of synonymous and nonsynonymous changes in the mtDNA protein genes of all infertile samples. (c) Accumulation of synonymous mtDNA changes in the protein genes of infertile samples versus the maximum number of possible changes per gene. (d) Accumulation of nonsynonymous mtDNA changes in the protein genes of infertile samples versus the maximum number of possible changes per gene. (e) Correlation between observed and maximum possible variations for amino acids originated from nonsynonymous mutations. Changes in amino-acid group comprising V, I, A, M and T are represented in orange. A full color version of this figure is available at the Journal of Human Genetics journal online.

If we compare the ratio of synonymous with nonsynonymous changes in the haplogroup-defining and non-haplogroup-defining branches of the whole cohort, no statistically significant differences are found (two-tailed Fisher’s exact test P-value=0.1011), despite the excess of nonsynonymous changes observed at the tips of the phylogeny (see private variants section below).

We found that the majority of nonsynonymous changes occur between V, I, A, T and M amino acids (Figure 2e; see also Supplementary Figure S5A) as previously shown by Pereira et al.71 It seems that these types of replacements are better tolerated than others, appearing more frequently than expected, and thus they are less likely to be pathogenic alterations in a patient.71 Most of the amino-acid replacements are to neutral apolar and to neutral polar amino acids (94%), and only a few are to basic and acid amino acids (4% and 2%, respectively); Supplementary Figure S5B. Again, this bias was previously highlighted by Pereira et al.71 and therefore represents a common feature in mtDNA data sets of healthy individuals.

Mutational changes in tRNA mtDNA genes

Mutations in tRNA genes are known to play an important role in mtDNA disease, potentially resulting in transcriptional and translational defects and consequently in mitochondrial respiratory chain dysfunction.

The tRNA gene most affected by mutational changes in our data is the one that encodes threonine (26%); Table 2. This feature was previously reported by Kivisild et al.72 in healthy individuals. This higher tendency for changes in this threonine-tRNA (data not shown) is also in good agreement with observations obtained from the analysis of >30 000 mitogenomes (as compiled from Mitomap). Several mitochondrial tRNA mutations have been related to many different diseases; however, none of them has ‘confirmed’ pathogenic status in, for example, Mitomap (Table 2). Moreover, none of the tRNA mutations in our data have been cataloged as ‘definitely pathogenic’ when using the weighted scoring system developed by McFarland et al.73 (further reviewed and adjusted by Yarham et al.74). Following this classification system, all of the tRNA mutations except two were classified as neutral. The two exceptions were T4336C and A4435G that however received low pathogenicity scores, namely 8/20 and 7/20, respectively.

Table 2 Variation observed in transfer RNA (tRNA) genes in infertile men

There are three positions that have more occurrences than others (A12308G, T10463C and G15928A) (Table 2). The former is a diagnostic site for the frequent Eurasian haplogroup U, whereas the latter two are diagnostic positions for haplogroup T.

Homoplasmic mutations in mtDNA genomes

A common feature of mtDNA variation is that the same variant can appear in different branches of the mtDNA phylogeny through recurrent mutation. Multiple occurrences of a variant in the mtDNA phylogeny strongly depend on point mutation rate and the effect of the purifying selection acting against deleterious changes.75 It has been reported that the mutations that have persisted multiple times in the mtDNA phylogeny have significantly lower predicted pathogenicity scores.75

Most of the homoplasmic variants observed in the mitogenomes of our patients (Table 3) are well-known hot spots in the mtDNA phylogeny. Three recurrent events appear only at the tips of the phylogenies (Supplementary Figures S1–S4): A7055G (samples O01, TA09), A12172G (samples O08, TA18) and G12651A (samples TA15, AT08). Not only do these variants appear at high frequency in GenBank (Table 3) but also all of them are diagnostic positions of several haplogroups along the mtDNA phylogeny (www.phylotree.org). The high level of nonsynonymous and tRNA changes in recurrent events observed was previously reported by Kivisild et al.72 in healthy individuals.

Table 3 Homoplasmic events observed in the mtDNA coding region of the infertile samples

Private mutations in mtDNA genomes

A total of 39% of the variants observed in our cohort fell at the tips of the phylogeny. There is an accumulation of nonsynonymous changes among private mutations (nonsynonymous/synonymous ratio of 1:1.69) when compared with haplogroup defining mutations (1:2.49). This fits well with the fact that selective pressure has had more time to act against deleterious mutations. Moreover, it has been reported that the power of purifying selection is strong enough to overcome important demographic changes (for example, because of genetic drift) occurring in main population groups during the past.56, 76

Most of the private variants observed in infertile males had already been reported in the literature for healthy individuals. Some private variants were even reported as possibly pathogenic in Mitomap (Table 4), but these were never confirmed, with the exception of transition A1555G in ribosomal RNA 12S (sample O36) that has been identified as a causal mutation in deafness.

Table 4 Private variants observed in the coding region of infertile samples that are ‘novel’ or recorded in Mitomap as disease associated

Nine private mutations could be classified as ‘novel’ (Table 4). This qualifier is applied here to those variants that could not be found in the main mtDNA databases and did not show up on Google searches.77, 78 This proportion of ‘novel’ variants is however expected given that only a minor proportion of the existing variation in human populations is covered in databases.

No common private mutations appeared in more than one haplotype, with two exceptions only (Table 4): transitions A1815G (samples O04 and O32; haplogroup V; Supplementary Figure S1) and C13586T (samples O06 and O28; haplogroup L3b1a9a; Supplementary Figure S4). However, these sample pairs share the same haplotypes, suggesting a common and recent maternal ancestry.

There are multiple in silico tools for predicting the pathogenicity of nonsynonymous changes through the use of different approaches. A total of 12 nonsynonymous private variants (Table 4) were evaluated; 6 of them were classified as ‘neutral’ or ‘benign’ using several pathogenicity predictors (Supplementary Table S3). The rest of the variants received uneven predictions depending on the different pathogenicity scores used, namely variants T3335A, C13586T and A11084G changes. The inconsistencies resulting from the use of different pathogenicity prediction software have been previously reported by Castellana et al.63

Heteroplasmic variation in mtDNA genomes

We found four heteroplasmic positions in our cohort of patients (Table 5), distributed in different haplotypes and genes. The haplotypes carrying these heteroplasmies belong to different haplogroup backgrounds and their carriers have different phenotypic features related to the semen parameters. Furthermore, these four heteroplasmies are common polymorphisms in populations,56 and hence they appear in many branches of the global mtDNA phylogeny. Three of these heteroplasmies represent synonymous changes. In addition, the missense G9163A variant (valine to isoleucine amino acid change) is not reported in Phylotree, but it appears in many healthy individuals belonging to different mtDNA haplogroups (that is, GenBank accession numbers KF540564, EF556158 and KP300791; haplogroups M7b1a2a1, U3b1a and A2, respectively).

Table 5 Characteristics of heteroplasmic variants found in infertile samples

Association study

Association tests were carried out individually for mtDNA haplogroups (Table 6) as well as mtSNPs (Table 7). The statistical power (considering the merged cohorts from Portugal and Spain) to detect an odds ratio of >2.4 for haplogroups/variants of frequency >10%, or >2 for haplogroups/variants of frequency of >20%, was at least 80%.

Table 6 Association test of haplogroups between cases and healthy controls
Table 7 Association test of mtSNPs between cases and healthy controls

After Bonferroni correction for multiple tests (considering as many hypotheses as haplogroups or mtSNP tested), none of the mtDNA haplogroups and mtSNPs observed in our cohorts of patients showed statistical association when analyzed independently. In addition, an association test computed on haplogroups for the two combined cohorts was not statistically significant based on the Mantel–Haenszel χ2 test.

Discussion

The present study used both a phylogenetic and a population-based approach to explore patterns of mtDNA variation in infertility. The phylogenetic methodology has been successfully employed in previous studies dealing with sperm motility27 and other pathogenic conditions.40 Our data do not show common variations in the mtDNA molecule of patients that could explain the infertile phenotype. Patients have different mtDNA backgrounds, ruling out the implication of any basal mutation in the phenotypic trait. In addition, we did not observe any correlation between infertility subtypes and mitochondrial lineages. Moreover, the novel variations observed in our patients do not seem to show any peculiarity with regard to infertility. Novel variations are commonly encountered when analyzing new mitogenomes in populations. For instance, when analyzing the variability contained in >30 000 mitogenomes (www.genbank.com), 59% of the transversions and 19% of the transitions are private variants (only one mutational hit).

At the same time, there is a common trend in the clinical setting to consider nonsynonymous changes as pathogenic by default. Changes in protein mtDNA genes are however not necessarily pathogenic; in fact, most of them are neutral or do not have obvious phenotypic manifestation. Mutational changes observed in our patients show a pattern comparable to expectations in healthy individuals.

We also carried out a statistical association test for common variants in mitogenomes from our patients and matched controls. The tests failed to detect positive associations in mtSNPs and haplogroups with regard to the infertile phenotype.

Overall, our results do not support an association of mtDNA variants in male infertility and are therefore in agreement with previous findings reported by Pereira et al.27

There is contradictory evidence in the literature pointing in opposite directions with regard to the role of mtDNA in sperm motility. There are several reasons that could explain these contradictory findings. First, studies arguing for positive evidence based on case–control association studies do not fully meet the necessary statistical standards. Some case–control association studies showing positive results have previously been questioned by others; for example, see Pereira et al.28 with regard to Ruiz-Pesini et al.38 or Bandelt12 with regard to Holyoake et al.19 A few other studies showing positive findings had not been examined in depth previously, but these studies are questionable too. Two relevant examples are Montiel-Sosa et al.11 and Feng et al.13

Montiel-Sosa et al.11 reported differences of sperm motility associated to haplogroup U; this study however might be affected by problems of different nature.21 In short, there was no control for population stratification, and the results were based only on a discovered cohort but never replicated in an independent cohort. However, there are other more intriguing problems in this study.11 Figure 3 of Montiel-Sosa et al.11 shows percentage of sperm motility and vitality depending on phylogenetic (haplogroup) background, and these percentages were arranged from low to high values; this decision was arbitrary and it has no phylogenetic foundation. In fact, any combination of haplogroups and sperm motility and vitality values could be arranged in the same way, from low to high. Therefore, the progression observed in Figure 3 of Montiel-Sosa et al.11 has no apparent biological foundation. Their methods section reported the use of different statistical tests, such as Fisher’s exact text, χ2 (that was never used), t-tests and analysis of variance, but the rationale for these procedures was not exposed. This renders the whole statistical analysis quite confusing. For instance, the authors mention that ‘comparison of the sperm motility and vitality between the different U sublineages revealed statistically significant differences in both motility…’ but only a single P-value from Fisher’s exact test was reported as ‘(F=3.37, P=0.013)’, when we should expect as many as different sublineages of U. Furthermore, multiple test correction should be applied when considering several sublineages (hypothesis), and this could turn the P-values reported into nonsignificant results; the same reasoning could be applied to other tests carried out in this study that would need adjustments. The authors did not estimate statistical power in their study, but absolute numbers of the different sub-haplogroups were very limited (considering a imaginary scenario of weak genetic effect associated to mtDNA variation); this would render any statistically significant value as a possible false positive. Moreover, the data by Montiel-Sosa et al.11 might be affected by some methodological errors related to NUMTs as reported by Yao et al.,25 although it is not possible to determine the impact of these errors in the data. Last but not the least, their hypothesis related to sublineages of haplogroup U in connection to adaptation to the lower temperatures of the northern latitudes and sperm motility could be challenged if other nonselective demographic scenarios would be considered79 based on the recolonization of Europe by Franco-Cantabrian refugees (for example, see previous studies80, 81).

The article by Feng et al.13 is also problematic for similar reasons. For instance, multiple tests adjustments were not applied, despite the fact that many different haplogroups and clinical variables were considered. Most striking are the main results in their Table 3. The authors reported the odd situation where statistical association tests were carried out between different haplogroups of a sub-set of patients (asthenozoospermic phenotype (AP)) against the total number of individuals and phenotypes (therefore including their AP men). By excluding their AP patients from the ‘Total’ category, the frequency of haplogroup R in their ‘control group’ (non-AP men) would have increased to 51%; a frequency that is notoriously much higher than in any healthy Han Chinese population.82 This therefore denotes some methodological problem in their data. A haplogroup R frequency of 51% in non-asthenozoospermic men would yield an unbelievable significant value when compared with the frequency of R in asthenozoospermic patients (Supplementary Table S4). In contrast, Yang et al.83 reported a frequency of haplogroup R in Han controls of 37%. If we recompute Fisher’s exact test in haplogroup R asthenozoospermic carriers from Feng et al.13 against the control group in Yang et al.,83 the nominal P-value would be still significant (Supplementary Table S4), but nonsignificant when compared with the frequency of haplogroup R reported by Yao et al.82 Therefore, apart from the messy scenario presented by Feng et al.,13 their findings are inconsistent given the important differences existing between Han Chinese haplogroup frequencies from Feng et al.13 and Yao et al.82 (Supplementary Table S4).

Problems related to association studies are quite common in the literature on mtDNA studies of different disease conditions.6, 22, 60 An urgent call for higher statistical standards has been recently issued for case–control mtDNA association studies.21

In addition to case–control studies in infertile men, there are some studies reporting an alleged role of particular variants in sperm motility, arguing that the candidate variants are novel and/or nonsynonymous. Unfortunately, these studies did not consider the likelihood that these variations could just represent private variants commonly found in population studies of healthy individuals too.22 If we bear in mind that there is a proportion of mtDNA variation that remains still unknown, we should be very cautious when assigning a pathogenic role to any new mtDNA variation. In this context, the studies by Baklouti-Gargouri et al.31, 32, 33, 34 on Tunisian patients are particularly problematic.36

The main limitation of our study is the sample size, although this constitutes the largest cohort of mitogenomes analyzed to date on infertility. To overcome this limitation we meta-analyzed our data with the data published by Pereira et al.27 By merging the two cohorts, we improved the phylogenetic analysis and we gained statistical power to carry out association tests, although only considering the most frequent variation and assuming a relatively high risk of the variants (see Materials and methods). Despite this limitation, the power of the present study is also comparable to those of the other case–control association studies reviewed, with the advantage that the present study targeted the complete mtDNA molecule instead of preselected variants or haplogroups. More powerful studies should be needed in order to assess positive associations between mtDNA variation and infertility. Ideally, these studies should also consider replication and confirmatory cohorts in order to avoid spurious positive findings.

The present study reports the lack of evidence of association of mtDNA variation and infertility phenotypes, mainly based on a phylogenetic approach, but also carrying out a complementary case–control association study. We have also contributed to a critical review of previous findings in the literature. Overall, and taking into account the limitations of the present study, we did not find convincing evidence for the involvement of mtDNA variation in fertility problems in our cohorts of patients and the literature revised. There is however overwhelming evidence showing that mtDNA studies on diseases are still problematic, and that special care should be taken in order to avoid unfounded claims of association between mtDNA variants and different disease conditions.