Letter

European Journal of Human Genetics (2015) 23, 567–568; doi:10.1038/ejhg.2014.227; published online 15 October 2014

Reply to Mendez et al: the ‘extremely ancient’ chromosome that still isn’t

Eran Elhaik1, Tatiana V Tatarinova2, Anatole A Klyosov3 and Dan Graur4

  1. 1Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
  2. 2Department of Pediatrics, Children's Hospital Los Angeles and Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
  3. 3The Academy of DNA Genealogy, Newton, MA, USA
  4. 4Department of Biology & Biochemistry, University of Houston, Houston, TX, USA

Correspondence: Eran Elhaik, E-mail: e.elhaik@sheffield.ac.uk

Earlier this year, we discovered that an extreme age estimate for a Y chromosomal haplotype (237000–581000 years ago) by Mendez et al1 was based on analytical choices that consistently inflated its value.2

As stated in our original criticism,2 estimating divergence time is not different, in principle, from estimating the time it takes two cars traveling in opposite directions at known speeds to reach a certain distance from each other. The time inferences will be overestimated if the distance between the two cars is overestimated, or if the speed of either car is underestimated. Similarly, a divergence time estimate will seem larger than the actual divergence time if the genetic distances between sequences are overestimated and/or the rates of substitution are underestimated.

Let us consider a very simple estimation model for the time of divergence,

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where t is the divergence time, d is the genetic distance, and r is the substitution rate per unit time. To overestimate t, one needs to overestimate d and/or underestimate r. d is usually estimated by dividing the number of differences between two sequences, n, by the length of the aligned sequences, l, and correcting for multiple hits and the like

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

d can, thus, be overestimated by either overestimating n or underestimating l. The unit time for r is years. However, r is often derived from data on number of substitutions per generation. r can, thus, be overestimated by assuming that the generation time, tg, is larger than it really is.

In selecting values for d, r, n, l, and tg, Mendez et al1 consistently and without exception chose values that led to overestimating the time of divergence.

In Elhaik et al,2 we discussed many such choices. In the following we will focus on two choices left unexplained by Mendez et al.3 The first choice concerns the substitution rate used in the calculation of the TMRCA. Using an estimate based on Y-chromosome substitution rate (1 × 10–9 substitutions per nucleotide per year)4 we can calculate divergence times of 43/240000/10−9179000 years and 45/180000/10−9250000 years, for an average of 214500 years, very similar to the TMRCA obtained using a likelihood-based method: 209500 (95% CI: 168000–257400) years.2 Not surprisingly, by employing an autosomally derived value of 0.617 × 10–9 as the mutation rate constant, which is 1.6 times smaller, Mendez et al1 obtained a divergence time 1.6 times higher than that estimate of 290000–404000 years, with an average value of 347000 years. More appropriate choices would have resulted in a much lower estimate. Mendez et al1 other choices, such as the unprecedented 40 years for human generation time, resulted in overestimating the time of divergence by 20–130%.

The second choice concerns the irregular and questionable comparison of mutation numbers based on sequences of unequal lengths. Mendez et al3 compared 240000 bases of the A00 Y-chromosome that contained 43 mutations with 180000 bases of the A0 Y-chromosome that contained 45 mutations. In other words, they used data from two segments, in which one segment was smaller than the other by about 25%. In response to Mendez et al’s3 allegations of ‘misunderstanding of population genetic theory,’ we challenge the authors to come up with one example in the evolutionary literature in which the branches on a phylogenetic tree were estimated by using pairwise distances based on alignments of different lengths. We note that textbooks in molecular evolution (for example, Graur and Li5) specifically caution against such practices.

Top

Conflict of interest

The authors declare no conflict of interest.

Top

References

  1. Mendez FL, Krahn T, Schrack B et al. An African American paternal lineage adds an extremely ancient root to the human Y chromosome phylogenetic tree. Am J Hum Genet 2013; 92: 454–459. | Article | PubMed | ISI | CAS |
  2. Elhaik E, Tatarinova TV, Klyosov AA, Graur D: The ‘extremely ancient’ chromosome that isn’t: a forensic bioinformatic investigation of Albert Perry’s X-degenerate portion of the Y chromosome. Eur J Hum Genet 2014; 22: 1111–1116.  | Article | PubMed | ISI | CAS |
  3. Mendez FL, Veeramah KR, Thomas MG, Karafet TM, Hammer MF: Reply to “The ‘extremely ancient’ chromosome that isn’t” by Elhaik et al. Am J Hum Genet 2014.
  4. Xue Y, Wang Q, Long Q et al. Human Y chromosome base-substitution mutation rate measured by direct sequencing in a deep-rooting pedigree. Curr Biol 2009; 19: 1453–1457.  | Article | PubMed | ISI | CAS |
  5. Graur D, Li W-H. Fundamentals of Molecular Evolution. Sinnauer Associates: Sunderland, MA, USA, 2000.
Top

Acknowledgements

We thank Thomas Krahn for his comments.