Dear Editor,
Canine transmissible venereal tumor (CTVT), the oldest known somatic cell line, is a living fossil of the original founder, transmitted from host’s cancer cells to other canids during the mating process.1 Since it was shown ten years ago that living cells from an ancient host could be transmitted among canids, the origin of CTVT has been studied continuously.2 Recent comparison of the CTVT genetic data with a more comprehensive canine reference panel including pre-contact dogs (PCDs) from North America argued that the CTVT founder (the original canid infected with CTVT) is the closest detectable lineage to PCDs, and that this clade underwent introgression from wild canids in North America.3
However, previous studies may not take into account several potential biases in the genotyping methods for CTVT samples and the strategy for collecting loci (Supplementary information, Note), and the genetic ancestry of the ancient founder of CTVT is still unknown. To address these biases and the unknown issue, we collected new CTVT samples and modern canids, and used a newly developed tool and a refined strategy for analyses.
We generated a high-quality reference panel representing canine genetic diversity with the highest spatial and temporal resolution to date, including whole-genome sequencing (WGS) data of 22 newly collected canids and 81 published ancient and modern worldwide dogs and wild canids (Supplementary information, Table S1 and Note). Then, 24.1 M single nucleotide polymorphisms (SNPs) were called from these samples (Supplementary information, Methods), composing a dense reference of makers.
We sequenced two new CTVT samples collected in Kunming, China (Supplementary information, Fig. S1, Methods and Note), and included WGS data of three previously published CTVT samples from Australia, Brazil,1 and Gambia3 (Supplementary information, Fig. S2 and Table S2). Together, these five CTVTs from four continents allow us to exclude lineage-specific somatic mutations.
As chromosomal instability is considered the predominant somatic mutational type in the tumorigenesis of CTVT,4 the copy number variation (CNV) profile is necessary to determine the genotype at local sites. Thus, we developed a transmissible tumor genotyper (ttgeno), the first genotyping tool designed specifically to analyze WGS data from paired transmissible tumors and their hosts, to obtain per-site allelic copy number of the tumor (Supplementary information, Methods). This tool simultaneously takes into account the ploidy, contamination, local copy number states of both host and tumor, and small indels in the tumor, and removes the subclonal factor, as previous studies have shown that CTVT is almost homogeneous.1,4 We genotyped each CTVT using this tool, and obtained successful genotyping rates of 95.5%–97.4%.
The genotyped CTVT genome is composed of a mix of different mutations, including systematic errors, alleles inherited by the founder, lineage-specific somatic mutations, and earlier somatic mutations. Assuming a single origin for CTVT,1 lineage-specific somatic mutations can be distinguished from genotype-polymorphic mutations using multiple worldwide CTVT samples. That is, alleles inherited by the founder and earlier somatic mutations should be genotype-monomorphic among CTVT samples. We found ~1.7 G genotype-monomorphic sites, allowing one missing CTVT sample at each site. Another 2.9 M sites were genotype-polymorphic loci among the five CTVTs, allowing two missing CTVT samples at each site. We used the genotype-polymorphic sites to assess the relationship between these five CTVTs (Supplementary information, Fig. S3) and excluded these from subsequent analyses. We found that of the ~1.7 G sites that were genotype-monomorphic in five CTVT samples, 17.4 M sites (2 M non-ref alleles) are biallelic polymorphic in the reference panel, while 1.5 M sites were private to CTVT samples. Using an assessment strategy based on mutation signatures, we demonstrated that the 17.4 M sites are inherited germline SNPs (Fig. 1a–c; Supplementary information, Figs. S4–7 and Note). Thus, we treated the 17.4 M sites as direct descendants of the suppositional ancient canid “the CTVT founder” and use these sites for subsequent population genetic analyses.
We utilized population phylogeny analysis (Fig. 1d and Supplementary information, Figs. S8–12), principal component analysis (PCA, Fig. 1e and Supplementary information, Fig. S13) and outgroup f3(CTVT founder, Pop2; Coyote) statistics5 (Fig. 1f) to assess the genetic relationship between the CTVT founder and samples in the reference panel (Supplementary information, Note). Our results reveal that the CTVT founder was more closely related to PCDs than to any other populations (Fig. 1d–f), similar to Ní Leathlobhair et al.,3 but disagreeing on some details (Supplementary information, Note). Meanwhile, the topology of phylogeny (Fig. 1d) and the spacial position of the PCD/CTVT founder cluster in PCA (Fig. 1e) all suggest that introgression from wild canids may exist in this PCD clade. ADMIXTURE6 analysis shows that the CTVT founder also possessed ancestral components found predominantly in wild canids (Supplementary information, Fig. S14 and Note).
To further investigate whether the CTVT founder and PCDs experienced introgression from a population distantly related to dogs, we calculated D-statistics5 to test whether significant asymmetry (positive D value, Z > 3) exists between Pop1 and Pop2 using the form D(Pop1, Pop2; Candidate Introgressor, Andean Fox) (Supplementary information, Table S3 and Note). We tested every non-dog group as a candidate introgressor for the CTVT founder using D(CTVT founder, Pop2; Introgressor, Andean Fox), where Pop2 was each canid population in turn (Supplementary information, Fig. S15). Only coyotes were found to be a robust candidate introgressor. Coyotes from Monterey showed significantly positive D-statistics for most Pop2 populations except the other coyotes, New World wolves, and PCDs (Z > 3.7). Similar to previous analysis,3 two PCDs (i.e., Port au Choix, Weyanoke Old Town) showed significantly positive D(CTVT founder, Pop2; PCD, Andean Fox) statistics for all Pop2 populations (Z > 46), indicating the close relationship between the CTVT founder and PCDs in our panel. Taken together, the CTVT founder is likely an ancient American dog with introgression from populations carrying ancestry related to coyotes from the Monterey area, California, and Alabama. We also tested whether other dogs (Pop1) underwent introgression from coyotes by using D(Pop1, Pop2; Coyote, Andean Fox), where Pop2 was tested using all other groups in turn (Supplementary information, Fig. S16). We found no evidence of introgression from coyotes in any dog population except PCDs and the CTVT founder. Due to the CTVT founder’s high coverage, we used it as a surrogate for PCDs to test whether any other canids carry ancestry from PCDs (Supplementary information, Fig. S17). Only Arctic sled dogs (ASDs) in North America show more similarity to PCDs, followed by Siberian and Alaskan huskies. However, whether asymmetric D-statistics indicate introgression from closely related populations or an inheritance relationship cannot be determined without high-density sampling of ancient and modern PCDs and ASDs over a broad geographical region and time frame.
To confirm our result of introgression from coyotes to the CTVT founder shown by D-statistics analyses, we utilized the coyote-specific diagnostic alleles,7 fd-statistics,8 and fdM-statistics9 in sliding windows, as well as RFMix10 to infer the local ancestry in the genome of the CTVT founder (Fig. 1g and Supplementary information, Note). We found that the results were consistent using these methods, with several regions introgressed from coyotes. The introgression rates were estimated to vary in range of 0.9%–2.6% using different methods (Supplementary information, Tables S4-5 and Note).
We used TreeMix11 to investigate the genetic relationship between the CTVT founder, PCDs, other ancient and present-day canids (Supplementary information, Figs. S18–21 and Note). We visualized the matrix of residuals (Supplementary information, Fig. S18b) to determine how the estimated genetic relationship between each pair of canids fits the model. We found three candidate admixture events: (1) between coyotes and the PCD/CTVT founder, (2) between Siberian and Alaskan huskies, and (3) between Indian and African village dogs. In a reticulate maximum-likelihood graph allowing three admixture events, a migration event from the coyote lineage to the PCD/CTVT founder clade is included (Fig. 1h; matrix of residuals in Supplementary information, Fig. S21b). Thus, several methods support the presence of gene flow from coyotes to the ancient native dog population represented by the CTVT founder and PCDs. This reticulate graph also demonstrated the concordant result of the Out of Southern East Asia hypothesis of living dogs suggested in a previous study12 (Fig. 1h), which reports that East Asian dogs are the basal clade of all dogs, and two major superclades are found in the dog phylogeny, representing two migration routes to the regions of Far East-America and Indian Peninsula-West Eurasia.12
The CTVT founder, inferred from the geographically dispersed CTVT samples, is a useful high-quality proxy for PCDs. The CTVT-private genotype-monomorphic sites will greatly aid cancer evolution studies,13 and more importantly, the extraction of the CTVT founder genome from genotype-monomorphic sites in CTVT samples is invaluable to canine population studies. Thus, we provide the genotype-monomorphic diploidized sites of the five geographically dispersed CTVTs in the DogDG database of the iDog14 platform for researchers to conveniently use in future studies.
References
Murchison, E. P. et al. Science 343, 437–440 (2014).
Ostrander, E. A., Davis, B. W. & Ostrander, G. K. Trends Genet. 32, 1–15 (2016).
Ní Leathlobhair, M. et al. Science 361, 81–85 (2018).
Ujvari, B., Papenfuss, A. T. & Belov, K. Bioessays 38, S14–S23 (2016).
Patterson, N. et al. Genetics 192, 1065–1093 (2012).
Alexander, D. H., Novembre, J. & Lange, K. Genome Res. 19, 1655–1664 (2009).
Monzon, J., Kays, R. & Dykhuizen, D. E. Mol. Ecol. 23, 182–197 (2014).
Martin, S. H., Davey, J. W. & Jiggins, C. D. Mol. Biol. Evol 32, 244–257 (2015).
Malinsky, M. et al. Science 350, 1493–1498 (2015).
Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. Am. J. Hum. Genet. 93, 278–288 (2013).
Pickrell, J. K. & Pritchard, J. K. PLoS Genet. 8, e1002967 (2012).
Wang, G.-D. et al. Cell Res. 26, 21–33 (2015).
Ostrander, E. A., Dreger, D. L. & Evans, J. M. Annu. Rev. Anim. Biosci. 7, 449–472 (2019).
Tang, B. et al. Nucleic Acids Res. 47, D793–D800 (2019).
Krzywinski, M. et al. Genome Res. 19, 1639–1645 (2009).
Acknowledgements
We thank Krishna R. Veeramah and Yong Zhou (BGI, Shenzhen, China) for providing valuable suggestions, Qi-Jun Zhou for collecting CTVT samples and laboratory work, Newton O. Otecko for providing dog samples from Kenya, Ming-Shan Wang for collecting dog samples from Sri Lanka, Shi-Fang Wu for collecting other dog samples, Shao-Jie Zhang for managing re-sequencing data, and the BIG Data Center in Beijing Institute of Genomics, Chinese Academy of Sciences (CAS) for usage of their high-performance computing platform. This work was supported by the Strategic Priority Research Program (B) (XDB13000000) of the CAS, the National Natural Science Foundation of China (31621062, 91731304, 41672021 and 41630102), and the Animal Branch of the Germplasm Bank of Wild Species, CAS (the Large Research Infrastructure Funding). Q.F. is supported by CAS (XDA19050102, QYZDB-SSW-DQC003, and XDPB05), and the Howard Hughes Medical Institute (55008731). G.-D.W. is supported by the Youth Innovation Promotion Association, CAS.
Author information
Authors and Affiliations
Contributions
Y.-P.Z. initiated the project. G.-D.W., Q.F. and Y.-P.Z. managed the project. F.-L.C., S.C.O., A.E., M.M.T., A.D.P. and P.S. collected and provided the samples. T.-T.Y. performed laboratory experiments. T.-T.Y. and X.W. managed the sequencing data. X.W. and B.-W.Z. analyzed and interpreted the data. X.W., M.A.Y., and B.-W.Z. wrote the original manuscript. G.-D.W., Q.F., and Y.-P.Z. reviewed and edited the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Wang, X., Zhou, BW., Yang, M.A. et al. Canine transmissible venereal tumor genome reveals ancient introgression from coyotes to pre-contact dogs in North America. Cell Res 29, 592–595 (2019). https://doi.org/10.1038/s41422-019-0183-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41422-019-0183-2