Analysis of horse genomes provides insight into the diversification and adaptive evolution of karyotype

Huang, Jinlong; Zhao, Yiping; Shiraigol, Wunierfu; Li, Bei; Bai, Dongyi; Ye, Weixing; Daidiikhuu, Dorjsuren; Yang, Lihua; Jin, Burenqiqige; Zhao, Qinan; Gao, Yahan; Wu, Jing; Bao, Wuyundalai; Li, Anaer; Zhang, Yuhong; Han, Haige; Bai, Haitang; Bao, Yanqing; Zhao, Lele; Zhai, Zhengxiao; Zhao, Wenjing; Sun, Zikui; Zhang, Yan; Meng, He; Dugarjaviin, Manglai

doi:10.1038/srep04958

Download PDF

Article
Open access
Published: 14 May 2014

Analysis of horse genomes provides insight into the diversification and adaptive evolution of karyotype

Jinlong Huang¹^na1,
Yiping Zhao¹^na1,
Wunierfu Shiraigol¹^na1,
Bei Li¹^na1,
Dongyi Bai¹^na1,
Weixing Ye²^na1,
Dorjsuren Daidiikhuu¹^na1,
Lihua Yang¹^na1,
Burenqiqige Jin¹^na1,
Qinan Zhao¹^na1,
Yahan Gao¹^na1,
Jing Wu¹^na1,
Wuyundalai Bao¹^na1,
Anaer Li¹^na1,
Yuhong Zhang¹^na1,
Haige Han¹^na1,
Haitang Bai¹^na1,
Yanqing Bao¹^na1,
Lele Zhao^3,4^na1,
Zhengxiao Zhai^3,4^na1,
Wenjing Zhao^3,4^na1,
Zikui Sun²^na1,
Yan Zhang⁵^na1,
He Meng^3,4^na1 &
…
Manglai Dugarjaviin¹^na1

Scientific Reports volume 4, Article number: 4958 (2014) Cite this article

7832 Accesses
81 Citations
23 Altmetric
Metrics details

Subjects

Abstract

Karyotypic diversification is more prominent in Equus species than in other mammals. Here, using next generation sequencing technology, we generated and de novo assembled quality genomes sequences for a male wild horse (Przewalski's horse) and a male domestic horse (Mongolian horse), with about 93-fold and 91-fold coverage, respectively. Portion of Y chromosome from wild horse assemblies (3 M bp) and Mongolian horse (2 M bp) were also sequenced and de novo assembled. We confirmed a Robertsonian translocation event through the wild horse's chromosomes 23 and 24, which contained sequences that were highly homologous with those on the domestic horse's chromosome 5. The four main types of rearrangement, insertion of unknown origin, inserted duplication, inversion and relocation, are not evenly distributed on all the chromosomes and some chromosomes, such as the X chromosome, contain more rearrangements than others and the number of inversions is far less than the number of insertions and relocations in the horse genome. Furthermore, we discovered the percentages of LINE_L1 and LTR_ERV1 are significantly increased in rearrangement regions. The analysis results of the two representative Equus species genomes improved our knowledge of Equus chromosome rearrangement and karyotype evolution.

Refining the evolutionary tree of the horse Y chromosome

Article Open access 02 June 2023

Trio-binning of a hinny refines the comparative organization of the horse and donkey X chromosomes and reveals novel species-specific features

Article Open access 17 November 2023

Recurrent chromosome reshuffling and the evolution of neo-sex chromosomes in parrots

Article Open access 17 February 2022

Introduction

Horses are recognized as extremely successful domestic animals. Humans in many parts of the world have relied on them for thousands of years¹. The genus Equus originated on the North American continent and migrated 2.6 million years ago over the Bering Strait during the Ice Age². Horses, donkeys and zebras evolved from the same ancestor. The speciation events were accomplished through acute chromosomal rearrangements, with the rearrangement rate ranging from 2.9 to 22.2 per million years, which is significantly higher than in other mammals^{3,4,5,6,7,8,9}. Equus species possess widely varying diploid chromosome numbers, from 2n = 32 (Mountain zebra) to 66 (Przewalski's horse). Przewalski's horse has a different chromosome number than domestic horses because of a Robertsonian translocation, resulting in one pair of metacentric chromosomes (ECA5) split into two pairs of acrocentric chromosomes^10,11,12 (EPR23 and EPR24). Although the offspring produced from a cross between Przewalski's horse and a domestic horse had 65 chromosomes, it was fertile^11,13, unlike the mule (2n = 63, offspring of male donkey and female horse) and the hinny (2n = 63, offspring of male horse and female donkey), which are sterile.

Przewalski's horse (“wild horse” hereafter) is the only wild horse species surviving in the world today¹⁴. Because of environmental change and human activities, this species dropped to only 12 individuals in the middle of the last century. Today, the number has increased to approximately 2000, located in the field or in zoos, but all of them are descendants of those 12 ancestors¹⁵. This event dramatically reduced the genetic variation of the wild horse, which could reduce the ability of the species to adapt to environment change. Severe genetic bottlenecks have also occurred with European bison¹⁶, northern elephant seals¹⁷ and cheetahs¹⁸. Therefore, the wild horse is not only a valuable wildlife resource but also a promising model for the study of population genetics. The Mongolian horse is an ancient horse breed that has been an integral part of the culture of nomadic pastoralists in North Asia. The Mongolian horse has a large population with abundant genetic diversity. This ancient breed has influenced other Northern European horse breeds¹⁹. It has acquired many special abilities and attributes, such as endurance and disease resistance and is well adapted to its harsh conditions—a cold, arid climate and poor grazing opportunities²⁰. Dramatic chromosomal rearrangement in the horse is a notable feature in comparison to other mammals and this makes the horse an ideal model for studying chromosomal evolution.

In this study, we obtained quality whole-genome sequences of a male wild horse and a male Mongolian horse using next-generation sequencing technology. The genome sequences of the two representative Equus species would improve the genomic maps of the horse. Importantly, based on this, we will focus on karyotypic diversification and explore the genetic mechanisms and evolution rules through analysis of comparative genomics, further uncovering the genetic mechanisms of chromosomal evolution for Equus species.

Results

Genome sequencing and assembly

The wild horse and Mongolian horse genomes were sequenced using the Illumina Hiseq platform. A paired-end library (500 bp) and two mate-paired libraries (3 kb and 8 kb) were constructed for both the wild horse and Mongolian horse. In total, we generated 231.21 Gb and 224.17 Gb of usable sequences for the wild horse and Mongolian horse. The sequence depth was 93× and 91×, respectively (Supplementary Table S1). The sequencing error rates were 0.000575 and 0.000507 for the wild horse and Mongolian horse, respectively (Supplementary Table S2). After assembly, both the wild horse and Mongolian horse generated the same length of genome sequences (2.38 Gb) (Supplementary Table S3).

We checked 248 core eukaryotic genes²¹ in our two assemblies and found the completeness was comparable with that of published genomes sequences assemblies^{22,23,24,25,26} (Supplementary Table S4). We did not detect any misassemblies²⁷ when comparing our genome assemblies with the wild horse and Mongolian horse sequences available in Genbank. (Supplementary Table S5).

We assembled Y chromosome of wild horse and Mongolian horse (Fig. 1). In previous studies²⁸, 127 markers on horse Y chromosome were reported and in this study, 87 markers and 103 markers could be detected in wild horse and Mongolian horse assemblies, respectively (Supplementary Table S6, 7). Thus, 34 scaffolds (3,018,288 bp) of wild horse and 48 scaffolds (1,971,029 bp) of Mongolian horse were identified originated from Y chromosome. The length of collinearity regions between wild horse and Mongolian horse was around 1.74 Mbp.

To improve gene prediction accuracy, eight types of tissue samples (heart, liver, spleen, lung, kidney, brain, spinal cord and muscle) from a female Mongolian horse were used to construct cDNA libraries. The RNA-seq was performed using the 454 FLX+ platform and 853,978 reads were obtained with an average length of 458 bp (Supplementary Fig. S1). From these transcriptome data, aided by homology-based gene prediction methods, we estimated that the horse genome contained 20,000 to 21,000 protein-coding genes.

Synteny analysis

Robertsonian translocation, which is also called whole-arm translocation or centric-fusion translocation, is a common form of chromosomal rearrangement. Previous studies based on fluorescence in situ hybridization (FISH) results indicate that chromosomes 23 and 24 of the wild horse are homologous with chromosome 5 of the domestic horse. After assembling the wild horse and Mongolian horse genome, we masked out all repetitive sequences and found that EPR23 and 24 of the wild horse and ECA5 of the Mongolian horse could be aligned to the chromosome 5 of the reference genome²⁹. The five probes (LAMC2, LAMB3, VCAM1, UOX and DIA1), which were used in FISH mapping in previous research to confirm that EPR23, 24 is homologous with ECA5¹², were also identified in both the wild horse and domestic horse genome (Fig. 2).

To study the relationship between Robertsonian translocation and local rearrangement, we performed whole genome synteny analysis. We compared wild horse genome and Mongolian horse genome to the Thoroughbred horse genome, respectively. Collinearity region between Mongolian horse and Thoroughbred horse (2.25 Gbp) was slightly longer than that between wild horse and Thoroughbred horse (2.23 Gbp). 124 Mbp (5.51%) of wild horse genome and 76 Mbp (3.34%) of Mongolian horse genome could not align to Thoroughbred horse genome. Four types of rearrangement, BRK (insertion of unknown origin), DUP (inserted duplication), INV (inversion) and JMP (relocation), were identified (Supplementary Table S8, 9).

Since artifactual mis-joins of assemblies could be counted as rearrangements, we attempted to estimate the correct rate of these rearrangements breakpoints. We remapped the usable reads to the genomes assemblies of wild horse and Mongolian horse, respectively. Then we checked the number of mapped reads in the breakpoint of each type of rearrangements. If the number was less than three, we considered the assembly was incorrect (Supplementary Fig. S2), otherwise correct (Supplementary Fig. S3). We counted 100 breakpoints for each type of rearrangement and calculated the correct rate. The correct rates of INV (92%) and JMP (82%) were higher than those for BRK (76%) and DUP (58%) in assemblies of wild horse. In Mongolian horse, the correct rates were similar with those of wild horse (Supplementary Table S10).

The potential rearrangement sites were investigated for potential synapomorphies. We counted the rearrangement events in two situations: (1) Assume genome sequences of Thoroughbred horse and Mongolian horse were consensus and identify rearrangements in wild horse; (2) Assume genome sequences of Thoroughbred horse and wild horse were consensus and identify rearrangements in Mongolian horse. We found that rearrangement events in the first situation were dramatically more than that in the second (Supplementary Fig. S4). This result was consistent with phylogeny.

The numbers of rearrangements on each chromosome were counted (Supplementary Table S11). Chromosome 5 does not have a greater number of local rearrangements compared with the other chromosomes, although chromosome 5 had undergone Robertsonian translocation (Fig. 3). We noticed that the number of inversions is far less than that of insertions and relocations in the horse genome. Some chromosomes, including the X chromosome, contain more local rearrangements than others. Local rearrangements in the genome of wild horse are more numerous than in that of the Mongolian horse.

Repetitive sequences

Repetitive sequences comprise approximately 50% of the mammal genomes³⁰ and are associated with syntenic breakpoints and chromosomal fragility^31,32,33. Repetitive sequences of six species of mammals (horses²⁹, humans³⁰, mouse³⁴, dogs³⁵, cattle³⁶ and pigs³⁷) were examined in this study (Fig. 4a). Seven common repetitive sequences were identified: short interspersed repeated sequences (SINE), long interspersed repeated sequences (LINE), long terminal repeated (LTR), DNA elements, satellites, simple repeats and low complexity. Broadly, the analysis of these sequences indicated that 41.4% of the horse genome sequences are repetitive sequences, which is comparable to the percentages in humans (46.8%), mouse (42.5%), dogs (40.0%), cattle (47.1%) and pigs (39.1%). LINEs comprise 22.6% of the horse genome, which is more than in humans (19.7%), mouse (19.1%), dogs (19.8%), cattle (21.9%) and pigs (18.4%). SINEs can be found in 7.3% of the horse genome, less than in the human (13.4%), mouse (7.5%), dog (10.6%), cattle (17.0%) and pig (13.0%) genomes (Supplementary Fig. S5).

The distribution of repetitive sequences in each chromosome was also examined. The results indicated that each chromosome contains a similar proportion of repetitive sequences, except the X chromosome, which contains a higher proportion of repetitive sequences than autosomes in the six species.

Using those rearrangement regions of the wild horse genome, we studied the association between rearrangement and repetitive sequences. We found some types of repetitive sequences were significantly increased in the rearrangement regions (Fig. 4b, Supplementary Table S12). This result is consistent with previous findings^31,32. Interestingly, the proportions of LINE_L1 and LTR_ERV1 increased, but the proportions of LINE_L2 and several other repetitive sequences decreased (Fig. 4c, Supplementary Table S13 to S16). This result suggests that LINE_L1 and LTR_ERV1 may play a more important role in chromosome rearrangement.

Heterozygosity analysis

We identified 1,280,203 and 2,203,945 heterozygous SNPs (within an individual) in the genomes of the wild horse and Mongolian horse (Supplementary Table S17 to S19). Small indels were also identified in the genomes of the wild horse and Mongolian horse (Supplementary Table S20). The heterozygosity rates were 0.52 × 10⁻³ and 0.89 × 10⁻³ in the wild horse and Mongolian horse, respectively. The heterozygosity of the wild horse is considerably lower than that of the Mongolian horse.

SNPs were not evenly distributed among the wild horse chromosomes but were evenly distributed in the Mongolian horse chromosomes (Fig. 5a). We explored the heterozygosity rates of different regions using sliding windows of 50 kb with a step size of 10 kb. The number of sliding windows with high heterozygosity rate in Mongolian horse genome was considerable greater than that in wild horse genome (Supplementary Fig. S6). Another interesting phenomenon found in the genome of the wild horse was that heterozygous SNPs were completely excluded in many large regions (Fig. 5b). The sequence coverage of those regions in the wild horse was the same as in the Mongolian horse (Fig. 5c, Supplementary Fig. S7).

In the wild horse, there is a total length of 1287 M of homozygous regions (there are no SNP in wild horse and there are more than 0.8SNP/Kbp in Mongolian horse) and 58 homozygous regions were larger than 1 Mbp (Supplementary Table S21). A total of 4508 genes were located in those homozygous regions. Enrichment analysis indicated that these genes were enriched for specific functional categories of olfactory transduction (n = 118, p_value = 8.90e-12), regulation of cell proliferation (n = 11, p_value = 1.70e-02), calcium ion binding (n = 9, p_value = 1.70e-02) and others.

Discussion

In the past decades, researchers have studied chromosomal rearrangement using different conventional methods such as chromosome banding and FISH. However, many local rearrangements are extremely difficult to detect. Here, we sequenced and de novo assembled the homologous chromosomes that had undergone Robertsonian translocation. Our study indicated that Robertsonian translocation did not increase local rearrangements. These findings indicated that Robertsonian translocation and local rearrangements may be caused by different mechanisms. From our results, inversions are rarer than insertions and relocation, suggesting that insertions and relocation may play a more important role in shaping the genome.

Some studies have demonstrated that repetitive sequences are associated with syntenic breakpoints and chromosomal fragility^31,32. This study did not reveal significant differences in repetitive sequences among different species and different chromosomes (except the X chromosome). Different strategies of genome sequencing (clone by clone and whole genome shot-gun) may impact the actual content of repetitive sequences in the genomes. Our results suggest that chromosomal local rearrangements are highly associated with repetitive sequences. However, these repetitive sequences did not contribute equally to rearrangement. LINE_L1 and LTR_ERV1 may play a more important role than other repetitive sequences.

In the middle of the last century, the population of the wild horse dropped to only 12 individuals. The genetic bottleneck and inbreeding caused by this event may be the reason for the many more homozygous regions in the wild horse genome. One interesting result is that heterozygous SNPs are completely excluded from chromosome 26 of the wild horse. It was the largest fragment without heterozygous SNPs. As sequencing coverage of EPR26 is similar to other autosomes, we confirmed that there was a pair of chromosome 26 in this individual (Supplementary Fig. S8). Another explanation could be that this pair of chromosome 26 was present because of uniparental isodisomy^38,39,40. We also sequenced a short region (~700 bp) in chromosome 26 of several other wild horse samples using the Sanger method and found 6 SNPs, indicating that this region is heterozygous in some other wild horses.

The analysis results of the two representative Equus species improved the genomic maps of the horse. It also revealed the unique aspects of the chromosomal rearrangement and improved our understanding of chromosomal evolution in mammals implicating Equus is thus a promising model to explore the Karyotypic instability. These analysis and discoveries would benefit studies of mammal karyotypic evolution and chromosomal rearrangement and studies of human disease caused by chromosome aberration.

Methods

Sampling and genome sequencing

Protocols used for this experiment were consistent with those approved by the Institutional Animal Care and Use Committee at Inner Mongolia Agricultural University. For sequencing, a male wild horse was selected from the “YE MA International Group” of Xinjiang, China and a Mongolian horse was selected from the Xilingol League of Inner Mongolia, China. DNA was extracted from ear tissue and peripheral blood cells. Illumina HiSeq 2000 was used to sequence the genomes of wild horse and Mongolian horse using a shotgun strategy. A pair-end library (500 bp, standard genomic library and sequenced using paired-end reads) and two mate-pair libraries (3 kb and 8 kb) were constructed for each horse. The length of reads was 101 bp for pair-end library and two mate-pair libraries. Library preparation and sequencing followed the manufacturer's instructions and sequence reads were collected from the Illumina data processing pipeline.

Data filtering

The following types of reads were filtered out: (1) reads with more than 3 unidentified nucleotides, (2) reads with average phred quality below Q30 and (3) reads with unidentified nucleotides in the first 50 nucleotides.

Genome assembly

The genome sequences of the wild horse and Mongolian horse were assembled with short reads using SOAPdenovo⁴¹. We first assembled the short reads of the pair-end library (500 bp) into contigs using sequence overlap information. Then, we used the information of the mate-pair libraries (3 kb and 8 kb) to join the contigs into scaffolds. Finally, “Gapcloser” (http://soap.genomics.org.cn/soapdenovo.html) was used to close the gaps inside the scaffolds.

Genome annotation

For protein-coding gene annotation, ab initio prediction was performed by MAKER⁴². We generated cDNA data from multiple RNA sources. Using an oligodT-based approach, cDNA libraries were constructed from eight types of tissue samples (heart, liver, spleen, lung, kidney, brain, spinal cord and muscle) from a female Mongolian horse. The library was sequenced using a Roche 454 FLX+ platform.

Estimated sequencing error rate

Data from the X chromosome was used to estimate the sequencing error rate, which is hemizygotic in males. The calculations were not influenced by heterozygous SNPs⁴³. All qualified reads from the wild horse and Mongolian horse were mapped to the X chromosome of Equus caballus and all repetitive and low complexity regions were excluded. At each nucleotide position, the predominant call was assumed to be true and all others were considered to be errors.

Synteny analysis

We used the MAUVE program⁴⁴ to construct the synteny map for chromosome 5 of domestic horses (Thoroughbred and Mongolian horse) and chromosomes 23 and 24 of the wild horse. We masked out all repetitive sequences and unique sequences were preserved. Then, we used the Mauve Contig Mover (MCM) to order the draft genomes of the wild horse and Mongolian horse relative to the Thoroughbred horse genome (Equcab 2.0). The synteny analysis used progressiveMAUVE.

We used MUMmer⁴⁵ to perform the synteny analysis for the whole genomes of the wild horse and Mongolian horse, in addition to the reference genome. Four types of rearrangements (BRK, DUP, INV, JMP) were identified using the “nucmer” module. The parameter was Options “-c 800 -g300 –l 100”.

Repetitive sequence analysis

We screened DNA sequences for interspersed repeats and low complexity DNA sequences using RepeatMasker (http://www.repeatmasker.org/) in the collinearity and rearrangement regions. Collinearity regions, which were larger than 100 kb, were used for following analysis and rearrangement loci plus 2 kb extended flanking regions were treated as rearrangement regions. The “T-test” was performed using R software.

SNP calling and heterozygosity rate estimation

We utilized the BWA program⁴⁶ to map the usable reads from the pair-end libraries (500 bp) of the wild horse and Mongolian horse to the genome sequences of Thoroughbred horse (Equcab 2.0). The parameters chosen for mapping were as follows: seed length of 32 and the maximum occurrences for extending a long deletion of 10. Duplicated reads were removed by SAMtools⁴⁷. SNPs and InDels were called using the Genome Analysis Toolkit⁴⁸ according to the guidelines as described.

The heterozygosity rate was estimated as the density of heterozygous SNPs for the whole genome. For the estimation of local heterozygosity rate, sliding windows of 50 kb with 80% overlap between adjacent windows were used to scan the genome.

Additional information

Data Access The Whole Genome Shotgun project has been deposited in DDBJ/EMBL/GenBank as project accession PRJNA200657 and PRJNA200654 of wild horse and Mongolian horse, respectively. The genome assembly of wild horse has been deposited at DDBJ/EMBL/GenBank under the accession ATBW00000000 and this version described in this paper is version ATBW01000000. The genome assembly of Mongolian horse has been deposited at DDBJ/EMBL/GenBank under the accession ATDM00000000 and the version described in this paper is version ATDM01000000. Transcript sequencing data have been deposited under Short Read Archive (SRA) accession SRR1014663.

References

Outram, A. K. et al. The earliest horse harnessing and milking. Science 323, 1332–1335 (2009).
Article ADS CAS PubMed Google Scholar
Lindsay, E. H., Opdyke, N. D. & Johnson, N. M. Pliocene dispersal of the horse Equus and late Cenozoic mammalian dispersal events. Nature 287, 135–138 (1980).
Article ADS Google Scholar
Bush, G. L., Case, S. M., Wilson, A. C. & Patton, J. L. Rapid speciation and chromosomal evolution in mammals. Proc Natl Acad Sci U S A 74, 3942–3946 (1977).
Article ADS CAS PubMed PubMed Central Google Scholar
Dobigny, G., Aniskin, V. & Volobouev, V. Explosive chromosome evolution and speciation in the gerbil genus Taterillus (Rodentia, Gerbillinae): a case of two new cryptic species. Cytogenet Genome Res 96, 117–124 (2002).
Article CAS PubMed Google Scholar
Koehler, U., Bigoni, F., Wienberg, J. & Stanyon, R. Genomic reorganization in the concolor gibbon (Hylobates concolor) revealed by chromosome painting. Genomics 30, 287–292 (1995).
Article CAS PubMed Google Scholar
Murphy, W. J. et al. Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps. Science 309, 613–617 (2005).
Article ADS CAS PubMed Google Scholar
Muller, S., Hollatz, M. & Wienberg, J. Chromosomal phylogeny and evolution of gibbons (Hylobatidae). Hum Genet 113, 493–501 (2003).
Article PubMed Google Scholar
Trifonov, V. A. et al. Multidirectional cross-species painting illuminates the history of karyotypic evolution in Perissodactyla. Chromosome Res 16, 89–107 (2008).
Article CAS PubMed Google Scholar
Piras, F. M. et al. Uncoupling of satellite DNA and centromeric function in the genus Equus. PLoS Genet 6, e1000845 (2010).
Article PubMed PubMed Central CAS Google Scholar
Benirschke, K., Malouf, N., Low, R. J. & Heck, H. Chromosome Complement: Differences between Equus caballus and Equus przewalskii, poliakoff. Science 148, 382–383 (1965).
Article ADS CAS PubMed Google Scholar
Koulischer, L. & Frechkop, S. Chromosome Complement: A Fertile Hybrid between Equus priewalskii and Equus caballus. Science 151, 93–95 (1966).
Article ADS CAS PubMed Google Scholar
Myka, J. L., Lear, T. L., Houck, M. L., Ryder, O. A. & Bailey, E. FISH analysis comparing genome organization in the domestic horse (Equus caballus) to that of the Mongolian wild horse (E. przewalskii). Cytogenet Genome Res 102, 222–225 (2003).
Article CAS PubMed Google Scholar
Ahrens, E. & Stranzinger, G. Comparative chromosomal studies of E. caballus (ECA) and E. przewalskii (EPR) in a female F1 hybrid. J Anim Breed Genet 122 Suppl 1, 97–102 (2005).
Article CAS PubMed Google Scholar
Orlando, L. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013).
Article ADS CAS PubMed Google Scholar
Goto, H. et al. A massively parallel sequencing approach uncovers ancient origins and high genetic variability of endangered Przewalski's horses. Genome Biol Evol 3, 1096–1106 (2011).
Article PubMed PubMed Central Google Scholar
Hartl, G. B. & Pucek, Z. Genetic depletion in the European bison (Bison bonasus) and the significance of electrophoretic heterozygosity for conservation. Conservation biology 8, 167–174 (1994).
Article Google Scholar
Hoelzel, A. R. et al. Elephant seal genetic variation and the use of simulation models to investigate historical population bottlenecks. J Hered 84, 443–449 (1993).
Article CAS PubMed Google Scholar
Menotti-Raymond, M. & O'Brien, S. J. Dating the genetic bottleneck of the African cheetah. Proc Natl Acad Sci U S A 90, 3172–3176 (1993).
Article ADS CAS PubMed PubMed Central Google Scholar
Bjornstad, G., Nilsen, N. O. & Roed, K. H. Genetic relationship between Mongolian and Norwegian horses? Anim Genet 34, 55–58 (2003).
Article CAS PubMed Google Scholar
Li, L. F., Guan, W. J., Hua, Y., Bai, X. J. & Ma, Y. H. Establishment and characterization of a fibroblast cell line from the Mongolian horse. In Vitro Cell Dev Biol Anim 45, 311–316 (2009).
Article CAS PubMed Google Scholar
Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).
Article CAS PubMed Google Scholar
Jirimutu et al. Genome sequences of wild and domestic bactrian camels. Nat Commun 3, 1202 (2012).
Article ADS CAS PubMed Google Scholar
Wan, Q. H. et al. Genome analysis and signature discovery for diving and sensory properties of the endangered Chinese alligator. Cell Res 23, 1091–1105 (2013).
Article CAS PubMed PubMed Central Google Scholar
Wang, Z. et al. The draft genomes of soft-shell turtle and green sea turtle yield insights into the development and evolution of the turtle-specific body plan. Nat Genet 45, 701–706 (2013).
Article CAS PubMed PubMed Central Google Scholar
Cho, Y. S. et al. The tiger genome and comparative analysis with lion and snow leopard genomes. Nat Commun 4, 2433 (2013).
Article ADS PubMed CAS Google Scholar
Dong, Y. et al. Sequencing and automated whole-genome optical mapping of the genome of a domestic goat (Capra hircus). Nat Biotechnol 31, 135–141 (2013).
Article CAS PubMed Google Scholar
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
Article CAS PubMed PubMed Central Google Scholar
Raudsepp, T. et al. A detailed physical map of the horse Y chromosome. Proc Natl Acad Sci U S A 101, 9321–9326 (2004).
Article ADS CAS PubMed PubMed Central Google Scholar
Wade, C. M. et al. Genome sequence, comparative analysis and population genetics of the domestic horse. Science 326, 865–867 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Article ADS CAS PubMed Google Scholar
Mikkelsen, T. S. et al. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature 447, 167–177 (2007).
Article ADS CAS PubMed Google Scholar
Webber, C. & Ponting, C. P. Hotspots of mutation and breakage in dog and human chromosomes. Genome Res 15, 1787–1797 (2005).
Article CAS PubMed PubMed Central Google Scholar
Hu, L. et al. Two replication fork maintenance pathways fuse inverted repeats to rearrange chromosomes. Nature 501, 569–572 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).
Article ADS CAS PubMed Google Scholar
Kirkness, E. F. et al. The dog genome: survey sequencing and comparative analysis. Science 301, 1898–1903 (2003).
Article ADS PubMed Google Scholar
Elsik, C. G. et al. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324, 522–528 (2009).
Article ADS PubMed PubMed Central CAS Google Scholar
Groenen, M. A. et al. Analyses of pig genomes provide insight into porcine demography and evolution. Nature 491, 393–398 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Engel, E. A new genetic concept: uniparental disomy and its potential effect, isodisomy. Am J Med Genet 6, 137–143 (1980).
Article CAS PubMed Google Scholar
Engel, E. Uniparental disomies in unselected populations. Am J Hum Genet 63, 962–966 (1998).
Article CAS PubMed PubMed Central Google Scholar
Kotzot, D. Complex and segmental uniparental disomy (UPD): review and lessons from rare chromosomal complements. J Med Genet 38, 497–507 (2001).
Article CAS PubMed PubMed Central Google Scholar
Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20, 265–272 (2010).
Article CAS PubMed PubMed Central Google Scholar
Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 18, 188–196 (2008).
Article CAS PubMed PubMed Central Google Scholar
Kim, J. I. et al. A highly annotated whole-genome sequence of a Korean individual. Nature 460, 1011–1015 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Darling, A. E., Mau, B. & Perna, N. T. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PloS one 5, e11147 (2010).
Article ADS PubMed PubMed Central CAS Google Scholar
Delcher, A. L. et al. Alignment of whole genomes. Nucleic Acids Res 27, 2369–2376 (1999).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central CAS Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by Ministry of Science and Technology of the People's Republic of China specific scientific and technological cooperation with Russia (2011DFR30860), Inner Mongolia major scientific and technological projects (NK-20120395), National Natural Science Foundation of China (31360538), Inner Mongolia key laboratory project (20130902), National Natural Science Foundation of China (31160446), Inner Mongolia Agricultural University Equine Science and Industrialization Innovation Team Support Program (NDTD2010-12) and Ministry of Agriculture of the People's Republic of China public welfare specific scientific and technological projects (201003075). We thank Dr. Liqing Zhang (Department of Computer Science, Virginia Tech, USA) for valuable comments to this manuscript.

Author information

Huang Jinlong, Zhao Yiping, Shiraigol Wunierfu, Li Bei, Bai Dongyi and Ye Weixing contributed equally to this work.

Authors and Affiliations

College of Animal Science, Inner Mongolia Agricultural University, Hohhot, 010018, P.R. China
Jinlong Huang, Yiping Zhao, Wunierfu Shiraigol, Bei Li, Dongyi Bai, Dorjsuren Daidiikhuu, Lihua Yang, Burenqiqige Jin, Qinan Zhao, Yahan Gao, Jing Wu, Wuyundalai Bao, Anaer Li, Yuhong Zhang, Haige Han, Haitang Bai, Yanqing Bao & Manglai Dugarjaviin
Shanghai Personal Biotechnology Limited Company, 777 Longwu Road, Shanghai, 200236, P.R. China
Weixing Ye & Zikui Sun
School of Agriculture and Biology, Shanghai Jiaotong University,
Lele Zhao, Zhengxiao Zhai, Wenjing Zhao & He Meng
Shanghai Key Laboratory of Veterinary Biotechnology, 800 Dongchuan Road, Shanghai, 200240, P.R. China
Lele Zhao, Zhengxiao Zhai, Wenjing Zhao & He Meng
Virginia Bioinformatics Institute, Virginia Tech, Washington Street, MC0477, Blacksburg, Virginia, 24061, USA
Yan Zhang

Authors

Jinlong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yiping Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Wunierfu Shiraigol
View author publications
You can also search for this author in PubMed Google Scholar
Bei Li
View author publications
You can also search for this author in PubMed Google Scholar
Dongyi Bai
View author publications
You can also search for this author in PubMed Google Scholar
Weixing Ye
View author publications
You can also search for this author in PubMed Google Scholar
Dorjsuren Daidiikhuu
View author publications
You can also search for this author in PubMed Google Scholar
Lihua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Burenqiqige Jin
View author publications
You can also search for this author in PubMed Google Scholar
Qinan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yahan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Wuyundalai Bao
View author publications
You can also search for this author in PubMed Google Scholar
Anaer Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haige Han
View author publications
You can also search for this author in PubMed Google Scholar
Haitang Bai
View author publications
You can also search for this author in PubMed Google Scholar
Yanqing Bao
View author publications
You can also search for this author in PubMed Google Scholar
Lele Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zhengxiao Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Wenjing Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zikui Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
He Meng
View author publications
You can also search for this author in PubMed Google Scholar
Manglai Dugarjaviin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.D., H.M., Ya.Z., Z.S. designed and managed the project. J.H., W.Y., Ya.Z., H.M. designed and performed the genome assembly and analyses. J.H., H.M., Ya.Z., M.D. wrote the paper. D.D., D.B., J.H., Yi.Z., H.H., H.B., Y.B. collected samples and prepared the nucleic acid samples. Yi.Z., W.S., B.L., D.B., L.Y., B.J., Q.Z., Y.G., J.W., W.B., A.L., Yu.Z., L.Z., Z.Z., W.Z. performed the genomes sequencing.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Supplementary material

Supplementary Information

Supplementary table S8

Supplementary Information

Supplementary table S9

Rights and permissions

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. The images in this article are included in the article's Creative Commons license, unless indicated otherwise in the image credit; if the image is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the image. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/

Reprints and permissions

About this article

Cite this article

Huang, J., Zhao, Y., Shiraigol, W. et al. Analysis of horse genomes provides insight into the diversification and adaptive evolution of karyotype. Sci Rep 4, 4958 (2014). https://doi.org/10.1038/srep04958

Download citation

Received: 27 December 2013
Accepted: 22 April 2014
Published: 14 May 2014
DOI: https://doi.org/10.1038/srep04958

This article is cited by

SimVec: predicting polypharmacy side effects for new drugs
- Nina Lukashina
- Elena Kartysheva
- Aleksei Shpilman
Journal of Cheminformatics (2022)
Hybrid architecture based on two-dimensional memristor crossbar array and CMOS integrated circuit for edge computing
- Pratik Kumar
- Kaichen Zhu
- Chetan Singh Thakur
npj 2D Materials and Applications (2022)
SPIN enables high throughput species identification of archaeological bone by proteomics
- Patrick Leopold Rüther
- Immanuel Mirnes Husic
- Jesper Velgaard Olsen
Nature Communications (2022)
Evaluation of nutrient characteristics and bacterial community in agricultural soil groups for sustainable land management
- Sumeth Wongkiew
- Pasicha Chaikaew
- Thanachanok Khamkajorn
Scientific Reports (2022)
Topography-driven soil properties modulate effects of nitrogen deposition on soil nitrous oxide sources in a subtropical forest
- Pengpeng Duan
- Xinyi Yang
- Dejun Li
Biology and Fertility of Soils (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Genome sequencing and assembly

Synteny analysis

Repetitive sequences

Heterozygosity analysis

Discussion

Methods

Sampling and genome sequencing

Data filtering

Genome assembly

Genome annotation

Estimated sequencing error rate

Synteny analysis

Repetitive sequence analysis

SNP calling and heterozygosity rate estimation

Additional information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links