Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family


A high-density linkage map of goldfish (Carassius auratus) was constructed using RNA-sequencing. This map consists of 50 linkage groups with 8,521 SNP markers and an average resolution of 0.62 cM. Approximately 84% of markers are in protein-coding genes orthologous to zebrafish proteins. We performed comparative genome analysis between zebrafish and medaka, common carp, grass carp, and goldfish to study the genome evolution events in the Cyprinidae family. The comparison revealed large synteny blocks among Cyprinidae fish and we hypothesized that the Cyprinidae ancestor undergone many inter-chromosome rearrangements after speciation from teleost ancestor. The study also showed that goldfish genome had one more round of whole genome duplication (WGD) than zebrafish. Our results illustrated that most goldfish markers were orthologous to genes in common carp, which had four rounds of WGD. Growth-related regions and genes were identified by QTL analysis and association study. Function annotations of the associated genes suggested that they might regulate development and growth in goldfish. This first genetic map enables us to study the goldfish genome evolution and provides an important resource for selective breeding of goldfish.


The Cyprinidae family (Telestei) includes about 3,000 species and is the largest family in vertebrate. The numbers of chromosomes in this family greatly vary. For instance, bighead carp has 48 chromosomes1 while the chromosome number of gibel carp is up to 1562. Among the members in this family, genome sequences of zebrafish (Danio rerio)3, grass carp (Ctenopharyngodon idella)4 and common carp (Cyprinus carpio)5 have been published. Cross-species comparisons revealed that grass carp had a chromosome fusion in relative to zebrafish and that common carp had one more round of whole genome duplication (WGD) than zebrafish. Whether other species in the Cyprinidae family had WGDs or genome rearrangements are still unknown. Comparing more genomes in this family will help us understand the genome evolution and the genetic basis of post-speciation expansion in Cyprinidae.

Goldfish (Carassius auratus), one member in the Cyprinidae family, has twice the number of chromosomes (n = 50) than zebrafish (n = 25). It is speculated to have undergone one more round of WGD compared to other teleosts6. Teleosts are widely believed to have experienced a “fish-specific” third-round (3R) of WGD event. Therefore, this additional round of WGD was considered as the fourth-round (4R) of WGD goldfish would be a valuable model to study the consequences of genome duplication7. Goldfish is unique among vertebrates since it can survive for an extended period in complete absence of molecular oxygen8. As one of the earliest domesticated fish9, it has highly various morphologies and some variants are important aquarium fish. Therefore, it is also a useful model to study genetic variance as a result of domestication. These special genomic and phenotypic characteristics argue that goldfish is a suitable model to study genome duplication and physiological adaptation. Hence, establishment of goldfish genomic resources will facilitate applications of this system in many fields of studies, such as genome evolution, physiology, and neurobiology10. Although goldfish transcriptome was reported and would provide a resource for functional and comparative genomic analyses11, additional genomic resources are much needed for comprehensive genetic and genomic analysis.

In this study, we performed transcriptome sequencing of all progenies from a mapping family of goldfish. After genotyping, we constructed a high-density genetic map. The markers on the map were mainly located in protein-coding genes. We then examined gene syntenies in goldfish, three Cyprinidae fish (zebrafish, grass carp and common carp), and medaka, to study the genome evolution events in this family. Finally, we identified candidate regions and genes related to goldfish growth traits on this map using QTL analysis and association study.


Genotype calling using RNA-Seq data

We obtained 2.1 Gb of RNA-seq reads for the male parent. After filtering out the low-quality reads, de novo assembly and scaffolding, we obtained 80,298 transcripts with an N50 length of 1,245 bp and total bases of 61.7 Mb (Supplementary Table S1). The length distribution of all transcripts is shown in Supplementary Fig. S1. These transcripts were used as a reference transcriptome for genotyping.

We also sequenced 172 Gb of reads for the female parent and 79 F2 offspring. The raw RNA-sequencing reads of all samples were deposited in the European Nucleotide Archive (ENA) under the project of PRJEB12518. We aligned cleaned RNA-seq reads to reference transcripts using BWA12. The overall mapping ratio to reference transcripts was about 70% (Supplementary Table S2). We then used stringent criteria to identify high-quality SNPs in all individuals. Finally, a total of 488,245 loci were genotyped, which were covered by at least 90% of individuals and heterozygous in at least one parent.

High-density genetic map

A chi-square test identified 50,353 markers that conformed to the expected Mendelian ratios (p value > 0.01). These markers were located on 21,486 transcripts. Among them, 8,344 transcripts had a minimum of two polymorphic markers. After the number of markers was reduced by selecting the representative SNPs, JoinMap assigned 8,521 markers to 50 linkage groups at the LOD threshold of 6. The group number is consistent with the haploid chromosome number of the goldfish13 (Supplementary Fig. S2 and Table S3). For these mapped markers, a high depth (31.6 × depth on average) was sequenced per offspring (Supplementary Fig. S3), suggesting that these polymorphic markers were of high quality. The total map length was 5,252 cM with an average marker interval of 0.62 cM (Table 1). LG5, the largest goldfish linkage group, comprised of 185 markers in 153.5 cM; and LG39, the smallest linkage group, comprised of 119 markers in 52.3 cM. The 8,521 markers were distributed on 5,202 reference transcripts. All these marker transcripts were longer than 200 bp, 90% of which were longer than 560 bp (Supplementary Fig. S4). The long sequences were suitable for designing PCR primers, which would be applied into future selective breeding.

Table 1 Characteristics of the goldfish linkage map.

We performed blastx searches of goldfish reference transcripts against zebrafish proteins and found that 33,895 goldfish reference transcripts had significant sequence homology to 17,551 zebrafish genes, at a proximal 2:1 mapping ratio. Among 5,202 marker transcripts on the linkage map, 4,279 (covering 7,170 markers) had orthologous zebrafish protein-coding genes (Supplementary Table S4). The significant proportion of protein-coding markers would help us study the genome structure and evolution of goldfish.

Discovery of different genome evolution events among Cyprinidae

Comparative analysis among four Cyprinidae fish (zebrafish, grass carp, common carp and goldfish) and medaka was performed to investigate genome rearrangements and duplication events in the Cyprinidae family. Firstly, the orthologous gene pairs between goldfish and zebrafish were used in the following analysis (Supplementary Table S4). The synteny analysis revealed that there was a 2:1 gene-synteny mapping between 50 goldfish linkage groups and 25 zebrafish chromosomes. In general, two duplicated goldfish linkage groups were homologous to one zebrafish chromosome. The 2:1 genome-wide orthologous mapping and the 2:1 chromosome-synteny between goldfish and zebrafish supported the hypothesis that there existed one more round of whole genome duplication in goldfish than in zebrafish (Fig. 1).

Figure 1

Chromosome-synteny analysis between goldfish linkage map and zebrafish genome.

An orthologous gene pair between two species is represented with a red dot. Synteny comparison reveals 2:1 syntenic correspondences between the two species.

Secondly, we identified 13,744 common carp proteins on 50 chromosomes had orthologous zebrafish proteins. The comparison between the common carp genome and the zebrafish genome revealed 2:1 synteny between 50 common carp chromosomes and 25 zebrafish chromosomes (Fig. 2). Although there were a number of minor chromosome rearrangements on common carp chromosomes, in general two common carp chromosomes were found homologous to one zebrafish chromosome. This observation was consistent with previous studies that there was one additional round of WGD in common carp than in zebrafish5,14,15 (Fig. 2).

Figure 2

Comparative genomic analysis of common carp and zebrafish.

Each red dot in the Oxford grid represents the position of an orthologous gene pair in the corresponding common carp and zebrafish genomes. In general, common carp and zebrafish chromosomes exhibit 2:1 correspondence.

Thirdly, we examined gene homology and synteny between goldfish and common carp. In the above two analyses, both genomes had an additional round of WGD compared to zebrafish genome. In each species two linkage groups or chromosomes were homologous to one zebrafish chromosome with 2:1 synteny. A comparison between goldfish linkage groups and common carp chromosomes would reveal distinct genome rearrangements after their WGDs. The best-match reciprocal homolog searches identified 23,001 gene pairs between goldfish and common carp. A total of 3,520 goldfish marker transcripts (with 6,022 markers) had orthologous common carp genes. A 2:2 synteny relationship was observed between goldfish and common carp (Fig. 3), where two duplicated goldfish linkage groups were primarily syntenic to two duplicated common carp chromosomes. For instance, among 157 markers on two goldfish duplicated linkage groups (LG5 and LG6), 95 markers (61%) had orthologs on two common carp duplicated chromosomes (LG5 and LG6). The remaining markers (62) had orthologs on other common carp chromosomes. Six markers on goldfish LG5 and LG6 had orthologs on common carp LG44, indicating many minor inter-chromosome rearrangements between common carp and goldfish after their WGDs.

Figure 3

Genome-wide synteny comparisons between goldfish and common carp.

Over all, two duplicated goldfish linkage groups were orthologous to two duplicated common carp chromosomes, at a 2:2 correspondence genome-wide synteny.

Fourthly, 13,445 one-to-one orthologous gene pairs were identified between grass carp and zebrafish on the basis of syntenic blocks by MCScanX. The Oxford grid between grass carp and zebrafish (Fig. 4) revealed that 21 of 24 grass carp chromosomes had a high degree of conserved synteny with zebrafish chromosomes. Each in the other three grass carp chromosomes (chr13, chr19 and chr24) was syntenic to two zebrafish chromosomes. There was a major genome fusion occurred in grass carp chr24, consistent with the observation of Want et al.4. This chromosome was orthologous to two zebrafish chromosomes, chr10 and chr22. We observed another two minor chromosome fusions on grass carp chr13 and chr19. Although most markers on grass carp chr13 were orthologous to genes on zebrafish chr14, a small number of genes on this chromosome had orthologs on zebrafish chr3. Likewise, grass carp chr19 had co-orthology to zebrafish chr16 and chr24. The 1:2 synteny between three grass carp chromosomes and six zebrafish chromosomes indicated that there were inter-chromosome fusion events in grass carp.

Figure 4

The comparison between grass carp and zebrafish genomes.

The comparison reveals that three grass carp chromosomes exhibited 1:2 syntenic correspondences to six zebrafish chromosomes, indicating three putative fusion events in grass carp.

The comparisons among four Cyprinidae fish revealed that goldfish and common carp had additional round of WGD compared to zebrafish and grass carp. To further demonstrate additional round of WGD in goldfish, the genome sizes of grass carp, zebrafish, common carp and goldfish were estimated using real-time PCR16. The genome sizes were estimated to be 1.03 pg, 1.58 pg, 1.80 pg and 1.78 pg, respectively (Supplementary Table S5, Supplementary Fig. S5 and S6), similar as the published data17. The goldfish genome size is equivalent to that of common carp and approximate 1.8 times that of grass carp. Considering that grass carp is diploid and common carp is tetroploid, the result provided evidence that goldfish underwent additional WGD compared to grass carp. An ancient intron size expansion led to larger genome size of zebrafish than that of grass carp18. Nevertheless, since zebrafish is diploid as grass carp, zebrafish has smaller genome size than goldfish and common carp.

Finally, the comparison analysis between zebrafish and medaka showed that complex genome rearrangements were present in zebrafish genome after its separation from teleost ancestor (Fig. 5). We retained 9,492 one-to-one orthologous pairs for plotting. Although many zebrafish chromosomes tended to be represented on only one medaka chromosome, we observed both 1:2 and 1:3 synteny relationships between zebrafish and medaka. For instance, the majority of genes on each of chromosomes 4, 6, 7, 13, 14, 20 and 21 were homologous to genes on two medaka chromosomes. Furthermore, genes on zebrafish chromosomes of 5, 8, 10, 17 and 18 had orthologs that were distributed broadly among three medaka chromosomes.

Figure 5

Genomic comparisons between medaka and zebrafish.

Some zebrafish chromosomes exhibit 1:1, 1:2 and 1:3 correspondences to medaka chromosomes, suggesting complex genome rearrangements during speciation.

Taken together, our genome-wide comparative analysis revealed a high degree of gene synteny among Cyprinidae fish after the speciation of this family from teleost ancestor. We also observed different genome evolution events in the Cyprinidae family. These events might explain diverse phenotypes of members in this largest vertebrate family.

QTL analysis of growth traits

Pearson’s correlation analysis between BW and SL revealed that they were statistically significantly correlated (correlation coefficient of 0.83, p value of 2.2 × e−16), indicating that these traits might be regulated by a common set of genes. We performed QTL mapping to identify trait-related markers and genes. Firstly, the F values of dam-based analysis and sire-based mapping showed similar distributions (Table 2, Fig. 6). For BW, the dam-based analysis identified nine significant QTLs. One QTL at 36 cM on LG 39 had the highest F value of 18.91 with a 95% of CI from 3 cM to 40 cM. The sire-based mapping identified six significant loci, where the 95% CIs of four loci overlapped the ones of four dam-based loci. For SL, the dam-based analysis identified five significant loci (Table 2 and Fig. 7). One QTL at 40 cM on LG50 had the highest F value of 16.03 with a 95% CI between 37 cM and 82 cM. The sire-based QTL mapping identified three significant QTL regions, two of which had the overlapping 95% CIs with the dam-based CIs.

Table 2 QTL mapping of BW and SL in goldfish.
Figure 6

BW-related regions in goldfish.

Association analysis (blue points) and QTL mapping were carried out for BW. In QTL scanning, green dots represent sire-based QTL F values and yellow dots are dam-based QTL F values.

Figure 7

SL-related regions in goldfish.

Association analysis (blue points) and QTL mapping were carried out for SL. In QTL scanning, green dots and yellow dots represent sire-based QTL F values and dam-based QTL F values, respectively.

Secondly, comparing the QTL regions for BW and SL revealed high proportion of overlapping QTLs between BW and SL (six out of eight QTL regions for SL, Table 2). In the dam-based QTL genome scanning, the 95% CIs of three QTLs for SL were significantly related to BW. In the sire-based QTL mapping, all QTLs of SL were consistent with the ones of BW. Furthermore, we identified 180 transcripts (with 237 markers) in the flanking regions of BW-related QTL regions (Supplementary Fig. S7). A total of 135 transcripts (with 167 markers) were located in the flanking regions of SL-related QTL regions (Supplementary Fig. S7). Among the 135 SL-related transcripts, 101 transcripts (with 127 markers) were covered in the franking regions of BW-related QTLs (Supplementary Table S6 and Fig. S7). The high proportion of overlapping genes and markers was consistent with the correlation analysis, suggesting that BW and SL might be regulated by a common set of genes.

The association tests using PLINK also showed similar distribution of p values for BW and SL (Figs 6 and 7, and Supplementary Table S7). We identified 131 SNPs significantly associated with BW and/or SL. Most of SNPs (71%, 93 out of 131) were covered in the 95% CIs of QTL regions for two traits, generally supporting the QTL mapping results.

Further, we studied the functions of genes in QTL regions. Among the transcripts in the flanking regions of BW-related and SL-related QTL regions, Blast2GO assigned GO annotations to 123 BW-related transcripts and 93 SL-related transcripts (Supplementary Table S8). The GO annotations of these genes included the processes of developmental process, anatomical structure formation and growth (Supplementary Fig. S7), indicating that these genes might be involved in growth regulation. The function of orthologs in other species would help us understand the mechanisms of growth-related traits in goldfish. For instance, connective tissue growth factor (CTGF, marker accession of 000014952_1991) was identified to be significantly associated with goldfish BW and SL by QTL mapping. Previous studies showed that CTGF down-regulated the bone development19,20,21. In CTGF-null mice, multiple skeletal dysmorphisms were resulted from impaired growth plate chondrogenesis, angiogenesis, and bone formation, suggesting the physiological significance of CTGF in development22. The GO annotations of these transcripts and the participation of orthologs in growth and development suggested the significant association between these genes and goldfish growth traits. The identified growth-related genes and markers may be applied into future selective breeding of goldfish.


A genetic map is widely applied into the genetics and genomics studies23. In this study, we generated the first high-density linkage map of goldfish using SNPs. This genetic map consists of 8,521 markers, with an average resolution of 0.62 cM. This map provided sufficient resolution for QTL mapping. The other feature of this map is high proportion (84%) of protein-coding genes among all markers, which enables us to perform a genome-wide comparative analysis and discover the trait-related genes.

These features of this genetic map allowed us to perform chromosome-level comparative analysis in the Cyprinidae family and provided evidence for an additional round of WGD in goldfish. Firstly, the large-scale synteny in members of the Cyprinidae family and the presence of 1:1, 1:2 and 1:3 synteny correspondences between zebrafish and medaka indicated that complex genome rearrangements took place in Cyprinidae ancestor after its separation from other teleosts. Secondly, the comparison revealed different types of genome evolutions including fusions, fission, and duplication occurred in the family, which might be responsible for speciation expansion in Cyprinidae and substantial phenotype diversity among members. Thirdly, goldfish chromosomes showed 2:1 synteny to zebrafish chromosomes, providing evidence for the fourth round of WGD in goldfish. Fourthly, comparison between common carp genome and goldfish revealed a 2:2 correspondence. Li et al. proposed that the fourth round of WGD occurred in a common ancestor of common carp and goldfish6. On the basis of our results, we proposed that after separated from the ancestor common carp and goldfish experienced different genome rearrangement processes but in general both kept the duplicate genome components.

Furthermore, we used this high-resolution genetic map to detect growth-related markers and genes. QTL mapping and association study were performed together to identify genetic loci responsible for these traits. GO annotations provides hints that the identified genes might regulate growth. The markers form a valuable resource for the future marker-assisted selection. This map can be potentially used to analyze other important traits, including traits of hypoxia tolerance and other environmental adaptabilities.

In conclusion, we identified SNPs in a mapping family using RNA-seq and constructed a high-resolution genetic map of goldfish. To our knowledge, this is the most comprehensive genetic map to date for this important species. Using this map, we revealed divergent genome evolutions in the Cyprinidae family. Through QTL mapping analysis based on this map, we identified QTL regions and markers significantly related to BW and SL. The markers can potentially aid growth breeding of goldfish. The high-density linkage map will facilitate the genome-wide comparative genomics analysis and the mapping of phenotypes in goldfish.


Ethics Statement

All experiments involving the handling and treatment of fish in this study were approved by the Animal Care and Use committee of Heilongjiang River Fisheries Research Institute of Chinese Academy of Fishery Sciences (HRFRI). The methods were carried out in accordance with approved guidelines.

Mapping family and sequencing

Two parents and 79 F2 individuals of one full-sib goldfish family were selected for study. This family was constructed at Hulan Experiment Station of HRFRI, Harbin, Helongjiang Province, China. Growth-related traits including body weight (BW) and standard length (SL) were measured for all progenies.

For each individual, twelve organs including scale, skin, muscle, eye, brain, liver, kidney, spleen, heart, blood, intestine and gonad were collected. Total RNAs were extracted from each tissue separately using Trizol reagent (Invitrogen, CA, USA), and then treated with RNase-free DNaseI (NEB, MA, USA) to remove genomic DNA. Equal quantities of total RNAs from each tissue were mixed. The integrity of pooled RNAs was analyzed on a Bioanalyzer 2100 machine (Agilent, CA, USA). About 3 μg of pooled RNA from each individual was used for RNA-seq library construction.

For each sample, a sequencing library with an insert size about 300 bp was constructed with TruSeq RNA Sample Preparation Kit (Illumina, CA, USA). The library was sequenced on the HiSeq2000 platform with the 2 × 100 bp mode at Berry Genomics Co., Ltd (Beijing, China).

Constructing reference gene set and calling SNPs

For each individual, the raw transcriptome reads were processed using SolexaQA24 to filter low-quality reads. The high-quality reads of male parent were assembled using Trinity25 with default parameters. We selected the longest transcript to represent each gene and then subjected the selected transcripts to SSPACE26 for scaffolding. The scaffolding would elongate the transcripts. The scaffolded transcripts were used as the reference set for further genotyping.

The cleaned RNA-seq reads of two parents and 79 F2 progenies were aligned to the reference transcripts using BWA12 with default parameters. Considering that the high sequence similarity of duplicated genes might lead to multiple alignment of sequencing reads, we identified SNPs based on a subset of uniquely aligned reads. SAMtools27 was used to call variants. We retained high-quality of SNPs which had a minimal sequencing depth of five and a minimal mapping quality of 20. Those SNPs homozygous in two parents or absence in more than 10% of the offspring were removed.

Constructing linkage map and annotating the coding marker transcripts

The retained markers were considered to be of high quality and included for further analysis. We performed a chi-square test to identify markers conforming to the expected Mendelian ratio (p value > 0.01). The markers that satisfied the Mendelian segregation were subjected to JoinMap28 to construct a linkage map. Because JoinMap could only process less than 5,500 markers at a time29, we adopted three strategies to reduce the computational time. Firstly, if multiple markers in a transcript had identical genotypes, we selected the marker genotyped in the most individuals to represent these markers. Secondly, for a short transcript (<500 bp in length) that harbored multiple markers, only one marker sequenced in the most individuals was selected to represent this locus. Thirdly, we assigned markers into different linkage groups (LGs) using a single-linkage clustering algorithm with a pair-wise modified independent logarithm of odds (LOD) score30. The LOD threshold of grouping was set as 6. In each LG, the recombination rate and map distances between markers were calculated using JoinMap with the parameters of CP population type (cross pollinator, or full-sib family), the Kosambi mapping function31 and the regression mapping algorithm.

The markers were developed by RNA-seq sequencing, suggesting that a significant proportion of them were from protein-coding genes. Identifying the protein-coding markers would help us perform the comparative analysis and detect the trait-associated genes. All goldfish reference transcripts were aligned against zebrafish proteins from Ensembl database32 using blastx33 with an e-value cutoff of 1 × e−5. For each marker transcript, we selected the best-aligned zebrafish gene.

Comparative genomic analysis among Cyprinidae fish and medaka

To study the genome evolution events among Cyprinidae fish, we performed comparative analysis among four Cyprinidae fish (zebrafish, grass carp, common carp and goldfish) and medaka. Zebrafish genome was selected as reference since this genome has been well annotated and is almost finished3. Firstly, goldfish genetic map was compared with zebrafish genome. To facilitate further comparative analysis with zebrafish genome, we ordered goldfish linkage groups based on the marker similarity to zebrafish proteins. Due to one more round of WGD, goldfish has twice the number of chromosomes (n = 50) than zebrafish (n = 25). If a majority of markers on one goldfish LG were orthologous to a zebrafish chromosome, this LG was named as either (2n−1) or (2n), where n was the zebrafish chromosome number. Based on the association of goldfish transcripts to zebrafish orthologs, we constructed an oxford grid34 by placing all goldfish coding markers according to their orders in the genetic map on the horizontal axis and plotting zebrafish orthologs on the vertical axis.

Secondly, we compared common carp genome against zebrafish genome. The updated common carp genes were downloaded from CarpBase ( All-against-all protein alignments were performed using blastp, with an e-value cutoff of 1 × e−5. The alignments were subjected to MCScanX35 to identify syntenic blocks between two species. Two chromosome regions with the gap size set to 15 genes and at least five genes were considered to be syntenic5. For each common carp gene, we selected the best-aligned zebrafish gene. Common carp genes and zebrafish genes were plotted on the horizontal axis and the vertical axis of a grid, respectively.

Thirdly, we investigated the homologous and syntenic relationship between common carp and goldfish. The reciprocal blastx searches were performed using goldfish reference transcripts as queries against common carp proteins with an e-value cutoff of 1 × e−5. Two sequences were defined as orthologs if each of them was the best hit of the other. Then the pairs between goldfish marker transcripts and the corresponding common carp orthologs was plotted on a grid based on the position of each gene.

Fourthly, we compared grass carp genome against zebrafish genome. Grass carp proteins and the corresponding locations were downloaded from the National Center for Gene Research website ( The blastp alignment and MCScanX searches were performed following the above criteria to build syntenic blocks. Both grass carp and zebrafish had species-specific gene duplications3,4, which might result in one-to-many orthologous pairs. We retained the reciprocal best-match ortholog pairs for each grass carp gene and zebrafish gene from the syntenic blocks, to ensure that each gene of a species had only one best-aligned hit in the other species. On an oxford grid, grass carp genes were plotted on the horizontal axis based on their orders in genome and zebrafish orthologs were on the vertical axis.

Finally, we examined the syntenic relation between zebrafish and medaka. Since the medaka genome is considered to preserve the teleost ancestor karyotype36, comparison between medaka and zebrafish genomes will reveal the genome rearrangement in zebrafish after its speciation from teleost ancestor. Medaka proteins were obtained from Ensembl database. After syntenic blocks were identified using MCScanX, only one-to-one orthologous pairs were retained in the following analysis. Medaka genes were plotted on the horizontal axis of an oxford grid and the zebrafish orthologs were on the vertical axis.

Estimation of genome sizes

We adopted the strategy of Wilhelm et al.16 to estimate the genome sizes of grass carp, zebrafish, common carp, and goldfish. In brief, we designed two sets of gene-specific primers including the outer primers and the inner primers (Supplementary Table S9, Supplementary Fig. S8). The outer primers were used in the first round of PCR to prepare the standard templates in the second round of real-time PCR. During the following round of real-time PCR with the inner primers, standard curves were drawn using diluted standard templates of different concentrations. Then the genomic DNA of target species was used as template in the real-time PCR with the above volume and protocol. We compared the amplification curve with the standard curves to estimate and then calculated the genome size based on the quantified concentration and copies. The performance details were described in Supplementary Method.

QTL mapping and association analyses of growth traits

QTL mapping analysis was performed for two growth traits, BW and SL, using GridQTL ( with the regression-interval mapping method37. Considering different recombination frequencies between sire and dam, we identified QTL regions using sire-based and dam-based half-sib analysis, respectively. For each analysis, F value was calculated at 1 centiMorgans (cM) interval on each LG. To identify the significant QTL regions, the chromosome-wide threshold was determined using an empirical permutation method38 with 10,000 permutations. If the F value of one region was larger than the chromosome-wide threshold at p value < 0.05, we considered it to be a significant QTL region. Secondly, the 95% confidence interval (CI) of a QTL region was calculated using a bootstrapping algorithm39 with 1,000 sampling. The marker transcripts in two 5-cM flanking regions of QTL peaks were selected to study the functions of genes related to traits.

The association analyses were performed between genotypes and two traits using PLINK40, as a complementary approach to QTL mapping. To determine the threshold to identify the significantly associated SNPs, we firstly detected the independent SNPs41. We calculated linkage disequilibrium for the F2 population according to r2 value with the window width of 50 SNPs and the stepwise distance of five SNPs. An r2 threshold of 0.5 was set to detect independent SNPs42. We identified 171 independent SNPs. Secondly, the suggestive significance threshold was set as 5.8 × 10−3 (1/171). Markers with p values ≤ 5.8 × 10−3 were considered to be significantly associated with growth traits.

To annotate the functions of the growth-associated genes, we searched their orthologs by blastx against NCBI NR database with an e-value threshold of 10−5. Then we used Blast2GO43 with default parameters to assign the Gene Ontology (GO) annotations to gold transcripts. The GO distributions were plotted using WEGO44.

Additional Information

How to cite this article: Kuang, Y.-Y. et al. The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family. Sci. Rep. 6, 34849; doi: 10.1038/srep34849 (2016).


  1. Zhu, C. et al. A second-generation genetic linkage map for bighead carp (Aristichthys nobilis) based on microsatellite markers. Anim. Genet. 45, 699–708 (2014).

    PubMed  Google Scholar 

  2. Zhou, L. & Gui, J. F. Karyotypic Diversity in Polyploid Gibel Carp, Carassius auratus gibelio Bloch. Genetica 115, 223–232 (2002).

    CAS  PubMed  Google Scholar 

  3. Howe, K. et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496, 498–503 (2013).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wang, Y. et al. The draft genome of the grass carp (Ctenopharyngodon idellus) provides insights into its evolution and vegetarian adaptation. Nat. Genet. 47, 625–631 (2015).

    CAS  PubMed  Google Scholar 

  5. Xu, P. et al. Genome sequence and genetic diversity of the common carp, Cyprinus carpio. Nat. Genet. 46, 1212–1219 (2014).

    CAS  PubMed  Google Scholar 

  6. Li, X. Y. et al. Evolutionary history of two divergent Dmrt1 genes reveals two rounds of polyploidy origins in gibel carp. Mol. Phylogenet. Evol. 78, 96–104 (2014).

    PubMed  Google Scholar 

  7. Yuan, J. et al. Speciation of polyploid Cyprinidae fish of common carp, crucian carp, and silver crucian carp derived from duplicated Hox genes. J. Exp. Zool. 314, 445–456 (2010).

    Google Scholar 

  8. Stensløkken, K. O. et al. Life without oxygen: gene regulatory responses of the crucian carp (Carassius carassius) heart subjected to chronic anoxia. PLoS One 9, e109978 (2014).

    ADS  PubMed  PubMed Central  Google Scholar 

  9. Komiyama, T. et al. An evolutionary origin and selection process of goldfish. Gene 430, 5–11 (2009).

    CAS  PubMed  Google Scholar 

  10. Popesku, J. T. et al. The goldfish (Carassius auratus) as a model for neuroendocrine signaling. Mol. Cell. Endocrinol. 293, 43–56 (2008).

    CAS  PubMed  Google Scholar 

  11. Li, C. Y. et al. The transcriptomes of the crucian carp complex (Carassius auratus) provide insights into the distinction between unisexual triploids and sexual diploids. Int. J. Mol. Sci. 15, 9386–9406 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

    PubMed  PubMed Central  Google Scholar 

  13. Xiao, J. et al. Coexistence of diploid, triploid and tetraploid crucian carp (Carassius auratus) in natural waters. BMC Genet. 12, 20 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Zheng, X. et al. A genetic linkage map and comparative genome analysis of common carp (Cyprinus carpio L.) using microsatellites and SNPs. Mol. Genet. Genomics 286, 261–277 (2011).

    CAS  PubMed  Google Scholar 

  15. Zhang, X. et al. A consensus linkage map provides insights on genome character and evolution in common carp (Cyprinus carpio L.). Mar. Biotechnol. 15, 275–312 (2013).

    CAS  PubMed  Google Scholar 

  16. Wilhelm, J. Real-time PCR-based method for the estimation of genome sizes. Nucleic Acids Res. 31, 56e–56e (2003).

    Google Scholar 

  17. Gregory, T. R. Animal genome size database. (2016).

  18. Moss, S. P., Joyce, D. A., Humphries, S., Tindall, K. J. & Lunt, D. H. Comparative Analysis of Teleost Genome Sequences Reveals an Ancient Intron Size Expansion in the Zebrafish Lineage. Genome Biol. Evol. 3, 1187–1196 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Nakanishi, T. et al. Overexpression of Connective Tissue Growth Factor/Hypertrophic Chondrocyte-Specific Gene Product 24 Decreases Bone Density in Adult Mice and Induces Dwarfism. Biochem. Biophys. Res. Commun. 281, 678–681 (2001).

    CAS  PubMed  Google Scholar 

  20. Canalis, E., Zanotti, S. & Smerdel-Ramoya, A. Connective tissue growth factor is a target of notch signaling in cells of the osteoblastic lineage. Bone 64, 273–280 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Mundy, C., Gannon, M. & Popoff, S. N. Connective tissue growth factor (CTGF/CCN2) negatively regulates BMP-2 induced osteoblast differentiation and signaling. J. Cell. Physiol. 229, 672–681 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Arnott, J. A. et al. The role of connective tissue growth factor (CTGF/CCN2) in skeletogenesis. Crit. Rev. Eukaryot. Gene Expr. 21, 43–69 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Fierst, J. L. Using linkage maps to correct and scaffold de novo genome assemblies: methods, challenges, and computational tools. Front. Genet. 6, 220 (2015).

    PubMed  PubMed Central  Google Scholar 

  24. Cox, M. P., Peterson, D. A. & Biggs, P. J. SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11, 485 (2010).

    PubMed  PubMed Central  Google Scholar 

  25. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).

    CAS  PubMed  Google Scholar 

  27. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  28. van Ooijen, J. W. Multipoint maximum likelihood mapping in a full-sib family of an outbreeding species. Genet. Res. 93, 343–349 (2011).

    CAS  Google Scholar 

  29. Amores, A., Catchen, J., Ferrara, A. & Fontenot, Q. Genome evolution and meiotic maps by massively parallel DNA sequencing: spotted gar, an outgroup for the teleost genome duplication. Genetics 188, 799–808 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Liu, D. et al. Construction and analysis of high-density linkage map using high-throughput sequencing data. PLoS One 9, e98855 (2014).

    ADS  PubMed  PubMed Central  Google Scholar 

  31. Kosambi, D. D. The estimation of map distances from recombination values. Ann. Eugenic. 12, 172–175 (1943).

    Google Scholar 

  32. Cunningham, F. et al. Ensembl 2015. Nucleic Acids Res. 43, D662–D669 (2015).

    CAS  PubMed  Google Scholar 

  33. Mount, D. W. Using the Basic Local Alignment Search Tool (BLAST). Cold Spring Harb. Protoc. 2007, pdb top17 (2007).

    Google Scholar 

  34. Edwards, J. H. The Oxford Grid. Ann. Hum. Genet. 55, 17–31 (1991).

    CAS  PubMed  Google Scholar 

  35. Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  36. Kasahara, M. et al. The medaka draft genome and insights into vertebrate genome evolution. Nature 447, 714–719 (2007).

    ADS  CAS  PubMed  Google Scholar 

  37. Knott, S. A., Elsen, J. M. & Haley, C. S. Methods for multiple-marker mapping of quantitative trait loci in half-sib populations. Theor.Appl.Genet. 93, 71–80 (1996).

    CAS  PubMed  Google Scholar 

  38. Churchill, G. A. & Doerge, R. W. Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Visscher, P. M., Thompson, R. & Haley, C. S. Confidence intervals in QTL mapping by bootstrapping. Genetics 143, 1013–1020 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Duggal, P., Gillanders, E. M., Holmes, T. N. & Bailey-Wilson, J. E. Establishing an adjusted p-value threshold to control the family-wide type 1 error in genome wide association studies. BMC Genomics 9, 516–518 (2008).

    PubMed  PubMed Central  Google Scholar 

  42. Geng, X. et al. A genome-wide association study in catfish reveals the presence of functional hubs of related genes within QTLs for columnaris disease resistance. BMC Genomics 16, 196 (2015).

    PubMed  PubMed Central  Google Scholar 

  43. Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Ye, J. et al. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 34, W293–W297 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


This study was supported by the Special Scientific Research Funds for Central Non-profit Institutes Special Funds, Chinese Academy of Fishery Sciences (2015B03XK01), National Natural Science Foundation of China (31402353) and China Ministry of Science and Technology 863 Hi-Tech Research and Development Program (2011AA100402). The authors would like to express their sincere appreciation to the Deanship of Scientific Research at King Saud University for its funding of this research through the Research Group Project No. RG-1435-012. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information




Y.-Y.K., J.-T.L., X.-W.S. and Y.Z. conceived the studies, J.-T.L., Y.-Y.K. and X.-H.Z. wrote the manuscript, Y.Z., X.-W.S., W.X., S.M. and K.A.-G. revised the manuscript, Y.-Y.K., X.-H.Z., D.-C.C., W.X. and G.-X.T. constructed the mapping family, W.X. contributes the goldfish variety, Y.-Y.K., X.-H.Z., C.-Y.L., D.-C.C., G.-X.T., W.-H.L., Z.-P.S. and X.-F.Z. collected tissues’ samples and extracted RNA samples, Y.-Y.K., J.-T.L., C.Y-.L. and X.-M.L. performed the RNA-Seq analysis, Y.-Y.K., J.-T.L., X.-H.Z., X.-FZ., C.-Y.L., G.-X.T. and W.H.L. constructed the genetic map and performed QTL and association studies. J.-T.L., C.-Y.L. and X.-M.L. carried out comparative genomic studies. X.-H.Z., W.H.L. and Y.-Y.K. carried out genome size estimation.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kuang, YY., Zheng, XH., Li, CY. et al. The genetic map of goldfish (Carassius auratus) provided insights to the divergent genome evolutions in the Cyprinidae family. Sci Rep 6, 34849 (2016).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing