Development of a 690 K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence

Zeng, Qifan; Fu, Qiang; Li, Yun; Waldbieser, Geoff; Bosworth, Brian; Liu, Shikai; Yang, Yujia; Bao, Lisui; Yuan, Zihao; Li, Ning; Liu, Zhanjiang

doi:10.1038/srep40347

Download PDF

Article
Open access
Published: 12 January 2017

Development of a 690 K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence

Qifan Zeng¹,
Qiang Fu¹,
Yun Li¹,
Geoff Waldbieser²,
Brian Bosworth²,
Shikai Liu¹,
Yujia Yang¹,
Lisui Bao¹,
Zihao Yuan¹,
Ning Li¹ &
…
Zhanjiang Liu¹

Scientific Reports volume 7, Article number: 40347 (2017) Cite this article

2465 Accesses
47 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Single nucleotide polymorphisms (SNPs) are capable of providing the highest level of genome coverage for genomic and genetic analysis because of their abundance and relatively even distribution in the genome. Such a capacity, however, cannot be achieved without an efficient genotyping platform such as SNP arrays. In this work, we developed a high-density SNP array with 690,662 unique SNPs (herein 690 K array) that were relatively evenly distributed across the entire genome, and covered 98.6% of the reference genome sequence. Here we also report linkage mapping using the 690 K array, which allowed mapping of over 250,000 SNPs on the linkage map, the highest marker density among all the constructed linkage maps. These markers were mapped to 29 linkage groups (LGs) with 30,591 unique marker positions. This linkage map anchored 1,602 scaffolds of the reference genome sequence to LGs, accounting for over 97% of the total genome assembly. A total of 1,007 previously unmapped scaffolds were placed to LGs, allowing validation and in few instances correction of the reference genome sequence assembly. This linkage map should serve as a valuable resource for various genetic and genomic analyses, especially for GWAS and QTL mapping for genes associated with economically important traits.

Targeted genome-modification tools and their advanced applications in crop breeding

Article 24 April 2024

Analysis and benchmarking of small and large genomic variants across tandem repeats

Article 26 April 2024

Genome assembly in the telomere-to-telomere era

Article 22 April 2024

Introduction

Catfish is the primary aquaculture species in the US, accounting for approximately 60% of US aquaculture production. Channel catfish (Ictalurus punctatus) and blue catfish (I. furcatus) are the two most important species in catfish farming industry due to their high tolerance in a wide range of environmental conditions. They have been also used as models for comparative immunology and environmental toxicology¹. In the past decades, various genetic and genomic resources have been produced to facilitate catfish genetic improvement and breeding programs, including development of molecular markers^2,3,4,5, BAC-based physical maps^6,7,8, expression sequence tags (ESTs)^{9,10,11,12,13,14,15} and transcriptome sequencing using RNA-Seq^{16,17,18,19,20,21,22,23}, genetic linkage maps^{24,25,26,27,28,29}, and generation of whole genome reference sequences¹.

However, rapid progress in catfish genetics research has been hindered by the lack of effective and efficient genotyping platforms. Recently, we developed the catfish 250 K SNP array using Affymetrix axiom technology³⁰. This array has been of great use for various genetic analysis including genetic linkage mapping^26,28 and analysis of performance traits using GWAS^31,32,33. However, the 250 K SNP array was designed without a reference genome. As a result, there were limitations with equal spacing of marker intervals and complete genome coverage. The recent catfish reference genome assembly¹ provided a platform for the construction of a high-density SNP array with improved genome coverage. In addition, independent genetic analyses using linkage mapping should allow greater levels of integration of the reference genome sequence with linkage maps, validation of the reference genome sequence, and correction of potential mistakes in the reference genome sequence.

Linkage maps are important for genetic analysis because they serve as a chromosomal-level framework to trace inheritance of various traits. However, for several decades, progress of linkage mapping was limited to relatively low density of markers because the marker systems available then were mostly limited to microsatellites and AFLPs, for which highly efficient automated genotyping systems were not available. Until recently, linkage maps carried over thousands of SNP markers have been constructed in several aquaculture species including Atlantic salmon (Salmo salar)^34,35, rainbow trout (Oncorhynchus mykiss)³⁶, Japanese flounder (Paralichthys olivaceus)^37,38, Asian seabass (Lates calcarifer)^39,40, sea bream (Sparus auratus)⁴¹, bighead carp (Hypophthalmichthys nobilis)⁴², yellowtail (Seriola quinqueradiata)⁴³, Pacific oyster (Crassostrea gigas)^44,45, tiger shrimp (Penaeus monodon)⁴⁶. kuruma prawn (Marsupenaeus japonicas)⁴⁷, Pacific white shrimp (Litopenaeus vannamei)^48,49, silver-lipped pearl oyster (Pinctada maxima)⁵⁰, common carp (Cyprinus carpio)⁵¹, gudgeons of the genus Gnathopogon⁵², European sea bass (Dicentrarchus labrax)⁵³, sea urchins (Strongylocentrotus nudus and S. intermedius)⁵⁴, sea cucumber (Apostichopus japonicus)⁵⁵, turbot (Scophthalmus maximus)⁵⁶, large yellow croaker (Larimichthys crocea)^57,58, Atlantic halibut (Hippoglossus hippoglossus)⁵⁹, zhikong scallop (Chlamys farreri)⁶⁰, chinook salmon (Oncorhynchus tshawytscha)⁶¹, triangle sail mussel (Hyriopsis cumingii)⁶², coho salmon (Oncorhynchus kisutch)⁶³, Japanese eel (Anguilla japonica)⁶⁴, and pearl oyster (Pinctada fucata)⁶⁵. Similarly, for many years, catfish genetic maps were limited to relatively low densities^24,25,27,66. It was only after the application of the automated SNP array genotyping systems, high density linkage maps were produced^26,28. Although the latest channel catfish linkage map carried over 50,000 SNP markers, a large number of scaffolds of the reference genome sequence were still not anchored to chromosomes. In addition, about half of the mapped markers were completely linked due to the relatively low resolution, hindering the validation of the reference genome sequence assembly²⁸. A linkage map with super-density of markers and high resolution is still desired in order to provide a high level of integration of the linkage map with the reference genome sequence. In this work, we developed a high-density SNP array with 690,662 unique markers, and used this array to construct a catfish genetic linkage map with the highest marker density among any of the constructed linkage maps. The use of four resource families with a total of 465 individuals allowed a greater resolution, with 30,591 unique mapped marker positions. Comparative analysis of marker orders on the genetic map and on the reference genome sequence allowed validation and correction of the reference genome sequence.

Results

SNPs identification and selection

SNPs were identified from a total of over 6.4 billion sequencing reads from 1,213 catfish (Table 1). Initially, a total of over 9.6 million putative SNPs were identified, including 8.6 million SNPs from channel catfish and 3.8 million SNPs from blue catfish. To select high-quality SNPs, 70 bp flanking sequences (35 bp upstream and downstream of the SNP site) of all the putative SNPs were mapped onto the catfish reference genome sequence and matched more than one locus were removed, resulting in approximately 5.7 million uniquely mapped SNPs from channel catfish and 3.1 million uniquely mapped SNPs from blue catfish. A total of ~1.8 million SNPs from channel catfish and ~300,000 SNPs from blue catfish were retained after additional steps of screening, including elimination of SNPs surrounded by simple sequence repeats, removal of tri-allelic SNPs, and removal of markers with low probability of conversion based on Affymetrix algorithms (Table 2).

Table 1 Sample and data size for development of 690 K SNP array.

Full size table

Table 2 Summary of SNPs identified in channel and blue catfish.

Full size table

SNPs included on the 690 K array

The final SNPs included on the 690 K SNP array were summarized in Table 3. A total of 690,662 SNPs were selected, including 238,484 genic SNPs and 452,178 intergenic SNPs. The 238,484 genic SNPs were from 24,186 genes annotated from the channel catfish reference genome sequence, or from the transcripts assembled from RNA-Seq datasets. For most SNPs with unique flanking sequences, only one probe was designed on the array. However, two probes were designed on the array for 2,905 SNPs whose flanking sequences were less unique, leading to a total of 693,567 SNP probes on the 690 K SNP array.

Table 3 Summary of the catfish 690 K SNP array.

Full size table

The 690 K catfish SNP array also included species and strain specific SNPs (Table 3), which should be useful for genetic analysis of the interspecific hybrid system and intraspecific crossbreeding. Of the total SNPs on the array, 581,002 SNPs were specifically identified from channel catfish, 44,694 were exclusively observed in blue catfish, 19,124 were interspecific SNPs which were homozygous within but differing between species, and 45,842 SNPs that were heterozygous within and between species.

For strain identification, 48,434 SNPs from four catfish aquaculture strains that originate from different geographic locations were also included on the array. These four strains possess different production traits such as growth rates, disease resistance, and adaptation to environmental stresses. A total of 6,622 SNPs were specifically from a wild channel catfish population of the Coosa River, Alabama, which may be useful for genetic analysis of genomic regions with selective signatures. In addition to SNP probes, 2,000 data quality control (QC) probes were also included on the SNP array serving as negative controls.

The SNPs included on the 690 K SNP array were of high quality, with an average p-convert value of 0.71; 97% of SNPs had a p-convert value of greater than 0.65 (Fig. 1A). A proportion of 89.6% of SNPs had a MAF greater than 0.1 (Fig. 1B). This distribution of MAF should provide the SNPs with a relatively high level of polymorphic content, a desired characteristic for linkage mapping, GWAS, and other genetic analysis.

The spacing among SNPs was evaluated using their physical position on the reference genome or on reference sequence contigs that were not mapped to chromosomes. As shown in Fig. 1C, over 86% SNPs had an inter-SNP spacing of less than 2,000 bp.

Distribution of SNPs across the genome

One of the most important goals of the SNP array development is to have a good coverage of the entire genome with relatively even distribution of SNPs. The SNP locations on the reference genome sequence were used as the coordinates to assess their overall distribution. A total of 659,912 (95.5% of all the SNPs on the array) SNPs were developed from the reference genome sequence, which span a total of 778 Mb, approximately 99.4% of the assembled reference genome sequence. As shown in Fig. 2, almost all of the genomic regions were covered by SNPs, except a few highly repetitive regions from which convertible SNP probes could not be designed. The remaining 30,750 (4.5%) SNPs were developed from other genomic resources, including 15,108 SNPs from unmapped scaffolds and contigs, 521 SNPs from transcript sequences via de novo transcriptome assembly, and 15,121 SNPs from bacterial artificial chromosome (BAC) end sequences (BES). Taken together, all the SNPs on the array covered 98.6% of the sequences in the reference genome scaffolds and 93.9% of the BAC based physical map contigs.

**Figure 2: Genome distribution of SNPs on the 690 K SNP array.**

Performance of the catfish 690 K SNP array

Performance of the SNP array was examined by genotyping catfish DNA samples from hybrid backcross families and channel catfish domesticated families. As summarized in Table 4, 473 of 480 catfish samples (98.5%) were successfully genotyped after sample quality control. In backcross hybrids samples, a total of 597,323 (86.1%) SNPs were successfully genotyped, and 504,265 (72.7%) were polymorphic in these 84 individuals. The average call rate of dish quality control (DQC) qualified samples was greater than 99.2%. In channel catfish samples, a total of 578,868 (83.5%) SNPs were converted, of which 467,821 (67.5%) were polymorphic from 396 tested fish of Delta select strain.

Table 4 SNP metrics summary.

Full size table

Construction of channel catfish linkage map

A total of 478 channel catfish samples of four mapping families were used for linkage mapping. After applying the criteria of DQC score greater than 0.82 and call rate greater than 97%, 5 samples with poor qualities were eliminated. Genotyping data of the remaining 473 samples were imported into Plink for a pedigree information test (Fig. 3). The vast majority of samples fell into three clusters, with two families gathered together because they were generated from one sire. Eight individual outliers were identified and excluded from linkage analysis. The genotyping data of remaining 465 samples were imported into Lep-map2 for SNP filtering prior to linkage group assignment. According to the assessment of genotyping quality and polymorphism in all samples from the four reference families, a total of 287,583 SNPs were informative in at least two families.

**Figure 3: Sample structure identified by multidimensional scaling analysis of IBS distances.**

A total of 287,370 qualified SNPs were successfully assigned into 29 linkage groups, which was in concordance with the number of chromosomes of the catfish haploid genome. A two-round marker ordering procedure were carried out with the four families simultaneously. The first round of marker ordering identified 116,864 representative markers. By using a hidden Markov model (HMM), the OrderMarkers module modeled recombinant haplotypes and identify duplicated markers. After filtering these duplicated markers, a second round of marker ordering was performed to improve the order of representative markers. Finally, the previously excluded duplicate and stacked markers were inserted back into the maker order to calculate the genetic distance. A total of 253,087 markers were placed onto the linkage map.

The sex-average genetic distances were calculated by taking account the recombination probabilities in both sexes. As summarized in Table 5, the sex-average map consists of 253,087 markers including 30,591 unique positions, with a total genetic length of 3,004.7 cM. The marker intervals estimated based on the unique marker positions ranged from 0.08 cM/marker pair in LG12 and LG13 to 0.13 cM/marker pair in LG22, with an average marker interval of 0.1 cM/marker pair in sex-average genetic map. As illustrated in Fig. 4, there were no abnormal large gaps on the genetic map. The detailed information on marker position is provided in Supplemental Table S1.

Table 5 Summary of the sex-average linkage map of channel catfish.

Full size table

The sex-specific genetic distances were calculated by taking account the recombination rates in only one sex. The female genetic map consisted of 23,610 unique positions, with a total genetic length of 3,582.3 cM (Table 6, Supplemental Fig. 1). The marker intervals estimated based on the unique marker positions ranged from 0.13 cM/marker pair in LG5, LG13, LG19 and LG21 to 0.18 cM/marker in LG22 and LG29, with an average marker interval of 0.15 cM/marker in the female genetic map. The male genetic map consists of 18,339 unique markers, with a total genetic length of 2,545.6 cM (Table 6, Supplemental Fig. 2). The marker intervals estimated based on the unique marker positions ranged from 0.11 cM/marker in LG12 to 0.17 cM/marker in LG22, with an average marker interval of 0.14 cM/marker on male genetic map.

Table 6 Summary of the sex-specific linkage map of channel catfish.

Full size table

The difference between female and male linkage maps was assessed according to contingency G-test⁶⁷. Significantly higher recombination rates were observed in the female genetic map than in the male genetic map, for the majority of the linkage groups (p < 0.01). The female genetic map was 1,036.7 cM longer than the male genetic map, with an average female-to-male ratio of 1.4:1. The ratio varied by linkage groups, ranging from 0.96 in LG6 to 2.06 in LG18 (Table 6). Large differences of recombination rate were also observed in LG2 and LG23 with the ratios of female to male greater than 1.6.

Integration and validation with reference genome sequence

The genetic map anchored 1,602 scaffolds of the reference genome sequence, corresponding to 766 Mb (97.8%) of the total 783 Mb. Comparing with the results from using the catfish 250 K SNP array, 1,007 previously unmapped scaffolds are anchored to chromosomes. However, these scaffolds are small in size, resulting in anchoring of only additional 5.7 Mb of the reference genome sequences to chromosomes. Additionally, 15 transcripts generated from de novo transcriptome assembly and 244 previously unmapped BAC-based physical contigs are also placed onto the linkage map. The positions of previously unmapped markers are illustrated in Fig. 5, making the whole genome reference sequence more ordered, continuous and connected.

**Figure 5: Concordance of SNP marker positions on reference sequence with those on genetic linkage map.**

Mapping of over 250,000 SNPs across the genome validate the accuracy of the reference genome sequence assembly. As shown in Fig. 5, the marker orders from the linkage map and the reference genome are in concordance. A few differences of the marker orders between the linkage map and the reference genome sequences are observed in scf00172 (LG8), scf00249 (LG11), scf00274 (LG12), scf00439 and scf00436 (LG22) (Fig. 6), suggesting that these regions may be mis-assembled.

Discussion

In this study, we developed the catfish 690 K SNP array using Affymetrix Axiom technology. Compared to the catfish 250 K SNP array³⁰, the 690 K array has significantly more markers, more even genomic distribution, higher qualities of SNPs based on p-convert values, greater level of polymorphic contents, and greater level of coverage for the whole genome. The final set SNPs on the array consisted of channel catfish specific, blue catfish specific, and inter-species SNPs. A subset of strain-specific markers from domestic and wild channel catfish populations were also included on the array, which should be useful for detecting genomic regions with selective signatures. The 690 K SNP covered 24,186 of the 27,618 predicted genes in catfish, including 23,876 genes identified from the reference genome sequences and 310 genes predicted from RNA-seq datasets. The remaining genes were not covered because most of these genes have more than one copy on the genome; with duplicated genes, it is difficult to design reliable SNP probes. The selected SNPs were relatively evenly distributed on the reference genome sequence, which should benefit the detection of linkage disequilibrium in future genome-wide association studies and fine-scale QTL mapping. Distribution of MAF indicated that most of the SNPs included on the array were common variants (89.6% of SNPs had a MAF >0.1).

The evaluation of SNP array performance by genotyping samples from catfish backcross hybrid families and channel catfish families indicated that 72.7% of the SNPs were polymorphic in backcross hybrid catfish and 67.5% were polymorphic in channel catfish. Despite more channel catfish samples processed, higher percentages of SNPs were converted and polymorphic in backcross hybrids samples. This may be caused by the interspecific and blue-specific SNPs included on the array. Alleles from blue catfish could be detected as they segregated in the hybrid backcross genomes.

Taking advantage of the catfish 690 K SNP array, we constructed a genetic linkage map with super high density of markers for channel catfish. Not only the total number of mapped markers were increased from 50,000 to 250,000, the resolution of the map was also increased by using four reference families with a total of 465 individuals. However, because of the large number of markers and the limited recombination along the chromosomes, a large number of stacked markers were still observed. The only way to increase resolution power is to increase the sample sizes of the reference population (numbers of individuals and number of the reference families). While that can be done, the primary limitation is the cost. Due to the presence of genotyping errors and missing values, genetic mapping with these closely linked markers greatly increase the computation burden and usually introduce an overestimated genetic sizes. In our study, the total genetic distance and error estimate were dramatically reduced when we filtered the stacked SNPs. Genotyping errors in duplicate markers could also deform the marker orders, which can cause more severe problems compared with the inflation of genetic distances. To reduce the effect of clustered markers on the map construction and reduce the computing burden, we selected one representative marker to anchor the clusters. After the marker order was settled, the duplicated markers were then rejoined into the linkage group. This procedure rescued the informative markers onto the linkage maps based on the positions of their representative anchor marker. Such clustered markers were still valuable to organize the scaffolds to chromosomes using the reference genome sequences.

By integration of the genetic map with the reference genome sequences, the patterns of recombination across the whole chromosomes could be determined. As shown in Fig. 5, mild to strong localized specific recombination patterns were observed in each linkage group. Most often, the recombination rates were usually elevated towards the ends and decreased in the middle of the chromosome. Stacked markers that located in regions of strong linkage disequilibrium were observed in each linkage group, especially in regions close to the centromeres. The observed overall recombination patterns were in concordance with previous research in catfish^26,28 and other aquaculture species such as tilapia⁶⁸, medaka⁶⁹, Atlantic salmon⁷⁰, and rainbow trout⁷¹. Although the ultimate mechanism is still elusive, increasing evidence supports the hypothesis that multiple functional sequence motifs are involved in recombination regulation⁷². With the available of channel catfish reference genome, the next step of our study is to identify the sequence features within the recombination “hot zones”.

Higher recombination rate is observed in females than in males, which is consistent with previous studies in channel catfish. This phenomenon has also been observed in species with dissimilar sex chromosomes, with larger recombination rate in homogametic sex than in heterogametic sex. Some examples of this include mice⁷³, zebrafish⁷⁴, Atlantic salmon³⁵, rainbow trout⁷¹, European seabass⁷⁵, silver carp⁷⁶, and grass carp⁷⁷. One hypothesis toward sexual dimorphism in recombination fraction is that sex chromosomes in the homogametic sex are equal in size, therefore, recombination is more likely to occur. However, this does not seem to be true for channel catfish because the X and Y chromosomes are cytologically indistinguishable⁷⁸. Interestingly, our results showed that recombination rate of LG4, which corresponds to the sex chromosome, is similar to that of other linkage groups, perhaps with the exception of the sex determination region. This suggests that other factors such as male-specific selection⁷⁹ or chromatin differences may account for this difference⁸⁰.

The high-density linkage map allowed validation and correction of the reference genome. In LG8, a 1.2 Mb stretch (from 9,958,308 bp to 11,156,616 bp) of scf00172 (NCBI accession KV452904.1) is supposed to be placed reversely between scf00174 and scf00637. In LG11, a 2.5 Mb subsequence (from 5,088,016 bp to 7,539,243 bp) of scf00249 (NCBI accession KV452863.1) is assumed to be assembled in the opposite direction. In LG12, a 1.8 Mb subsequence (from 3,480,035 bp to 5,255,031 bp) of scf00274 (NCBI accession KV453011.1) is presumed to be assembled in the opposite direction. It is also surmised that the whole sequence of scf00439 (NCBI accession KV453114.1) should be assembled reversely in LG22. Scf00436 (NCBI accession KV453115.1) is supposed to be placed between scf00437 (NCBI accession KV453116.1) and scf00435 (NCBI accession KV453121.1). By integrating the linkage map with the reference genome assembly, over 5.6 Mb from 1,007 previously unmapped scaffolds were successful anchored to their corresponding positions. Additionally, 15 transcripts generated from de novo transcriptome assembly, and 244 previously unmapped BAC-based physical contigs were also placed onto the linkage map. This is a significant improvement on integration of the linkage map and the reference genome sequence, which is useful for further genomic studies, QTL analysis, and whole genome-based selection.

Materials and Methods

Ethics statement

This study was approved by the Institutional Animal Care and Use Committee (IACUC) at Auburn University. All experiments involving the handling and treatment of fish were carried out in accordance with approved guidelines.

SNP identification and SNP array development

In order to identify SNPs from the whole genome, Illumina sequencing data from various studies involving fish with diverse genetic background were collected. For channel catfish, a total of 3.3 billion reads from RNA-seq of 824 fish and 2.4 billion reads from whole genome sequencing of 150 fish were collected for SNP identification, with an average genome coverage of 500X (Table 1). For blue catfish, data used for SNP identification included over 359 million reads of GBS data from 190 individuals and 478 million reads of RNA-seq data from 49 individuals.

To reduce sequencing artifacts and improve the SNP quality, raw sequencing reads from all the studies were first subjected to quality control with Trimmomatic (version 0.33). Adaptor sequences, ambiguous nucleotides (N’s), extreme short reads (<25 bp) were removed. Low quality bases were identified and trimmed with a sliding window method, bases within a window size of 4 were cut once the average quality was less than 15. For reads generated from GBS, chimeric sequences were eliminated by trimming the sequence at the corresponding restriction enzyme site. The clean reads generated by WGS and GBS were then aligned to the reference genome sequence with BWA-MEM (version 0.7.12). To acquire high sensitivity and accuracy for RNA reads alignment, a 2-pass alignment method was performed using STAR aligner (version 2.4.0j). The results of the alignment were exported in BAM format for subsequent analysis⁸¹.

Prior to SNP identification, the BAM files were processed with Picard tools (version 1.119) to identify and remove redundant copies of duplicates. BAM files of WGS reads were subjected to local realignment of regions near INDELs with GATK (version 3.3) to improve the accuracy of variant calling. For BAM files of RNA-seq reads, “SplitNCigarReads” commands of GATK were used to split reads which spanned multiple exons and to trim overhangs. Ambiguously mapped reads were removed using SAMtools (version 0.1.19) under the criteria of a minimum MAPQ score of 20. For paired-end reads, only those mapped in a proper pair were kept for further analysis. Subsequently all alignment files were integrated for variant calling using Varscan (version 2.3.7). The putative SNPs were identified with the thresholds of minor allele frequency greater than 0.05, minimum read base quality of 20, strand-filter of 90%, and minimum read depth of 10. Sequences of 71-bp spanning each SNP were extracted, with 35-bp upstream and 35-bp downstream of the SNP base, respectively. To avoid false positive SNPs caused by ambiguous mapping of duplicated regions, the 71-bp fragments were aligned to the reference genome sequence with BLAST + (version 2.2.29). SNPs with flanking regions that mapped to multiple sites or low complexity and repetitive regions on the reference genome sequence were removed.

SNP selection was performed in multiple steps using different criteria regarding different SNP types. All the filtration parameters were set with the aims of achieving an evenly spaced coverage across the entire genome and removal of false positive sites. All the original SNPs were classified into different groups and selected in a certain order: SNPs within genes were first selected, then SNPs from intergenic regions were added, finally, species-specific SNPs and strain-specific SNPs were included in the pool of candidate SNPs. Custom scripts were used to perform the selection according to the following criteria: (1) To avoid non-specific hybridization, the flanking sequences of selected SNPs should not have other SNPs or simple nucleotide repeats; (2) For practical application in SNP genotyping assays, only bi-allelic SNPs were selected; (3) To acquire high-polymorphic rate, SNPs with a minor allele frequency greater than 0.1 were preferentially selected; (4) To acquire a high-capacity, A/T or C/G SNPs were not selected unless absolutely needed; (5) To minimize the effects of GC content on the signal intensity, the GC percentage of the SNPs flanking sequences should be between 30–70%. For SNPs within genes, once a new SNP was included into the candidate pool, SNPs from the flanking regions of 200 bp were not added. For SNPs from intergenic regions, once a new SNP was included into the candidate pool, SNPs from the flanking regions of 350 bp were not added. SNPs from unmapped scaffolds and contigs apart from reference genome sequences were also included regardless of their distances.

All catfish SNPs that passed this filter were submitted to Affymetrix Bioinformatics Services for in-silico probe converting test, where the performing quality of the SNPs were evaluated. Upstream and downstream probes flanking the SNPs were assigned with a p-convert value (0.0 to 1.0), respectively. Probes with high p-convert values were more likely to be successfully genotyped. A threshold for p-convert value was set to remove the lowest performing probes to facilitate selection of a high-quality SNP list. Markers with at least one probe that passed the p-convert value threshold were retained. For SNP markers with both of the probes that passed the p-convert value threshold, the probe with greater p-convert value was selected. For SNPs with only probes of low p-convert values, both probes were included to cover the genome region.

In addition to the polymorphic SNPs, 2,000 probes generated from non-polymorphic genomic regions were also introduced as QC probes of which 1,000 probes were selected with A or T at the 31st base, and 1,000 QC probes were selected with G or C at the 31th base. The QC probes along with the final list of SNPs were submitted to Affymetrix for fabrication of Axiom GW genotyping array.

SNP array performance evaluation

A total of 480 catfish were genotyped to assess the performance of SNP array, including 396 purebred channel catfish of the Delta Select line provided by USDA-ARS Warmwater Aquaculture Research Unit, and 84 hybrids generated by backcrossing channel x blue interspecific hybrid females with male channel catfish.

DNA samples were prepared following the procedures as previous described²⁷. DNA samples were diluted to 50 ng/μl and genotyped with the catfish 690 K SNP array (GeneSeek, Lincoln, Nebraska, USA). The signal intensity data of each probe on the array were reported in CEL files, which were analyzed with Axiom Analysis Suite (version 1.1.0.616) for quality control and genotype calling. Samples with a Dish value greater than 0.85 and SNP call rates greater than 95% were retained for subsequent analysis.

Following the genotyping step, SNPolisher (Affymetrix) generated quality metrics and classified all the SNPs into six types. Briefly, “PolyHighResolution” indicate that both of the two alleles of a SNP were detected. The signal data of all the samples also formed into three distinct clusters with good resolution; “NoMinorHom” indicate SNPs with two clusters of signal data, with no example of the minor homozygous genotypes; “MonoHighResolution” include SNPs with only one clusters identified. “OTV” refers to off-target variants, indicating SNPs with an OTV cluster that caused by sequence dissimilarity between probes and target genome regions⁸². “CallRateBelowThreshold” were the SNPs with call rates below threshold, but other cluster properties were above threshold. “Other” were the SNPs with one or more cluster properties below the threshold. In most cases, SNPs classified as “PolyHighResolution”, “NoMinorHom”, “MonoHighResolution”, were considered as converted SNPs. SNPs classified as “PolyHighResolution” and “NoMinorHom” were considered as polymorphic SNPs.

Linkage map construction

A total of 478 individuals from four full-sib families of the Delta select strain of channel catfish were used for linkage mapping. SNPs classified as “PolyHighResolution” and “NoMinorHom” were retained for further analysis. The genotyping results were exported in the pre-MAKEPED LINKAGE pedigree format⁸³.

The genotyping data were imported into Plink (version 1.9) to test pedigree information. A complete linkage agglomerative clustering procedure based on pairwise identity-by-state (IBS) distance were carried out. Multidimensional scaling analysis was performed on the generated IBS pairwise distances matrix. The pedigree information of outliers was identified and checked by subsequent outlier detection diagnostics, where a Z score was assigned to measure the distance between the outliers with the rest of the samples. Outlier samples were discarded once they were detected with significantly larger distances compared with the normal level.

Linkage map was constructed using Lep-MAP2 (version 0.2)⁸⁴. First, the Filtering module was executed to filtering low quality and un-informative markers. Markers with missing values larger than 12 (about 10%) or MAF less than 6 (about 5%) in each family were discarded. Only markers with 2 or more informative families were retained. A segregation distortion test was also performed to compare the offspring genotype distribution and the expected Mendelian proportions. Markers with significant segregation distortion were eliminated from linkage analysis (χ² test, P < 0.005). SNP Markers were assigned to linkage groups (LGs) using the SeparateChromosomes module. LGs were formed according to the threshold of logarithm of the odds (LOD) score limit of 35 and minimum LG size of 10. Singular markers were then added to the established LGs using the JoinSingles module with an LOD score limit of 10 and a minimum difference of 3 between the best LG and the second best LG of each joined marker.

Marker order of each LGs was determined by allowing different recombination probabilities in both sexes. Genotyping data of markers from the four families were analyzed simultaneously. Two-rounds of marker ordering procedures were carried out for better performance. In the first round, ten iterations were performed to acquire the best order of markers. To reduce the computational burden, a missing rate of 5% was set when determining whether two markers were duplicates. As the number of markers may beyond the resolution of recombination, there were many markers stacking up in the same locus on the genetic map, which may lead to recombination rate deformities. Therefore, the stacked markers as well as the duplicated markers were clustered and filtered. The marker with the most informative meiosis of one cluster was selected as the representative marker and retained for a new round of marker ordering. After the second round of ordering, all the previously identified duplicates and stacked markers were added to the adjacent locus and included for genetic distance calculation with Kosambi mapping function by taking account both male and female meiosis. Sex-specific recombination rates were then calculated with the same marker order. MapChart (version 2.3) was used to graphically present the genetic linkage map.

Additional Information

How to cite this article: Zeng, Q. et al. Development of a 690 K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence. Sci. Rep. 7, 40347; doi: 10.1038/srep40347 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Liu, Z. et al. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts. Nature communications 7, 11757 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Serapion, J., Kucuktas, H., Feng, J. & Liu, Z. Bioinformatic mining of type I microsatellites from expressed sequence tags of channel catfish (Ictalurus punctatus). Marine biotechnology 6, 364–377 (2004).
Article CAS PubMed Google Scholar
Liu, Z., Nichols, A., Li, P. & Dunham, R. Inheritance and usefulness of AFLP markers in channel catfish (Ictalurus punctatus), blue catfish (I. furcatus), and their F1, F2, and backcross hybrids. Molecular and General Genetics MGG 258, 260–268 (1998).
Article CAS PubMed Google Scholar
Liu, S. et al. Generation of genome-scale gene-associated SNPs in catfish for the construction of a high-density SNP array. Bmc Genomics 12, 53 (2011).
Article CAS PubMed PubMed Central Google Scholar
Sun, L. et al. Identification and analysis of genome-wide SNPs provide insight into signatures of selection and domestication in channel catfish (Ictalurus punctatus). Plos One 9, e109666 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Quiniou, S. M., Waldbieser, G. C. & Duke, M. V. A first generation BAC-based physical map of the channel catfish genome. Bmc Genomics 8, 40 (2007).
Article PubMed PubMed Central CAS Google Scholar
Xu, P. et al. A BAC-based physical map of the channel catfish genome. Genomics 90, 380–388 (2007).
Article CAS PubMed Google Scholar
Wang, S. et al. Characterization of a BAC library from channel catfish Ictalurus punctatus: indications of high levels of chromosomal reshuffling among teleost genomes. Marine biotechnology 9, 701–711 (2007).
Article CAS PubMed Google Scholar
Karsi, A., Li, P., Dunham, R. & Liu, Z. Transcriptional activities in the pituitaries of channel catfish before and after induced ovulation by injection of carp pituitary extract as revealed by expressed sequence tag analysis. Journal of molecular endocrinology 21, 121–129 (1998).
Article CAS PubMed Google Scholar
Ju, Z. L. et al. Transcriptome analysis of channel catfish (Ictalurus punctatus): genes and expression profile from the brain. Gene 261, 373–382 (2000).
Article CAS PubMed Google Scholar
Cao, D. et al. Transcriptome of channel catfish (Ictalurus punctatus): initial analysis of genes and expression profiles of the head kidney. Animal genetics 32, 169–188 (2001).
Article CAS PubMed Google Scholar
Karsi, A. et al. Transcriptome analysis of channel catfish (Ictalurus punctatus): initial analysis of gene expression and microsatellite-containing cDNAs in the skin. Gene 285, 157–168 (2002).
Article CAS PubMed Google Scholar
Kocabas, A. M. et al. Expression profile of the channel catfish spleen: Analysis of genes involved in immune functions. Marine Biotechnology 4, 526–536 (2002).
Article CAS PubMed Google Scholar
Li, P. et al. Towards the ictalurid catfish transcriptome: generation and analysis of 31,215 catfish ESTs. Bmc Genomics 8, 177 (2007).
Article PubMed PubMed Central CAS Google Scholar
Wang, S. et al. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies. Genome biology 11, 1–14 (2010).
MathSciNet Google Scholar
Liu, S. et al. Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote. Bmc Genomics 13, 595 (2012).
Article CAS PubMed PubMed Central Google Scholar
Sun, F. Y. et al. Male-Biased Genes in Catfish as Revealed by RNA-Seq Analysis of the Testis Transcriptome. Plos One 8, e68452 (2013).
Li, C. et al. RNA-seq analysis of mucosal immune responses reveals signatures of intestinal barrier disruption and pathogen entry following Edwardsiella ictaluri infection in channel catfish, Ictalurus punctatus. Fish Shellfish Immun 32, 816–827 (2012).
Article CAS Google Scholar
Zeng, Q. et al. Transcriptome Display During Testicular Differentiation of Channel Catfish (Ictalurus punctatus) as Revealed by RNA-Seq Analysis. Biology of reproduction, 95:19, 1–17 (2016).
Article CAS Google Scholar
Liu, S. K. et al. RNA-Seq reveals expression signatures of genes involved in oxygen transport, protein synthesis, folding, and degradation in response to heat stress in catfish. Physiol Genomics 45, 462–476 (2013).
Article ADS CAS PubMed Google Scholar
Sun, F. Y. et al. Transcriptomic signatures of attachment, NF-kappa B suppression and IFN stimulation in the catfish gill following columnaris bacterial infection. Dev Comp Immunol 38, 169–180 (2012).
Article CAS PubMed Google Scholar
Peatman, E., Lange, M., Zhao, H. & Beck, B. H. Physiology and immunology of mucosal barriers in catfish (Ictalurus spp.). Tissue barriers 3, e1068907 (2015).
Article PubMed PubMed Central CAS Google Scholar
Wang, R. J. et al. Bulk segregant RNA-seq reveals expression and positional candidate genes and allele-specific expression for disease resistance against enteric septicemia of catfish. Bmc Genomics 14, 929 (2013).
Article PubMed PubMed Central CAS Google Scholar
Liu, Z., Karsi, A., Li, P., Cao, D. & Dunham, R. An AFLP-based genetic linkage map of channel catfish (Ictalurus punctatus) constructed by using an interspecific hybrid resource family. Genetics 165, 687–694 (2003).
Article CAS PubMed PubMed Central Google Scholar
Kucuktas, H. et al. Construction of genetic linkage maps and comparative genome analysis of catfish using gene-associated markers. Genetics 181, 1649–1660 (2009).
Article CAS PubMed PubMed Central Google Scholar
Liu, S. et al. High-density interspecific genetic linkage mapping provides insights into genomic incompatibility between channel catfish and blue catfish. Animal genetics 47, 81–90 (2016).
Article CAS PubMed Google Scholar
Waldbieser, G. C., Bosworth, B. G., Nonneman, D. J. & Wolters, W. R. A microsatellite-based genetic linkage map for channel catfish, Ictalurus punctatus. Genetics 158, 727–734 (2001).
Article CAS PubMed PubMed Central Google Scholar
Li, Y. et al. Construction of a high-density, high-resolution genetic map and its integration with BAC-based physical map in channel catfish. DNA Research 22, 39–52 (2015).
Article CAS PubMed Google Scholar
Ninwichian, P. et al. Second-generation genetic linkage map of catfish and its integration with the BAC-based physical map. G3: Genes| Genomes| Genetics 2, 1233–1241 (2012).
Article CAS PubMed PubMed Central Google Scholar
Liu, S. et al. Development of the catfish 250K SNP array for genome-wide association studies. BMC research notes 7, 135 (2014).
Article PubMed PubMed Central Google Scholar
Geng, X. et al. A genome-wide association study in catfish reveals the presence of functional hubs of related genes within QTLs for columnaris disease resistance. Bmc Genomics 16, 196 (2015).
Article PubMed PubMed Central Google Scholar
Jin, Y. et al. A genome‐wide association study of heat stress‐associated SNPs in catfish. Anim. Genet. 10.1111/age.12482 (2016).
Geng, X. et al. A Genome Wide Association Study Identifies Multiple Regions Associated with Head Size in Catfish. G3 6, 3389–3398 (2016).
Article CAS Google Scholar
Gonen, S. et al. Linkage maps of the Atlantic salmon (Salmo salar) genome derived from RAD sequencing. Bmc Genomics 15, 166 (2014).
Article PubMed PubMed Central CAS Google Scholar
Lien, S. et al. A dense SNP-based linkage map for Atlantic salmon (Salmo salar) reveals extended chromosome homeologies and striking differences in sex-specific recombination patterns. Bmc Genomics 12, 615 (2011).
Article CAS PubMed PubMed Central Google Scholar
Guyomard, R., Boussaha, M., Krieg, F., Hervet, C. & Quillet, E. A synthetic rainbow trout linkage map provides new insights into the salmonid whole genome duplication and the conservation of synteny among teleosts. Bmc Genet 13, 15 (2012).
Article CAS PubMed PubMed Central Google Scholar
Castano-Sanchez, C. et al. A second generation genetic linkage map of Japanese flounder (Paralichthys olivaceus). Bmc Genomics 11, 554 (2010).
Article PubMed PubMed Central CAS Google Scholar
Shao, C. W. et al. Genome-wide SNP identification for the construction of a high-resolution genetic map of Japanese flounder (Paralichthys olivaceus): applications to QTL mapping of Vibrio anguillarum disease resistance and comparative genomic analysis. DNA Research 22, 161–170 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wang, C. M. et al. A high-resolution linkage map for comparative genome analysis and QTL fine mapping in Asian seabass, Lates calcarifer. Bmc Genomics 12, 174 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wang, L. et al. Construction of a high-density linkage map and fine mapping of QTL for growth in Asian seabass. Scientific reports 5, 16358 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Tsigenopoulos, C. S. et al. Second generation genetic linkage map for the gilthead sea bream Sparus aurata L. Mar Genom 18, 77–82 (2014).
Article Google Scholar
Fu, B. D., Liu, H. Y., Yu, X. M. & Tong, J. G. A high-density genetic map and growth related QTL mapping in bighead carp (Hypophthalmichthys nobilis). Scientific reports 6, 28679 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Aoki, J. et al. Second generation physical and linkage maps of yellowtail (Seriola quinqueradiata) and comparison of synteny with four model fish. Bmc Genomics 16, 406 (2015).
Article PubMed PubMed Central CAS Google Scholar
Hedgecock, D., Shin, G., Gracey, A. Y., Van Den Berg, D. & Samanta, M. P. Second-Generation Linkage Maps for the Pacific Oyster Crassostrea gigas Reveal Errors in Assembly of Genome Scaffolds. G3-Genes Genom Genet 5, 2007–2019 (2015).
CAS Google Scholar
Wang, J. P., Li, L. & Zhang, G. F. A High-Density SNP Genetic Linkage Map and QTL Analysis of Growth-Related Traits in a Hybrid Family of Oysters (Crassostrea gigas x Crassostrea angulata) Using Genotyping-by-Sequencing. G3-Genes Genom Genet 6, 1417–1426 (2016).
CAS Google Scholar
Baranski, M. et al. The Development of a High Density Linkage Map for Black Tiger Shrimp (Penaeus monodon) Based on cSNPs. Plos One 9, e85413 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Lu, X. et al. High-resolution genetic linkage mapping, high-temperature tolerance and growth-related quantitative trait locus (QTL) identification in Marsupenaeus japonicus. Mol Genet Genomics 291, 1391–1405 (2016).
Article CAS PubMed Google Scholar
Yu, Y. et al. Genome survey and high-density genetic map construction provide genomic and genetic resources for the Pacific White Shrimp Litopenaeus vannamei. Scientific reports 5 (2015).
Du, Z. Q. et al. A gene-based SNP linkage map for pacific white shrimp, Litopenaeus vannamei. Animal genetics 41, 286–294 (2010).
Article CAS PubMed Google Scholar
Jones, D. B., Jerry, D. R., Khatkar, M. S., Raadsma, H. W. & Zenger, K. R. A high-density SNP genetic linkage map for the silver-lipped pearl oyster, Pinctada maxima: a valuable resource for gene localisation and marker-assisted selection. Bmc Genomics 14, 810 (2013).
Article CAS PubMed PubMed Central Google Scholar
Peng, W. et al. An ultra-high density linkage map and QTL mapping for sex and growth-related traits of common carp (Cyprinus carpio). Scientific reports 6, 26693 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Kakioka, R., Kokita, T., Kumada, H., Watanabe, K. & Okuda, N. A RAD-based linkage map and comparative genomics in the gudgeons (genus Gnathopogon, Cyprinidae). Bmc Genomics 14, 32 (2013).
Article CAS PubMed PubMed Central Google Scholar
Palaiokostas, C. et al. A new SNP-based vision of the genetics of sex determination in European sea bass (Dicentrarchus labrax). Genetics, selection, evolution: GSE 47, 68 (2015).
Article PubMed PubMed Central CAS Google Scholar
Zhou, Z. et al. High-Density Genetic Mapping with Interspecific Hybrids of Two Sea Urchins, Strongylocentrotus nudus and S. intermedius, by RAD Sequencing. Plos One 10, e0138585 (2015).
Article PubMed PubMed Central CAS Google Scholar
Tian, M. et al. Construction of a High-Density Genetic Map and Quantitative Trait Locus Mapping in the Sea Cucumber Apostichopus japonicus. Scientific reports 5, 14852 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, W. et al. High-density genetic linkage mapping in turbot (Scophthalmus maximus L.) based on SNP markers and major sex- and growth-related regions detection. Plos One 10, e0120410 (2015).
Article PubMed PubMed Central CAS Google Scholar
Xiao, S. et al. Gene map of large yellow croaker (Larimichthys crocea) provides insights into teleost genome evolution and conserved regions associated with growth. Scientific reports 5, 18661 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Ao, J. et al. Construction of the High-Density Genetic Linkage Map and Chromosome Map of Large Yellow Croaker (Larimichthys crocea). Int J Mol Sci 16, 26237–26248 (2015).
Article CAS PubMed PubMed Central Google Scholar
Palaiokostas, C. et al. Mapping the sex determination locus in the Atlantic halibut (Hippoglossus hippoglossus) using RAD sequencing. Bmc Genomics 14, 566 (2013).
Article CAS PubMed PubMed Central Google Scholar
Jiao, W. et al. High-resolution linkage and quantitative trait locus mapping aided by genome survey sequencing: building up an integrative genomic framework for a bivalve mollusc. DNA Res. 21, 85–101 (2014).
Article CAS PubMed Google Scholar
Brieuc, M. S., Waters, C. D., Seeb, J. E. & Naish, K. A. A dense linkage map for Chinook salmon (Oncorhynchus tshawytscha) reveals variable chromosomal divergence after an ancestral whole genome duplication event. G3 4, 447–460 (2014).
Article CAS PubMed Google Scholar
Bai, Z. Y., Han, X. K., Liu, X. J., Li, Q. Q. & Li, J. L. Construction of a high-density genetic map and QTL mapping for pearl quality-related traits in Hyriopsis cumingii. Scientific reports 6, 32608 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Kodama, M., Brieuc, M. S., Devlin, R. H., Hard, J. J. & Naish, K. A. Comparative mapping between Coho Salmon (Oncorhynchus kisutch) and three other salmonids suggests a role for chromosomal rearrangements in the retention of duplicated regions following a whole genome duplication event. G3: Genes| Genomes| Genetics 4, 1717–1730 (2014).
Article PubMed PubMed Central CAS Google Scholar
Kai, W. et al. A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication. Bmc Genomics 15, 233 (2014).
Article PubMed PubMed Central CAS Google Scholar
Li, Y. & He, M. Genetic mapping and QTL analysis of growth-related traits in Pinctada fucata using restriction-site associated DNA sequencing. Plos One 9, e111707 (2014).
Article ADS PubMed PubMed Central CAS Google Scholar
Liu, Z., Karsi, A. & Dunham, R. A. Development of polymorphic EST markers suitable for genetic linkage mapping of catfish. Marine Biotechnology 1, 437–447 (1999).
Article CAS PubMed Google Scholar
Garciadorado, A. & Gallego, A. On the Use of the Classical Tests for Detecting Linkage. J Hered 83, 143–146 (1992).
Article CAS Google Scholar
Shirak, A. et al. Amh and Dmrta2 genes map to tilapia (Oreochromis spp.) linkage group 23 within quantitative trait locus regions for sex determination. Genetics 174, 1573–1581 (2006).
Article CAS PubMed PubMed Central Google Scholar
Naruse, K. et al. A detailed linkage map of medaka, Oryzias latipes: comparative genomics and genome evolution. Genetics 154, 1773–1784 (2000).
Article CAS PubMed PubMed Central Google Scholar
Lorenz, S. et al. BAC-based upgrading and physical integration of a genetic SNP map in Atlantic salmon. Animal genetics 41, 48–54 (2010).
Article CAS PubMed Google Scholar
Sakamoto, T. et al. A microsatellite linkage map of rainbow trout (Oncorhynchus mykiss) characterized by large sex-specific differences in recombination rates. Genetics 155, 1331–1345 (2000).
Article CAS PubMed PubMed Central Google Scholar
Ross, C. R. et al. Genomic correlates of recombination rate and its variability across eight recombination maps in the western honey bee (Apis mellifera L.). Bmc Genomics 16, 107 (2015).
Article PubMed PubMed Central Google Scholar
Lynn, A., Schrump, S., Cherry, J., Hassold, T. & Hunt, P. Sex, not genotype, determines recombination levels in mice. The American Journal of Human Genetics 77, 670–675 (2005).
Article CAS PubMed Google Scholar
Singer, A. et al. Sex-specific recombination rates in zebrafish (Danio rerio). Genetics 160, 649–657 (2002).
Article CAS PubMed PubMed Central Google Scholar
Chistiakov, D. A. et al. A microsatellite linkage map of the European sea bass Dicentrarchus labrax L. Genetics 170, 1821–1826 (2005).
Article CAS PubMed PubMed Central Google Scholar
Guo, W. J. et al. A second generation genetic linkage map for silver carp (Hypophthalmichehys molitrix) using microsatellite markers. Aquaculture 412, 97–106 (2013).
Article CAS Google Scholar
Xia, J. H. et al. A consensus linkage map of the grass carp (Ctenopharyngodon idella) based on microsatellites and SNPs. Bmc Genomics 11, 135 (2010).
Article PubMed PubMed Central CAS Google Scholar
LeGrande, W. H., Dunham, R. A. & Smitherman, R. Karyology of three species of catfishes (Ictaluridae: Ictalurus) and four hybrid combinations. Copeia 1984, 873–878 (1984).
Article Google Scholar
Mank, J. E. The evolution of heterochiasmy: the role of sexual selection and sperm competition in determining sex-specific recombination rates in eutherian mammals. Genet Res 91, 355–363 (2009).
Article CAS Google Scholar
Makova, K. D. & Hardison, R. C. The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 16, 213–223 (2015).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central CAS Google Scholar
Didion, J. P. et al. Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias. Bmc Genomics 13, 34 (2012).
Article CAS PubMed PubMed Central Google Scholar
Lathrop, G. M., Lalouel, J. M., Julier, C. & Ott, J. Strategies for multilocus linkage analysis in humans. Proc. Natl. Acad. Sci. 81, 3443–3446 (1984).
Article ADS CAS PubMed PubMed Central Google Scholar
Rastas, P., Calboli, F. C., Guo, B., Shikano, T. & Merilä, J. Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F2 Recombinant Crosses as an Example. Genome biology and evolution 8, 78–93 (2016).
Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge grant support from the Animal Genomics, Genetics and Breeding Program of the USDA National Institute of Food and Agriculture (#2015-67015-22907).

Author information

Authors and Affiliations

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture, and Aquatic Sciences, and Program of Cell and Molecular Biosciences, Auburn University, Auburn, 36849, Alabama, United States of America
Qifan Zeng, Qiang Fu, Yun Li, Shikai Liu, Yujia Yang, Lisui Bao, Zihao Yuan, Ning Li & Zhanjiang Liu
USDA-ARS Warmwater Aquaculture Research Unit, Stoneville, 38776, Mississippi, United States of America
Geoff Waldbieser & Brian Bosworth

Authors

Qifan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar
Geoff Waldbieser
View author publications
You can also search for this author in PubMed Google Scholar
Brian Bosworth
View author publications
You can also search for this author in PubMed Google Scholar
Shikai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yujia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lisui Bao
View author publications
You can also search for this author in PubMed Google Scholar
Zihao Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ning Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhanjiang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.L. conceived the study. Z.L. and Q.Z. wrote the manuscript. Q.Z., Q.F., and Y.L. performed linkage mapping and bioinformatics analysis. G.W. and B.B. conducted family construction. Q.Z., Q.F., S.L., Y.Y., L.B., Z.Y. and N.L. conducted sample collection and DNA extraction. All authors reviewed and approved the final version of the manuscript. The authors declare that there is no conflict of interest.

Corresponding author

Correspondence to Zhanjiang Liu.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information (PDF 7219 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Zeng, Q., Fu, Q., Li, Y. et al. Development of a 690 K SNP array in catfish and its application for genetic mapping and validation of the reference genome sequence. Sci Rep 7, 40347 (2017). https://doi.org/10.1038/srep40347

Download citation

Received: 06 October 2016
Accepted: 05 December 2016
Published: 12 January 2017
DOI: https://doi.org/10.1038/srep40347

This article is cited by

Reference genomes of channel catfish and blue catfish reveal multiple pericentric chromosome inversions
- Geoffrey C. Waldbieser
- Shikai Liu
- Zhanjiang Liu
BMC Biology (2023)
Development of a multi-species SNP array for serrasalmid fish Colossoma macropomum and Piaractus mesopotamicus
- Vito A. Mastrochirico-Filho
- Raquel B. Ariede
- Diogo T. Hashimoto
Scientific Reports (2021)
Genetic Dissection of a Precocious Phenotype in Male Tiger Pufferfish (Takifugu rubripes) using Genotyping by Random Amplicon Sequencing, Direct (GRAS-Di)
- Sota Yoshikawa
- Masaomi Hamasaki
- Sho Hosoya
Marine Biotechnology (2021)
Development and validation of a RAD-Seq target-capture based genotyping assay for routine application in advanced black tiger shrimp (Penaeus monodon) breeding programs
- Jarrod L. Guppy
- David B. Jones
- Kyall R. Zenger
BMC Genomics (2020)
Non-synonymous polymorphisms in candidate gene associated with growth traits in Channel catfish (Ictalurus punctatus, Rafinesque, 1818)
- Diana Suárez-Salgado
- Gaspar Manuel Parra-Bracamonte
- Xochitl Fabiola De la Rosa-Reyna
Molecular Biology Reports (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

SNPs identification and selection

SNPs included on the 690 K array

Distribution of SNPs across the genome

Performance of the catfish 690 K SNP array

Construction of channel catfish linkage map

Integration and validation with reference genome sequence

Discussion

Materials and Methods

Ethics statement

SNP identification and SNP array development

SNP array performance evaluation

Linkage map construction

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links