Abstract
The Pacific saury (Cololabis saira) is a pelagic fish commonly found in the North Pacific Ocean. Its population diversity and migratory lifestyle have long captured global attention. Despite the inherent complexity of the C. saira genome, characterized by extremely high heterozygosity, we successfully assembled a phased chromosome-level genome. The genome analysis revealed the expansion and natural selection of numerous functional genes, likely contributing to its enduring and extensive migratory lifestyle. Notably, gpr35 and igh genes showed significant expansion in the C. saira genome, potentially associated with regulating the immune response against environmental parasites and pathogens. Moreover, genes involved in DNA repair/replication and peroxisome function, including atm, ercc6, pex14, and pex16, displayed evidence of positive selection. Based on genome-sequencing of 80 individuals from eight sampling sites, we demonstrated that the genomic divergence among C. saira populations is relatively low. However, the sampling sites could be grouped into two distinct clusters, roughly corresponding to the migratory route of C. saira. This suggests a possible genome-wide divergence for C. saira within the open ocean region. Furthermore, the trmu gene, responsible for controlling otolith development and sharpness, exhibited differentiation between the two groups, consistent with previously reported differences in otolith morphology. This study has provided a reference genome and insights into the evolution, ecology, and conservation of Pacific saury and closely-related species.
Similar content being viewed by others
Introduction
The Pacific saury (Cololabis saira) is a small pelagic fish with a wide distribution across the North Pacific Ocean, spanning from Japan’s east coast to the United States’ west coast1,2. This species holds significant economic importance in countries and regions bordering the Northwest Pacific, including Japan, Korea, Russia, Vanuatu, China, and Chinese Taiwan3. Small pelagic fishes are crucial components of marine ecosystems and are crucial links in the food chain4,5. Previous research efforts have focused on various aspects of the Pacific saury, including its life history, migratory patterns, population dynamics, distribution in fisheries, responses to environmental changes, and the study of mitochondrial genes6,7,8,9,10. Despite this, genomic resources for the Pacific saury remain limited. The absence of systematic genome resources has impeded our understanding of the species’ evolutionary history, potential adaptive traits, and the genetic diversity within its populations.
Previous studies have suggested that immune regulation may play a pivotal role in the migratory behaviors of fish11,12. The Pacific saury is renowned for its extensive, seasonal, and wide-range migrations13. During these migrations, the species follows a significant route that takes it from subtropical waters through the complex and ever-changing environment of the Kuroshio-Oyashio transition zones, allowing it to reach the subarctic waters. In autumn, the Pacific saury migrates from the subarctic back to the subtropical waters1. This extended migration exposes the fish to diverse marine viruses and parasites14. Studies have reported that parasites can easily infect Pacific saury during this migration7,15. Moreover, the prolonged movement challenges the species’ antioxidant capacity16. However, the precise mechanisms by which the Pacific saury adapts to its long-distance migration remain unclear. Studying the critical genes associated with migration adaptations in Pacific saury is crucial for understanding the species’ evolutionary process17.
High variability in the biology of Pacific saury has been observed across different geographic regions18,19. For example, Suyama, et al.18 identified two spatially separated groups from the east and west using the first otolith annual ring. Li, et al.19 applied otolith shape analysis to differentiate two otolith morphologies in eastern and western Pacific saury through cluster analysis, suggesting the presence of two distinct geographic groups. By contrast, the genetic diversity of Pacific saury is generally low, and there is no significant differentiation at the population level based on mitochondrial genes9,20. However, the mitochondrial genome has a limited number of genes and is maternally inherited, which has insufficient testing power for studying population differentiation21. Genome-wide SNPs are widely distributed, genetically stable, and highly representative22. They provide more informative loci that can accurately delineate the population genetic structure of Pacific saury23. Nevertheless, population genetic studies of Pacific saury based on whole-genome variations are still in their early stages.
In order to explore mechanisms underlying the adaptation of Pacific saury migratory lifestyle and to examine the genetic diversity, we have performed the genome and population analysis for the species. Firstly, we sequenced the genome of Pacific saury with the PacBio HiFi and Hi-C technologies, resulting in a phased and near-complete genome assembly, which allowed us to investigate the species’ phylogenetic placement and identify potential functional genes that contribute to its migratory adaptability. Secondly, we generated deep whole-genome resequencing data for 80 Pacific saury individuals from eight sites in the North Pacific Ocean. By leveraging whole-genome variations, we explored Pacific saury’s genetic diversity and population structure.
Results
Genome assembly, phasing and annotation
We initiated the genome assessment and de novo assembly using a Pacific saury specimen (Supplementary Table 1). Our K-mer analysis estimated the genome size of Pacific saury to be 1296 Mb, characterized by a 3.19% heterozygosity and a 60.42% repeat content (Supplementary Table 2 and Supplementary Fig. 1). PacBio sequencing generated a dataset of approximately 49.4 Gb Circular Consensus Sequencing (CCS) HiFi reads (Supplementary Table 3), resulting in a sequencing depth of roughly 38x. Using HiFiasm and the CCS reads, we assembled one draft genome comprising 3833 contigs, with size of 2152 Mb (Supplementary Tables 4 and 5). The assembly size is roughly twice the K-mer estimate, suggesting the successful assembly of two haploid genomes for Pacific saury. This observation was confirmed by the genome-wide BUSCO analysis, revealing that 94.95% of BUSCO are duplicated (Supplementary Tables 5 and 6). Using whole-genome sequencing (WGS) short reads data and HiFi CCS data, 99.81% of WGS short reads and 100% HiFi CCS data were mapped to our genome assembly, and these reads covered 99.5% and 99.98% of the total assemble (Supplementary Tables 7 and 8).
To further enhance the assembly quality and assemble the contigs into chromosome-level structures, we employed 130.79 Gb of Hi-C data to assess contact frequencies among contigs. This effort enabled us to assign the majority of contigs to 48 chromosomes, resulting in an impressive chromosome anchoring rate of 96.39% (Supplementary Tables 9 and 10, Supplementary Fig. 2). Consequently, we successfully generated two chromosome-level haploid genomes for Pacific saury and arbitrary divided the chromosomes into two haploid genomes, resulting in two haploids with sizes of 1103 Mb and 1072 Mb, respectively (Fig. 1, Supplementary Tables 11). For the purposes of subsequent analyses, we referred to the two haploid genomes of Pacific saury as CSA (1103 Mb) and CSa (1072 Mb). From 3,565 single-copy orthologues, 97.94% and 97.64% of complete and single-copy BUSCOs were found in the CSA and CSa genomes (Supplementary Tables 5 and 12). The estimated genome consensus quality (QV) for two haploid genomes was above 40 (Supplementary Tables 5).
A GC content distribution, measured by the GC proportion of each 1Mb-window. B gene distribution, measured by the proportion of gene sequences in each 1Mb-window. C Repeat content, measured by the proportion of repeat regions in each 1Mb-window. D LINE distribution, measured by the proportion of LINE in each 1Mb-window. E Collinearity of two haploid genomes.
The repeat element prediction revealed that approximately 63.1% of the content in the two haploid genomes consisted of repetitive sequences, primarily comprised of DNA, LINE, and LTR repeat elements (Supplementary Fig. 3, Supplementary Tables 13 and 14). Our gene annotation, using homolog-based, de novo-based, and RNA-seq-based gene annotation approaches, successfully predicted 44,823 protein-coding genes within the two haploid genomes (Supplementary Tables 5). To note, out of all protein-coding genes, 43,635 were effectively functionally annotated through homologous searches against publicly available databases, accounting for 97.35% of the total genes (Supplementary Table 15).
Gene family, phylogenetic and collinearity analysis, and gene evolution
For the comparative genomic analysis, we focused on CSA of Pacific saury, along with Zebrafish (Danio rerio), Chinook salmon (Oncorhynchus tshawytscha), Medaka (Oryzias latipes), Yellowtail kingfish (Seriola lalandi) and Yellowfin tuna (Thunnus albacares). Genes from these six species were categorized into 14,469 gene families, and 11,825 gene families were shared among all species. Additionally, we identified 110 gene families exclusive to Pacific saury (Supplementary Table 16). The phylogenetic relationship among these species was investigated using 3068 single-copy genes, revealing that Pacific saury diverged from their common ancestor of Medaka ~64.9 Ma ago (Fig. 2A). Furthermore, the collinearity analysis of protein-coding genes along chromosomes illuminated a strict chromosome karyotype conservation between Pacific saury and Medaka (Fig. 2B).
A phylogenetic trees and gene family contractions and expansions. Green numbers indicate the number of expanding gene families, and red numbers indicate the number of contracting gene families. Blue numbers indicate the divergence time between branches, and the numbers in parentheses indicate the divergence time supported by 95% of HPD (highest posterior density). B diagrams showing Pacific saury (Cololabis saira) chromosome synteny relations with Medaka (Oryzias latipes).
Gene family analysis showed that Pacific saury exhibited expansion in 223 gene families, while 2131 gene families showed contraction (Fig. 2A). The significantly expanded gene families were primarily associated with functions related to zinc ion binding and DNA repair (Supplementary Fig. 4 and Data 1). Notably, those expanded gene families were also enriched in immunity-related pathways, including phagosomes and programmed necrosis (Supplementary Fig. 4 and Data 2). Gene families for igh, hla-a, and gpr35 genes showed a significant expansion in the Pacific saury genome. (Supplementary Table 17).
Within the expanded gene families, the gpr35 gene stood out with seven copies located in Chr9 of the CSA genome. In addition, we identified another four copies in the homologous region of the CSa genome, further confirming the expansion of the gpr35 gene family in the Pacific saury genome. The expanded gpr35 genes in both CSA and CSa exhibited a high degree of homology with closely related fish species (Fig. 3A). Furthermore, our analyses revealed that gpr35 is a single-exon gene residing within a LINE transposon. This finding suggests that the expansion of gpr35 genes in the Pacific saury genome possibly resulted from LINE-mediated tandem duplication. Phylogenetic analysis on gpr35 genes from GSA/GSa of Pacific saury and other fish species, demonstrated that all Pacific saury gpr35 genes form a single clade, indicating that the expansion of the gpr35 gene family occurred after the specie divergence from other related fish species (Fig. 3B).
From a pool of 3068 single-copy genes, we successfully identified 610 positively selected genes with signifcant false discovery rate (FDR)-corrected p values (<0.05) in Pacific saury (Supplementary Data 3), which were predominantly associated with key biological pathways such as DNA repair, homologous recombination, Fanconi anemia, peroxisome, and mismatch repair (Fig. 4, Supplementary Table 18 and Data 4). In addition, these genes were found to be closely linked to processes such as RNA polymerase I formation, monosaccharide catabolic processes, and DNA replication (Supplementary Fig. 5 and Data 5).
Population genetic analysis
We used 80 samples collected from eight sites for whole-genome resequencing (Fig. 5 and Fig. 6A, Supplementary Table 1). The mapping rate for each sample ranged 85.72% to 88.80%, and the mean mapping depth was determined to be ∼8 folds (Supplementary Table 19). Whole-genome SNPs and InDels were detected and filtered according to details in the Materials and Methods section. This process yielded 1,633,773 SNPs, which were subsequently used in our analyses (Supplementary Table 20). Firstly, we calculated the genome-wide FST among sites and showed that there was barely any population structure of Pacific saury (FST < 0.005) (Fig. 6B). Combined with Pacific saury migration routes (Fig. 6A), clustering using site-wise FST allowed us to categorize the eight sites into two distinct groups (Fig. 6B, C). This suggests the presence of underlying genetic differentiation among populations from the various sampling sites. Based on this clustering analysis, we have referred to the populations at sites S1, S2, S3 and S5 as the ‘west group’, and the populations at sites S4, S6, S7, S8 as the ‘east’ group.
A Distribution of eight sampling sites for Pacific saury. The black dashed line indicates the presumed different migratory routes of Pacific saury13. B FST values among sites based on whole-genome SNPs. Eight sites were clustered using the “ward. D2” method in “Pheatmap” function based on the Euclidean distance. The eastern group (I) is shown in blue, and the western group (II) is shown in red. C PCA within eight sites from FST values. PC, principal component.
We also performed selective sweep analysis to reveal differentiated genomic regions that displayed differentiation between the east (I) and west groups (II). We found that massive genomic regions characterized by significant genetic diversity between these two groups, especially in Chr2 and Chr11 (Fig. 7A). Our more focused examination of differentiation between the two groups focused on the region around 18 Mb of Chr2 and another region around 22 Mb of Chr11 (Fig. 7B). Within these regions, we identified 181 genes exhibiting a high degree of differentiation (FST > 0.05), and these genes featured non-synonymous mutation loci (Supplementary Table 21). These genes were significantly enriched in functions related to tRNA threonylcarbamoyladenosine and DNA replication/repair (Fig. 8A and Supplementary Tables 22). Remarkably, we identified a non-synonymous mutation at amino acid 188 within TRMU (Fig. 8C), a gene associated with the tRNA threonylcarbamoyladenosine metabolic process (Supplementary Table 22). We found that this mutation is specific to Pacific saury and the mutation frequency of the trmu gene significantly differs (p < 0.05) between the east and west groups (Fig. 8B).
A patterns of genomic differentiation between east and west groups. FST values correspond to the weighted mean per 50-kb window with 5-kb increments. Different colors represent different chromosomes. The red box indicates the chromosomes where differentiation is more concentrated. B patterns of genomic differentiation between east and west groups in Chr2 and Chr11. FST values correspond to the weighted mean per 10-kb window with 1-kb increments. The red box indicates the areas where differentiation is more concentrated.
A GO-enriched pathways of highly differentiated (FST > 0.05) genes containing non-synonymous mutations within a concentrated differentiation region on Chr2 and Chr11, including 181 genes. Only showed significant enrichment (p < 0.05) of the GO pathway. B Types of trmu genes and distribution in the east and west groups. C Three-dimensional views of TRMU protein. The three-dimensional protein model was generated by Phyre298. Pacific saury mutated amino acids are highlighted in red.
Discussion
Besides an important economic fish species, Pacific saury is also a pelagic fish of the North Pacific Ocean, making it a species of substantial research interest1,4. Genomes could greatly promote reliable population structure and evolutionary studies of Pacific saury. Previous studies have shown that Pacific saury is a diploid species with high heterozygosity24. K-mer analysis revealed a high heterozygosity (3.19%) for Pacific saury genome, posing a significant challenge for the genome assembly of the species25. Meanwhile, most previous diploid genome assemblies usually resulted in a single mosaic reference genome, consisting of portions of the parental alleles26. To overcome the challenges for the genome assembly of Pacific saury and to obtain a phased genome, we employed the PacBio HiFi and Hi-C technology to successfully assemble the first phased chromosome-level genomes of Pacific saury. We assembled twohaploid genomes with sizes of 1.10 Gb (CSA) and 1.07 Gb (CSa). The number of protein-coding genes was 22,206 (CSA) and 22,617 (CSa), which was similar to the published genome size and annotation results. This further confirmed the reasonableness of our assembly results24,27. Unfortunately, the biological information collection of Pacific saury in this study was not completed, and detailed bioinformatics data will be conducted in subsequent studies.
The genome analysis revealed the natural selection of numerous immunity-related functional genes. Immunoglobulin (Ig) is a glycoprotein that plays important role in adaptive immunity, produced by B lymphocytes upon exposure to antigens28,29. Among the essential components of immunoglobulins, IGH holds particular significance as it contributes to antigen recognition and signaling30. Furthermore, genes such as hla-a, ifn3 and ifng, which were highly expressed in the kidney of Pacific saury, are also involved in immune regulation27,31. Previous studies have shown the high prevalence of large parasitic copepods infecting Pacific saury7,15,32. Moreover, the migratory route of Pacific saury traverses complex and expansive sea areas, such as the Kuroshio-Oyashio transition zone33, which potentially harbors various marine viruses14. The expansion of immunity-related gene families observed in Pacific saury may be a response to the elevated risk of viral and parasitic infections associated with its habitat and migratory patterns.
The intestine plays an essential role in the reaction to pathogens infections, given that intestinal epithelial cells serve as a crucial physical barrier against bacterial infection in the gut34. Within the intestinal immune system, numerous immune cells are at work, and the Gut-associated lymphoid tissue (GALT) constitutes about 70% of the entire immune system in teleost fish34. Gpr35 is abundantly expressed in the intestine35, and research has demonstrated its critical role in macrophages during intestinal inflammation35, contributing to mucosal repair via migration of colonic epithelial cells36. Within the Pacific saury genome, the expansion of the gpr3 gene, alongside other immunity-related genes, is possibly to be a critical factor in enhancing resistance to marine viral and parasitic infections. This expansion underscores the species’ adaptation to its diverse and challenging habitats.
Pacific saury shows robust motility as an adaptation to its migratory lifestyle, with a substantial portion of its life history dedicated to migration13. Physical activity is closely linked to the generation of free radicals, particularly oxygen radicals, and prolonged endurance exercise can escalate free radical production16. Reactive oxygen species, such as hydroxyl radicals, produced during oxidative cellular respiration, possess the potential to induce damages to DNA bases and cause DNA strand breaks37. Repair pathways associated with DNA cross-linking damage, including Fanconi anemia (FA), nucleotide excision repair, and homologous recombination repair, assume vital roles in mitigating these types of DNA damages38,39. ATM, a key regulator of DNA damage response, is involved in various processes, such as cell cycle checks, DNA damage repair, and the maintenance of telomeres40. ERCC6, on the other hand, functions in DNA repair, the preservation of chromosome stability, and as a co-factor in base excision repair41. POLD3 participates in various DNA repair processes42,43. It is notable that genes associated with DNA repair show signatures of positive selection, which could be indicative of a specialized antioxidant mechanism that has evolved in Pacific saury in response to its demanding lifestyle.
The extensive migratory range and prolonged migration periods of Pacific saury necessitate a continuous cellular energy supply. Lipids serve as a crucial means to store and generate energy, and Pacific saury is distinguished by its elevated fatty acid content compared to other fish species44. Peroxisomes are closely associated with cellular redox metabolism, fatty acid oxidation, and detoxification of free radicals45,46,47. Key protein in this context include PEX14, which is a central component of the peroxisomal matrix protein transport system48, PEX6, which promotes fatty acid oxidation49, and HPCL2 and PHYH, essential for fatty acid alpha oxidation50,51. These positively selected genes are possibly linked to the high energy requirements associated with Pacific saury’s long migrations and may also play a role in mitigating the accumulation of metabolically generated free radicals.
To further understand the molecular adaptations to long-duration and long-distance migrations in fish, we analyzed the positively selected genes shared by two highly migratory fish species, Pacific saury and Yellowfin tuna. The identified genes under positive selection, including efnb2, crlf3, and gpx3, are primarily associated with hematopoietic cells, hypoxic response, and antioxidants (Fig. 9 and Supplementary Table 23). EFNB2 (ephrin B2) is a member of the ephrin (EPH) family, and plays an essential role in developing the nervous system and erythropoiesis. It can promote erythroid differentiation under hypoxic conditions52. CRLF3, a neuroprotective erythropoietin receptor, suggests potential impacts on primitive hematopoiesis and downstream hematopoietic progenitors in zebrafish studies53. GPX3 belongs to the glutathione peroxidase family, safeguarding cells from oxidative damage and serving as a crucial antioxidant enzyme54. These shared positively selected genes in Pacific saury and Yellowfin tuna possibly reflect a typical evolutionary pattern of oceanic migratory fish, particularly concerning their oxygen-carrying capacity and antioxidant capacities. These possible adaptations are common responses to the challenges posed by sustained or rapid movements in aquatic environments.
The main results are outlined: Igh, gpr35, hla-a, ifn3 and ifng genes were expanded and associated with the immune system. Atm, plod3 and ercc6 were under positive selection, and may be associated with DNA damage caused by metabolic free radicals. Pex6, hpcl2 and phyh were under positive selection, and may be associated with energy metabolism. Efnb2 and crlf3 were under positive selection that may be associated with oxygen supply.
Population analysis based on mitochondrial genes previously suggested the presence of a single Pacific saury population9. However, recent research has proposed the existence of distinct migratory routes for Pacific saury in the North Pacific region13. These differing migratory routes could potentially lead to differentiation of Pacific saury populations. Furthermore, otolith studies have revealed morphological variations in the first otolith annual ring and the sharpness of otoliths between ‘east’ and ‘west’ Pacific saury groups. Our population genomic analysis has provided additional evidence of genetic differentiation among sampling sites, and clustering of sites into two groups appears to be consistent with the two approximate migratory routes of Pacific saury (Fig. 6A, B). Notably, we have identified substantial differentiation in specific genomic regions on Chr2 and Chr11 (Fig. 7B) between the two groups. A similar phenomenon was previously found in studies of different ecotypes of Chinook salmon55. Also, studies on Atlantic cod have shown that in the presence of gene flow, ecological divergence still exists and affects specific genomic regions56. Accordingly, we postulate that the differentiated regions in Chr11 and Chr2 may also be linked to the differentiation between migratory groups within the Pacific saury population. However, confirming this hypothesis requires more further in-depth investigations. We also found that trmu gene exhibited significantly different allele frequency in the two groups. This gene has been associated with inner ear hair cell development in zebrafish studies, influencing otolith size and shape57. These findings provide compelling insights into the potential genetic basis of morphological variations in Pacific saury populations.
In conclusion, our data indicate that expanded genes and genes under selection may be associated with persistent movements, corresponding to the migratory characteristic of Pacific saury. Based on the whole-genome resequencing of 80 individuals, the Pacific saury population was divided into two groups, with genomic differences between the two groups focused on two chromosomes. Furtherly, trmu gene associated with otoliths on chromosome 2 exhibited significantly different allele frequencies in the two groups. These findings could help identify the Pacific saury population at the genetic level.
Materials and Methods
Sample collection
A female Pacific saury collected in the North Pacific Ocean (153°08’E, 43°00’N) on 22 October 2019 was used for genome sequencing and assembly. After the euthanasia, the fish was immediately dissected to extract muscle, gills, liver, intestine, and skin. All samples were frozen in the liquid nitrogen and stored in the −80°C fridge. Muscle tissues were used for DNA extraction, genome sequencing and assembly, and Hi-C library construction. We used a modified CTAB method58 to extract genomic DNA (gDNA) from the tissues for Illumina short-read and PacBio long-read genome sequencing. The concentration of gDNA was detected by NanoDrop Technologies 2000 (Wilmington, DE, USA) and Qubit fuorometer (ThermoFisher, MA, USA), further the quality of gDNA was detected by 0.8% agarose gele lectrophoresis. Gill, liver, intestine, and skin were used for RNA sequencing. A total of 80 samples were collected at eight stations in the North Pacific Ocean from June 2019 to November 2019 for resequencing analysis (Fig. 5 and Supplementary Table 1). The same modified CTAB method was used for DNA extraction of resequencing samples. During the course of this experiment, the operators strictly adhere to the Code of Ethics of the Ethics Committee for Laboratory Animal Management and Use of Ocean University of China and follow the rules and regulations of the Special Committee on Scientific Ethics of the Academic Committee of China Ocean University of China.
Genome size estimation
We used 1 μg of DNA to construct the whole-genome sequencing short-read library with 300–350 bp size for the Illumina NovaSeq 6000 platform. The sequencing library was constructed strictly according to the manufacturer’s recommendations. HTQC v1.92.31059 filtered the raw sequencing data. Finally, a total of 65.92 Gb WGS short reads (CRA015908) was generated. Clean reads were used for K-mer analysis based on GCE60 software with 17 K-mer frequencies.
PacBio HiFi-CCS sequencing and de novo genome assembly
We used g-TUBE (Covaris) to break the genomic DNA into fragments of about 15Kb randomly. SMRTbell Express Template Prep kit 2.0 reagent (Pacific Biosciences) was used to construct the SMRT bell HiFi library. The FEMTO Pulse and Qubit dsDNA HS kits were used for detecting the library size and quality. At the end, the primer and Sequel II DNA polymerase were annealed separately and combined with SMRT bell templates. The constructed gene libraries were sequenced using PacBio Sequel II in CCS mode for 30 h. CCS workflow61, with “-minPasses 3” setting, was used to generate HiFi reads from raw subreads (https://github.com/pacificbiosciences/ccs). A total of 49 Gb HiFi CCS (CRA015952) subreads was generated. The coverage was sufficient for de novo assembly according to recommendation61. Subsequently, HiFiasm62 was employed for genome assembly.
We evaluated the completeness and accuracy of the assembled genome in three methods. In the first method, Minimap2 (v2.5 default parameter63) was used to compare HiFi CCS data to assembled genomes, counting the ratio of reads, coverage of genomes, and distribution of sequencing depths In the second method, the WGS short reads data were compared with the assembled genome using BWA64 to count the ratio of reads. In the third method, BUSCO v5.7.065 was used to evaluate genome completion with actinopterygii_odb10 database. The fourth method used Merqury (v1.3 default parameter66) to evaluate genome consensus quality (QV) based on the WGS short reads.
Chromosome assembly using Hi-C technology
Muscle tissue was taken for Hi-C assisted assembly. For Hi-C library construction, the DNA was fragmented into 300–500 bp and purified using magnetic beads. Subsequently, Hi-C library was sequenced with 150-bp paired-end reads on the Illumina NovaSeq 6000 platform. Hi-C raw data were filtered using HTQC v1.92.310. A total of 138 Gb Hi-C data (CRA015885) was Generated. After filtering, the data were compared using BWA v0.7.16a-r1181, and ALLHiC67 was used to remove reads for single-end comparisons and sequences outside of 500 bp from the restriction sites.. The contigs were clustered, sorted, and oriented using ALLHiC to obtain chromosome-level genomes. The assembled genomes were assisted in constructing interactions using Juicer68 and visualized for error correction using JuiceBox69.
RNA sequencing of short reads
A total amount of 2 μg RNA was used as input material for the RNA sample preparations. Sequencing libraries were generated using NEBNext® Ultra™ RNA Library Prep Kit for Illumina (#E7530L, NEB, USA) following the manufacturer’s recommendations and index codes were added to attribute sequences. The aimed products were retrieved and PCR was performed, then the library was completed. The clustering of the index-coded samples was performed on a cBot cluster generation system using HiSeq PE Cluster Kit v4-cBot-HS (Illumina) according to the manufacturer’s instructions. After cluster generation, the libraries were sequenced on an Illumina NovaSeq 6000 platform and a total of 11.68 Gb RNA short reads (CRA015982) were generated.
Genome annotation
To identify repeated elements, we used both de novo and homology-based approaches. For the homology-based approaches, RepeatMasker (open-4.09)70 and RepeatProteinMask (open-4.09) were used to search transcriptional elements (TE) by aligned Pacific saury genome with RepBase (release 21.01)71. For de novo approaches, RepeatModeler v272 and LTR-FINDER v1.0.573 were used to construct a de novo repeat library. Then RepeatMasker was used to identify the repeat sequence with RepBase. Tandem Repeat Finder (TRF)74 was used to identify tandem repeats.
We combined homology annotation, de novo annotation, and transcriptome-based annotation approaches to predict gene structure and function. For the homology annotation, we downloaded protein sequences of Sheepshead minnow (Cyprinodon variegatus), Medaka (Oryzias latipes), Marine Medaka (Oryzias melastigma), Mummichog (Fundulus heteroclitus), and Annual killifish (Austrofundulus limnaeus) from NCBI. Pacific saury genome was aligned with these genomes using TblastN75. We used EXconerate76 to predict the protein-coding gene structures based on the aligned data. For the de novo annotation, Augustus v3.377 and Genscan v3.0.478 were used to predict the protein-coding gene. We extracted and sequenced RNA reads from the gill, liver, intestine, and skin for the transcriptome-based annotation. Tophat (default parameters)79 was used to match them to Pacific saury reference genome. Cufflinks (default parameters)80 were used to splice the sequences from the match to obtain the structure of the protein-coding genes. MAKER v3.0081 was used to integrate the gene sets predicted by the various methods into a non-redundant, more complete, and reliable gene set. Finally, the proteins were functionally annotated with the help of exogenous protein databases (SwissProt82, TrEMBL83, KEGG84, InterPro85, GO86 and NR (https://www.ncbi.nlm.nih.gov/)).
Gene family clustering analysis
We selected Zebrafish (Danio rerio), Chinook salmon (Oncorhynchus tshawytscha), Medaka (Oryzias latipes), Yellowtail kingfish (Seriola lalandi), Yellowfin tuna (Thunnus albacares), and Pacific saury (Cololabis saira) for gene family analysis. The filtered dataset was compared all-versus-all using BLASTP v2.11.0 (-evalue 1e-5) to obtain similarity relationships of protein sequences. The genes were clustered into families using OrthoMCL v2.0.9 (-l 1.5)87.
Phylogenetic analysis using whole-genome information
Phylogenetic trees were constructed using single-copy genes from the six species identified by OrthoMCL. Multiple sequence alignment was performed using MAFFT v7.48788 for each single-copy gene family, with parameters by default. The super alignment matrix is constructed by combining all single-copy alignment results. We concatenated the results of all single-copy genes and extract the conserved sequences using Gblocks v0.91b (-t = c)89. All loci, phase1 loci, and 4D loci data were obtained. RAxML v8.2.12 was used to construct the phylogenetic trees of the six species (-f a -N 100 -m GTRGAMMA)90 using the maximum-likelihood method with 1,000 bootstraps. The final species phylogenetic relationships were determined based on known species relationships and the degree of agreement among phylogenetic trees. Divergence times were estimated using MCMCtree in the PAML91 software and were corrected using TimeTree (http://www.timetree.org/).
Gene family contraction and expansion and positive selection analysis
Expansion and contraction of each gene family were identified using CAFÉ v5.0 (P < 0.05)92. We used the CodeML v4.9 module in PAML to detect positive selection effects in a single-copy gene. The multiple-protein alignments of single-copy gene were generated by MAFFT v7.487 and used to estimate the dN/dS ratio (ω). Likelihood values were calculated separately using Modal A (model=2, NSsites=2, fix_omega=0) and null Mode (model=2, NSsites=2, fix_omega=1, omega=1.0) based on multiple-protein alignments. The likelihood ratio test was performed on the above likelihood values by the chi2 program from PAML, and genes with p value less than 0.05 were treated as candidates that underwent positive selection. The posterior probability of being considered a positive selection site was obtained using Bayes empirical Bayes method. Finally, KEGG and GO enrichment were used for the positively selected genes to identify functional categories and pathways.
Conserved syntenies
JCVI was used to identify and visualize regions of conserved synteny between Pacific saury and Medaka based on protein-coding gene regions using JCVI93. MCScanx94 was used to identify the location of orthologous and paralogous genes between Pacific saury and the other species.
Population genetics
Genomic DNA from the 80 individuals were sequenced using Illumina NovaSeq 6000. The sequenced data were filtered and aligned to the CSA genome using BWA software. GATK was used to perform SNP calling. Related filtering parameters for GATK were set as “QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0”. SnpEff95 was used to annotate the genetic variants. Whole-genome SNPs were further filtered (-maf 0.2, -geno 0.1, -hwe 0.0001) using Plink96. As a result, a total of 1,633,773 SNPs loci were obtained after filtering. FST values among sites were calculated along the sliding window (a window size of 50 kb with 5 kb increments) of the genome using Vcftools97. A window size of 10 kb with 1-kb increments was applied for more fine-scale analysis. Pheatmap package was used to cluster the eight sites based on the FST. ClusterProfiler package was used to perform enrichment analysis with the Zebrafish database.
Statistics and reproducibility
The statistical significance of GO and KEGG terms was evaluated using Fisher’s exact test in combination with FDR correction for multiple testing (P < 0.05). For whole-genome resequencing, we analyzed 10 samples from each of the eight sites under the same conditions to ensure comprehensive and accurate detection of variation. *p-value < 0.05 and **p-value < 0.01 were considered significant and extremely significant differences, respectively.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2022), China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences (BioProject ID PRJCA025192) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa.
References
Fukushima, S. Synoptic analysis of migration and fishing conditions of saury in the northwest Pacific Ocean. Bulletin of Tohoku Regional Fisheries Research Laboratory (Japan) (1979).
Hubbs, C. L., Hubbs, C. L. & Wisner, R. L. Revision of the sauries (Pisces, Scomberesocidae) with descriptions of two new genera and one new species. Fish. B-noaa. 77, 521–566 (1980).
Chang, Y. Ä., Lan, K. Ä., Walsh, W. A., Hsu, J. & Hsieh, C. Ä. Modelling the impacts of environmental variation on habitat suitability for Pacific saury in the Northwestern Pacific Ocean. Fish. Oceanogr. 28, 291–304 (2019).
Bakun, Cury, Bakun & Cury. The “school trap”: a mechanism promoting large-amplitude out-of-phase population oscillations of small pelagic fish species. Ecol. Lett. 2, 349–351 (1999).
Ma, S. et al. Interannual to decadal variability in the catches of small pelagic fishes from China Seas and its responses to climatic regime shifts. Deep Sea Res. Part II: Topical Stud. Oceanogr. 159, 112–129 (2019).
Suyama, S., Kurita, Y. & Ueno, Y. Age structure of Pacific saury Cololabis saira based on observations of the hyaline zones in the otolith and length frequency distributions. Fish. Sci. 72, 742–749 (2006).
Suyama, S. et al. Geographical variation in spawning histories of age-1 Pacific saury Cololabis saira in the North Pacific Ocean during June and July. Fish. Sci. 85, 495–507 (2019).
Watanabe, Y., Kurita, Y., Noto, M., Oozeki, Y. & Kitagawa, D. Growth and Survival of Pacific Saury Cololabis saira in the Kuroshio-Oyashio Transitional Waters. J. Oceanogr. 59, 403–414 (2003).
Chow, S., Suzuki, N., Brodeur, R. D. & Ueno, Y. Little population structuring and recent evolution of the Pacific saury (Cololabis saira) as indicated by mitochondrial and nuclear DNA sequence data. J. Exp. Mar. Biol. Ecol. 369, 17–21 (2009).
Tian, Y., Akamine, T. & Suda, M. Long-term variability in the abundance of Pacific saury in the Northwestern Pacific Ocean and climate changes during the last century. Bull. Jpn. Soc. Fish. Oceanogr. 66, 16–25 (2002).
Xu, G. et al. Genome and population sequencing of a chromosome-level genome assembly of the Chinese tapertail anchovy (Coilia nasus) provides novel insights into migratory adaptation. GigaScience 9 https://doi.org/10.1093/gigascience/giz157 (2020).
Harder, A. M. & Christie, M. R. Genomic signatures of adaptation to novel environments: hatchery and life history-associated loci in landlocked and anadromous Atlantic salmon (Salmo salar). Can. J. Fish. Aquat. Sci. 79, 761–770 (2022).
Fuji, T., Suyama, S., Nakayama, S., Hashimoto, M. & Oshima, K. A review of the biology for Pacific saury, Cololabis saira in the North Pacific Ocean. NPFC-2019-SSC PSSA05-WP13 (Rev. 1) (2019).
Sime-Ngando, T. L. & Colombet, J. Virus and prophages in aquatic ecosystems. Can. J. Microbiol. 55, 95–109 (2009).
Yamaguchi, M. & Honma, T. Parasitological study of the migration route of the Pacific saury, Cololabis saira, to the Okhotsk Sea. Scientific Reports of Hokkaido Fisheries Experiment Station, 35–44 (1992).
Davies, K. J. A., Quintanilha, A. T., Brooks, G. A. & Packer, L. Free radicals and tissue damage produced by exercise. Biochem. Biophys. Res. Commun. 107, 1198–1205 (1982).
Chapman, B. B. et al. Partial migration in fishes: causes and consequences. J. Fish. Biol. 81, 456–478 (2012).
Suyama, S., Nakagami, M., Naya, M. & Ueno, Y. Migration route of Pacific saury Cololabis saira inferred from the otolith hyaline zone. Fish. Sci. 78, 1179–1186 (2012).
Li, W. et al. Otolith Shape Analysis as a Tool to Identify Two Pacific Saury (Cololabis saira) Groups from a Mixed Stock in the High-Seas Fishing Ground. J. Ocean Univ. China 20, 402–408 (2021).
Zhao, L., Zhu, Q. & Hua, C. Genetic structure of saury population based on mitochondrial cytochrome b sequence analysis. Haiyang Tongbao 38, 312–318 (2019).
Zhang, B., Li, Y., Xue, D. & Liu, J. Population genomic evidence for high genetic connectivity among populations of small yellow croaker (Larimichthys polyactis) in inshore waters of China. Fish. Res. 225, 105505 (2020).
Zhao, J. et al. Review on application of SNP detection methods in animal research. dbkxxb 34, 299–305 (2018).
Zheng, J., Zhao, L., Zhao, X., Gao, T. & Song, N. High genetic connectivity inferred from whole-genome resequencing provides insight into the phylogeographic pattern of Larimichthys polyactis. Mar. Biotechnol. 24, 671–680 (2022).
Sato, M. et al. Chromosomal DNA sequences of the Pacific saury genome: versatile resources for fishery science and comparative biology. DNA Res. 31, dsae004 (2024).
Pryszcz, L. P. & Gabaldón, T. Redundans: an assembly pipeline for highly heterozygous genomes. Nucleic Acids Res. 44, e113–e113 (2016).
Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).
Nakamura, Y., Yasuike, M., Fuji, T., Suyama, S. & Mekuchi, M. Draft genome sequence and tissue expression panel of Pacific saury (Cololabis saira). DNA Res. 31, dsae010 (2024).
Warr, G. W. The immunoglobulin genes of fish. Developmental Comp. Immunol. 19, 1–12 (1995).
Qu, B., Zhang, S., Ma, Z. & Gao, Z. Hepatic cecum: a key integrator of immunity in amphioxus. Mar. Life Sci. Technol. 3, 279–292 (2021).
Murphy, K. & Weaver, C. Janeway’s immunobiology. (Garland science, 2016).
Secombes, C. J., Hardie, L. J. & Daniels, G. Cytokines in fish: an update. Fish. Shellfish Immunol. 6, 291–304 (1996).
Nagasawa, K., Imai, Y. & Ishida, K. Long-term changes in the population size and geographical distribution of Pennella sp. (Copepoda) on the saury, Cololabis saira, in the western North Pacific Ocean and adjacent seas. Hydrobiologia 167, 571–577 (1988).
Ito, S. et al. Initial design for a fish bioenergetics model of Pacific saury coupled to a lower trophic ecosystem model. Fish. Oceanogr. 13, 111–124 (2004).
Vighi, G., Marcucci, F., Sensi, L., Di Cara, G. & Frati, F. Allergy and the gastrointestinal system. Clin. Exp. Immunol. 153, 3–6 (2008).
Kaya, B. et al. Lysophosphatidic Acid-Mediated GPR35 Signaling in CX3CR1+ Macrophages Regulates Intestinal Homeostasis. Cell Rep. 32, 107979 (2020).
Tsukahara, T. et al. G protein-coupled receptor 35 contributes to mucosal repair in mice via migration of colonic epithelial cells. Pharmacol. Res. 123, 27–39 (2017).
Slupphaug, G. The interacting pathways for prevention and repair of oxidative DNA damage. Mutat. Res. /Fundamental Mol. Mechanisms Mutagenesis 531, 231–251 (2003).
Kim, H. & D’Andrea, A. D. Regulation of DNA cross-link repair by the Fanconi anemia/BRCA pathway. Genes Dev. 26, 1393–1408 (2012).
Thompson, L. H., Hinz, J. M., Yamada, N. A. & Jones, N. J. How Fanconi anemia proteins promote the four Rs: Replication, recombination, repair, and recovery. Environ. Mol. Mutagenesis 45, 128–142 (2005).
Lee, J. H. & Paull, T. T. Activation and regulation of ATM kinase activity in response to DNA double-strand breaks. Oncogene 26, 7741–7748 (2007).
Tuo, J., Chen, C., Zeng, X., Christiansen, M. & Bohr, V. A. Functional crosstalk between hOgg1 and the helicase domain of Cockayne syndrome group B protein. DNA Repair 1, 913–927 (2002).
Kadyrov, F. A. et al. A possible mechanism for exonuclease 1-independent eukaryotic mismatch repair. Proc. Natl. Acad. Sci. USA 106, 8495–8500 (2009).
Spivak, G. Nucleotide excision repair in humans. DNA Repair 36, 13–18 (2015).
Zhu, X. et al. Analysis of the fatty acid contents and composition in 5 species of economic fish guts. Food Res. Dev. 42, 22–27 (2021).
Ding, T. et al. Optimal amounts of coconut oil in diets improve the growth, antioxidant capacity and lipid metabolism of large yellow croaker (Larimichthys crocea). Mar. Life Sci. Technol. 2, 376–385 (2020).
Nordgren, M. & Fransen, M. Peroxisomal metabolism and oxidative stress. Biochimie 98, 56–62 (2014).
Waterham, H. R., Ferdinandusse, S. & Wanders, R. J. A. Human disorders of peroxisome metabolism and biogenesis. Biochimica et. Biophysica Acta (BBA) - Mol. Cell Res. 1863, 922–933 (2016).
Grant, P. et al. The biogenesis protein PEX14 is an optimal marker for the identification and localization of peroxisomes in different cell types, tissues, and species in morphological studies. Histochem. Cell Biol. 140, 423–442 (2013).
Wang, Z. Y., Soanes, D. M., Kershaw, M. J. & Talbot, N. J. Functional analysis of lipid metabolism in Magnaporthe grisea reveals a requirement for peroxisomal fatty acid β-oxidation during appressorium-mediated plant infection. Mol. Plant-Microbe Interact. 20, 475–491 (2007).
Croes, K., Casteels, M., De Hoffmann, E., Mannaerts, G. P. & Van Veldhoven, P. P. Alpha-Oxidation of 3-methyl-substituted fatty acids in rat liver. Production of formic acid instead of CO2, cofactor requirements, subcellular localization and formation of a 2-hydroxy-3-methylacyl-CoA intermediate. Eur. J. Biochem. 240, 674–683 (1996).
Wierzbicki, A. S. et al. Identification of genetic heterogeneity in Refsum’s disease. Eur. J. Hum. Genet. 8, 649–651 (2000).
Suenobu, S. et al. A role of EphB4 receptor and its ligand, ephrin-B2, in erythropoiesis. Biochem. Biophys. Res. Commun. 293, 1124–1131 (2002).
Taznin, T., Perera, K., Gibert, Y., Ward, A. C. & Liongue, C. Cytokine Receptor-Like Factor 3 (CRLF3) Contributes to Early Zebrafish Hematopoiesis. Front. Immunol. 13, 910428 (2022).
Azhdari, A. et al. Antioxidant effect of high intensity interval training on cadmium-induced cardiotoxicity in rats. Gene Cell Tissue 6 https://doi.org/10.5812/gct.94671 (2019).
Thompson, N. F. et al. A complex phenotype in salmon controlled by a simple change in migratory timing. Science 370, 609–613 (2020).
Bradbury, I. R. et al. Genomic islands of divergence and their consequences for the resolution of spatial structure in an exploited marine fish. Evol. Appl. 6, 450–461 (2013).
Zhang, Q. et al. Deletion of Mtu1 (Trmu) in zebrafish revealed the essential role of tRNA modification in mitochondrial biogenesis and hearing function. Nucleic Acids Res. 46, 10930–10945 (2018).
Porebski, S., Bailey, L. G. & Baum, B. R. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Report. 15, 8–15 (1997).
Yang, X. et al. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinforma. 14, 33 (2013).
Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. arXiv: Genomics https://doi.org/10.48550/arXiv.1308.2012 (2013).
Wenger, A. M. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Li, H. New strategies to improve minimap2 alignment accuracy. Bioinformatics 37, 4572–4574 (2021).
Li, H. & Durbin, R. Fast and accurate long-read alignment withBurrows-Wheeler transform. Bioinformatics. 26, 589–595 (2010).
Manni, M., Berkeley, M. R., Seppey, M., Sim√£o, F. A. & Zdobnov, E. M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Zhang, X., Zhang, S., Zhao, Q., Ming, R. & Tang, H. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nat. Plants 5, 833–845 (2019).
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).
Durand, N. C. et al. Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101 (2016).
Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. CP in Bioinformatics 5 https://doi.org/10.1002/0471250953.bi0410s05 (2004).
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic Genome Res. 110, 462–467 (2005).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
Gertz, E. M., Yu, Y.-K., Agarwala, R., Sch√§ffer, A. A. & Altschul, S. F. Composition-based statistics and translated nucleotide searches: Improving the TBLASTN module of BLAST. BMC Biol. 4, 41 (2006).
Slater, G. S. C. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinforma. 6, 31 (2005).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Campbell, M. S., Holt, C., Moore, B. & Yandell, M. Genome annotation and curation using MAKER and MAKER‐P. CP Bioinforma. 48, 4.11.11–14.11.39 (2014).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res. 25, 31–36 (1997).
Bairoch, A. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids Res. 28, 27–30 (2000).
Zdobnov, E. M. & Apweiler, R. InterProScan ‚Äì an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848 (2001).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
Talavera, G. & Castresana, J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56, 564–577 (2007).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Mendes, F. K., Vanderpool, D., Fulton, B. & Hahn, M. W. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics 36, 5516–5518 (2020).
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49–e49 (2012).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Purcell, S. et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. E. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).
Acknowledgements
This work was supported by R&D and Industrialization Project of Key Technologies for Smart Detection System of Digital Marine Fisheries Satellite” by China’s Ministry of Education (202482230112900600), and the National Natural Science Foundation of China (NSFC) (grant numbers 42376100, 32072980, 42206085).
Author information
Authors and Affiliations
Contributions
Y.L. (Yang Liu): investigation, conceptualization, writing—original draft, writing—review and editing. Y.L. (Yanping Luo): data curation, writing—review and editing. P.W. (Penghao Wang): methodology conducting, data analyze, writing—original draft. W.L. (Wenjia Li): methodology designing. H.T. (Hao Tian): data collecting. C. C. (Chang Cao): data collecting. Z. Y. (Zhiqiang Ye): investigation, conceptualization. H. L. (Hongan Long): investigation. T. L. (Tongtong Lin): writing—editing. S. W. (Shengjun Wang): conceptualization. X. Y. (Xiaohui Yuan): investigation. S. X. (Shijun Xiao): investigation, conceptualization, writing—review and editing. Y. W. (Yoshiro Watanabe): investigation, conceptualization. Y. T. (Yongjun Tian): investigation, conceptualization, writing—review and editing. All authors contributed to the manuscript and approved the submitted version. All authors have read and agreed to the published version of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing of interests
Peer review
Peer review information
Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: John Mulley, Luke Grinham and Kaliya Georgieva. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liu, Y., Luo, Y., Wang, P. et al. Phased chromosome-level genome provides insights into the molecular adaptation for migratory lifestyle and population diversity for Pacific saury, Cololabis saira. Commun Biol 7, 1513 (2024). https://doi.org/10.1038/s42003-024-07126-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42003-024-07126-0











