Background

The domestic silkworm, Bombyx mori (Lepidoptera: Bombycidae), has been domesticated more than 5,000 years ago1. It is a key insect in the sericulture industry and one of the very important economic animals that are responsible for the livelihood of many farmers internationally. The sericulture industry, which raises silkworms and obtains silk, is a very labor-intensive primary industry and global production continues to decrease due to a decline of production in China, which accounted for the majority of the world’s raw silk production with India (https://inserco.org/en/statistics). However, it is still one of the most important economic animals and is being used as a new source of income in some developing countries. In addition to the simple use of B. mori as silk sources in the textile industry, the use of silkworms and silkworm by-products is further expanded in the fields of drugs, tissue engineering, medical textiles, drug delivery systems, cosmeceuticals, food additives, and manufacturing of valuable biomaterials. Therefore, the importance of B. mori as an important animal resource is increasing2,3.

As long as the long domestication period of 5000 years, silkworms have been bred to have phenotypes suitable for specific use through strong selection. Domesticated silkworm can produce a large amount of silk and some of them are known to produce 10 times more silk than Bombyx mandarina, which is known as a wild type species of B. mori4,5. However, as the environment of sericulture is changing and the usability of B. mori is expanded beyond simple silk production, strains with various phenotypes have the potential to be utilized for various purposes as important biological resources. Because of this importance, even though silk production in general farms is decreasing in South Korea, national research institutes have continuously made efforts to secure useful genetic resources by constructing breeding lines for various strains of B. mori. The National Institute of Agricultural Sciences of the Rural Development Administration of Korea (NIAS, RDA, Korea) has been collecting silkworm resources with various expression traits from the 1960s and established a breeding line for using them as genetic resources for F1 hybrid. Strains with various phenotypes can be usefully utilized to enhance specific phenotypes depending on the purpose of use through additional selective breeding and crossbreeding. And they are valuable biological resources to prepare for unexpected environmental changes such as feeding. In addition, the whole-genome sequences of these strains linked to their phenotypes can be used as a major research resource to expand our knowledge of molecular background about B. mori.

In this study, we report the whole-genome sequences of 37 breeding line B. mori strains established over the past 60 years, along with a description of phenotypic characteristics and photos. These whole-genome sequences linked to the phenotypic characteristics of the established breeding line could be valuable resources for the understanding of B. mori genome and provide more insight into the molecular background of various phenotypes.

Methods

Construction and maintenance of breeding lines

For the 37 breeding line strains reported in this study, individuals with phenotypic singularities were first produced through two-way or three-way hybridization using locally collected B. mori strains after the Korean war. All 37 strains were fixed as a breeding line for F1 Hybrid production through selective self-crossing for a minimum of 10 generations so that the strain could maintain the specific phenotype continuously. The established breeding line strain produces 1 generation per year by hatching and raising eggs from the spring and preserving the eggs secured through self-breeding. Egg incubation is carried out under 16 h of light conditions at 15–26°C and 75–80% humidity. After hatching, 1–3 instars are raised at 25–26°C and humidity of 75–80%, and 4–5 instars are raised at 23-24 degrees and humidity of 65–75%. In all instar stages, mulberry leaves are fed 3 times a day to maintain the breeding line.

Library construction and data generation

For whole-genome sequencing of 37 breeding line strains, representative male individuals for each strain were randomly selected during the pupa stage. The epidermis tissue was isolated from the pupa and DNA was extracted using the QIAGEN DNesay Blood & Tissue Kit. The extracted DNA was subjected to gel electrophoresis to confirm DNA fragmentation, and trinean, picogreen, bioanalyzer were used to check the quality of the DNA. For five tri-molt mutant strains(KRSM, SH, HS, S7 and SD), the sequencing library was constructed using the MGIEasy DNA Library Prep Kit according to the manufacturer’s protocol and target size of constructed library was 500 bp. 150 bp paired-end data for 5 strains were generated using MGISEQ-2000 sequecing platform. Libraries for remaing 32 strains were constructed using Illumina Truseq Nano DNA LT Kit and target size of constructed library was 700 bp. 150 bp paired-end data for 32 strains were generated using Illumina Nextseq 500.

Genomics variants and phylogenetic relationship using p50T reference strain

Adapter sequence and low-quality bases were removed by using Trimmomatic6 with adapter sequence, and filtered reads were mapped to the reference p50T genome7 from NCBI Refseq using bwa-mem28 with default parameter. Removal of PCR duplicated reads and variant calling was performed using samtools9, and only biallelic Single Nucleotide Variant(SNV) loci without missing in 38 samples including p50T strain were extracted using VCFtools10. InDel and structural variants for each strain were identified using SvABA11. All identified variant information can be found in (samtools: https://drive.google.com/file/d/1U3VVh_Q5ER-I6OtcpuqAunHZFtnbaQjG/view?usp = sharing) and (SvABA: https://github.com/asleofn/B_Mori/). Identified SNVs were annotated using SnpEff using custom DB infromation using Refseq annotation. The cladogram was constructed through the Neighbor-joining algorithm using Tassel 512.

Data Records

The entire data set described in this study is deposited under NCBI Bioproject accession PRJNA75138713 and NCBI SRA accession SRP33103413 and accession number for each sample can be found in Tables 1 and 2.

Table 1 Summary information of generated whole-genome sequence for 37 breeding line B. mori strain.
Table 2 Phenotypes, silk production statistics and sequence accession information for 37 B. mori breeding lines.

Technical Validation

Phenotypes and genome sequences of 37 breeding line strains of B. mori

Like other countries where B. mori is managed as an important economic animal, the NIAS, RDA, Korea has collected various B. mori strains existing in South Korea since the 1960s and established breeding lines of B. mori strains as genomic resources. In the early 1970s and 1980s, breeding was carried out cantered on hardy and high silk-producing strains to increase silk production. However, from the 1990s, after Korea’s rapid industrialization, to cope with labor shortages and environmental changes, the focus was on the strains that can use artificial feed, require less labor, and are easily differentiated by gender using larval markings and cocoon colors. The 37 strains reported in this study have important values as seed strains used in the development of customized hybrid strains to respond to changes in the sericulture environment and requests from local farmers. Fig. 1 shows each picture of an egg, larva, cocoon, pupa, and adult from 37 B. mori strains. Table 1 shows the summary information of generated whole-genome sequencing data for each strain and Table 2 shows the summary of phenotypic characteristics of 37 breeding line strains with breeding performance. Minimum depth coverage of generated data was over 30X coverage based on the genome size of B. mori(about 450 Mb).

Fig. 1
figure 1

Pictures of egg, larva, cocoon, pupa, and adult of 37 breeding line strains of B. mori.

Genomic variants for each strain were identified using samtools and SvABA. A total 23,478,741 SNVs were identified from samtools and 1,506,850 SNVs(variant quality under Q30 and multiallelic loci) were filtered. Among 21,971,891 SNVs after filtering, 1,327,196 SNVs located in CDS regions. 1,002,715(75.551%) SNVs were synonymous variants and 324,481(24.449%) SNVs were non-synomymous variants. In InDel and structural variant calling using SvABA performed on individual strains, an average of 622,531 InDels and 41,348 structural variants were identified. All variant calling information is available in the link of method section. To figure out the evolutionary relationship of 37 breeding line strains including P50T, phylogenetic analysis was performed using whole-genome variants from generated sequencing data. Fig. 2 shows the phylogenetic relationship between 37 B. mori strains reported in this study with the p50T reference strain. Of the five strains showing tri-molt characteristics, four strains except SH showed a close evolutionary relationship, and some strains had closer evolutionary relationships despite the external differences. Through this, it can be expected that the external characteristics identified by the eye are regulated by the small portion of the total genomics variant and more research will be needed to expand our knowledge for the detailed association between the genomic variants and characteristics. Previously, there were several studies on the phenotype, genetic contents, and regional population of Bombyx mori14,15. However, this is the first populatoin-level whole genome data that is released from South Korea, and this is the first data set containing the details of breeding performance and phenotypic characteristics each individual strain. With existing dataset of previous study, more expanded data for understanding the gentic background of silkworm phenotype can be built. And the data reported in this study can be utilized as useful resources for marker development and is expected to help develop silkworm strains with desired traits in a short time through genomic breeding or genetic engineering.

Fig. 2
figure 2

Cladogram of 37 B. mori breeding line strains with reference p50T strain using Tassel with Neighbor-Joining method.

F1 hybrid strains obtained from 37 breed line strains

The NIAS, RDA, Korea has produced F1 hybrids with the required phenotypes using the 37 seed strains reported in this study, and generated F1 hybrid strains were annually provided to local farmers. This hybrid strain is selected from several hybrid combinations and they have various characteristics to respond to changes in the breeding environment or purpose of use. Table 3 shows the breeding performance and characteristics of representative F1 hybrid strains constructed using 37 breeding line strains. These strains have several important characteristics and the first of which is whether artificial feed can be used. The silkworm is a monophagous insect whose main diet is mulberry leaves. Mulberry leaves, which are feed for silkworms, require a lot of labor in the process of producing, storing, and providing them. Since sericulture is carried out according to the production time of mulberry leaves, there is a problem that the breeding period is limited throughout the year. If an artificial feed can be fed, the produced mulberry leaves can be utilized more longer and it reduces the labor required to prepare mulberry leaves. And also increased production through year-round feeding can be expected. In addition, they are very important due to the recent rapid climate change. These strains which can be fed artificial feed can flexibly cope with the change in the productivity of mulberry leaves. The second is a sex-limited inheritance strain that can classify gender using larval pattern or cocoon color. In the case of sex classification of silkworms, classification is possible through the tail part of the 5 instar period or the shape of pupa, but if classification is performed using larva’s pattern or color, a lot of labor for gender classification can be effectively reduced. The third is a hybrid strain that produces color silk. Among the 37 breeding line strains, the strain producing cocoons with yellow and light green colors has a lower cocoon size compared to the general strain for silk production. Therefore, hybrid strain is a strain that effectively improved the existing low color silk production. In addition to the direct use of color silk itself, these strains can be used as functional strains for carotenoids or flavonoids required for color silk generation. The fourth is a strain that does not produce a cocoon. The breeding line strain Jam307 in this study produces very few cocoons. Only about 1.2% of individuals produce fibroin-free, sericin-only nets. By dissecting the silk gland of this strain, it can be seen that the posterior silk gland, which is important for fibroin-based filamentation, is degenerated. In the Jam307 x Jam126 hybrid strain, which produces relatively large larva and pupa compared to Jam307, most individuals form sericin nets and normal silk with fibroin was not generated. Through this, it can be expected that the characteristic of Jam307, which produces silk composed only of sericin due to the degeneration of the posterior silk gland, is a dominant trait. This hybrid strain that does not make a cocoon is mainly utilized to use the silkworm itself, such as cordyceps production and silkworm powder for a food additive. Lastly, the most recently developed strain is a hybrid strain of KRSM and Jam124. The phenotypic results were not included in Table 3 because the breeding performance evaluation was not completed yet, but the KRSM x Jam124 hybrid strain has the following characteristics. The KRSM x Jam124 hybrid strain produces light green silk like tri-molt characteristics like B. mori KRSM, but the silk production is similar to the general silk production strain. Fig. 3 shows the cocoons of KRSM, Jam124, and KRSM x Jam124 hybrid strains. The cocoon size of the hybrid strain is almost similar to the silk production strain Jam124. In addition to the increased cocoon size, the total larval period was surprisingly shortened. Unlike KRSM and Jam124, which have larval periods of 25.06 and 25.04 days.hrs, respectively, the total larval period of this hybrid strain was 20.04 days.hrs. It is about 20% shorter than the original strains. Since a 20% reduction in production time can increase silk production as well as reduce the production cost, the hybrid strain is being developed as a useful resource that can contribute to productivity improvement. In addition, the whole genome sequences reported in this study can help to provide more insight into the genetic background of B. mori phenotype and develop modified strain for specific use using genetic engineering.

Table 3 Summary of phenotypic characteristics and breeding performance of F1 hybrid strains.
Fig. 3
figure 3

Cocoon of F1 hybrid offspring between male KRSM and female Jam124. All F1 hybrid offspring were tri-molt mutants with a short larval period and the cocoon size was similar to normal B. mori with LYG color.