Comparative genome analysis of Bacillus thuringiensis strain HD521 and HS18-1

Bacillus thuringiensis (Bt) is an important biological insecticide used to management of different agricultural pests by producing toxic parasporal crystals proteins. Strain HD521 has an antagonistic effect against Rhizoctonia solani AG1IA, the causal agent of rice sheath blight. This strain with three cry7 genes can the formation of bipyramidal parasporal crystals (BPCs). BPCs are used for insecticidal activities against Henosepilachna vigintioctomaculata larva (Coleoptera). Strain HS18-1 contains different types of BPCs encoding genes and has effective toxicity for Lepidoptera and Diptera insects. Here we report the whole genome sequencing and assembly of HD521 and HS18-1 strains and analyzed the genome constitution covering virulence factors, types of plasmid, insertion sequences, and prophage sequences. The results showed that the genome of strain HD521 contains a circular chromosome and six circular plasmids, encoding eight types of virulence protein factors [Immune Inhibitor A, Hemolytic Enterotoxin, S-layer protein, Phospholipase C, Zwittermicin A-resistance protein, Metalloprotease, Chitinase, and N-acyl homoserine lactonase (AiiA)], four families of insertion sequence, and comprises six pro-phage sequences. The genome of strain HS18-1 contains one circular chromosome and nine circular plasmids, encoding five types of virulence protein factors [Hemolytic Enterotoxin, S-layer protein, Phospholipase C, Chitinase, and N-acyl homoserine lactonase (AiiA)] and four families of insertion sequence, and comprises of three pro-phage sequences. The obtained results will contribute to deeply understand the B. thuringiensis strain HD521 and HS18-1 at the genomic level.

were collected, and a virulence factor gene set (protein sequence set) is constructed, and then Homology comparison the annotation results of HD521 and HS18-1 genomes and plasmids with the gene set, then get a gene set of predicting virulence genes and their sequences were analyzed by using BLAST software, Finally, the virulence factor sequence information of HD521 and HS18-1 were obtained.

Results and analysis
Features of strain HD521 and HS18-1. Bt strain HD521 was first isolated from soil sample of the United States 31 . It was obtained from Bacillus Genetic Stock Center (BGSC). Strain HD521, like the majority of the Bt strains, cells are Gram-positive and rod-shaped 1 . It exhibits maroon colonies and produces bipyramidal parasporal crystals (BPCs) during the stationary phase of its growth cycle. But, the difference is that its colonies can produce brown-red pigments that turn the entire colony into brownish. SDS-PAGE analysis of spores and crystals mixtures showed the strain HD521 expression of a major protein band of 130 kDa, which is consistent with the following analysis of its parasporal crystal gene 32 . However, strain HS18-1 was isolated from the Sichuan basin of China, and it has typical toxicity against Lepidoptera and Diptera 33 . It can produce spherical parasporal crystals during the stationary phase of its growth cycle. SDS-PAGE analysis of spores and crystals mixtures showed HS18-1 expression of two major protein band of 130 kDa and 75 kDa 34 . By identifying insecticidal gene, it indicated that strain HS18-1 contains very rich cry-type insecticidal crystal protein genes, including cry4Cb1, cry30Ga1, cry30Ea1, cry56Aa3, cry50Aa, cry69Ab1, cry70Aa, cry71Aa, and cry72Aa 33,34 .
Virulence factors. The insecticidal active ingredients of strains HD521 and HS18-1 are mainly encoded insecticidal crystal proteins on the plasmid. In addition, the chromosomes also encode a large number of insecticidal active ingredients (Table 2), and their insecticidal mechanisms are also different. Moreover, there is synergistic effect between the insecticidal active ingredients.  www.nature.com/scientificreports/ Strain HD521 comprises a plethora of virulence factors such as Immune Inhibitor A, Hemolytic Enterotoxin, S-laryer protein, Phospholipase C, Zwittermicin A-resistance protein, Metalloprotease, Chitinase, and N-acyl homoserine lactonase (AiiA). Immune Inhibitor A is a metallo-enzyme, it have three copies on the chromosome of strain HD521, which able to enhance toxicity to insects by inhibiting insect immune factors and hydrolyzing some antibacterial proteins in insects. The chromosome of strain HD521 encodes 9 enterotoxin genes, it is a virulence factor contained in Bacillus cereus which causes vomiting and diarrhea in humans. However, we conclude that strain HD521 may have hemolytic properties due to 3 subunit genes, Gamma-hemolysin component B, Tripartite hemolysin BL component L1, and Hemolysin BL lytic component L2, encoded in chromosome of strain HD521.
The S-layer protein forms an ordered crystal array structure on the surface of pathogenic bacteria that maintains cell morphology and cell integrity, so it belongs to a class of surface proteins and is widely distributed in Bacillus species. The S-layer protein of Bt have synergistic effects on insecticidal crystal proteins. Phospholipase C can hydrolyze phosphatidyl alcohol and phosphatidyl choline, which can also cause certain damage to the intestinal tract of insects and promote the activity of insecticidal crystal proteins in a certain extent 35 . Zwittermicin is a new broad-spectrum antibiotic that can inhibit the growth of a variety of microorganisms, especially oomycetes and their related bacteria. But the chromosome of Bt HD521 contains two resistant genes for Zwittermicin, Zwittermicin A-resistance protein and Zwittermicin A resistance protein zmaR. Therefore, we conclude that Bt HD521 has certain resistance to Zwittermicin. Chitin is also known as shell polysaccharide, which is widely found in the shells of insects, the shells of crustaceans and the cell walls of fungi, and acts as a support skeleton to protect itself. Meanwhile, Chitinase is an enzyme that can hydrolyzes chitin, its main role is to has a synergistic effect on pesticides. Because the chitin is one of the main component of the insect midgut peritrophic membrane, the peritrophic membrane is the barrier of insects against bacteria and viruses, when the peritrophic membrane is destroyed by chitinase, the activity of insecticidal protein ultimately increases [36][37][38] . The strain HD521 chromosome encodes two types of chitinases, chitinase A and chitinase D, which can hydrolyze the outer wall of insects and cause insect death. N-Acyl homoserine lactonase (AiiA) is an enzyme that can degrade N-acyl homoserine lactones (AHLs) and is a signal molecule of bacteria, which acts as a signaling molecule of the gram-negative bacterial quorum sensing system and participates in the expression regulation of the pathogenic genes 39 . The AiiA gene-expressing protein contributes to the degradation of AHL molecules, it can reduce the concentration of AHLs by hydrolyzing the lactone bond of AHLs that declines the harm caused by pathogens. Previous studies have shown that AiiA has the effect of enhancing the resistance of Zwittermicin to soft rot 40 . Therefore, the AiiA gene of strain HD521 may have a synergistic effect to Rhizoctonia solani AG 1 IB. The chromosome of strain HS18-1 encodes five virulence factors such as Hemolytic Enterotoxin, S-laryer protein, Phospholipase C, Chitinase, and N-acyl homoserine lactonase (AiiA). But strain HS18-1 chromosome have not subunit B which is necessary for hemolytic in enterotoxin. It contains subunit L1 and L2. The chromosome of strain HS18-1 encodes 8 hemolytic enterotoxin, 5 S-laryer proteins, 5 phospholipases, 2 chitinases, and 5N-acyl homoserine lactonases (AiiA). In comparison of HS18-1 with HD521, the strain HS18-1 does not encode Immune Inhibitor A, Zwittermicin A-resistance protein and metalloprotease. Plasmid analysis. Bt strain HD521 contains 6 plasmids and codes a total of 772 predicted genes, the smallest plasmid is pBTHD521-1with a length of 7042 bp and encodes 11 functional genes, the largest plasmid is pBTHD521-6 with a length of 314,883 bp and encodes 256 functional genes. Some of them, pBTHD521-5 and pBTHD521-6, are used as plasmid which contain the insecticidal crystal protein. However, pBTHD521-1, pBTHD521-2, pBTHD521-3, and pBTHD521-4 are used as plasmid without any insecticidal crystal protein. The www.nature.com/scientificreports/ whole length of plasmid pBTHD521-5 is 253,580 bp and encodes three cry7 genes named as cry7Da1, cry7Ga2 and cry7Fb3 ( Fig. 2A), the gene cry7Ga2 located on sense strand while gene cry7Fb3 and cry7Da1 located on antisense strand. The IS6 family of insertion sequence located on the downstream sequence of gene cry7Fb3 and cry7Da1, and the IS231B family of insertion sequence located on the upstream sequence of gene cry7Ga2. Plasmid pBTHD521-5 also encodes the plasmid replication protein RepX and conjugal transfer protein TraG that means  2B).
The insecticidal genes encoded by HS18-1 are like the genes encoded by HD521, beside these genes there are some mobile elements such as IS4 family of insertion sequence located in the downstream sequence of cry30Ga1 + orf2, IS4 sequence located in the in the upstream sequence of cry71Aa1 + orf2, Tn3 and IS231C located in the upstream and downstream sequences of cry30Ea1 + orf2, respectively. And IS231C in the upstream and downstream sequence of cry30Ea1 + orf2, IS231 sequence located in the downstream sequence of cry4Cb1. It reveals that the cry genes combined with mobile elements as genomic island in the genome.

Insertion sequence analysis. Insertion sequence (IS) is a movable element that causes genomic plasticity
and its main feature is the transposition between different sites within the genome. A basic IS element includes a site-specific recombinase (transposase) and flanking repetitive DNA sequences 41 . Different IS sequence elements have great difference in transposition mechanisms and target specific sites, such as IS7 and IS30 42,43 . By comparing ISin the genome of both strains, HD521 and HS18-1 (Table 3), we found that the family and number of IS have significant difference in the distribution of the genome. Among them, the number of distribution of strain HS18-1 is the largest, and the IS family is the most abundant.
IS of four families were found in the genome of HD521 by IS sequence analysis. We found that IS family IS6 and IS110 have one copy, respectively. But the insert sequence of IS200_IS605_ssgr_IS1341 family has two copies ( Table 3). Bt HD521 contains 6 plasmids and 4 types of IS sequences. Among them, the plasmids pBTHD521-1, pBTHD521-2, pBTHD521-3 and pBTHD521-4 did not contain any IS and these are small plasmids with length of 7 kbp, 50 kbp, 71 kbp and 71 kbp, respectively. The plasmid-encoded transposase gene analysis revealed that pBTHD521-5 and pBTHD521-6 contain 4 and 3 IS families, respectively. Their IS are mainly focused in the IS200_IS605_ssgr_IS1314, IS6, IS4_ssgr_IS231, and IS200_IS605 families (Table 4). Through the distribution of the IS on the plasmids, we can perceive that the IS of Bt is often found in some larger plasmids, so the evolution rate of the endogenous large plasmid of Bt is larger than that of the endogenous small plasmid. In the genome and plasmid of HD521, we found that there are 4 IS on the genome and 17 IS on the plasmids. The family of IS shared by the plasmid and genome are IS200_IS605_ssgr_IS1341 and IS6. Besides, the genome also contains two specific families that are IS607 and IS110 and plasmid contains a specific IS231 family along with two subfamilies, IS231B and IS231E. This suggests the indication of the rate of plasmid evolution is faster than the evolution of the genome. www.nature.com/scientificreports/ By analysing the HD521 whole-genome annotation results and ISFINDER data, we found a complete IS607 family IS in the HD521 genome. It does not contain an inverted repeat region (IR), but it encodes two ORFs; one encodes a 1131 bp transposase gene, and the other encodes a Haloacid Dehalogenase which plays a crucial role to carry out dehalogenation, phosphoryl transfer and hydrolysis of phosphate salts by forming covalent enzymes.
Meanwhile, we also found a complete IS110 family of IS in the HD521 genome, it also does not contain an IR region with similar to IS family IS200_605, IS607. But the difference is, IS100 family contains a conserved amino-terminal region of pilin gene inverting protein (PIVML). Through analysis we found that IS110 family  The first type of IS, IS200_IS605_ssgr_IS1341 on plasmid pBTHD521-5 has 3 copies, which have an ORFB structure similar to IS family of IS110, the C terminus of ORFB has four typical cysteine residues with an ability to combine with zinc. Therefore, the C terminus is mainly used as a DNA binding site in transposition process 44 . By comparing the upstream and downstream genes in IS, we found that the first IS1314 was inserted into C1qtnf9 (C1q and tumor necrosis factor related protein 9) protein of house mouse. However, the 3′ end of the IS1314 sequence can form a complete ORF with the downstream C1qtnf9, thus the insertion of IS1314 sequence brings an exogenous gene to the HD521 genome. But, the insertion of the second IS did not result in a change of functional gene and did not carry the insertion of a new heterologous gene.
According to the transposase gene and ISsaga results, we found that plasmid pBTHD521-6 contains 2 copies of the IS IS200_IS605_ssgr_IS1341 family, 1 copy of the IS IS200_IS605 family, and 5 copies of the IS IS4_ ssgr_IS231 family. At the same time, IS4_ssgr_IS231 family also includes subfamily insertion sequences IS231B (238,504-235,158 bp) and IS231E (267,744-269,823 bp), these subfamily insertion sequences generally have a transposase gene which contains ORF containing Integrase binding domain, Helix-turn-helix (HTH) DNA binding site, and DDE structure. The DDE structure has three carboxylic acid residues, and they can combine with metal ions which participate in catalytic DNA cutting to catalytic DNA cleavage, and to transcript regulator. IS231 IS family also contains some enzymes which participate in the metabolism of amino acid and nucleic acid, such as proline dehydrogenase and ribose triphosphate deoxyribonucleoside reductase.
The HS18-1 genome contains 42 copies of the transposase gene and IS family IS3_ssgr_IS150 contains 36 copies, IS200_IS605_ssgr_IS1314, IS607 and IS110 have two copies, respectively (Table 3). However, the transposase gene of IS607 family contains a helix-trans-helical DNA domain at the N terminus and four conserved cysteine residues at the C terminus, and IS110 family has a typical DEDD structure. The IS150 of IS3 family is similar to IS607, also has a helix-trans-helical DNA domain at the N terminus, but contains an integrase core binding domain at its C terminus. The pHS18-2 plasmid encodes 62 transposase genes and contains eight IS families (Table 5). IS family IS200_IS605_ssgr_IS1341 contains a zinc finger structure that combines with DNA at the C-terminal of the transposon, its transposase gene has only 82% homology with other by comparison in NCBI database, which enables to perceive that it might be a new type of IS. IS family IS3_ssgr_IS3 has only 84% homology with the transposase gene of Bacillus cereus, its DNA binding domain is a helix-turn-helix structure, and it contains a DNA integrase gene, thus it may also be a new class of IS3 family. IS family IS4_ssgr_IS231 has 28 copies, IS6 has 13 copies, IS3_ssgr_IS150 has 14 copies, IS1182 and IS110 have two copies, respectively. The pHS18-4 plasmid encodes five copies of IS family IS4_ssgr_IS231 and contains a cry56Aa3 + orf2 gene, the upstream and downstream sequences of cry56Aa3 + orf2 gene have a IS231 insertion sequence, respectively. The pHS18-1 plasmid contains 15 transposase genes and encodes three IS families, of these three families   www.nature.com/scientificreports/ 5 has homology with the prophage sequence of 11 species, 17 of the 24 CDS sequences participate in the coding of phage functional proteins, and 7 CDS sequences participate in the coding of hypothetical proteins. Functional analysis of CDS encoded by prophage sequences that show sequence 1 and sequence 2 contain attachment site Left (attL) and attachment site Right (attR), these sites are specific for the integration of the phage DNA or the excision of the Bt HD521 genome. However, the prophage sequence is located between these two attachment sites. Component genes needed by CDS sequence encodes phage integrate or cut with bacterial genome, such as Endolysin, DNA recombination and exonuclease gene, Site-specific recombinase, Exonuclease, and DNA polymerase I encoded by sequence 1. Sequence 2 encodes an Integrase, Resolvase, Site-specific recombinase, Cytokine tail protein, head-tail adaptor, Capsid protein, and Bacterial proteins, etc. Sequence 3 mainly encodes phage-related proteins, such as Tail fiber protein, Calcineurin phosphoesterase, Glycosyltransferase, Collagen triple helix repeat protein, Bacteria encode proteins, and some incomplete phage proteins. Sequence 4 mainly encodes DNA integration, recombination protein, and phage tail protein. Sequence 5 has relatively few CDS that encode functional proteins of phages and does not even contain recombinant related enzymes or phage structural proteins. Sequence 6 has fewer proteins that participate in encoding phages and have only one phage minor tail protein. Similarly, sequences 3, 4, 5, and 6 encode some bacterial-type proteins.
The sequence of HS18-1 chromosome analysed by PHASTER showed HS18-1 chromosome contains 3 lysogenic phage genome regions (Table 7). Among them, sequence 1 is complete and its sequence length is 27.1 kbp (2,506,943-2,534,097 bp) with GC content of 35.77%, and encodes 37 protein sequences. Sequence 2 is incomplete and its length is 20.6 kbp (2,489,098-2,509,752 bp) with GC content of 32.54%, and encodes 24 protein sequences. The length of sequence 3 is 43. 2 kbp (1,376,234-1,419,448 bp), its GC content is 34.38%, and encodes 54 protein sequences. Sequence 1 has homology with the prophage sequence of 38 species and its 54.28% of the protein sequence can be aligned with PHAGE_Bacill_phIS3502_NC_019502. Sequence 1 encodes 35 ORFs and a phage-specific attachment site attL and attR, of them 25 ORF sequences encode prophage proteins, such as the transcription regulators of phage ArpU family, site-specific integrases, phage capsid proteins, phage tail assembly proteins, etc.
Of them 5 ORF sequences encode phage hypothesis proteins, and there are also 5 ORF sequences encode non-phage hypothesis proteins. Sequence 2 has homology with the prophage sequence of 9 species and its 45.45% of the protein sequence can be aligned with PHAGE_Bacill_phBC6A52_NC_004821. Sequence 2 encodes 23 ORFs and a phage-specific attachment site attL and attR, of them 9 ORF sequences encode prophage proteins, such as DNA integration/recombination/insertion protein, DEAD/DEAH box helix protein, Helix-turn-helix

Discussion
By analysing the genome of strain HD521 and HS18-1, we found that these two genomes encode rich virulence factors, such as S-layer protein, enterotoxin, phospholipase, chitinase, and AiiA, etc. They have an important significance for the insecticidal activity and environmental adaptability of Bt strains. Immune Inhibitor A is a metalloproteinase secreted by Bt, it is able to degrade antibacterial peptide produced by insects to escape the host's immune system 45,46 . AiiA can hydrolyze AHLs (Acylated Homoserine Lactones) which is bacterial quorum sensing related signaling molecules, its role is to inhibit a variety of bacteria and enhance Bt's competitive advantage in the insect gut 47,48 . Chitinase is a soluble extracellular protein and an insecticidal active substance that can help Bt strains to degrade chitin in the peritrophic membrane of insect intestines and make it able to enter in the blood cavity through the perforated intestinal tract to cause insect septicaemia that further enhance the insecticidal protein effect of Bt strains 38,49 . Simultaneously, the genome of strain HD521 and HS18-1 also encode abundant plasmids, of which HD521 contains 6 plasmids and HS18-1 contains 9 plasmids. For instance, Plasmid pBTHD521-5 contained in strain HD521, encodes three cry7-like insecticidal crystal protein genes, was cry7Fb3, cry7Ga2 and cry7Da1, respectively.
The plasmid of strain HS18-1 encodes 10 insecticidal crystal protein genes, which are distributed in plasmids pHS18-2, pHS18-4 and pHS18-9, respectively. We revealed that these plasmids carry a large number of transcriptional regulatory factors and genes related to the ABC (ATP-binding cassette) protein transport system. The presence of these genes provides an important theoretical basis for understanding their regulatory mechanisms to positively and negatively regulate companion crystal genes. We also found that the insecticidal genes carried by Bt are almost entirely located on large plasmids, but the whole length of plasmid pHS18-9 is only 7386 bp and encodes a cry54Ba gene in the plasmid of HS18-1.
We found that two genomes have abundant IS including IS200_IS605, IS3, IS4, IS6, IS110 and IS1182, complex transposons Tn3, and junctional transfer system protein, analyzed by horizontal gene transfer of the genomes of HD521 and HS18-1. IS605 belongs to IS200/IS605 family, IS605 is widely distributed in Helicobacter pylori, and its terminal is not an inverted repeat sequence and is a forward repeat. IS605 often forms a complex with IS200. IS200 was originally found in Salmonella typhimurium, and its terminal inverted repeat sequence has transposase terminator and block ORF transcription 50 . Moreover, these IS comprised of upstream and downstream sequences of the insecticidal crystal protein, e.g. insertion sequence of IS3 family was comprising downstream sequences of gene cry30Ga1 + orf2, insertion sequence of IS4 and IS6 family were comprising of upstream and downstream sequences of gene cry71Aa1 + orf2 and IS1182 sequence was comprised of downstream sequence of gene cry72Aa1 + orf2. Insertion sequence and transfer system of transposition unit composed by insecticidal crystal protein connected with plasmid is beneficial to horizontal gene transfer of the insecticidal crystal protein gene between the different plasmids of different strains and the different plasmids of same strain, which ultimately plays an important role in the exchange of genetic material and the evolution of population of B. thuringiensis.
The IS families and numbers of HD521 and HS18-1 are significantly different in genome distribution, which may be due to the different living environments and population evolution of the two strains. In order to survive and multiply, some strains have enhanced their adaptability to the environment through millions of years of evolution. The insertion sequence has formed a dynamic balance in the adverse and beneficial effects of the host bacteria. The transposition of the insertion sequence mediates genome rearrangement, activates or silences the expression of functional genes, etc., which may cause fatal harm to the host bacteria, and may also enable the host bacteria to acquire new functions, to better adapt to the external environment 51 . For example, there are huge differences in the number of IS4 family insertion sequences in different genomes, which may be mainly related to the living environment and the needs of evolution, but these phenomena show that the insertion sequence plays a very important role in the flexibility and evolution of the genome, rather than a simple "selfish gene" 52 .
Simultaneously, different in the distribution of genome also bring a difference in their functions. Strain HD521 has the characteristics of inhibiting the growth of its hyphae against rice disease-causing bacterium sheath blight AG1 IB (Rhizoctonia solani AG1 IB). Simultaneously, the colony of HD521 can produce brown-red pigment, which causes the colony of AG1 IB to appear brown-red, this may be due to the antagonism of multiple microorganisms. It have been reported that Bacillus thuringiensis can inhibit a variety of plant diseases caused by filamentous fungi and other plant pathogens. When multiple microorganisms grow together, one kind of microorganism produces one or several specific secondary metabolites in assimilation, which changes its microenvironment, thereby inhibiting or even killing another microorganism 53 .
Strain HS18-1 has high toxicity to lepidopteran and dipteran pests. We analyzed the insertion sequence of the plasmid and found that plasmids containing insecticidal crystal protein genes often contains abundant insertion sequence, and these inserted sequences often form a transposable unit with the insecticidal crystal protein gene. This indicates that the evolution and transfer method of insecticidal crystal protein genes on plasmids of different strains or different plasmids of the same strain is mediated by the insertion sequence.
Plasmid analysis showed that: pHS18-2 is a plasmid, which contains the most insecticidal crystal protein genes, and they are cry30Ga1 + orf2, cry71Aa1 + orf2, cry72Aa1 + orf2, cry70Aa1, cry30Ea1 + orf2, cry69Ab1, cry50Aa1 + orf2, and cry4Cb1. Among them, there is a IS4 family insertion sequence located in the upstream Scientific Reports | (2021) 11:16590 | https://doi.org/10.1038/s41598-021-96133-w www.nature.com/scientificreports/ sequence of cry71Aa1 + orf2, IS6 in the downstream and upstream sequence of cry71Aa1 + orf2, cry72Aa1 + orf2, respectively. IS1182 in the downstream sequence of cry72Aa1 + orf2. We analyzed the gene expression of cry71Aa1 and cry72Aa1 and found that these two genes can produce diamond-shaped insecticidal crystal proteins. Insecticidal biological activity testing showed that their crystal protein has good insecticidal activity against the larvae of lepidopteran pests, cotton bollworm, beet armyworm, and diamondback armyworm. The insecticidal crystal protein produced by Bacillus thuringiensis is encoded by genes of different sizes, the largemolecular-weight cry proteins are generally encoded by genes above 2 kbp. In this study, cry71Aa1 and cry72Aa1 belong to this type of insecticidal crystal genes. These large-molecular-weight cry genes can produce protein molecules that form independent crystal structures through expression, for example, cryIVD genes can produce irregular hexagonal crystals, and cry8 genes can produce spherical crystals 54,55 .
To date, thousands of Bt strains have been identified and isolated but only 24 strains of them found to be fully sequenced The availability and scrutiny of complete genome sequence of strain HD521 and HS18-1 will lay a foundation in Bt genome database for further analysis of the generation and regulatory mechanism of cry genes. In summary, the whole genome sequencing and its comparative analysis of both strains (HD521 and HS18-1) will lay out comprehensive perceptions for the genomic diversity and can also be utilized as genomic data support for further strain improvement.