Introduction

The growth of most microorganisms is influenced by physical factors such as temperature, water activity, pH, pressure, salinity, and oxygen concentration as well as chemical factors such as availability of nutrients (e.g., carbon and nitrogen)1,2,3,4. Microorganisms are usually classified based on optimal growth temperature-i.e., as psychrophiles, mesophiles, thermophiles, and hyperthermophiles, which grow best at temperatures of ≤15 °C, 15 °C–45 °C, >45 °C, and 80 °C, respectively5. These classes also differ in terms of the amino acid composition, structure, and thermostability of proteins6. Growth temperature seems to be related to genomic features; one study showed that the average length of proteins is shorter in thermophiles (growing best at temperatures of >45 °C) as compared to their homolog in mesophiles (15 °C–45 °C), whereas the proportion of purine bases in the coding strand is higher in the former than in the latter7. Other environmental factors besides temperature affect genome size: for example, the small genomes of prokaryotes are thought to reflect adaptation to strong selective pressures in large microbial populations, while the genome size in geophytes was found to be positively correlated with early flowering and growth tendency under humid conditions8,9.

Extremely halophilic archaea (haloarchaea) belonging to the domain Archaea are usually found in hypersaline environments such as salt lakes and crystallizer ponds from artificial marine solar salterns and in salty fermented foods and salted hides10,11, as well as in avian feather12. The growth temperature of haloarchaeal type strains ranges from −1 °C to 62 °C, with few growing at temperatures >60 °C (see Supplementary information). Genus Natrinema in the family Natrialbaceae includes eight known species of haloarchaea: Natrinema altunense, Nnm. ejinorense, Nnm. gari, Nnm. pallidum, Nnm. pellirubrum, Nnm. salaciae, Nnm. soli, and Nnm. versiforme13,14,15,16,17,18,19. In this study we describe strain CBA1119T isolated from solar salt, which has the highest growth temperature and one of the largest genome sizes among all of the haloarchaeal members. We identified and characterized thermophilic strain CBA1119T and investigated the relationship between two strain-specific features, namely growth temperature and genome size.

Results and Discussion

Polyphasic taxonomic analysis (see Supplementary Information) revealed that strain CBA1119T belonged to the genus Natrinema and was a novel member of the genus Natrinema. Interestingly, strain CBA1119T grew at a temperature of 20 °C–66 °C; optimal growth was observed at 50 °C–55 °C. Of the four strains with an optimal growth temperature >50 °C; three belonged to the family Haloferacaceae and one was strain CBA1119T, which belongs to the family Natrialbaceae (Fig. 1a). The maximum growth temperatures of haloarchaea varied within each family (Fig. 1b). It is worth noting that there were no strains belonging to the family Halococcaceae that grew at temperatures >50 °C, and only those belonging to the family Natrialbaceae had a maximum growth temperature >60 °C, including strain CBA1119T. The maximum and optimal growth temperatures of strain CBA1119T were the highest recorded to date among haloarchaea. Environmental temperature underlies the evolution of various biological phenomena such as the density of hydrogen bonds in nucleic acid20.

Figure 1
figure 1

Comparison of the highest optimal (a) and maximum growth temperatures (b), and genome sizes (c) among haloarchaeal species. Strain CBA1119T has the highest optimal and maximum growth temperature, and the third largest genome size among type strains belonging to haloarchaea. Red circles indicate strain CBA1119T.

General genomic features of strain CBA1119T were described in Supplementary Information (Supplementary Tables S2 and S3; Supplementary Fig. S2b). The ways in which the microbial genome is affected by environmental factors can be understood by pan-genome comparisons21. The number of pan-genome orthologous groups (POGs) and strain-specific POGs (singletons) were compared among strain CBA1119T and seven species of the genus Natrinema (Fig. 2). The flower plot showed that strain CBA1119T had the largest number of singletons among Natrinema species. The number of singletons in strain CBA1119T was 1.4 times that in Nnm. salaciae JCM 17869T (which had the second largest number) and four times that in Nnm. altunense AJ2T (which had the smallest number). The heat map based on gene content also showed that strain CBA1119T had more exclusive POGs than other related species (Fig. 3). Additionally, each genome within the genus Natrinema had distinct KEGG pathway profiles based on POGs (Table 1). Strain CBA1119T had specific enzymes listed on the KEGG pathway named propanoate metabolism, geraniol degradation, fatty acid biosynthesis, metabolism and degradation, and valine, leucine and isoleucine degradation, with P values of zero.

Figure 2
figure 2

Flower plot showing strain-specific and core POGs of eight Natrinema species.

Figure 3
figure 3

Heatmap based on gene content. A dendrogram was generated using Jaccard coefficients and unweighted pair-group method with arithmetic mean clustering. Blue and red indicate present and absent genes, respectively. Values in the brackets indicate number of POGs of each strain.

Table 1 Strain-specific POGs listed on the KEGG pathway (P < 0.05).

For genome size and growth temperature comparisons among haloarchaeal type strains, information on the strains was obtained from the GenBank database and previous studies, and is shown in Supplementary Table S4. Genome size comparison at the class level revealed that most haloarchaea (104/128 species) had a genome ranging between 3.0 and 4.5 Mb in size, with the class Natrialbaceae having the largest average genome size (Fig. 1c). Interestingly, only three strains had a genome size >5 Mb, including strain CBA1119T. Besides a high growth temperature, strain CBA1119T had an unusually large genome size. Haloarchaea species with a genome >5 Mb are uncommon; only two such type strains (and three in total) are found in the GenBank database. Strain CBA1119T had the third largest genome among haloarchaeal type strains (and the fourth among total haloarchaeal strains). Genome size was shown to be related to COG categories and pathways in bacteria; COG categories related to secondary metabolism and energy conversion were more highly represented in larger genomes, as were KEGG categories related to various cellular processes and metabolism with the exception of nucleotide metabolism22. Free-living bacteria with a genome size >6 Mb such as Bacteroides thetaiotaomicron and Streptomyces avermitili can grow in various environments and use a wide range of substrates for energy production. Thus, strain CBA1119T with its large genome size may be capable of growing under different conditions, and can potentially utilize different substrates to produce energy. Genome size increases with the level of environmental instability; that is, large genomes are also more resistant to environmental perturbations than smaller ones23. It remains to be determined whether this applies to strain CBA1119T. Clarifying the genomic and environmental factors that affect growth temperature and genome size can provide insight into environment-microbe interactions and evolutionary adaptations of various microorganisms, while additional studies on the enzymes of strain CBA1119T can reveal new tools for industrial biotechnology applications.

Materials and Methods

Isolation of archaeal strain

Strain CBA1119T was isolated from unrefined solar salt obtained from a salt field (34.587738 N, 126.105372 E) in the Republic of Korea and aerobically cultured in DBCM2 medium (JCM medium no. 574; 833 ml MDS salt water [240 g NaCl, 30 g MgCl2∙6H2O, 35 g MgSO4∙7H2O, 7 g KCl, 5 ml 1 M CaCl2 solution per liter], 1 ml FeCl2 solution [10 ml 25% HCl, 1.5 g FeCl2∙4H2O per liter], 1 ml trace element solution [70 mg ZnCl2, 100 mg MnCl2∙4H2O, 6 mg H3BO3, 190 mg CoCl2∙6H2O, 2 mg CuCl2∙2H2O, 24 mg NiCl2∙6 H2O, 36 mg Na2MoO4∙2H2O per liter], 0.25 g peptone [Oxoid, Chesire, UK], 0.05 g yeast extract [BD Biosciences, Franklin Lakes, NJ, USA], 5 ml 1 M NH4Cl, 3 ml vitamin solution [3 mg biotin, 4 mg folic acid, 50 mg pyridoxine·HCl, 33 mg thiamine·HCl, 10 mg riboflavin, 33 mg nicotinic acid, 17 mg DL-calcium pantothenate, 17 mg vitamin B12, 13 mg para-aminobenzoic acid, 10 mg lipoic acid per liter], 10 ml of 1 M sodium pyruvate solution, 2 ml potassium phosphate buffer [417 ml 1 M K2HPO4 and 83 ml 1 M KH2PO4 per liter], and 50 ml 1 M Tris-HCl, pH 7.5 per liter) at 37 °C for 4 weeks. To obtain pure culture, a single colony was transferred repeatedly to the agar medium.

Phenotypic, chemotaxonomic, and phylogenetic analyses

Phenotypic tests were performed according to the minimal standards for description of new taxa in the order Halobacteriales24. Cell morphology and size were examined by field emission transmission electron microscopy (Chuncheon Center, Korea Basic Science Institute, Korea). Gram staining was performed as previously described25. For comparative phenotypic analyses, reference strains were selected based on the relatedness of 16S rRNA gene sequences (>97%). For this purpose, Nnm. soli LMG 29247T18, Nnm. salaciae JCM 17869T17, and Nnm. ejinorense JCM 13890T14 obtained from the Japan Collection of Microorganisms (JCM) or Belgian Coordinated Collections of Microorganisms (BCCM) were cultured at 37 °C in DBCM2 medium. Growth at different temperatures (4 °C, 15 °C–60 °C at intervals of 5 °C, and 61 °C–70 °C at intervals of 1 °C), NaCl concentrations (0–30% [w/v] at intervals of 5%), pHs (5.0–11.0 at intervals of 1.0), and Mg2+ concentrations (0, 5, 10, 20, 50, 100, 200, and 500 mM) were tested using DBCM2 medium as the basal medium for 4 weeks. pH was adjusted by adding the following buffers: 10 mM 2-(N-morpholino)-ethanesulfonic acid (MES) (pH 5–6), 1,3-bis[tris(hydroxymethyl)methylamino]propane (Bis-TRIS propane) (pH 7–9), or N-cyclohexyl-3-aminopropanesulfonic acid (CAPS) (pH 10–11). Anaerobic growth in the presence of 0.5% l-arginine, trimethylamine-N-oxide (TMAO), dimethyl sulfoxide (DMSO), or 30 mM nitrate was evaluated on DBCM2 medium at 37 °C in an anaerobic chamber (Coy Laboratory Products, Grass Lake, MI, USA) with an N2·CO2·H2 (90:5:5, v-v:v) atmosphere. Catalase and oxidase activities26 as well as the hydrolysis of starch and casein27 and of Tween 40 and Tween 8028 were evaluated according to established protocols. Antibiotic susceptibility was tested on DBCM2 medium using antibiotic discs with ampicillin (10 μg per disc), erythromycin (15 μg), gentamicin (10 μg), kanamycin (30 μg), nalidixic acid (30 μg), rifampicin (10 μg), and streptomycin (10 μg). The effectiveness of various substrates as a sole carbon and energy source and acid production were determined in HMD medium29. A total of 20 carbon sources were tested: D-fructose, D-galactose, D-mannitol, D-mannose, D-sorbitol, D-xylose, fumarate, glycerol, maltose, pyruvate, starch, succinate, sucrose, L-alanine, L-arginine, L-aspartate, L-glutamate, L-lysine, L-malate, and L-sorbose. Polar lipids from strain CBA1119T were extracted, analyzed, and compared with those of the three reference strains as previously described30. The DNA-DNA hybridization (DDH)31 was performed to determine the genetic relationship between strain CBA1119T and the three reference strains. To determine the taxonomic identity based on 16S rRNA gene sequence, chromosomal DNA was extracted using a commercial DNA extraction kit (iNtRON Biotechnology, Sungnam, Korea) and the 16S rRNA gene was amplified using PCR PreMix (iNtRON Biotechnology) with universal primers 0018 F and 1518R32. Amplified 16S rRNA PCR products were sequenced and assembled as previously described33 and 16S rRNA sequences were compared using EzTaxon-e34 or NCBI BLAST35. Phylogenetic trees were constructed based on the three 16S rRNA gene sequences of strain CBA1119T obtained from the genome sequencing data (see below) and other related species using MEGA6 software36. Phylogenetic trees were generated with neighbor-joining (NJ)37, maximum likelihood (ML)38, and maximum parsimony (MP)39 methods with 1 000 bootstrap replications based on the NJ tree.

Library preparation, sequencing, genome assembly, and annotation

To clarify the relationship between physiological characteristics (especially capacity for growth at high temperatures) and genomic features, we performed genome sequencing of strain CBA1119T and Nnm. ejinorense JCM 13890T as previously described40. In brief summary, the genomic DNA shearing and SMRTbell library preparation were carried out according to the standard PacBio 20-kb Template Preparation Using BluePippin Size-Selection System protocol by P6-C4 chemistry (Pacific Biosciences, Menlo Park, CA, USA), respectively. The strain CBA1119T genome and Nnm. ejinorense JCM 13890T genome sequences were determined using the PacBio RS II system (Pacific Biosciences). De novo genome assembly of each genome was performed using Hierarchical Genome Assembly Process v.2 software with default parameters supported by PacBio SMRT Analysis v.2.3.041. rRNA and tRNA prediction was carried out using RNAmmer v.1.242 and tRNAscan-SE v.1.2143, respectively. Genes were predicted using Glimmer3 in Rapid Annotation using Subsystem Technology server (http://rast.nmpdr.org), and functional gene annotations were performed based on the SEED, COG (http://www.ncbi.nlm.nih.gov/COG), and KEGG (http://www.genome.jp/kegg/) databases. The GenBank/EMBL/DDBJ accession numbers for the Natrinema thermophila CBA1119T and Natrinema ejinorense JCM 13890T are PDBS00000000 and NXNI00000000, respectively.

Comparative genomic analysis

For genomic comparisons, Natrinema species genomes were obtained from the NCBI genome database, except those of strains CBA1119T and JCM 13890T, which were sequenced as described above. The OrthoANI algorithm was used to analyze the genomic relatedness between strain CBA1119T and other species. OrthoANI percentages were calculated and a phylogenetic tree was constructed44. Orthologs in strain CBA1119T and the reference strains were predicted and mapped using the reciprocal best hit method in UBLAST45. Pan-genome orthologous groups (POGs) were estimated using the EzBioCloud Comparative Genomics Database (http://cg.ezbiocloud.net/)46, and their presence was calculated using the Jaccard coefficient. The unweighted pair-group method with arithmetic mean (UPGMA) clustering was then used to assess clustering between strain CBA1119T and the reference strains from a dendrogram constructed based on the presence or absence of gene content. Haloarchaea genomes for comparisons were obtained from the NCBI genome database according to the following criteria: genomes with optimal or maximum growth temperature information were selected for comparisons of optimal and maximum growth temperature, respectively; genomes of unclassified strains47 were excluded; and genomes with fewer contigs that are less incomplete were selected, when multiple genomes were available for a single strain.