Members of the genus Salinispora are obligate marine actinomycetes of the family Micromonosporaceae, and require sodium-enriched medium for growth.1 This genus has attracted attention from researchers seeking novel secondary metabolites, because it represents promising sources of such molecules.2, 3 However, it is difficult to isolate novel strains within or closely related to the genus Salinispora; only three species, Salinispora arenicola, Salinispora pacifica and Salinispora tropica, have been discovered during the past decade.

In our survey of actinomycetal inhabitants of mangrove forests, we isolated a novel strain, designated NBRC 107566, from the rhizosphere of a mangrove growing on Iriomote Island, Okinawa, Japan. The 16S rRNA gene of the strain showed high sequence similarities to those of S. arenicola CNH-643T (98.28%), Micromonospora pattaloongensis TJ2-2T (98.08%), S. pacifica CNR-114T (98.07%) and S. tropica CNB-440T (98.07%). Thus, to assess the potential of NBRC 107566 as a secondary metabolite producer, we performed whole-genome shotgun sequencing and examined type-I polyketide synthase (PKS) and nonribosomal peptide synthetase (NRPS) gene clusters, which are involved in the major secondary metabolite-synthetic pathways in actinomycetes. The taxonomic study of this strain will be reported elsewhere.

The whole-genome sequence was determined by a shotgun sequencing strategy with paired-end sequencing using MiSeq (Illumina, San Diego, CA, USA; 652 Mb, 97-fold coverage). These reads were assembled using Newbler v2.6 (Roche, Basel, Switzerland) and subsequently finished using GenoFinisher,4 which enabled a final assembly of 182 scaffold sequences of >500 bp each. The sequences have been deposited at DDBJ under accession numbers BBQH01000001–BBQH01000182. The total size of the assembly was 6 704 564 bp, with a G+C content of 72.4%. Coding sequences were predicted by Prodigal v2.6 (http://prodigal.ornl.gov/downloads.php),5 and domains related to PKS and NRPS were searched using the SMART and PFAM domain databases. PKS and NRPS gene clusters, and their domain organizations, were identified according to a previously reported procedure.6

Complete gene cluster sequences were obtained for two type-I PKS (pks2 and pks3) and five NRPS (nrps1 to nrps5) gene clusters. We believe that one PKS gene cluster (pks1) was split into 16 scaffolds/contigs in the draft genome sequence. Genes encoding PKSs and NRPSs in these clusters are listed in Table 1. Although the pks1 gene cluster could not be completely sequenced, this cluster likely contains at least 12 modules because of the presence of 12 ketosynthase domains and 12 acyl carrier protein domains. Hence, the products will be large compounds containing a polyketide chain comprising at least C24. We hypothesized that the pks2 gene cluster will synthesize an enediyne-type polyketide, because its PKS (open reading frame (Orf) 5–114) showed high sequence similarities to iterative type-I PKSs for enediyne core synthesis, termed enediyne PKS (PksE),7 and the domain organization was the same as those of Salinispora, such as Sare_0551 and Strop_2697, for enediyne compounds.2 The pks3 gene cluster was likely novel because the PKS genes showed low sequence identities to the closest homologs. The domain organization was likely ketosynthase–acyltransferase–ketoreductase–dehydratase, which is specific to PksE, although the module split into three Orfs. Hence, the products may be enediyne compounds too. According to each module number and adenylation domain substrate predicted by antiSMASH,8 we guessed that gene clusters nrps1 to nrps5 synthesize hexapeptidic compounds containing three valine molecules; small compounds derived from single amino-acid molecules; tripeptidic compounds comprising glycine, lysine and serine molecules; compounds derived from cysteine; and tetrapeptidic compounds containing a dihydroxybenzoate and two cysteine molecules, respectively.

Table 1 PKSs and NRPSs of type-I PKS and NRPS gene clusters in Micromonosporaceae strain NBRC 107566

Among the eight gene clusters, five (pks1, pks3, nrps3, nrps4 and nrps5) were new and specific to strain NBRC 107566, because each PKS and NRPS showed low sequence similarity and distinct domain organization from its closest homolog. In contrast, homologs of the remaining three gene clusters (pks2, nrps1 and nrps2) were present in other strains (Table 1). PksE genes such as pks2 are present commonly in the genus Salinispora,3 and also in the closely related genus Micromonospora.9 In the nrps1 gene cluster, six genes showed high sequence similarities (57–64% identities) to those of S. arenicola, which are specific in S. arenicola strain CNX891 and are not present in the other Salinispora strains whose genome sequences are available. However, the products of strain NBRC 107566 and S. arenicola CNX891 probably differ because the nrps1 gene cluster encoded two extra NRPSs in addition to those found in S. arenicola CNX891, and some domain organizations between these two strains were different (Supplementary Figure S1). The nrps2 cluster showed high sequence similarities to a Catenulispora acidiphila gene, whose products have not yet been identified, and whose domain organizations were the same, suggesting that the products of the two strains will be similar. A putative acetyl-lysine deacetylase and a lysine biosynthesis enzyme LysX were also encoded upstream of nrps2 and the C. acidiphila gene cluster (data not shown). However, we could not predict the products only by genome information. To identify the products, further studies are needed.

We compared similarities and differences of type-I PKS and NRPS gene clusters among strain NBRC 107566 and three Salinispora species (Table 2). Orthologous gene clusters are aligned in the same row of the table. Strain NBRC 107566, S. arenicola CNH-205, S. tropica CNB-440 and S. pacifica DSM 45543 possess 8, 13, 10 and 10 PKS and NRPS gene clusters, respectively, suggesting NBRC 107566 has as many and various gene clusters as Salinispora strains. In the genus Salinispora, three PKS and NRPS gene clusters (iterative enediyne PKS, yersiniabactin-like siderophore NRPS and tetrapeptide NRPS, shown as clusters 1–3 in Table 2) are well conserved in all the species, and four gene clusters (clusters 4–7) are conserved at least between two species.3, 10 In contrast, strain NBRC 107566 did not possess a yersiniabactin-like siderophore NRPS gene cluster, a tetrapeptide NRPS gene cluster or the four gene clusters, but does have an iterative enediyne PKS gene cluster (pks2). According to the report on the ancestry of secondary metabolite gene clusters in the genus Salinispora, the well-conserved gene clusters are derived from the common ancestors, but only the iterative enediyne PKS gene cluster among them is shared with the closely related genus Micromonospora; the remaining clusters are considered to have been acquired by horizontal gene transfer at the beginning of the evolution of this genus or each species.3 Strain NBRC 107566 is phylogenetically related to the Salinispora species, but its taxonomic position is outside of the clade of the genus Salinispora in the phylogenetic tree based on 16S rRNA gene sequences (Tamura et al., unpublished). This strain may have evolved without acquiring these gene clusters, except for pksE (pks2), conserved in the genus Salinispora. Interestingly, no homologs of the five NBRC 107566-specific gene clusters (pks1, pks3, nrps3, nrps4, nrps5) were observed, even in genome-sequenced strains belonging to the related genus Micromonospora (Table 1). To date, only 10 Micromonospora strains have been genome sequenced; therefore, it is not clear whether these five clusters were transmitted vertically or acquired by horizontal gene transfer during evolution of the family Micromonosporaceae. This study showed that isolation of phylogenetically novel strains, such as strain NBRC 107566, in this family could aid the search for attractive, novel and diverse secondary metabolites.

Table 2 Type-I PKS and NRPS gene clusters conserved or specific in NBRC 107566 and Salinispora species