A preliminary examination of bacterial, archaeal, and fungal communities inhabiting different rhizocompartments of tomato plants under real-world environments

Plant microbiota is a key determinant of plant health and productivity. The composition and structure of plant microbiota varies according to plant tissue and compartment, which are specific habitats for microbial colonization. To investigate the structural composition of the microbiome associated with tomato roots under natural systems, we characterized the bacterial, archaeal, and fungal communities of three belowground compartments (rhizosphere, endosphere, and bulk soil) of tomato plants collected from 23 greenhouses in 7 geographic locations of South Korea. The microbial diversity and structure varied by rhizocompartment, with the most distinctive community features found in the endosphere. The bacterial and fungal communities in the bulk soil and rhizosphere were correlated with soil physicochemical properties, such as pH, electrical conductivity, and exchangeable cation levels, while this trend was not evident in the endosphere samples. A small number of core bacterial operational taxonomic units (OTUs) present in all samples from the rhizosphere and endosphere represented more than 60% of the total relative abundance. Among these core microbes, OTUs belonging to the genera Acidovorax, Enterobacter, Pseudomonas, Rhizobium, Streptomyces, and Variovorax, members of which are known to have beneficial effects on plant growth, were more relatively abundant in the endosphere samples. A co-occurrence network analysis indicated that the microbial community in the rhizosphere had a larger and more complex network than those in the bulk soil and endosphere. The analysis also identified keystone taxa that might play important roles in microbe-microbe interactions in the community. Additionally, profiling of predicted gene functions identified many genes associated with membrane transport in the endospheric and rhizospheric communities. Overall, the data presented here provide preliminary insight into bacterial, archaeal, and fungal phylogeny, functionality, and interactions in the rhizocompartments of tomato roots under real-world environments.


Microbial richness and diversity of the different rhizocompartments of tomatoes. To inves-
tigate the features of the microbial communities associated with different rhizocompartments of tomatoes, tomato plants were collected from 23 greenhouses located in 7 different geographic locations across South Korea (Supplementary Fig. S1 and Supplementary Table S1). Roots of collected tomato plants were separated according to bulk soil, rhizosphere, and endosphere 8,9 . A total of 69 samples (23 greenhouses × 3 compartments) were used for Illumina MiSeq sequencing with bacteria-, fungi-, and archaea-specific primers. The number of high-quality sequences generated from high-throughput sequencing was 7,985,514 bacterial 16S rRNA gene sequences, 4,637,772 archaeal 16S rRNA gene sequences, and 6,979,207 fungal ITS region sequences. After removing operational taxonomic units (OTUs) assigned to non-target kingdoms or chloroplasts, 4,387,870 bacterial, 4,184,774 archaeal, and 2,582,195 fungal sequences were clustered into 3,511 bacterial, 678 archaeal, and 4,608 fungal OTUs. Although the number of generated sequences and OTUs was different among the kingdoms and rhizocompartments, the rarefaction curves showing the observed number of OTUs of those samples were sufficiently saturated ( Supplementary Fig. S2). Bacterial OTU richness was the highest and archaeal OTU richness was the lowest among the three kingdoms. After eliminating sequences assigned to the chloroplasts of plants, the number of sequences and OTUs of the endosphere samples were exceedingly low in comparison to those of the bulk soil and rhizosphere samples. The OTU datasets of endosphere samples were rarefied to 4,000 reads while those of bulk soil and rhizosphere were rarefied to 20,000 reads, both of which are close to the depth of saturation with more than 98% coverage (Supplementary Table S2). To compare significant differences in species diversity between samples, α-diversity was represented by the Chao1 richness estimate and Shannon's diversity index based on the rarefied OTU datasets ( Fig. 1 and Supplementary Table S2). The communities belonging to the three kingdoms, particularly bacteria and archaea, had similar levels of α-diversity in the bulk soil and rhizosphere, with the highest values observed for bacteria, followed by fungi and archaea. Of the rhizocompartments, the microbial communities of the endosphere exhibited the lowest Chao1 richness estimate and Shannon's diversity index in comparison to those of the bulk soil and rhizosphere across all three kingdoms. Similarly, the Faith's phylogenetic diversity (Faith's PD) was found to be the lowest in the endosphere compared to the bulk soil and rhizosphere (Fig. 1c).
Differences in microbial community structure between rhizocompartments. We examined the similarities in the microbial communities between samples via a non-metric multidimensional scaling (NMDS) biplot based on Bray-Curtis dissimilarity (Fig. 2a-c) and Principal Coordinates Analysis (PCoA) of weighted UniFrac distances incorporating phylogenetic relatedness ( Supplementary Fig. S3). Despite the different soil types that the tomatoes were grown in, the microbial communities were separated by rhizocompartments. In particular, the bacterial communities were clearly separated by three compartments, and those in the endosphere were found to retain the most distinguishable bacterial communities. The archaeal communities were separated between the bulk soil and rhizosphere, while the endosphere exhibited a community signature that was distinct from those of the other two compartments. The fungal communities exhibited similar community structures as the bacterial communities, with the endosphere harbouring the most distinct communities. However, the fungal communities were not differentiated between the bulk soil and rhizosphere. The ordination results were further supported by a non-parametric analysis of similarities (ANOSIM) and permutational multivariate analysis of variance (PERMANOVA) based on Bray-Curtis dissimilarity. ANOSIM revealed significant separation of the bacterial (R = 0.865, P < 0.001) and fungal (R = 0.756, P < 0.001) communities by rhizocompartment, while the archaeal communities exhibited low values (R = 0.308, P < 0.001). PERMANOVA also demonstrated that the bacterial (R 2 = 0.477, P < 0.001), fungal (R 2 = 0.260, P < 0.001), and archaeal (R 2 = 0.179, P < 0.001) communities were significantly differentiated by rhizocompartment.
Next, the homogeneity of communities within the same compartment was examined by measuring the distance between the centroid and each sample of the group. The bacterial, archaeal, and fungal communities of the rhizosphere exhibited the lowest dispersion, while those of the endosphere exhibited higher dispersion than those of the other compartments ( Fig. 2d-f). Taken together, the results show that the endosphere harbours much fewer species but exhibits greater variation in microbial community structure than the rhizosphere and bulk soil.
Influence of soil characteristics on the microbial communities of tomato roots. Physicochemical analyses of the collected bulk soils provided a wide range of values across the samples (Supplementary Table S3). The soil pH and electrical conductivity (EC) values varied from 4.8 to 7.8 and 0.1 to 4.0 dS/m, respectively. The organic matter (OM) content ranged from 12.1 to 72.6 g/kg. The total nitrogen (TN) content varied from 0.07 to 0.55%. The available phosphate content varied from 98.9 to 1319.0 mg/kg. The exchangeable cations, such as K + , Ca 2+ , Mg 2+ , and Na + , varied from 0.1 to 17.6 cmol c /kg.
We investigated whether the microbial communities inhabiting the bulk soil, rhizosphere, and endosphere are influenced by edaphic factors using the Mantel test and canonical correspondence analysis (CCA). Overall, the bacterial and fungal communities within the bulk soil and rhizosphere exhibited significant correlations with the edaphic factors measured in this study (Table 1 and Supplementary Fig. S4). Interestingly, the microbial communities within the endosphere had no significant correlation with any edaphic factors. Among the three kingdoms, relatively few edaphic factors were associated with archaeal communities. The communities of bacteria and fungi in the rhizosphere were more greatly influenced by the edaphic factors than those in the bulk soil. The Mantel test showed that soil pH, TN, K + , Ca 2+ , and Mg 2+ levels were significantly correlated with the bacterial communities in the bulk soil. In addition to these edaphic factors, the bacterial communities in the rhizosphere exhibited a www.nature.com/scientificreports www.nature.com/scientificreports/ significant relationship with EC, OM, and Na + levels. The fungal communities within the bulk soil had significant relationships with pH, EC, Mg 2+ , and Na + levels, and those in the rhizosphere were additionally correlated with TN and K + levels. The archaeal communities in the bulk soil were correlated with pH, TN, and Mg 2+ levels, while the rhizospheric archaeal communities were correlated with only Mg 2+ levels. Similar to the Mantel test, the results of CCA showed that pH was the main factor that shaped the bacterial and fungal communities in the bulk soil and rhizosphere ( Supplementary Fig. S4).
Furthermore, forward selection based on redundancy analysis (RDA) was implemented to quantify the relative contributions of the selected soil physicochemical properties to the structures of the microbial communities 36 . Microbial communities in the bulk soil and rhizosphere were highly influenced by soil edaphic factors, but this trend was not observed in the endosphere, as demonstrated by the Mantel test. Specifically, a combination  www.nature.com/scientificreports www.nature.com/scientificreports/ of the selected soil parameters, including pH, EC, K + , and Ca 2+ , explained 27.9% of the bacterial community variance in the bulk soil, and K + , OM, pH, and Na + explained 21.1% of the community variance in the rhizosphere (Supplementary Table S4). A combination of pH and Mg 2+ explained 18.4% of the archaeal community variance in the bulk soil, and pH alone explained 12.0% of the community variance in the rhizosphere. In the fungal communities, a combination of EC, Na + , and Ca 2+ explained 16.3% of the community variance in the bulk soil, while Na + and pH explained only 7.8% of the community variance in the rhizosphere. However, the selected soil parameters of the bacterial, archaeal, and fungal communities in the endosphere either did not explain the microbial community variance (no parameter explained the variance in the archaeal communities) or the explanation provided was unsatisfactory (cumulative variance of 7.1% and 7.6% in bacterial and fungal communities, respectively).

Microbial taxonomic distribution in different rhizocompartments of tomatoes.
We detected a total of 25 bacterial, 7 archaeal, and 8 fungal phyla from the dataset, and the taxonomic distributions were presented according to rhizocompartment with average relative abundances (Fig. 3). The communities of the three kingdoms exhibited distinct taxonomic distribution patterns between the bulk soil, rhizosphere, and endosphere at the phylum level. Among the bacterial communities, the most dominant phylum was Proteobacteria, with relative abundances of 43.2 ± 6.2%, 54.7 ± 6.5%, and 61.2 ± 14.7% in the bulk soil, rhizosphere, and endosphere, respectively. At the class level, the relative abundances of Alphaproteobacteria (11.8 ± 2.8%), Betaproteobacteria (11.2 ± 5.5%), and Gammaproteobacteria (15.0 ± 10.4%) were similar in the bulk soil. In the rhizosphere, the relative abundance of Alphaproteobacteria (34.4 ± 9.3%) was the highest, while the relative abundances of Betaproteobacteria (21.3 ± 12.1%) and Gammaproteobacteria (21.8 ± 11.1%) were higher than that of Alphaproteobacteria (11.1 ± 6.6%) in the endosphere. Actinobacteria was the second most abundant phylum, with relative abundances of 14.1 ± 6.2%, 19.5 ± 4.4%, and 13.8 ± 14.9% in the bulk soil, rhizosphere, and endosphere, respectively. The relative abundances of Bacteroidetes (2.9 ± 1.6%) and Firmicutes (2.9 ± 3.5%) in the endosphere were 2 to 3 times lower than those in the bulk soil and rhizosphere. Unlike the bacterial communities, the archaeal communities were predominated by a few dominant phyla. The most dominant archaeal phylum was Thaumarchaeota, with relative abundances of 94.7 ± 4.8%, 96.7 ± 4.5%, and 87.4 ± 22.0% in the bulk soil, rhizosphere, and endosphere, respectively. Euryarchaeota was the second most abundant archaeal phylum, The phyla with relative abundances less than 1% were classified into "Others".

Identification of representative OTUs in each rhizocompartment.
We took a closer look at the individual OTUs abundant in each rhizocompartment of the tomato roots. To identify the taxa that were significantly associated with particular ecological niches, a species indicator analysis was performed. Six OTUs, 14 OTUs, and 19 OTUs with high indicator values (>0.5) and dominance (>1% average relative abundance) were identified in the bulk soil, rhizosphere, and endosphere, respectively ( Table 2). Bacterial OTUs were found mainly as indicator taxa across all compartments (23 out of 39 OTUs) and were mostly affiliated with Actinobacteria  www.nature.com/scientificreports www.nature.com/scientificreports/ and Proteobacteria. The fungal indicator taxa in the three compartments (15 out of 39 OTUs) were affiliated with Ascomycota. Only one archaeal OTU, belonging to Thaumarchaeota, was detected in the bulk soil; however, no significant archaeal indicator taxa were identified in the rhizosphere and endosphere. At the genus level, the indicator OTUs of the rhizosphere were assigned to Microbacterium, Arthrobacter, Sphingobium, Sphingomonas, Afipia, Leifsonia, Luteimonas, Hyphodiscus, Aspergillus, Trichoderma, Chrysosporium, and Oidiodendron, and those of the endosphere were affiliated with Enterobacter, Acidovorax, Variovorax, Pseudomonas, Rhizobium, Streptomyces, Alternaria, and Colletotrichum. Linear discriminant analysis (LDA) Effect Size (LEfSe) is a statistical method that determines the features (organisms, genes, and functions that show a significant difference between groups. We conducted an LEfSe analysis to identify specific taxa in different rhizocompartments that showed an LDA score > 3.5 and p < 0.05. Eight, 6, and 13 OTUs were identified in the bulk soil, rhizosphere, and endosphere, respectively ( Supplementary Fig. S5). Most of the OTUs identified in the endosphere were shared with the OTUs identified by the indicator species analysis.
To identify the persistent members of microbial communities that inhabit each rhizocompartment, we identified core OTUs that were detected in all the bulk soil, rhizosphere and endosphere samples (Fig. 4). The bacterial core microbiomes comprised 183, 334, and 26 OTUs in the bulk soil, rhizosphere, and endosphere, respectively, accounting for relative abundances of 38%, 77%, and 68%, respectively (Fig. 4a). The archaeal core microbiomes comprised 1-11 OTUs with relative abundances of 34.9-98.8% (Fig. 4b). Archaeal OTU1, which was a core OTU in all rhizocompartments, was abundant exclusively in the rhizosphere (50%), endosphere (35%), and bulk soil (18%). In contrast to bacteria and archaea, the number of fungal core OTUs was small (1-12 OTUs) despite the large total OTU number (Fig. 4c). Next, core OTUs in the rhizosphere and endosphere that might be closely associated with plants were phylogenetically classified. Nineteen bacterial core OTUs were common to both the rhizosphere and endosphere, belonging to Pseudomonas, Variovorax, Enterobacter, Rhizobium, Shinella, Acidovorax, Rhizobium, Propionibacteriaum, Sphingobium, Luteimonas, Rhodococcus, and Streptomyces ( Fig. 4d and Supplementary Data S1). An archaeal core OTU that was common to all the compartments was affiliated with Nitrososphaera, and in addition, the rhizospheric core OTUs were assigned to Methanobacterium and Methanosarcina ( Fig. 4e and Supplementary Data S1). The fungal core OTUs in the bulk soil and rhizosphere were assigned to Cladosporium, Fusarium, Chrysosporium, Trichocladium, Monographella, and Gibellulopsis ( Fig. 4f and Supplementary Data S1). The one fungal core OTU in the endosphere belonged to Alternaria.
Complexity of the microbial network among rhizocompartments of tomatoes. Potential interactions between microbial taxa in each rhizocompartment were investigated using a random matrix theory (RMT)-based network analysis by combining bacterial, fungal, and archaeal OTUs. For network construction, we used 1,619, 1,633, and 160 OTUs for bulk soil, rhizosphere, and endosphere, respectively, which were detected www.nature.com/scientificreports www.nature.com/scientificreports/ in more than half of the samples as recommended by the network analyses pipeline ( Table 3). The network connectivity in all compartments was well fitted by the power law, with an R 2 value greater than 0.7, indicating scale-free properties. The values for the average clustering coefficient (avgCC) and average path distance (GD) of the empirical networks were higher than those of the random networks, indicating the small-world behaviour of the networks. The number of nodes in the rhizospheric network was lower, but the average degree (avgK) was higher, than that in the bulk soil network. This finding suggested that microbial communities in the rhizosphere formed more complex networks than those in bulk soil (Table 3 and Supplementary Fig. S6). Compared to the rhizosphere, the endosphere exhibited low and simple network connectivity, as indicated by the small number of nodes and low avgK value.
Next, the topological roles of the taxa in the network were classified into the following four categories based on the values of within-module connectivity (Zi) and among-module connectivity (Pi): peripherals, few interactions with other nodes (Zi < 2.5 and Pi < 0.62); connectors, many links with other modules (Zi < 2.5 and Pi > 0.62); module hubs, many interactions within the module (Zi > 2.5 and Pi < 0.62); and network hubs, many interactions within and among modules (Zi > 2.5 and Pi > 0.62) (Supplementary Fig. S7) 30,37 . Most of the nodes (99%, 98%, and 88% in bulk soil, rhizosphere, and endosphere, respectively) were categorized as peripherals with low connectivity. None of the nodes fell into the network hub category. Notably, 9 module hubs and 16 connectors were identified from the dataset and are considered to be keystone taxa, with potential roles in shaping distinct microbial communities in the bulk soil, rhizosphere, and endosphere (Supplementary Table S5). Among the 9 module hubs, 3 module hubs in the bulk soil network were assigned to Gemmatimonadetes, Firmicutes, and Proteobacteria; 5 module hubs in the rhizospheric network were assigned to Actinobacteria, Bacteroidetes, and Alphaproteobacteria; and one module hub in the endospheric network was assigned to Firmicutes. Of the 15 connectors, 12 connectors in the bulk soil network were assigned to Planctomycetes, Actinobacteria, Proteobacteria, Elusimicrobia, and Firmicutes; 3 connectors in the rhizospheric network were assigned to Proteobacteria and Ignavibacteriae; and there was no connector identified in the endosphere.
Metagenome prediction profiles of each rhizocompartment of tomatoes. Functional gene profiles of each compartment were predicted using the PICRUSt program with the bacterial 16S rRNA gene sequence dataset. Based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway hierarchy level 2, 41 KEGG orthology (KO) groups were identified (Fig. 5). Functional genes belonging to carbohydrate metabolism, amino acid metabolism, and membrane transport were markedly abundant in the dataset. Among these genes, the relative abundances of genes predicted to be associated with membrane transport exhibited high variability among rhizocompartments, with the highest abundance observed in the endosphere followed by the rhizosphere and bulk soil. This finding was further supported by a higher abundance of genes associated with the sublevel of membrane transport (ABC transporter, secretion system, transporters, and phosphotransferase system) at the third hierarchical level in the endosphere than in the other two compartments (Supplementary Fig. S8).

Discussion
We collected tomato plants cultivated in 23 greenhouses in 7 different geographic locations across South Korea and performed in-depth characterization of the microbial communities harboured in different rhizocompartments using the Illumina MiSeq platform. Among the three rhizocompartments, both taxonomic and phylogenetic diversity of the endosphere was considerably lower than that of the rhizosphere and bulk soil (Fig. 1). The microbial alpha-diversity, estimated by the Chao1 richness index and the Shannon's diversity index, was highest among bacterial communities, followed by fungal and archaeal communities. Despite the differences in soil characteristics, analysis of the microbial community structure revealed a clear differentiation between  www.nature.com/scientificreports www.nature.com/scientificreports/ bulk soil, rhizosphere, and endosphere, as demonstrated by the beta-diversity analysis (NMDS, ANOSIM, and PERMANOVA based on Bray-Curtis dissimilarity and PCoA based on weighted UniFrac distances) ( Fig. 2 and Supplementary Fig. S3). Soil microorganisms attracted to root exudates are reassembled in the rhizosphere. A subset of rhizospheric microorganisms that have endophytic competence and interact with the plant innate immune system colonize the endosphere 5,6,10,38 . Intriguingly, the ordination results analysed by NMDS showed that the variation in the microbial community structure of the endosphere was considerably higher than that of the bulk soil and rhizosphere (Fig. 2). This finding was consistent with those of previous microbiome studies of poplar trees, which showed high structural variation among microbial communities within the endosphere in comparison to those in the rhizosphere 7,10 . The high variance in microbial communities in the endosphere might be dependent on the plant innate immune system, while rhizospheric microbial communities are less influenced by host-dependent selection 10 . Taken together, the results show that microbial communities of a tomato's endosphere are characterized by a low species richness and diversity and a high vulnerability to plant responses, resulting in a high degree of variation in the microbial community structure.
We examined soil samples that had different physiochemical properties to evaluate whether edaphic factors affect the composition and diversity of microbial communities, not only in the soil but also in the rhizosphere and endosphere. The Mantel test and CCA results showed that soil physicochemical properties were correlated with microbial communities in the rhizosphere and bulk soil but not in the endosphere (Table 1 and Supplementary  Fig. S4). While the microbial communities in the rhizosphere could be affected by edaphic factors of the surrounding soil, the endosphere is isolated from the soil by the plant's epidermal cells and may be influenced by the plant's innate immune system rather than by soil properties. Thus, edaphic factors are among the determinants of the structures of microbial assemblages found in the bulk soil and rhizosphere, but not in the endosphere. Among the edaphic factors measured, pH was a major factor influencing bacterial and fungal communities in both the bulk soil and rhizosphere, similar to what has been previously reported [18][19][20][21][22]39,40 . Archaeal communities were also influenced by soil pH in the bulk soil and rhizosphere. Moreover, one of the cations, namely Mg 2+ , which is an important cofactor of ATP and is involved in the regulation of metabolic reactions and energy balance of organisms 41 , was significantly correlated with the bacterial, archaeal, and fungal communities. While EC and other cations, including K + and Na + , were correlated with bacterial and fungal communities, the archaeal communities were not affected by other soil characteristics, suggesting that archaeal communities are less influenced by environmental factors than bacterial and fungal communities.
We identified specific microbial taxa that were significantly more relatively abundant in each rhizocompartment under various environmental conditions. The indicator species and LEfSe analysis identified representative OTUs that were significantly more relatively abundant in bulk soil, rhizosphere, and endosphere (Table 2 www.nature.com/scientificreports www.nature.com/scientificreports/ and Supplementary Fig. S5). Bacterial OTUs that were specifically more relatively abundant in the endosphere were affiliated with Enterobacter (B_OTU18), Acidovorax (B_OTU26), Variovorax (B_OTU6), Pseudomonas (B_OTU3), Rhizobium (B_OTU17), and Streptomyces (B_OTU23). Notably, these genera have been previously reported to promote plant growth, biocontrol, and nutrient availability [42][43][44][45][46] . Interestingly, the other OTUs abundant in the endosphere were affiliated with the fungal genera Alternaria (F_OTU42, 53, 121, 196, 295, 464, and 477) and Colletotrichum (F_OTU2) within the phylum Ascomycota. Some Alternaria species, such as A. alternata and A. solani, are known tomato pathogens that cause leaf spots, rots, and blights 47 . Species belonging to Colletotrichum, such as C. acutatum, C. gloeosporioides, C dematium, and C. coccodes, are known causes of anthracnose in tomato plants 48 . However, no symptom of anthracnose was evident on the tomato plants used in this study. Apparently, the OTUs belonging to the two genera may be non-virulent in tomato plants but are effective in endophytic colonization. Among the genera associated with indicator OTUs in the rhizosphere, members of Sphingobium (B_OTU2), Arthrobacter (B_OTU4), and Sphingomonas (B_OTU50) are known to promote plant growth 5,49 . Moreover, in the rhizosphere, species of the fungal indicator taxa, which included Aspergillus (F_OTU364), Trichoderma (F_OTU4), and Chrysosporium (F_OTU31), have also been reported to have plant growth-promoting activity, mainly by producing plant hormones such as gibberellin and auxin [50][51][52] . In general, a greater number of taxa that positively affect plant growth and health were found in the rhizosphere and endosphere in comparison to bulk soil samples.
We identified the core OTUs that were present in all 23 tomato samples. Given that the core microbiome is a stable and consistent component associated with a particular habitat, these microbes are thought to play important roles in the ecosystem 53 . The rhizosphere harboured more core OTUs (357 OTUs) than the bulk soil (190 OTUs) in the bacterial, archaeal, and fungal communities, although the species richness values in the bulk soil and rhizosphere were similar (Fig. 4). Furthermore, while the number of core OTUs in the rhizosphere and endosphere was small, these OTUs accounted for a large fraction of the OTU abundances in bacterial and archaeal communities (Fig. 4). The bacterial core OTUs that were shared by the rhizosphere and endosphere contained the aforementioned indicator taxa that might exert beneficial effects on plant growth. These results suggest that plants recruit beneficial microorganisms that are abundant in the rhizosphere 5 , which is also supported by the higher homogeneity of microbial communities in the rhizosphere than of those in the other compartments (Fig. 2). In the fungal communities, the number of core OTUs was low (12 OTUs) compared to the number of bacterial core OTUs (374 OTUs), even though the total number of fungal OTUs (4,940 OTUs) was greater than that of bacterial OTUs (3,511 OTUs). This finding indicates that the composition of fungal OTUs was very different between each sample, which is consistent with a previous report in which the similarity of fungal communities between tropical soils was seen to be lower than that of bacterial communities 54 . It is likely that fungal communities are strongly influenced by environmental factors and stochastic assembly processes compared to bacterial communities 55 .
By combining the bacterial, fungal, and archaeal community datasets, co-occurrence network analysis was performed to gain insights into potential relationships between the taxa in the different rhizocompartments. The rhizospheric network exhibited a higher complexity of microbial communities than the bulk soil network, as shown by the greater number of links and higher avgK value (Table 3). This finding is consistent with previous results obtained with the rhizosphere of oat and Jacobaea vulgaris, which were analysed by different algorithms, namely, the molecular ecological networks (MENs) based on RMT and SparCC, respectively 30,56 . The complexity of the network found in the rhizosphere is most likely due to continuous changes in the local environment of roots and surrounding soil, such as changes in nutrient availability, pH, moisture, oxygen content, and carbon dioxide levels 16,30 . In contrast to the rhizosphere, microbial communities in the endosphere presented very low species richness, diversity, and network complexity, indicating that the limited number of species were physically separated by plant tissue and/or that the number of microbial interactions with the host plant was much greater than the number of microbe-microbe interactions. Taken together, the results show that the rhizosphere is the most active site of microbe-microbe and plant-microbe interactions. We identified several OTUs as module hubs and connectors that play potential roles in the network structure and suggested that these OTUs are keystone taxa. All potential keystone OTUs detected in the bulk soil, rhizosphere, and endosphere were species of bacteria, even though the microbial networks included bacterial, archaeal, and fungal taxa, which suggests that bacteria might play a key role in linking members of archaea and fungi in the heterogenous microbial communities. As previously reported, we observed that most of the keystone OTUs had relatively low abundances (<0.1%), and these taxa might be important players in the structure and functioning of microbial communities in each rhizocompartment 30,57,58 . To reveal their specific roles, experimental evidences complementing the computational inference carried out is needed. The manipulation of unculturable microorganisms in an experiment is the biggest obstacle we must overcome in the future.
Prediction of gene function profiles using PICRUSt with bacterial 16S rRNA gene sequence data showed that genes associated with membrane transport were the most abundant across the functional categories and were significantly more relatively abundant in the endosphere and rhizosphere, than in bulk soil (Fig. 5). In previous studies, metagenomic analyses of soybean and J. vulgaris rhizospheres have shown that functional genes associated with membrane transport were more abundant in the rhizosphere than in bulk soil 56,59 . Considering the fact that the rhizosphere and endosphere are regions where microorganisms and plants interact via the uptake and export of various substrates, genes involved in membrane transport potentially play critical roles in interactions with the host plants in these regions. Rhizospheric and endospheric microorganisms take up carbon exuded from plant roots and secrete diverse compounds that have beneficial effects on plant growth and health. Examples of well-known molecules that promote plant growth include indole-3-acetic acid (IAA), which is a phytohormone that regulates plant development; 1-aminocyclopropane-1-carboxylic acid (ACC) deaminase, which reduces the ethylene content by degrading an ethylene precursor; and 2,3-butanediol, which is a volatile organic compound that induces the plant defence system 60,61 . Secondary metabolites with antibiotic activity, including siderophores www.nature.com/scientificreports www.nature.com/scientificreports/ and lytic enzymes such as chitinase, have been shown to inhibit plant pathogens and cause disease suppression in plants 62,63 . Further metatranscriptomic and metaproteomic research is needed to identify the function of microbiomes associated with plant growth and health.

Conclusions, limitations, and future directions.
We investigated the differences in the microbial communities found in the bulk soil, rhizosphere, and endosphere of tomato plants under natural systems, analysing the diversity, composition, structure, network, and metagenome prediction of the microbial communities. The findings of the present study have several implications. First, the microbial communities in the endosphere, with low species richness and diversity, were highly variable and less influenced by edaphic factors than those in the other rhizocompartments. Second, the taxa that are significantly more relatively abundant in the rhizosphere and endosphere have potentially beneficial effects on plant growth and health. Third, the microbial communities in the rhizosphere exhibited a complex network, while those in the endosphere had a smaller and simpler network consisting of fewer keystone taxa. Lastly, the microbial communities inhabiting the endosphere and rhizosphere possessed more genes involved in membrane transport than those found in bulk soil samples, indicating active exchange of diverse compounds between microorganisms and plants.
Conducting a microbiome study, particularly under natural systems, is challenging because microbiomes are sensitive to surrounding environmental conditions. Although the primary objective of this study was to screen three kingdoms of organisms found in microbial communities of tomato plants under natural conditions and to reveal clear niche differentiation of microbial communities, other factors, such as host genotype, phenotype (e.g., enhanced growth, disease susceptibility/tolerance), developmental stage of plants (sample collection timing), and climatic condition, might have contributed to this differentiation. Thus, future work should put more effort into defining and minimizing variables across samples in order to provide a complementary and more comprehensive picture of microbial communities and contribute to the understanding of the factors that drive the structure of these microbial communities. Furthermore, predicting functional profiles of microbial communities based on 16S rRNA amplicon sequencing data does not provide adequate genomic and functional details of the microbiome, although it is widely used for screening purposes. On the other hand, a metagenomic approach would help determine the functional mechanisms mediating plant-microbe interactions and define the core microbiome of plants. Such information is needed to understand and manage microbial functions in each compartment of the tomato roots.
Looking forward, identifying the distinct features of microbial communities in each compartment of tomato roots under real-world environments will improve the understanding of plant-microbiome interactions, thus contributing to the progress of microbiome engineering and to increased plant productivity in the future.

Materials and Methods
study sites and sample preparation. Tomato plants were sampled from 23 different greenhouses in 7 provinces across South Korea, which has a temperate climate (Supplementary Fig. S1). Information regarding the study sites and tomato cultivars is listed in Supplementary Table S1. The tomatoes were planted under the ground between November and December 2014, and tomato plants in the ripening stage were collected between May and June 2015 in the middle of the harvest season. From each greenhouse, three individual tomato plants (spaced 15 m apart) were randomly collected along with soil from a 30-cm radius and pooled. The tomato samples were brought to a laboratory and separated into bulk soil, rhizospheric soil, and endosphere 8,9 . Loose soil without tomato roots was carefully collected as the bulk soil. Subsequently, the tomato roots were vigorously shaken by hand to remove adherent soil particles. The roots with firmly attached soil were placed into 50-ml tubes containing 0.85% NaCl and shaken vigorously using a shaker (CUTE MIXER CM-1000, EYELA, Japan) for 30 min. After the roots were removed from the tubes, the tubes containing soil and saline water were centrifuged at 8,000 rpm for 15 min. The supernatants were removed, and the remaining soil samples were used as the rhizosphere samples. The roots removed during rhizosphere sample preparation were used as the endosphere samples after sterilization of the root surfaces following a series of washing steps: 70% (v/v) ethanol for 1 min, 3% (v/v) sodium hypochlorite solution for 3 min, 2.5% (w/v) sodium thiosulfate for 5 min, and rinsing the samples five times with sterile water. The sterile root samples were homogenized with liquid nitrogen (LN 2 ) and stored with the bulk soil and rhizosphere samples at −80 °C until DNA extraction. soil physicochemical analyses. Prior to analysis of the soil physiochemical properties, the collected bulk soil samples were air dried at room temperature in the shade for three days and passed through a 2-mm sieve. Soil pH and EC values were measured using a pH meter (CyberScan pH 1500, EUTECH, USA) and an EC meter (D-54, Horiba, Japan), respectively, after shaking the soil:water (1:5 w/v) mixture for 30 min at 200 rpm. The OM content was measured using the Wakely and Black method 64 . The TN content was measured by the Kjeldahl method 65 . The available phosphate content was measured by the Bray No. 1 method 66 . The exchangeable Ca 2+ , Mg 2+ , Na + and K + content was determined by extracting the soil samples by the ammonium-acetate (1 N, pH 7.0) infusion method and analyzing the extracts by inductively coupled plasma (ICP) analysis (GBC Integra XL, Australia) 67 .
sequence data processing. The sequences obtained from the MiSeq platform were processed using the UPARSE pipeline (ver.9.1.13_i86linux64, www.drive5.com/usearch) 71 . The paired-end reads were merged when the number and ratio of mismatches in the overlap region were less than 10 and 10%, respectively. Low-quality reads that were above the expected error threshold (>1) and short reads (<300 bp) were removed. To minimize the impact of sequencing artefacts, singletons were removed from the datasets 72 . Chimeric sequences were removed using the UCHIME de novo algorithm. The remaining high-quality sequences were clustered into OTUs with 97% identity by the UPARSE algorithm. Representative sequences of bacterial and archaeal OTUs were classified using the naïve Bayesian classifier 73 based on the Ribosomal Database Project (RDP) database 74 with a 60% confidence threshold. The OTUs affiliated with chloroplasts and archaea were subsequently removed from the bacterial OTU table, and those affiliated with bacteria were removed from the archaeal OTU table. The fungal OTUs were classified with the UNITE database 75 with a 60% confidence threshold, and OTUs that were assigned to non-fungi including plant and protozoa were removed from the fungal OTU table. Rarefaction curves were generated by Mothur (version 1.29.1; www.mothur.org) 76 . For assessing alpha-diversity indices, sequence reads of bulk soil and rhizosphere samples were rarefied to 20,000 and those of endosphere samples to 4,000, and six indices including coverage, number of OTUs, Chao1, ACE, Shannon, and Inverse Simpson were subsequently calculated using Mothur. Faith's PD was calculated using the picante package of R 3.3.1 (R Development Core Team, 2014) with a UPGMA phylogenetic tree that was constructed after MUSCLE alignment 77 of the OTU sequences. statistical analyses. Statistical analyses in this study were performed using R 3.3.1 (R Development Core Team, 2014). The OTU abundances in the dataset were normalized with Hellinger transformation 78 using the 'decostand' function of the vegan package 79 in R. Subsequently, NMDS analysis was performed based on the Bray-Curtis dissimilarity matrix using the "metaMDS" function of the vegan package. PCoA of weighted UniFrac distances of microbial communities was performed using the "beta_diversity.py" script in QIIME with a UPGMA phylogenetic tree that was constructed after MUSCLE alignment 77 . Differences in community structure between rhizocompartments were tested by ANOSIM 80 using the 'anosim' function and PERMANOVA 81 using the 'adonis' function based on the Bray-Curtis dissimilarity matrix calculated using the "vegdist" function within the vegan package. The dispersion of microbial communities within each compartment was analysed using the 'betadisper' function in the vegan package. The Mantel test was performed to analyse the correlation between microbial community compositions and soil physicochemical properties. All soil chemical variables except for pH were log-transformed for a normal distribution. The correlations between the Euclidean dissimilarity of the soil physicochemical properties and the Bray-Curtis distance of the microbial community composition were analysed with the 'mantel' function in the vegan package. CCA analysis was conducted using the 'cca' function of the vegan package, and forward selection of environmental variables based on RDA was examined using the 'forward.sel' function of the packfor package. To identify the OTUs that were specifically abundant in each compartment, indicator species analysis was conducted using the 'multipatt' function with the 'r.g' option in the indispecies package 82 and the OTUs with high indicator values (>0.5) and high abundances (>1% relative abundance) were screened from the results. Linear discriminant analysis (LDA) effect size (LEfSe) was performed online (http://huttenhower.sph.harvard.edu/galaxy) 83 . A logarithmic LDA score was set to 3.5 with statistical significance (p < 0.05). Core OTUs that were present in all samples of each rhizocompartment were detected using "get. sharedseqs" function of Mothur program.

Network analysis. Network analysis was conducted by following the Molecular Ecological Network
Analyses (MENA) Pipeline based on RMT at the University of Oklahoma's Institute for Environmental Genomics web server (http://ieg2.ou.edu/MENA/) 37 . The details of the process are provided in Shi et al. (2017). The input datasets contained the relative abundances of OTUs from the 23 samples of each rhizocompartment (bulk soil, rhizosphere, and endosphere). For network construction, the following options were used: OTUs detected in more than half of the samples were used; 0.01 was filled in the blanks with paired valid values; logarithm values