Environmental factors shape the epiphytic bacterial communities of Gracilariopsis lemaneiformis

Macroalgae host various symbionts on their surface, which play a critical role in their growth and development processes. However, there is still incomplete understanding of this epiphytic bacteria-host algae interactions. This study comprehensively analysed variation of the epiphytic bacterial communities (EBC) composition of red macroalga Gracilariopsis lemaneiformis at different geographic locations and environmental factors (i.e., nitrogen and phosphorus), which shape the EBC composition of G. lemaneiformis. The composition and structure of EBC were characterized using high throughput sequencing of the V3-V4 hypervariable region of the 16S rRNA gene. The results revealed that epiphytic bacteria varied significantly among three different geographic locations in China, i.e., Nan’ao Island (NA), Lianjiang County (LJ), and Nanri Island (NR). Redundancy analysis (RDA) showed that the relative abundance of Bacteroidetes, Firmicutes, Verrucomicrobia, and Epsilonbacteraeota at NR were strongly positively correlated with total nitrogen (TN), total phosphorus (TP), nitrate nitrogen (NO3-N), and dissolved inorganic nitrogen (DIN), but negatively correlated with nitrite nitrogen (NO2-N). The relative abundance of Cyanobacteria at NA and LJ were strongly positively correlated with NO2-N, but negatively correlated with TN, TP, NO3-N, and DIN. Besides, the Mantel test results indicated that the EBC composition was significantly correlated with these environmental factors, which was also confirmed by Spearman correlation analysis. Thus, environmental factors such as NO3-N and DIN play a key role in the community composition of epiphytic bacteria on G. lemaneiformis. This study provides important baseline knowledge on the community composition of epiphytic bacteria on G. lemaneiformis and shows correlation between different epiphytic bacteria and their surrounding environmental factors.

Macroalgae are major habitat-forming organisms in marine ecosystems that play a critical role in building the physical structure of habitats 1,2 . They produce nutrients 3 by releasing extracellular products into the surrounding environment during their growth process 4 , forming a unique microenvironment on the surface of the algal body 5 . Epiphytic bacteria release nutrient supplements 6 such as vitamin B12 7 and fatty acids 8 , regulate the growth of the macroalgae 4 , while some also produce antimicrobial compounds 9 . Besides, epiphytic bacteria regulate and prevent the settlement of other marine organisms. For example, the epiphytic bacteria of green alga Ulva reticulata produce bioactive compounds that inhibit the settlement of polychaete Hydroides elegans 10 on macroalgal surfaces 8,11 .
Studies on biodiversity and community composition of epiphytic bacteria of macroalgae have been scientists' focus in the last decade using culture dependent analysis. For instance, Burke et al. 12 revealed that Alphaproteobacteria and Bacteroidetes were the dominant groups, while Rhodobacteriaceae, Sphingomonadaceae, Flavobacteriaceae, and Sapropiraceae were the dominant families of bacteria on the surface of green macroalgae collected from Bare Island, La Perouse. Similarly, Tujula et al. 13 reported that Alphaproteobacteria and Bacteroidetes were the dominant epiphytic microbial communities on the surface of Ulva australis collected from Shark Point, Clovelly, NSW, Australia. Moreover, the epiphytic bacteria on Ulva australis were taxonomically and functionally

Results
Physicochemical factors of the environment. The physicochemical parameters of seawater from the study areas including temperature (Temp), pH, salinity (Sal), dissolved oxygen (DO), electrical conductance (EC), total dissolved solids (TDS), total nitrogen (TN), total phosphorus (TP), ammonia (NH 4 -N), nitrate (NO 3 -N), nitrite (NO 2 -N), and dissolved inorganic nitrogen (DIN) levels are shown in Fig. 2 & S. Tab 1. The temperatures in NR and NA were significantly higher (p < 0.05) than that in LJ (Fig. 2a), whereas DO concentration in NR and NA was significantly lower (p < 0.05) than that in LJ (Fig. 2b). There were no significant differences (p > 0.05) in pH between NR and NA, or between NR and LJ (Fig. 2a). The salinity and electrical conductance in NR were significantly higher (p < 0.05) than in LJ (Fig. 2b,c). There were no significant differences (p > 0.05) in salinity and electrical conductance between NR and NA, or between NA and LJ (Fig. 2b,c). Similarly, no significant differences in TDS were found among NR, NA, and LJ (Fig. 2c). The concentrations of TN and TP in NR were significantly higher (p < 0.05) than NA and LJ (Fig. 2d). On the other hand, there was no significant difference (p > 0.05) in NH 4 -N concentration between NR and NA, but was significantly higher (p < 0.05) than LJ (Fig. 2e). The levels of NO 3 -N and DIN in NR were significantly higher (p < 0.05) than NA and LJ (Fig. 2e, www.nature.com/scientificreports/ whereas the concentration of NO 2 -N in NR was significantly lower (p < 0.05) than LJ but not (p > 0.05) NA (Fig. 2f).
Diversity of epiphytic bacterial communities on G. lemaneiformis. Twenty Geographic comparison of EBC on the surface of G. lemaneiformis. Based on the analysis of OTUs, an NMD (Non-metric multidimensional) ordination biplot indicated clear clustering of EBC. The stress and RSQ (Squared correlation) values were 0.14391 and 0.91376, respectively (Fig. 4). The G. lemaneiformis samples at LJ generally clustered together, with a similar phenomenon also observed at NA and NR. Notably, the composition of EBC at NA and LJ were more similar compared with that at NR based on Dimension 1. The EBC on the surface of G. lemaneiformis were considerably different in the three geographic locations (Fig. 5). Bacteroidetes was the most predominant phylum on the surface of G. lemaneiformis in NR, with a relative abundance of 44.982%, followed by Proteobacteria (28.941%) and Firmicutes (20.565%). Samples (G. lemaneiformis) collected from NR exhibited significantly (p < 0.05) diverse EBC on their surfaces compare with samples from NA and LJ. For examples, Proteobacteria was the most predominant phylum on the surface of G. lemaneiformis in NA and LJ, with relative abundances of 61.727% and 64.437%, respectively, which was significantly higher than NR (p < 0.05). The second dominant phylum was Bacteroidetes, with relative abundances of 30.140% at NA and 27.008% at LJ, which were significantly lower than that in NR (p < 0.05). On the other hand, the relative abundance of Firmicutes in NR was significantly higher than NA and LJ (p < 0.05), while the abundance of Deinococcus-Thermus in NA was significantly higher than NR and LJ (p < 0.05).
Redundancy analysis and heat map. Redundancy analysis (RDA) was used to explore the correlations between epiphytic bacteria on G. lemaneiformis at the phylum level and environmental factors (Fig. 6). The RDA results showed that epiphytic bacteria at NR, NA, and LJ clustered in groups. RDA1 and RDA2 explained 47.58% and 29.16% of the total variance, respectively. At NR, NO 3 -N, DIN, TN, and TP were strongly positively correlated with the relative abundance of Bacteroidetes, Firmicutes, Verrucomicrobia, and Epsilonbacteraeota, whereas NO 2 -N was negatively correlated with the EBC. Similarly, NO 2 -N was strongly positively correlated with the relative abundance of Cyanobacteria, whereas NO 3 -N, DIN, TN and TP showed negative correlations with the relative abundance of Cyanobacteria at NA and LJ. The RDA results together with Pearson correlation analysis (Fig. 7) revealed that NO 3 -N and DIN had a significant influence (p < 0.05) on macroalgal and microorganisms distribution. When the Mantel tests was used to examine the contribution of environmental factors to the assembly of community composition of epiphytic bacteria on G. lemaneiformis, the results (at both phylum and genus level) showed that community composition correlated significantly with NO 3 -N, NO 2 -N, DIN, TN, www.nature.com/scientificreports/ and TP (Table 1). There was also significant correlations between temperature, pH, DO and the genus communities composition. Spearman correlation analysis was used to investigate how environmental factors affect the EBC compositions on G. lemaneiformis. The results revealed a relation between the composition of EBC on G. lemaneiformis and environmental factors of the surrounding seawater, as shown by the heat map based on Spearman correlations (Fig. 7). The variation of EBC at the genus level showed that all epiphytic bacterial composition, except Dokdonia and Maribacter at NR, exhibited a positive correlation with NO 3 -N, DIN, TN, and TP (0.31 < r < 0.81), but a negative correlation with NO 2 -N (-0.77 < r < -0.35). Notably, the abundance of Escherichia-Shigella in samples at NR2 was significantly positively correlated with NO 3 -N, DIN, TN, and TP. The abundance of all epiphytic bacterial composition except Flavobacteriaceae_unclassified, Hellea, Arenicella, Leucothrix, and Caulobacteraceae_unclassified at NA showed positive correlation with NO 2 -N (0.10 < r < 0.70), but negative correlation with NO 3 -N,

Functional prediction and comparative analysis.
To determine the functional profile of the epiphytic bacteria communities on G. lemaneiformis from the three geographic locations, phylogenetic investigation by reconstruction of unobserved states (PICRUSt) was used to analyze and predict the functional capabilities of EBC on G. lemaneiformis. The KEGG pathway analysis from all the predicted metagenomes revealed that metabolism was the most enriched functional composition of level 1 in all groups i.e., NR: 50.060%, NA: 51.384%, and LJ: 52.717%, respectively (S. Figure 3). Within the metabolism assignments for all groups, the enrichment of amino acid, carbohydrate, and energy metabolisms were the highest ( Fig. 8 and S. Tab. 4). The metabolisms of amino acid showed 'amino acid related enzymes' , while 'arginine and proline' were found as most and 'phenylalanine' as least predominant in all groups (S. Figure 4). ' Amino sugar, nucleotide sugar' , 'glycolysis/gluconeogenesis' , 'pyruvate' in carbohydrate and 'oxidative phosphorylation' except 'photosynthesis-antenna proteins' in energy metabolisms were found as the most abundant in all groups (S. Figures 5 and 6).

Discussion
To the best of our knowledge, this is the first study that used next generation sequencing to evaluate geographic and environmental specificity of epiphytic bacterial communities (EBC) on cultivated red macroalgae G. lemaneiformis. The focus of previous studies have been on spatial and temporal diversity of EBC on macroalgae 13,17,20 , while few reports have attributed differences in the EBC composition on macroalgae surface from different geographic locations to environmental selections 17,32 . Although environmental selection can produce substantial biogeographic patterns in the global microbe population 33 , there are selective mechanisms that determine the assemblage of species in one environment or the other 12 . Thus, to better understand this phenomenon, we investigated the EBC composition, their correlation at the spatial level, and the role of different environmental factors (e.g., nitrogen and phosphorous) in shaping EBC composition. We demosndtrated that the similarity among EBC was higher in G. lemaneiformis from sampling sites that are 300 m apart. For instance, at NR 17.5%-66.92% www.nature.com/scientificreports/ similarity was found, with 12.5%-64.77% similarity at NA, and 50%-75.07% similarity at LJ, when compared with G. lemaneiformis samples from NA and LJ which are 500 km apart, and exhibit only 7.5% similarity. The similarity of the EBC at different sampling sites in any of the three locations was much higher than that between the two most distant locations (NA and LJ, 500 km apart). This observation fits into the distance-decay of similarity model, where decrease in the similarity of microbial communities is related to an increase in geographic distance 28 . Our findings are consistent with the study by Roth-Schulze et al. 1 , where variations in the similarity of EBC correlated with distance in marine, soil, sediment, and plant-associated ecosystems 1 .
Our study revealed considerable variations of EBC composition on G. lemaneiformis among the three geographic locations (NR, NA, and LJ), where the EBC were mainly composed of Flavobacteriaceae, Saprospiraceae,   Figure 1 and S. Tab. 3). These observations are consistent with previous studies of marine algal-associated bacterial communities. For example, the EBC on the surface of the red algae, Delisea pulchra, comprised of Rhodobacteraceae, Sphingomonadaceae, Flavobacteriaceae, Planctomycetaceae, and unclassified Grammaproteobacteria 34 , whereas the green alga U. australis hosts Alphaproteobacteria, Bacteroidetes, Planctomycetes, and unclassified Grammaproteobacteria 12 . Fucus vesiculosus is reported to be associated with high proportions of Alphaproteobacteria, Bacteroidetes, Verrucomicrobia, and Cyanobacteria in summer, but mainly host Grammaproteobacteria in winter 35 . Low abundance of epiphytic bacteria in the phyla Cyanobacteria and Verrucomicrobia were found in this study, which have also previously been reported to be widespread in different seaweeds 16,35 , suggesting that members of these phyla play an important role in algae-bacteria interaction. Cyanobacteria are widely dispersed in rivers, lakes, and the ocean, with the ability to withstand harsh environmental conditions 36 . The core microbial communities are often defined as the suite of members shared among microbial communities from similar habitats 37 . Thus, discovering the core microbial communities is important for understanding stable and consistent components across complex microbial assemblages. In this study, significant differences in the core EBC were found across the three geographic locations (S. Figure 2 38 . Although few studies have revealed that dimethylsulfoniopropionate (DMSP) play a key role in macroalgal-bacterial interactions 39 , Alphaproteobacteria, which are morphologically and metabolically diverse 13 , have a critical role in the assimilation of DMSP in oceans and contribute significantly to global sulphur cycling 40 . While DMSP produced by macroalgae including Ulva sp. that usually attract some bacteria 39 , some members of the family Hyphomonadaceae are widely dispersed in marine environment and play a vital role in mineralizing dissolved organic matters in oligotrophic waters 17 . Based on our findings, it seems the ability of G. www.nature.com/scientificreports/ lemaneiformis to grow in the three different oligotrophic waters may partly benefit from the mineralization by these microorganisms on its surface. After all, a previous study by Holmström et al. 41,42 had shown that many bioactive compounds produced by the genus Pseudoalteromonas presented on the surface of Ulva lactuca play an important role in chemical defense against biofouling in marine environment. Several studies on holobionts have shown host-specificity of EBC within different species or geographic differences of EBC between different locations 1,15,17,20,32 . For instance, the EBC on different host species were taxonomically and functionally distinct, which is not due to the phylogeny of the host but to the physicochemical properties of the host 15 . Thus, Lachnit et al. 43 have suggested that the difference in the EBC on different algal hosts are due to the physiochemical properties of the macroalgae surface, which allows the settlement and colonization by specific bacteria. In terms of geographic diversity, Roth-Schulze et al. 1 demonstrated that the same algal-genus www.nature.com/scientificreports/ across different regions can harbor different microbial communities at the taxonomic and functional level, due to local geographical conditions and host specificity. When the specificity of G. lemaneiformis associated EBC and their correlation with different environmental factors was examined, the concentrations of TN, TP, NO 3 -N, and DIN at NR were found to be significantly higher compared to NA and LJ (p < 0.05), while NO 2 -N concentration at NR was significantly lower compared to LJ (p < 0.05) and NA (p > 0.05), although all sites had similar environmental conditions i.e., temperature, pH, salinity, dissolved oxygen, electrical conductance, and total dissolved solids. Changes in environmental conditions, such as nutrient concentration, nutrient ratio, and temperature, can affect the physicochemical properties of macroalgae 44 . For instance, Van Alstyne 45 reported that Ulva lactuca and Ulva obscura grown in high nitrogen concentration have higher DMSP content than in low nitrogen concentration, suggesting that different environmental conditions can affect the content of algal-associated compounds. Increasing evidence shows that compounds associated with algal surface can mediate epiphytic bacterial colonization, abundance, and community composition of macroalgae 18,19,46 . Given that these studies suggest that the differences in EBC composition of macroalgae due to changes in environmental factors can be attributed to variations in physicochemical properties on macroalgae surface, we speculate that the variation of the EBC on G. lemaneiformis was probably due to differences in algal-associated compounds caused by environmental conditions. Generally, environmental factors that affect the diversity, composition, and structure of microbial communities include temperature, pH, salinity, and DO 23,24,30 . Since many marine microbes require oxygen for growth and survival, DO is crucial to microbial communities in coastal ecosystems 30 . Thus, a higher DO concentration could relate to an increased taxonomic and functional diversity of microbial communities. In our current study, the concentration of DO at NR and NA was significantly lower (p < 0.05) compared to LJ, although the EBC diversity in NR was higher than that in NA and LJ. These results suggest that DO might not be the main factor that affects the diversity of www.nature.com/scientificreports/ EBC on G. lemaneiformis. Even though temperature is one of the important environmental factors that affect microbial communities in many habitats 23 , no significant differences in ocean water temperature was observed among the three sampling sites (NR, NA, and LJ). This indicates that temperature is not the main factor that affects the diversity of EBC on G. lemaneiformis, which is consistent with the report by Zozaya-Valdés et al. 27 that environmental factors other than temperature seems to impact on alga-associated microbiome. Considerable changes in EBC on the surface of G. lemaneiformis was found at the three geographic locations, regardless of the taxonomic level (Fig. 5, S. Figure 1 and S. Tab. 3). At phyla level (Fig. 5), Bacteroidetes was the most predominant phylum at NR, whereas Proteobacteria was dominant at NA and LJ. Our results are similar to a previous study on surface-associated bacterial communities on macroalgae, which revealed that epiphytic bacteria of U. australis are dominated by Proteobacteria and Bacteroidetes 12 . Although at the family level (S. Tab. 4), the relative abundance of Muribaculaceae, Ruminococcaceae, and Lachnospiraceae at NR was significantly higher (p < 0.05) than NA and LJ, no significant differences (p > 0.05) were observed between NA and LJ. This indicates that there are significant differences in the EBC composition on G. lemaneiformis between NR and NA and between NR and LJ, but not between NA and LJ, which is contrary to previous reports that algal EBC varies between different locations 1,32 . Probably the explanation for this observation could be due to secretions of some compounds on the surface of G. lemaneiformis that regulates EBC composition.
When RDA 47 and Spearman correlation analysis 30 were employed, the EBC on G. lemaneiformis at NR exhibited positive correlation with TN, TP, NO 3 -N, and DIN, and negative correlation with NO 2 -N ( Figs. 6 and 7). A clinically important bacterial genus, Escherichia-Shigella, was found on the G. lemaneiformis at NR2, and was significantly correlated with NO 3 -N, DIN, TP, and TN (Fig. 7). Escherichia-Shigella is an enteric pathogen that seeps from waste water treatment plants 48 , and can secrete toxins to the surrounding environment 49 . The correlation between Escherichia-Shigella abundance and environmental factors could be of ecological or health concerns 50 . The EBC on G. lemaneiformis at NA and LJ correlated positively with NO 2 -N and negatively with TN, TP, NO 3 -N, and DIN. Thus, the differences in G. lemaneiformis associated EBC from the different geographic locations is due to environmental factors but not geographical locations. These findings are consistent with a previous study in which Asparagopsis-associated bacterial communities were observed to be modulated by environmental conditions 17 . This is further supported by Roth-Schulze et al. 1 , who reported that most Ulvaassociated bacterial communities are horizontally derived from the environment because macroalgae U. australis isolated from distinct geographical locations are found to share only two low-abundance OTUs. However, in the current study, G. lemaneiformis from NA and LJ (which are 500 km apart) only shared 7.5% similarity. Intriguingly, environmental factors but not geographical locations account for the differences in the EBC on G. lemaneiformis, which is contrary to the previous report by Roth-Schulze et al. 1 . Therefore, we assume that the EBC on G. lemaneiformis is regulated by environmental factors such as nitrogen and phosphorus rather than geographical locations.
Interestingly, replicate samples of Gracilaria species from the same location showed some variability in their EBC composition, as indicated by Bray-Curtis similarity of 17.5-66.92% at NR, 12.5-64.77% at NA, and 50-75.07% at LJ. This observation is similar to previous studies, where a high level of intraspecies variability of EBC was found associated with macroalgae 1,14 . The depth of sequencing 13 , or the lottery model in which random variation can be seen in the recruitment of the EBC on algal surface 21 might account for these observations. The lottery model proposes that recruitment of species with the same tropic abilities in any ecosystem follows a stochastic fashion i.e., who ever gets there first wins the space, but they must share similar ecologies 21,51,52 . Thus, to further ascertain that environmental factors indeed play a role in the selection of the EBC on macroalgae, future studies would explore comparative EBC composition of other algae in the presence of similar environmental factors.
Although considerable differences in the composition of EBC on G. lemaneiformis was observed at the three geographic locations, the functional composition of EBC were similar. There is an emerging consensus that the bacterial community composition on macroalgae is mainly driven by functional genes rather than taxonomic composition 14,16 . In the current study, the functional capabilities of EBC on G. lemaneiformis at the different locations were similar, which indicated similar functions of these bacterial communities at these locations. The fundamental factor responsible for the differences in the composition of bacterial communities was the microenvironment, which is established by the physiological and biochemical properties of the algal host 3,16 . We found bacterial genes associated with the amino acids glycine, alanine, arginine, proline, glutamic, and aspartic acids 53 (S. Figure 4). Given that G. lemaneiformis is rich in proteins and a source of phycoerythrin production 54 , algal proteins can be selectively used by the bacteria on the surface of G. lemaneiformis, which might explain the higher percentage of bacterial genes assigned to amino acid metabolism (Fig. 8). The presence of the amino acid metabolism associated genes indicates an adaptive mechanism of epiphytic bacteria to use organic matter from macroalgae for growth. Besides, the abundant functional genes related to carbohydrate metabolism suggest an involvement in the mineralization of dissolved organic matter under the oligotrophic environment of coastal water (Fig. 8). Similar findings have previously been reported by Selvarajan et al. 16 , where a higher abundance of bacterial functional genes associated with carbohydrate metabolism were found in all seaweeds at intertidal zones of Mission Rocks, Cape Vidal, Leven Point, South Africa. Members of the Hyphomonadaceae family are reported to be the main contributors to mineralization of dissolved organic matter in oligotrophic conditions 17 . In the current study, Hyphomonadaceae were also found to be highly abundant in the EBC on G. lemaneiformis (S. Tab. 3). Thus, we speculate that Hyphomonadaceae harbor significant number of functional genes related to carbohydrate metabolism, which might be the reason for pyruvate metabolism being the most abundant in all groups (S. Figure 5), because energy from pyruvate decomposition is stored in the form of high-energy phosphates for microbial growth and reproduction 55 . From an ecological perspective, macroalgae may also exploit QS inhibitors and antimicrobial compounds produced by numerous epiphytic bacteria to protect their surfaces from pathogens, herbivores and fouling organisms 56  www.nature.com/scientificreports/ of other secondary metabolites showed the lowest abundance in 'Metabolism' (Fig. 8), these functions cannot be ignored in the chemical defense process of algae. The limitation of the current study is that all the inferences are based on predicted functions by PICRUSt annotation, which cannot completely substitute metagenomic research, hence, there would be inherent inaccuracies in interpreting functional biogeography in some ecosystems.

Conclusion
The current study demonstrates that EBC associated with G. lemaneiformis varied significantly at three geographic locations. The relative abundance of EBC at NR was strongly positively correlated with total nitrogen, total phosphorus, nitrate, and dissolved inorganic nitrogen, whereas the EBC at NA and LJ only strongly correlated positively with nitrite. Therefore, environmental factors such as nitrate and dissolved inorganic nitrogen are the key factors that shape the EBC composition on G. lemaneiformis but not geographic locations. This is the first study that provides insight into the EBC composition on G. lemaneiformis at three geographic locations and shows that environmental factors influence their composition.

Materials and methods
Collection location and sampling. Individual thalli from G. lemaneiformis were collected from three Samples preprocessing and elution of epiphytic bacteria. The elution of epiphytic bacteria from G. lemaneiformis followed the methods described by Burke 58 . Briefly, before extraction, algae samples were washed three times with filter-sterilized seawater to remove loosely associated microorganisms, microalgae, and phytoplankton. Five grams wet weight of G. lemaneiformis were placed into 250 mL conical flask containing 100 mL of calcium-and magnesium-free artificial seawater (CMFSW, 0.45 M NaCl, 10 mM KCl, 7 mM Na 2 SO 4 , 0.5 mM NaHCO 3, and 10 mM EDTA) and 1 mL filter-sterilized rapid multienzyme cleaner (3 M, Shanghai, China). Samples were incubated in an oscillating concentrator (DHZB-500, Jintan Science Analysis Instrument Co., Ltd.) for 3 h at 25 °C and 180 rpm. To completely elute the surface epiphytic bacteria on G. lemaneiformis, samples were sonicated (JY92-IIN, Ningbo Scientz Biotechnology Co., Ltd.) for 10 min (50 W, 5 s/5 s) 59 . Algal material was removed by sterilized tweezers and the remaining solution centrifuged at 8000 rpm for 20 min. The supernatant was removed from the upper layer, and about 1.0 mL of liquid was left at the bottom of the centrifuge tube to be mixed with the bacterial precipitate and transferred to new centrifuge tube before being centrifuged at 12,000 rpm for 5 min to completely remove the supernatant. The obtained bacterial precipitate was thoroughly washed once with 1 × TE buffer, followed by dissolving the bacterial precipitates in 1 mL 1 × TE buffer and stored at -20 °C for DNA extraction.
Detailed DNA extraction. The DNA from epiphytic bacteria on G. lemaneiformis was extraction by the procedure described previously 60 . Briefly, bacterial precipitate samples were centrifuged at 12,000 rpm for 5 min to remove the supernatant. Next, 100 μL TE buffer was added to each centrifuge tube to thoroughly suspended the bacterial precipitate, followed by the addition of 60 μL 10% SDS solution and 10 μl 20 mg·ml −1 proteinase K, and then mixed gently before being incubated at 37 °C for 1 h. After incubation, 100 μl 5 mol·L −1 NaCl solution was added to each sample and the tubes inverted repeatedly at least 10 times to mix, after which 80 μl CTAB/ NaCl solution was added to each tube followed by gentle mixing and incubation at 65°C for 10 min. Next, 700 μl Chloroform: Isoamyl Alcohol (24:1) was added and the samples mixed gently before being centrifuged at 12,000 rpm for 2 min. The supernatant was collected and transferred to new sterile 1.5 mL polypropylene centrifuge tubes, followed by the addition of an equal volume of Phenol Denoising of sequence data. Perl script was used to split the offline data into different sample data according to the barcode sequence, while the barcode and PCR primer sequences were cut off 62 . Splicing of PE Reads followed the following steps 63 : (1) Bases below the mass value of 20 at the end of the read were filtered and a 50 bp window was set. When the average mass value in the window was less than 20, the back-end base is cut off from the window and the read below 50 bp after quality control are filtered. (2) According to the overlapping relationship between PE reads, pair of reads are merged to form a sequence, with a minimum overlap length of 10 bp. (3) The allowable maximum mismatch ratio of the overlap region of the splice sequence was 0.2 and nonconforming sequences were screened. (4) Samples were distinguished according to the barcode at both ends of the sequence and the primer, and the sequence direction was adjusted. The mismatch number allowed by the barcode was 0, and the maximum primer mismatch number was 2. The Tags sequences obtained after the above processing are compared with the database (Gold database, http:// drive5. com/ uchime/ uchime_ downl oad. html) to detect chimera sequences, and finally, the chimera sequences were removed to obtain the final Effective Tags 64 .
OTU and species communities analysis. The Uparse software (Uparse v7.0.1001, http:// drive5. com/ uparse/) was used to perform 97% identify clustering of all Effective Tags sequences of all samples to form OTU 65 . The representative OTU sequences were selected and compared with GreenGene (16S chloroplast, mitochondria, http:// green genes. secon dgeno me. com/), RDP (16S, http:// rdp. cme. msu. edu/), Silva (18S, http:// www. arb-silva. de) and Unite (ITS, http:// unite. ut. ee/ index. php) database using the assign_taxonomy.py (http:// qiime. org/ scrip ts/ assign_ taxon omy. html) script and RDP Classifier method 66 in Qiime to obtain species annotation information (the default confidence threshold was above 0.8). OTU and its Tags annotated as chloroplast or mitochondria and not annotated to the boundary level were removed. Perl script was used for statistics of Effective Tags data of each sample, low-frequency Tags data, Tags annotation data, data annotated by chloroplast and mitochondria, and the number of OTU obtained for each sample 67 . The R software was also used to carry out statistical analysis of the annotation proportion of each classification level of OTU and the relative abundance of species in each classification level. The relative abundance heatmaps of OTU level and genus level, and cluster analysis between samples and species were performed R software.
Alpha and beta diversity analysis. Alpha_diversity.py (http:// qiime. org/ scrip ts/ alpha_ diver sity. html) script in the QIIME software package was used to calculate four diversity indexes (Observed species, Chao, Shannon, phylogenetic diversity) 68 based on the uniform OTU abundance table. The calculation of phylogenetic diversity required the phylogenetic tree data of OTU. Jackknifed_beta_diversity.py (http:// qiime. org/ scrip ts/ jackk nifed_ beta_ diver sity. html) script in the QIIME software package was used to calculate three beta diversity distances (Bray Curtis, Weighted Unifrac, Unweighted Unifrac) based on the homogeneous OTU abundance table 69 . For nMDS, R software was used for nMDS analysis, and the drawing was based on the uniform OTU abundance table 70 .

Correlation analysis of environmental factors.
For redundancy analysis (RDA), the RDA function in the vegan package was used for ranking analysis 47 . Through envfit function, r 2 and P values of the influence of each environmental factor on species distribution was calculated, followed by RDA analysis to screen out environmental factors with significant influence. Applying the 'bioenv' function in the vegan package, the environmental factor or combination with the largest correlation with the species matrix (Spearman) were screened out, and the environmental factor obtained by screening was analyzed by targeted RDA. For Spearman correlation, Spearman correlation values of species and environmental factors were calculated using the 'Corr. test' function of psych package in R and its significance tested 30 . The pheatmap function of the pheatmap package was used for visualization. The vegan package in R was used for the Mantel test 71 . Vegdist function was used to transform the distance matrix for 'species matrix' and 'environmental factor data matrix' . The Spearman correlation analysis was performed for the two types of matrices with the Mantel function to obtain r and P values. www.nature.com/scientificreports/