Evaluation of influent microbial immigration to activated sludge is affected by different-sized community segregation

Activated sludge (AS) microbial communities were analyzed for seasonal variation, a disturbance-recovery event, and separated small aggregates (SAG) to study the influent immigration effect using both neutral immigration model and mass-balance model with operational parameters. SAG differed with AS, and higher immigration impact on SAG was confirmed by both models. Adding the SAG community segregation in the latter model to evaluate the contribution of influent immigration to community disturbance-recovery showed increased impact of immigration.


INTRODUCTION
The activated sludge (AS) wastewater treatment plant provides a well-defined model system to study microbial ecology. The AS community is shaped by deterministic (Niche) factors and stochastic (Neutral) factors (e.g., neutral immigration from wastewater influent) 1 . Two types of numerical models have been used to quantify the influent immigration effect. One is the neutral community (immigration) model (NIM) 2 , employing the abundance and frequency of operational taxonomic units (OTUs) in metacommunities to calculate the community-level migration probability 1,3-6 . The other is mass-balance model (MBM) that defines the immigration as the proportion of reactor biomass derived from influent biomass influent biomass influent biomassþlocal growth , using abundances of specific OTUs to calculate the immigration rates and net growth rates 7 . The NIM immigration describes the probability that a dead individual in the reactor is replaced by an immigrant from the influent, differentiated from the reproduction of local community. Intrinsically, the MBM bases on the fractionation of influent biomass and local growth, which also aims to differentiate the influent immigration and local community. The MBM has been increasingly used because of the quantitative measurements of the OTU growth activity 8 . Although steady-state metacommunities demonstrated that operational parameters (e.g., solid retention time, SRT) are important for the evaluation of immigration 4,9 , the operation parameters were not directly employed in the original MBM. Recently, Frigon and Wells 10 and Guo 11 developed the mass-flow immigration model, which is a MBM incorporating operational parameters (MBM-OP), to calculate the immigration rate of specific OTUs. By using operational parameters, this model is comprehensible for both wastewater engineers and microbiologists. In wastewater treatment bioreactors, community variation has been shown in relation to the size and status (settled and unsettled) of the aggregates 12,13 . In conventional AS, the relative importance of influent immigration on different-sized aggregate communities (i.e., easily settleable large-size flocs and lowsettleability small-size particles) has not been explored. Besides steady-state modeling, the bioreactors often experience disturbances 9,14 , such as sudden high flow rates which lead to a biomass washout. The non-steady-state community serves as a good example to study the community recovery and to compare the contributions of local community growth and influent immigration.
Our study aimed at monitoring the long-term variation of AS communities in different seasons and comparing the influent immigration effect on different-sized AS communities using NIM and MBM-OP models. In addition, the short-term non-steady-state (under operational high-flow disturbance) AS community's recovery attributed to local community growth and influent immigration was evaluated.

AS community variation
The AS communities (between 2013 and 2019) showed that abundant microorganisms were relatively consistent during the long-term observation, and consisted of a core group at the order level ( Fig. 1b) and at genus level ( Supplementary Fig. S1). Seasonal changes and disturbance did not alter the core group 14 . The variation between small aggregates (SAG) and AS was clearly shown (Fig. 1b), being consistent in different seasons. The most abundant members in SAG were different from AS communities, but similar with the influent community's most abundant members, indicating that influent microbiome is an important origin for the SAG. The community richness (number of observed OTUs) was slightly higher in SAG than in AS ( Supplementary Fig.  S2), inferring that the SAG community was a union of AS and influent communities. The shared numbers and abundances of OTUs are indicators of immigration effect 7,9,15 . A large number of OTUs were shared among the influent, AS, and SAG (Fig. 1c). The shared OTUs between influent and AS took up 89.1 ± 0.1% of the influent community total number of reads and 54.0 ± 4.3% of the AS community total number of reads, which increased between influent and SAG (97.0 ± 0.1% and 73.8 ± 3.0%, respectively), indicating higher immigration effect for influent-SAG than for influent-AS. The Bray-Curtis distances also showed higher similarity between influent and SAG than between influent and AS (Fig. 1d). Permutational multivariate analysis of variance (Bray-Curtis distance, permutation = 999) was performed to test the two-way factor effect of sampling time and aggregate size. Our results showed significant effect of aggregated size (p = 0.001), but not sampling time (p = 0.911) or their interaction (p = 0.866).
This phenomenon of community segregation was observed in different forms of biomass in biofilm and granular sludge reactors 13,16,17 . Less work has been reported for conventional AS. SAG have higher accessibility to substrates and lower retention time and are more subjected to influent immigration effect as revealed in our results. An ecological growth trait of community, weighted rrn gene copy number per genome, showed that SAG has higher average rrn than AS, implying possibly higher growth activity ( Supplementary Fig. S3) 14,18,19 .
The influent immigration impact was quantified using the NIM 2,3 and MBM-OP (Eq. 14) models. The migration probability (m NIM ) was higher for influent-SAG (0.40) than for influent-AS (0.33) (Fig. 1e). The distribution of immigration rates for each OTU based on MBM-OP (m i,MBM-OP ) showed that influent-SAG shifted towards higher immigration rate compared to influent-AS ( Fig.  1e), in accordance with the observation of m NIM . These observations suggest a neglected fact that the influent immigration to bioreactors is associated with the aggregate size-differentiated community. The SAG community is less settleable compared to flocs and more subjected to washing out from the reactor, leading to higher dependency on the influent immigration. Moreover, the influent community that enters the reactor and the SAG community share a more similar suspension status compared to flocs, which may attribute to the higher community similarity (Fig.  1d) and immigration (Fig. 1e).

Immigration contribution to community recovery
The AS community was disturbed by a high-flow event which led to a drop in biomass density, shifting away from the steady-state community and recovered after 3 days (Fig. 1d). The recovery could be attributed to different mechanisms, including growth of the indigenous microorganisms 9,15 and influent immigration 1,2,9 . Using the MBM-OP immigration model, the influent community can be partitioned to growth and neutral immigration (nongrowth) (Fig. 2a). A conceptual model MBM-OP-SAG was constructed (Fig. 2c). In both models, OTUs were classified into three groups based on their relative abundances in the influent and in the AS: (i) not in influent but in reactor community, (ii) growth group (calculated net growth rate u i,MBM-OP > 0, Eq. 16), (iii) complete immigration group (calculated net growth rate u i, MBM-OP ≤ 0, Eq. 16). The relative abundances (Fig. 2b, d) showed that the immigration group was using the MOM-OP-SAG model more ( Fig. 2d) than using the MBM-OP-AS model (Fig. 2b). For the disturbed and recovered AS communities, the MBM-OP-AS model showed similar abundances of immigration group (Fig.  2b). The MBM-OP-SAG model estimated higher contribution of immigration in recovered than in disturbed community (Fig. 2d). The two models differed in certain taxa's growth rates and immigration rates, resulting in a list of genera classified into growth group by MBM-OP-AS model while into immigration group by MBM-OP-SAG model (Supplementary Table S1). Microorganisms such as Methanobacterium, Methanobrevibacter, Clostridiales, Synergistales, and Bifidobacterium are known anaerobic microorganisms and likely to be carried by wastewater from sewer 20 and not to grow in AS. The MBM-OP-SAG model was successful to identify these genera as influent immigration microorganisms.
Our results demonstrated AS community variation in differentsized aggregates. NIM and MBM-OP models were compared, both resulted in higher immigration impact for the small aggregate (SAG) community than for the AS community. The different-sized community variation led to an approach (MBM-OP-SAG model) of modeling influent and segregated communities for immigration effects, which was applied to investigate the community recovery. The MBM-OP-SAG model showed higher immigration contribution to recovery than the MBM-OP-AS model. This approach could be applied to other bioreactor systems to increase modeling accuracy of microbial growth and immigration, compared to the conventional NIM and MBM-OP models. The raw sequences (2 × 250 nt) were paired, quality-filtered using DADA2 23 in Qiime2 pipelines 24 . The OTUs were defined using the default 100% similarity. The taxonomic identification was assigned using Green-Genes (version gg_13_8) reference database at 99% similarity 25,26 . Microbial community diversities (alpha-diversity, beta-diversity) were analyzed using R "vegan" package 27 . The weighted rrn copy number was estimated by the normalized OTU table using PICRUSt 28 with metagenome references in Integrated Microbial Genomes (IMG) system 29 .
Weighted rrn copy number ¼ where f 16S_norm,i is the relative abundance of the ith OTU in total 16S rRNA gene amplicon sequence reads normalized by PICRUSt; N rrn,i is the rrn copy number of the ith OTU in the IMG database.

Neutral immigration model
The probability of influent immigration was defined by Sloan et al 2 . and calculated using the neutral immigration model. The model described that in a microbial community with N T individuals, an individual die or leave the system and is immediately replaced by an immigrant with probability of m, or by reproduction of a local member of the community with probability of 1-m. For the i th species (initial number N i ), the probability (Pr) of increase by one, stay the same, or decrease by one is expressed, respectively, by the following three equations: where p i is the relative abundance of the ith species, α i is the advantage parameter of the ith species over the other species. The relative abundance can be expressed as x i ¼ Ni NT . Assume that N T is large enough and x i becomes continuous, and α i = 0 so the species has no advantages, then the probability density function of x i ,ϕ i =(x i ,t) can be expressed as (for more details please see Sloan et al 2 .): where c is a constant so that The probability of the ith species observed in any local community is: Pr species i is present with a relative aubndance > d ð Þ ¼ where d is the detection threshold. This equation can be used to calibrate community data (frequency and mean relative abundance plot) and solve for m.
AS and SAG communities included six replicates each. The immigration rate m was estimated for influent and AS communities and influent and SAG communities, using the R codes from Burns et al 3 . with 100 permutations (R code in Supplementary Information).

Mass-balance model with operational parameters
The MBM with operational parameters (mass-flow immigration model) calculation for net growth rates was described by Guo 11 .
The mixed liquor heterotrophic biomass is contributed from the influent biomass and the microbial growth from the consumption of substrate resources. The immigration is defined as the fraction of influent biomass contribution: where m is the mass-flow immigration rate; X OHO,Inf and X OHO,ML are the active heterotrophic biomass in influent and mixed liquor; f OHO,Capt is the fraction of influent biomass captured by the mixed liquor which is default 1;Y OHO is the heterotrophic growth yield; S consumed is the consumed substrates.
For continuous stirred tank reactors, the heterotrophic biomass based on mass-balance calculation 30 is: Rearrangement gives: Under the assumption that f OHO;Capt Ã X OHO;Inf ( Y OHO Ã S consumed , substitution of Eq. 10 to Eq. 8 gives: The immigration rate of a specific ith OTU is: