Microbial co-occurrence network topological properties link with reactor parameters and reveal importance of low-abundance genera

Operational factors and microbial interactions affect the ecology in anaerobic digestion systems. From 12 lab-scale reactors operated under distinct engineering conditions, bacterial communities were found driven by temperature, while archaeal communities by both temperature and substrate properties. Combining the bacterial and archaeal community clustering patterns led to five sample groups (ambient, mesophilic low-solid-substrate, mesophilic, mesophilic co-digestion and thermophilic) for co-occurrence network analysis. Network topological properties were associated with substrate characteristics and hydrolysis-methanogenesis balance. The hydrolysis efficiency correlated (p < 0.05) with clustering coefficient positively and with normalized betweenness negatively. The influent particulate COD ratio and the relative differential hydrolysis-methanogenesis efficiency (Defficiency) correlated negatively with the average path length (p < 0.05). Individual genera’s topological properties showed more connector genera in thermophilic network, representing stronger inter-module communication. Individual genera’s normalized degree and betweenness revealed that lower-abundance genera (as low as 0.1%) could perform central hub roles and communication roles, maintaining the stability and functionality of the microbial community.


INTRODUCTION
Anaerobic digestion (AD) is a promising technology for organic waste/wastewater treatment to recover biogas as energy and reduce the risk to environmental and human health [1][2][3] . The successful AD system depends on robust microbial community, effective microbial symbiosis and balanced hydrolysis, acidogenesis, acetogenesis and methanogenesis, which involve different functional groups. To uncover the AD microbiome "blackbox" and to manage the AD systems, the first task has been deciphering the links between operational factors and microbial community assembly and functionality 4,5 . Previous studies have used the strategy of identifying the most abundant or core microorganisms and correlating them with operational factors and performance to illustrate the key factors and functional microorganisms 1,2,4,6,7 . This approach reflects the selection force on the dominant microorganisms (e.g., top 10, top 20, or >1%) and narrows down the target microorganisms for in-depth genomic and functional analysis. Whereas, the importance and functions of lowerabundance (e.g., 0.1%) members are not fully revealed, and the microbial interactions and ecological roles in the communities remain unclear. Due to the complexity nature of microbiome structure and function, unraveling the microbial interactions within a functional group and between different groups and revealing the microorganisms' ecological roles and importance in predicting system performances are challenging 8 .
Based on network theory, the co-occurrence of microorganisms can be modeled using network analysis to illustrate microbial relationships and responses to variations of operational factors and suggest clustering of sub-communities. Meanwhile, the network's topological properties, e.g., modularity and connectivity, may associate with operational and performance parameters and imply the general ecological interaction of the system [9][10][11][12][13][14] . The topological characteristics of individual microorganisms can be used to infer their ecological roles in the network. Previous studies used microbial communities from different AD systems to construct one global network and link the operational factors to demonstrate their impacts on the microbial eco-system 10,13 . Different feedstock substrates (e.g., waste activated sludge, food waste, manure) have been studied as one important factor affecting the microbial co-occurrence networks for full-scale AD systems 15 . However, operational factors for full-scale systems may vary within and between systems, and pooling all microbial communities fed with different substrates to perform one network analysis may not reveal the correlations between microorganisms properly. Rather, it is rational to build separate networks for each substrate type or each reactor, instead of one network 9,12,13,16 . Very few studies have investigated separated networks to compare different systems. In one study, four separate networks were built for AD reactors treating dairy manure or co-digestion of poultry waste at different organic loading rates and demonstrated that the network topological features and substrate availability were associated 9 . Another study compared four AD networks at different temperatures and found that the topological features were correlated with methane production rate but not temperature 12 .
Multi-network topological characteristic comparison was conducted in other wastewater treatment systems. For example, in activated sludge, the network topological features were mainly associated with substrate characteristics, i.e., the ratio of biological oxygen demand and chemical oxygen demand 17 . In activated carbon biofilter, the seasonal variation and ozonation treatment affected the network properties 18 . While the number of studies on multi-network comparison is limited, there is a need to explore the AD microbial network with appropriate operational parameter 1 constrains to better understand the links between operation, microbial community and system performance 9,18 . This study aims at investigating the impacts of operational parameters on AD performances and microbial community ecology using blackwater as a feedstock under various conditions. Temperature, reactor design, organic loading rate and substrate characteristics were control variables. In total, 52 microbial samples from 12 reactors were used to analyze microbial community diversity and composition and to construct cooccurrence networks. Separate networks were built for different groups based on the community similarity and the network topological properties were associated with reactor operational and performance variables.
The specific methanogenic activity (SMA) was measured for hydrogenotrophic (SMA_H 2 ) and acetoclastic methanogenesis (SMA_Acetate). All reactors showed higher SMA_H 2 than SMA_Acetate, suggesting that blackwater AD is dominated by hydrogenotrophic methanogenesis rather than acetoclastic methanogenesis, which agrees with the archaeal community composition that hydrogenotrophic methanogens dominated the blackwater AD (Fig. 2B). It can be due to the high content of nitrogen in blackwater and the resulted high total and free ammonia concentrations (93-1410 mg-N/L, free ammonia 4-259 mg-N/L), which leads to inhibition to acetoclastic methanogenesis 1 . The ambient temperature condition resulted in the lowest SMA and the smallest difference between SMA_H 2 and SMA_Acetate among all reactors due to the lowest temperature and microbial activities.
The effluent total COD concentrations were highest in the thermophilic and mesophilic co-digestion reactors (11.4 g/L and 8.4 g/L, respectively) since their influent COD concentrations were the highest. The thermophilic reactors showed the third highest effluent COD at 4.9 g/L, also due to the high influent COD. The soluble COD and total volatile fatty acids (VFAs) had similar trends with effluent COD. Effluent TAN and free ammonia concentrations were below the inhibitory level 19 .

Microbial community diversity and composition
To compare the microbial community beta-diversity, the Bray-Curtis distance was calculated and ordinated with the Principal Coordinate Analysis (PCoA) for bacterial and archaeal communities separately ( Fig. 1) because that archaeal community has been reported to be influenced differently with bacterial community 15,20 . The bacterial communities were clustered into three groups that associated with reactor temperature (ambient, mesophilic and thermophilic, Fig. 1A). This trend reveals that temperature is the key factor that determines the bacterial community composition. ANOSIM analysis showed that both temperature and substrate conditions had a significant effect (p < 0.001) on bacterial communities. The PERMANOVA results showed that temperature can explain 43% and substrate can explain 22% of the variance for bacterial community (Table S1). The archaeal communities showed different clustering patterns with the bacterial communities (Fig. 1B). The ambient and thermophilic reactors still formed their own clusters with a far distance, whereas the mesophilic communities divided into three sub-clusters. The first sub-cluster, consisting of mesophilic codigestion archaeal communities, was adjacent to the thermophilic archaeal communities. The second sub-cluster, mesophilic lowsolid-substrate communities, was closer to the ambient reactor communities. The third sub-cluster includes the regular-substrate and concentrated-substrate reactor samples. Although temperature was an important factor affecting the archaeal community, there were other factors impacting the archaeal community, especially under the mesophilic condition. Temperature and substrate conditions showed significant effects (p < 0.001) on archaeal communities (using ANOSIM and PERMANOVA analyses) and can explain 44% and 35% of the archaeal community variance respectively (Table S1).
The reactor operational parameters and feeding substrate characteristics (independent of temperature) were used to conduct Canonical Correspondence Analysis (CCA) to explain the variation in communities ( Supplementary Fig. S1). It showed that temperature, influent substrate concentration (total COD, soluble COD, pCOD/tCOD), organic loading rate, and hydrolysis rate constant (K H ) had significant correlations (p < 0.05, Supplementary Table S2) with bacterial community variation but not D rate (relative difference between K H and total SMA). The hydrolysis potential of the substrate (K H ) is determined by both the substrate type and reactor temperature, showing significant correlation with the bacterial community variance (p < 0.05, Supplementary Table S2). The D rate showed a significant association with archaeal community variance (p < 0.05, Supplementary Table S2), indicating that archaeal community may be affected more by the unbalanced substrate hydrolysis potential and methanogenic potential.
The alpha diversity indexes (Shannon diversity, Number of genera and Evenness) showed general decreasing trends with deceasing temperature (p < 0.05, Supplementary Table S3). Different substrate conditions also showed significant differences in all alpha diversity indexes (p < 0.05, Supplementary Table S3). Thermophilic co-digestion showed the lowest Shannon diversity and Evenness in both bacterial communities (1.42 ± 0.35, Fig. 1C) and archaeal communities (0.46 ± 0.06, Fig. 1D), and the lowest number of archaeal genera (3.25 ± 0.50). Less diverse community predominated the thermophilic co-digestion reactor with lower alpha-diversity indexes, indicating stronger selective effect and microbial competition.
Under the mesophilic conditions, the bacterial community of the low-solid-substrate reactor showed the highest Shannon diversity (3.98 ± 0.38) and number of genera (257 ± 37) (Fig. 1C). The mesophilic regular-substrate reactor showed the highest     evenness (0.80 ± 0.04). For the archaeal community, the mesophilic low-solid-substrate reactor showed the highest Shannon diversity, number of genera and evenness (1.85 ± 0.33, 14 ± 1, 0.69 ± 0.12, respectively) ( Fig. 1D). The differences within the mesophilic condition indicate that other factors besides temperature acted as driving forces for community diversity. The ambient reactor showed similar alpha-diversity indexes with the mesophilic low-solid-substrate reactors except for a lower number of archaeal genera (10 ± 2). The top 10 most abundant bacterial genera from each sample were compared, as shown in Fig. 2A. A clear clustering pattern associated with temperature was observed at the genus level and also at the phylum level ( Supplementary Fig. S2). At ambient temperature, the predominant bacteria included unidentified genera from the order Bacteroidales (5-36%), the family SB-1 (phylum Bacteroidetes) (1-25%), the family Helicobacteraceae (phylum Proteobacteria) (1-27%), the family Christensenellaceae (phylum Firmicutes) (1-4%), and Desulfomicrobium (phylum Proteobacteria) (1-17%). These bacteria were observed in the mesophilic reactors at lower relative abundances, while they disappeared or had very low abundances in the thermophilic reactors. At thermophilic conditions, the genus S1 (phylum Thermotogae) showed significant abundances (20-80%) in both blackwater and co-digestion reactors.
Comparing the clustering patterns of bacterial and archaeal communities by PCoA (Fig. 1A, B) and by the dominant genera (Fig. 2), it is suggested that five community groups can be classified: ambient, mesophilic low-solid-substrate, mesophilic (concentrated-substrate and regular-substrate), mesophilic codigestion, and thermophilic. These five groups were used for cooccurrence network analysis in the following sections.

Co-occurrence network analysis
The clustering pattern of the five groups led to construction of five co-occurrence networks using the significant correlations among genera (Spearman's correlation coefficient r > 0.6, p < 0.01). As shown in Fig. 3 and Table 2, the network characteristics varied for each group. The clustering coefficient represents the complexity of the network and strong interactions among microorganisms. The mesophilic co-digestion network showed the highest clustering coefficient (0.85), followed by the thermophilic network (0.70), indicating that microbial interactions are strongest in these two groups. A previous study showed that higher clustering coefficient corresponds to more dynamic and active community 21 . The average path length values were the lowest in these two groups (3.04 and 2.63 for mesophilic co-digestion and thermophilic condition, respectively), indicating compact network property and strong microbial interactions. These network properties can be associated to the high-hydrolysis efficiency (51.8-80.5%) and methanogenesis efficiency (51.6-77.1%) ( Table 1), which were remarkably higher than at other conditions. The mesophilic low-solid-substrate network presented the lowest clustering coefficient (0.56) and highest average path length (7.52). Meanwhile, it showed the highest modularity (0.81) and lowest average normalized degree (0.02), indicating less interdependence among the microorganisms and more divergent functional groups. This observation correlates to the substrate property with lower solids content and reduced requirement for hydrolysis (16%), leading to weaker dependence on hydrolytic bacteria by the acidogens, acetogens and methanogens compared to the other networks.

Linking network characteristics and reactor parameters
Comparison of the co-occurrence network characteristics inferred relationships with reactor hydrolysis and methanogenesis performances and activities. Therefore, the variables related to hydrolysis and methanogenesis were tested for correlation with network characteristics among all variables (Fig. 4).
The clustering coefficient was significantly (p < 0.05) positively correlated with hydrolysis efficiency, while the average normalized betweenness was significantly (p < 0.05) negatively correlated with hydrolysis efficiency, indicating that higher hydrolysis may depend on stronger microbial interactions. The average path length was significantly (p < 0.05) negatively correlated with two variables, the pCOD percentage, and the D efficiency . When the pCOD percentage is high, it is required for higher hydrolytic activity and fast consumption of the hydrolyzed products, which may lead to the short network path length. The relative differential efficiency between hydrolysis and methanogenesis (D efficiency ) may indicate the dependence on transportation or syntrophic production and consumption of intermediates between different functional microorganisms. Higher D efficiency resulted in more complex network (higher clustering coefficient) and shorter average path length. Modularity showed similar patterns with average path length, but not significant for any correlation. The Table 2. Key characteristics of co-occurrence networks of

Keystone OTUs
Besides the general co-occurrence network characteristics, the topological features of the individual OTUs are informative to indicate their ecological roles. As shown in Fig. 5, OTUs were classified into four groups based on the within-module connectivity (Z i ) and among-module connectivity (P i ) 22 . The peripherals (low Z i , low P i ) are OTUs that have a few links to other OTUs within their modules, which included the majority of OTUs in all groups. The module hubs (high Z i , low P i ) present OTUs with high number of links in their own modules. Only the mesophilic lowsolid-substrate network had one module hub, Roseburia (order Clostridiales).
The connectors (low Z i , high P i ) describe OTUs with links to several modules, inferring important roles for inter-module communication. The mesophilic and thermophilic networks showed a few connectors. In the mesophilic network, Streptococcus (order Lactobacillales) and an unclassified genus in the family Methanospirillaceae were classified as connectors. In the thermophilic network, five genera were classified as connectors, Ruminofilibacter (order Bacteroidales), Paludibacter (order Bacteroidales), Fibrobacter (order Fibrobacterales), Lachnospira (order Clostridiales), and an unclassified genus in the family Anaerobaculaceae (order Synergistales).
The normalized betweenness and degree for individual OTUs are shown in Fig. 6. The degree of an individual OTU indicates its level of interaction with other OTUs. High degree suggests a central hub role in the network. The high betweenness reveals the role for connecting other OTUs, referred as gatekeepers, suggesting the importance of maintaining communication, integrity and function of microbial communities 23 . Generally, the distribution of normalized betweenness and normalized degree showed positive relationships in all networks 21 . The mesophilic low-solid-substrate network showed lower normalized degree distribution than other networks, which is also reflected in the lowest average normalized degree (Fig.  3). The thermophilic network showed a trend of high normalized degree distribution but lower normalized betweenness than other networks, suggesting high interaction levels but lower dependence on connector OTUs, which can be reflected by the shortest average path length (Fig. 3). The relative abundances of OTUs are represented by the symbol size and showed no correlation with the distribution of normalized betweenness or degree.
The 10 highest normalized betweenness and normalized degree OTUs in each network are shown in Table 3. It can be seen that among different networks, the OTUs functioning as central hubs (highest normalized degree) were different, so were the connecting OTUs (highest normalized betweenness). Only a few OTUs appeared concurrently in different networks. The genus T78 (phylum Chloroflexi) had high normalized degree in both ambient and mesophilic co-digestion networks. An unidentified genus in the order GCA004 (phylum Chloroflexi) was shown in both mesophilic and mesophilic co-digestion networks. Fibrobacter (phylum Fibrobacteres) was in mesophilic, mesophilic codigestion and thermophilic networks as high normalized betweenness genus. Desulfobulbus (phylum Proteobacteria) was common in mesophilic low-solid-substrate and mesophilic networks to have high normalized betweenness.
Some methanogens exhibited important roles in the networks (Table 3), i.e., f_Methanoregulaceae, and Methanobacterium showed high normalized degree while Methanosaeta and f_Methanomassiliicoccaceae showed high normalized betweenness. Methanosaeta is an obligate acetoclastic methanogen while the others are hydrogenotrophic. They are syntrophic partners of many acetogens and acidogenic (H 2 -producing) bacteria 5 , exhibiting many links in the network.
Within each network, the central hubs and the connectors only shared 1-2 same OTUs. Fusibacter in the ambient network, Macellibacteroides in the mesophilic low-solid-substrate network, Acidovorax and Anaerovorax in the mesophilic network, Petrimonas in the meso co-digestion network, and Sedimentibacter in the thermophilic network had dual network functions.

DISCUSSION
Operational conditions have been demonstrated to affect microbial communities in AD systems but whether the dynamics of bacterial and archaeal communities synchronize or differ is not clear. Previous full-scale AD survey studies have included multiple factors and demonstrated that substrate type (e.g., sewage sludge, manure, agricultural wastes) and temperature drove microbial community diversity, and demonstrated covariation of bacteria and archaeal communities 15,20,24 . Given the complexity of the substrate composition and large differences between substrates, pooling different types of substrates in one analysis may underestimate other factors, and conceal the variation between bacterial and archaeal responses to operational factors. Our analysis focused mainly on blackwater, while covered a wide range of factors for inter-reactor comparison. Even though it has been recognized that bacterial and archaeal communities are linked through syntrophic partnerships, we observed that their composition and diversity rely on different factors based on the Variation Partition Analysis (VPA, Supplementary Fig. S3).
Temperature is the most significant factor affecting the bacterial community based on VPA (27% of total variance, Supplementary  Fig. S3), which has been reported in other studies 15,20 . Other factors such as substrate type 15,20 , organic loading rate, solids content 5,11 , and free ammonia 1,20 can affect the bacterial community. While it may be simple and reliable to focus on one operational factor at a time, the practical application of AD often faces multiple influencing factors. For single-factor analysis, the bacterial and archaeal communities may exhibit synchronized changes along the axis of the factor, and the enriched bacteria and archaea can indicate their selection advantages and covariant dynamics. Whereas under multiple factor variations, the selection mechanism and bacteria-archaea interaction are more complex. Our study evaluated multiple factors and underlined the remarkable influence of temperature compared to all other factors. It was interesting that with the same type of substrate feed (i.e., blackwater), other factors other than temperature exhibited higher influence on archaeal communities compared to on bacterial communities as shown in Figs. 1 and 2, and VPA analysis ( Supplementary Fig. S3). Influent total COD contributed to slightly higher level of variance than temperature (27.2% and 26.6% respectively in VPA). Especially, the D rate could explain 21% of archaeal community variance and only 10% of bacterial community variance. This phenomenon has not been emphasized or investigated although some hints have been observed in a fullscale study with complex substrates 15 . Another full-scale AD study on primary and secondary sludge showed contrary conclusion 25 , where the archaeal community was more affected by temperature than the bacterial community; however, it should be noticed that the substrate feed was mainly mixed primary sludge and waste activated sludge which can vary considerably among different reactors. Our results showed that different temperatures (thermophilic and mesophilic co-digestion) may enrich the same methanogen (Methanosarcina), which probably relates to the easy-to-hydrolyze conditions for the substrate. Using co-occurrence network analysis on groups divided based on the community clustering pattern, different topological characteristics can be observed which may imply links to bioreactor performances and microbial ecological features. Especially, some characteristics of the separated networks (i.e., clustering coefficient, average path length, normalized betweenness) were correlated with rector performances (Fig. 4) and have the potential to link and predict system function or efficiency. In our study, higher clustering coefficients and shorter average path lengths were features of the high-hydrolysis-rate and highmethanogenesis-rate groups, i.e., thermophilic and mesophilic co-digestion (Table 2). Although the number of separated network studies is limited, this approach is promising to show that the topological characteristics of co-occurrence networks are indicative of the ecological features of microbiomes inferred by our results and other microbiome network studies. In comparison to soil microbiome networks, the clustering coefficient and connectedness showed increases in early recovery and decreases in late recovery after disturbance 21 , indicating that weaker interaction (smaller clustering coefficient) predicts higher community stability 26 . In a co-occurrence network study of steady-state activated sludge 17 , small clustering coefficient and short average geodesic distance (average path length) were linked with the optimal substrate biodegradability condition and the BOD removal load, indicating generally higher microbial activity. It should be noted that AD systems contain more microbial interactions for cooperation and syntrophic partnership compared with aerobic systems; and the average clustering coefficient (0.59-0.85) in our study was much higher than the activated sludge networks (0.059-0.402) 17 and soil bacterial networks (0.08-0.24) 21 . Comparison of the AD networks in our study demonstrated that stronger interaction (higher clustering coefficient and shorter average path) is linked to higher rates and activities of microbial communities under steady-state conditions. It should be noted that the implication to stability is not included in the present study, which can be tested using non-steady-state bioreactors in future studies.
The high modularity indicates the formation of sub-communities, representing high niche differentiation in other ecosystems 17 . In activated sludge networks, the modularity was indicated to reflect parallel functional groups formed under desirable (non-stressful, less limitation) and less-disturbance conditions, while interaction and dependency between microbes are less required compared to highdisturbance conditions 17 . Cooperative niches are essential for AD system performance rather than parallel niches. In reported AD  Fig. 6 Normalized betweenness and normalized degree of individual genera. The size of symbols represents the relative abundances of genera. Five groups: Ambient, Mesophilic Low-solid-substrate, Mesophilic (regular-substrate and concentrated-substrate), Mesophilic Codigestion, Thermophilic (sole-blackwater and co-digestion). systems, the modularity was positively correlated with methane production 12 , while the clustering coefficient was not reported, thus not able for comparison with our study. Another AD study demonstrated multiple networks for sewage sludge and food waste co-digestion without comparing the network topological characteristics 16 . Nevertheless, it showed network differences from non-stable to stable status, inferring that the network characteristics may indicate system stability. Even though the clustering coefficient and the modularity showed different meanings for different types of microbiomes (soil, activated sludge and AD), the average path length seems to have a global meaning. Shorter path length is correlated with higher efficiencies of BOD and nutrient removal in activated sludge 17 , and negatively with particulate COD percentage and relative differential efficiency between hydrolysis and methanogenesis (D efficiency ) in AD in our study. The ecological implication of shorter path length can link to better microbial cooperation, more communication and intermediate transport, which are crucial for AD microbiome.
The topological characteristics of individual nodes (OTUs) are indicative of the ecological roles of the OTUs. For example, Roseburia, the module hub in mesophilic low-solid-substrate network, is a butyrate-producing anaerobic bacterial genus 27 and may perform acidogenesis that connects hydrolysis and downstream acetogenesis and methanogenesis. The approach of identifying keystone OTUs from network analysis is therefore useful to bring insights to understand the biochemical functions and microbial interactions. Identifying keystone OTUs using Z i -P i (within-module connectivity and among-module connectivity) plot has been used in many studies 17,22,28 , although it is also recognized that the majority of OTUs were peripherals 17 and lacked ecological implication. The betweenness and degree of nodes are informative features to indicate OTUs' ecological roles as communication connectors and central hubs 21 . Noticeably, these important roles are not correlated with the relative abundance of the OTUs (Fig. 6). The high-abundance OTUs are functionally important for reactors' performances but not always perform central roles in the network 17,22,28 .
Lower-abundance OTUs (as low as 0.1%) can be found to have high normalized betweenness and degree, indicating their important roles in maintain the system's ecological stability and functionality. These OTUs differed in the five AD networks (Table 3), underlining the ecological importance of OTUs with low abundance and high betweenness and degree in different systems and potential to use them as ecological indicators. T78 (ambient and mesophilic co-digestion networks) was reported in anaerobic digesters and suggested to produce acetic acid (acetogenesis) from long-chain fatty acids 29 , which explains its central hub role linking with many bacteria and archaea since acetic acid can be utilized widely. Fibrobacter (mesophilic, mesophilic co-digestion and thermophilic networks) is a common cellulose-degrading genus in AD 30 , which is likely to perform cellulose hydrolyzation from the feedstock and supply metabolites to acidogenetic and acetogenetic bacteria with high efficiency. Desulfobulbus (mesophilic low-solid-substrate and mesophilic networks) can oxidize propionate with sulfate to produce acetate 31 , linking the propionate-producing bacteria and acetate-consuming bacteria in mild conditions. Among the operational factors, influent substrate characteristics influenced archaeal communities in addition to temperature which affected both bacterial and archaeal communities. Moreover, the network topological properties were correlated with influent substrate characteristics. Higher hydrolysis efficiency of the substrate associated with higher clustering coefficient and lower normalized betweenness of the network, indicating higher complexity and microbial interactions. One of the network's topological characteristics, average path length, can be an indicator of system performance, linking the network model to engineering operations. Individual OTUs' topological characteristics showed that (i) high-rate thermophilic reactor group had more connector OTUs (e.g., Ruminofilibacter, Paludibacter), inferring stronger inter-module communication, and (ii) many lowerabundance OTUs (as low as 0.1%) may be important members to maintain the stability and functionality of the microbial community. This study differentiates bacterial and archaeal communities' driving factors, reveals the links between system performance and network features, and raises our attention on ecological importance of lowerabundance OTUs (as low as 0.1%) besides the top ranked OTUs. Future work is required to expand the sampling size of full-scale systems if to apply this approach to engineering systems, e.g., realtime data acquisition and analysis could be implemented to predict, monitor and troubleshoot the system performance. Low-abundance taxa's ecological roles should be investigated using integrated approaches on top of network topology.

METHODS Reactors
Anaerobic digestion of blackwater was conducted under temperature ranging from 22 to 55°C, including ambient (22°C, 5 reactors), mesophilic (35°C, 5 reactors) and thermophilic (52-55°C, 2 reactors) conditions. Details of the feedstock preparation are provided in Supplementary information. Table 4 shows the reactor processes and operational parameters of 12 reactors all fed with blackwater at different concentrations (influent total chemical oxygen demand [COD]
The specific methanogenic activity (SMA) of the reactor sludge at the end of each operation stage was measured in batch bottles fed with H 2 &CO 2 (80%/20%) or sodium acetate, to evaluate the maximum activity according to the methods described previously 2,4 . The observed specific methanogenic activity was calculated based on the real CH 4 production in the reactor.

Derived parameter calculations
Free ammonia (FA) was calculated using TAN, pH and temperature (unit K).
Hydrolysis efficiency, methanogenesis efficiency and hydrolysis rate constant (K H ) were calculated using the following equations: Methanogenesis efficiency % ð Þ ¼ COD CH4 tCOD inf À tCOD eff (3) To describe the difference between hydrolysis and methanogenesis, two variables were derived. The relative differential efficiency (D efficiency ) calculates the relative difference between hydrolysis and methanogenesis efficiency. The relative differential rate (D rate ) calculates the relative difference between hydrolysis rate constant (which reflects the easiness of the substrate to be hydrolyzed) and total SMA. D efficiency ¼ Hydrolysis efficiency % ð Þ À Methnogenesis efficiency ð%Þ Hydrolysis efficiency % ð Þ (5) DNA extraction and 16S rRNA gene sequencing Genomic DNA was extracted from each sludge sample using DNeasy PowerSoil Kit (QIAGEN, Canada) following the manufacturer's protocol. The quality and concentration of the extracted DNA were examined using gel electrophoresis and NanoDrop One (ThermoFisher, USA). PCR was performed to amplify the 16S rRNA genes using the universal primer-

Data analysis
The demultiplexed paired raw sequences were quality-filtered and aligned using the Qiime2 pipelines 32 "DADA2" algorithm 33 with 99 % similarity in reference to the Greengenes (13_8). Microbial community Alpha and Beta diversities, Principal Coordinates Analysis (PCoA), Canonical Correspondence Analysis (CCA), environmental factor fitting to ordination (envfit), Variation Partition Analysis (VPA), ANOSIM and PERMANOVA were analyzed using the "vegan" package 34 in RStudio version 3.4.1 35 . The PCoA was performed using Bray-Curtis distance. The top 10 highest relative abundance bacterial genera and all archaeal genera were plotted to heatmap using "heatmap.2" function in "gplots".

Co-occurrence network analysis
Co-occurrence network analysis was conducted using R "igraph" 36 and "psych" packages 37 . Grouping of the samples into five groups resulted in different number of samples (Table S5) and the highest sample number was 20 in the Ambient group which was reduced to 8 selecting those at AD steady state at the Ambient condition for network analysis to reduce bias of large sample number. Only OTUs with relative abundance higher than 0.1% and occurred in more than two samples were included for analysis. The Spearman's correlations at r > 0.6 and p < 0.01 were used for network construction. Although the sample number varied, it did not show significant correlation with network properties except for the size of network (Fig. S4). The size of network was not significantly correlated with other properties of the networks or reactor parameters (Figs. 4 and S4). We used the same cut-off value of correlation coefficient for all networks (r > 0.6) instead of varied cut-off values 38 to be consistent among networks and to be comparable with literature 21 . The power-law fit of degree showed significance level of p < 0.05 for all networks except for the thermophilic network (Supplementary Table S4). Random Networks showed much lower clustering coefficients than the constructed networks (Table 2), including the thermophilic network (0.13 vs. 0.70). The network properties, clustering coefficient, modularity, average path length, average normalized degree, and positive ratio, were analyzed using the "igraph" package. The key characteristics are shown in Table 2 and additional information in Table S5. Although the node numbers (Order , Table S5) varied, the correlation matrix did not show any significant correlations between the network properties and the node number (Fig. S4).
The reactor variables include influent substrate characteristics (tCOD and sCOD concentrations), the ratio of particulate COD in total influent COD (pCOD %, which reflects the demand for substrate hydrolysis), the hydrolysis potential of the substrate (hydrolysis rate constant, K H ), SMA, observed SMA (methanogenesis activity), hydrolysis efficiency, methanogenesis efficiency, and the derived variables relative differential efficiency (D efficiency ) and relative differential rate (D rate ). Two connectivity types were calculated and compared, within-module connectivity (Z i ), and among-module connectivity (P i ) 22 . The P i Table 4. Reactor processes and operational parameters.