Parallel-META 3: Comprehensive taxonomical and functional analysis platform for efficient comparison of microbial communities

The number of metagenomes is increasing rapidly. However, current methods for metagenomic analysis are limited by their capability for in-depth data mining among a large number of microbiome each of which carries a complex community structure. Moreover, the complexity of configuring and operating computational pipeline also hinders efficient data processing for the end users. In this work we introduce Parallel-META 3, a comprehensive and fully automatic computational toolkit for rapid data mining among metagenomic datasets, with advanced features including 16S rRNA extraction for shotgun sequences, 16S rRNA copy number calibration, 16S rRNA based functional prediction, diversity statistics, bio-marker selection, interaction network construction, vector-graph-based visualization and parallel computing. Application of Parallel-META 3 on 5,337 samples with 1,117,555,208 sequences from diverse studies and platforms showed it could produce similar results as QIIME and PICRUSt with much faster speed and lower memory usage, which demonstrates its ability to unravel the taxonomical and functional dynamics patterns across large datasets and elucidate ecological links between microbiome and the environment. Parallel-META 3 is implemented in C/C++ and R, and integrated into an executive package for rapid installation and easy access under Linux and Mac OS X. Both binary and source code packages are available at http://bioinfo.single-cell.cn/parallel-meta.html.

pipeline for easy installation and friendly user experience. Tests on 5,337 samples with 1,117,555,208 16S amplicon sequences showed that Parallel-META 3 is able to uncover dynamics of microbiomic features and patterns in both taxonomical and functional aspects with higher efficiency, optimized memory usage and uncompromised precision from large-scale metagenome datasets.

Results
To evaluate the performance of Parallel-META 3, we prepared four test datasets collected from different studies and produced by different platforms (Table 1). Dataset 1 and 2 contained 4,799 16S rRNA amplicon samples in total from healthy adults at five body sites (nasal, gastrointestinal, oral, skin and urogenital) produced by Human Microbiome Project 10 . Dataset 3 contained 528 human gut 16S rRNA amplicon samples from healthy adults, children less than 3 years old (C-3) and children of 3-17 years old (C-17) from three countries (Malawi, USA and Venezuela) produced by Yatsunenko, et al. 11 These three datasets were employed as representatives of 16S rRNA amplicon based microbiome data.
On the other hand, in order to test the performance on shotgun metagenomics data, Datasets 4 was included, which contained 10 artificial metagenomic shotgun samples that were simulated with nine organisms using Dwgsim 12 (version 0.1.8). These 10 samples were divided into two groups based on proportion of organisms to test the sensitivity of Parallel-META 3 in distinguishing samples with different community patterns (detailed design of simulation in Table S1 of Supplementary file S1).
Results of 16S rRNA amplicon samples from the first three datasets were generated via Parallel-META 3 (version 3.3.2), then they were compared to the taxonomical structure reported by QIIME 7 (version 1.9.0) and functional profiles predicted by PICRUSt 13 (version 1.0) that worked the benchmark. For the artificial metagenomic All analysis steps were implemented in C/C+ + and/or R with optimized parallel computing, and well configured in to a fully automatic pipeline package.    Figure S1 in Supplementary file S1. Higher efficiency of Parallel-META 3 in computing time and memory usage than the benchmark. We illustrated the efficiency advantages of Parallel-META 3 by comparing the running time and memory usage of all analysis processes with the three 16S rRNA amplicon datasets to those of the benchmark by QIIME and PICRUSt. All tool packages were invocated in parallel computing model using 1, 16, 32 and 64 threads, respectively. Benefited by the whole-process parallel scheduling, multi-thread memory sharing and C+ + programming, Parallel-META 3 achieved a ~5× faster speed (Fig. 2) than the benchmark scores, yet with remarkably lower RAM usage on identical hardware configurations, such as for Dataset 1 the benchmark software used over 48 GB but Parallel-META 3 only took 3.3 GB. In addition, the speedup ratio was independent of either the source or the sequence type of the input samples. This acceleration demonstrated the capability of Parallel-META 3 on rapid evaluation of input samples, which is essential for in-depth data mining with massive amount of samples and complex community structures.
Identification of microbiota structure pattern from 16S rRNA amplicon samples. Community structure is the basis for in-depth interpretation of microbiota. We verified the overall taxonomical patterns and predicted metabolic profiles of three 16S rRNA amplicon datasets generated by Parallel-META 3 with the benchmark from both α diversity and β diversity aspects. The α diversity was calculated by Shannon index and    Simpson index at the genus level and the KEGG pathway level; the β diversity was examined based on pair-wised Meta-Storms 15 distance of OTUs and KOs. Since the 16S rRNA copy number calibration (refer to Methods section for details) was not supported in QIIME, we disabled this option in taxonomy analysis of Parallel-META 3 for consensus purpose. No significant difference was observed between the community structures reported by Parallel-META 3 and the benchmark on α diversity (Wilcoxon rank-sum test; p-value > 0.01; Fig. 3). Moreover, the two methods generated similar β diversity patterns in PCoA as verified by Monte-Carlo test (10,000 permutations, p-value < 0.001; Fig. 4). Furthermore, the reliability of metabolic prediction was examined by NSTI (Nearest Sequenced Taxa Index) values of PICRUSt 13 . The strong Pearson correlation (R > 0.9) also confirmed the uncompromised precision in functional profiling for Parallel-META 3 (Fig. 5).
Interpreting the diversity of 16S rRNA amplicon samples. Since change in microbial diversity in microbiota samples has been linked to several human diseases, we assessed the capability of Parallel-META 3 in α diversity and β diversity statistics via both taxonomical structure and predicted functional profiles. For human-associated microbiota samples from Dataset 1 and Dataset 2, Parallel-META 3 differentiated samples from each body site via β diversity by their taxonomical and functional profiles ( Fig. 6A and B). Moreover, the distinct degree of microbiomic complexity in each body site was revealed via α diversity ( Fig. 6C and D, Wilcoxon rank-sum test p-value < 0.01): oral communities feature especially diverse taxonomical membership, vaginal sites harbor particularly simple communities, while gut samples had the most complex metabolic functions. This significance was also confirmed by previous works of The Human Microbiome Project Consortium, Nature 16 . Furthermore, biomarker analysis at the genus level suggested that each body site could be characterized and distinguished by the relative abundance of just a few taxa (Fig. 7): for example, skin communities were specifically dominated by Propionibacterium and Staphylococcus, while oral samples contained abundant Streptococcus and Prevotella. In addition, it was apparent that whether 16S rRNA reads were from V1-V3 and V3-V5 regions did not affect the diversity analysis results (Fig. 6E and F).
Using the human gut microbiome samples of Dataset 3, Parallel-META 3 generated results that described the global patterns of age-associated changes and partitions among populations and countries. In this case we observed that gut samples from children under three-year old had a much lower α diversity, and interpersonal variation as evaluated by β diversity on both taxonomical and functional levels was significantly greater among children under three-year old than among 3-7-year old children or adults; such a pattern was robust to change in geography (Fig. 8). On the other hand, functional analysis uncovered the specific pathways that discriminated the microbiomes of different ages (Fig. 9): methane, arginine, proline, energy and glutamine metabolisms were higher in the adult microbiomes, while nitrogen, carbohydrate, tyrosine and glycerolipid metabolisms were enriched in baby microbiomes. In addition, notable differences in functional profile of microbiota were found among individuals that lived in different countries, with especially pronounced separation observed between the Malawian & Venezuelans and the US gut communities across all age stages (Fig. 10).

Identification of microbiota structure pattern by simulated metagenomic shotgun sequences.
Using Dataset 4, the ability of Parallel-META 3 to process shotgun metagenome data was examined by comparing the community structures reported by our software to the simulation design with 10 samples. Parallel-META 3 was used to profile shotgun sequences at the genus level, and the relative abundance of all organisms then estimated with 16S rRNA copy number calibration. Parallel-META 3 finished all processes on the 10 simulated shotgun samples in 15 minutes (892.06 s) with maximum RAM usage of 1.6 GB. The microbiota patterns were presented in bar charts ( Fig. 11A and B) and then compared to the simulation design in PCA (Fig. 11C) on β diversity by Monte-Carlo test with 10,000 permutations (p-value < 0.001). Based on the calculated organism proportions, the 10 samples were separated into the two groups that were consistent with the simulation design, thus confirming the reliability of Parallel-META 3 in handling metagenomic shotgun sequences. In addition, had the analysis been performed without 16S rRNA copy number calibration in PCA (Fig. 11C), a noticeable deviation of the results from the true answer in the simulation design was observed (p-value = 0.003). This indicated that the new feature of 16S rRNA copy number calibration in Parallel-META 3 could improve the accuracy of estimating relative abundances of organisms in microbiota.

Discussion
With the rapid expansion of microbiota research, it is becoming increasingly important to perform in-depth data mining of large microbiome datasets or compare datasets produced from different studies. However, the challenge associated with big-data computing has limited the data size-range and samples number that can be effectively analysed. On one hand, such analysis usually cannot be finished in only single round, instead requires several iterations for parameter adjustment, meta-data (sampling information) arrangement, samples insertion or removal, or result optimization; computing-intensiveness of such processes frequently causes an unacceptable amount of wait time, especially when the number of samples reaches thousands. On the other hand, hundreds of Giga-bytes of RAM consumption as well as larger amount of computing resources are usually required for such analyses. The Moreover, the complexity of hierarchical structures and dependency packages associated with current metagenome analysis pipelines can become a practical barrier for installations and operations. Targeting this issue, we have enhanced the users experience of Parallel-META 3 in both installation and access by: (i) encapsulating software components into a single package to allow installation with little user configurations, (ii) organizing all steps for automatic execution by a single command line with user-defined parameters, and (iii) simplifying the visualization for users via high-quality vector graphs to allow nimble and flexible interrogation of the data.
In summary, Parallel-META 3 enables high-throughput and in-depth data mining from a large number of microbial community samples with high efficiency and out-standing experience. Further development of this and related tools that tackle the challenges associated with comparison, searching and mining of large-scale and heterogeneous microbiome datasets should help to fulfil the promise of microbiome research in various scientific disciplines and application areas.  Meta-Storms distances between the Malawi samples and the Venezuela samples were smaller than that between other pairs. Functional profiles exhibited a similar pattern (refer to Figure S4 in Supplementary file S1 for details).

Methods
Scientific RepoRts | 7:40371 | DOI: 10.1038/srep40371 metagenomic shotgun sequences for profiling. All 16S rRNA gene sequences (either extracted from shotgun sequences or 16S rRNA amplicon reads) are aligned to the Parallel-META 3 reference database by Bowtie2 19 for OTU picking, taxonomical annotation and phylogeny construction. The reference 16S rRNA sequences are from a customized database that integrates GreenGenes 20 (version 13-8, sequence similarity on 97% level) with RDP and SILVA consensus taxonomy annotation (assigned by BLASTN with e-value < 1e-30 and similarity > 97%), which raised the proportion of annotated sequences at the genus level from 35.8% to 81.5% (refer to Figure S5 in Supplementary file S1 for details). The phylogenetic architecture of all reference sequences is built by FastTree 21 . Since 16S rRNA gene copy number varies greatly among different bacterial species, Parallel-META 3 also calculates the precise relative abundance of each organism by 16S rRNA copy number calibration using IMG database 22 . In addition, considering that the uneven sequencing depth (number of sequences) among multiple samples may introduce bias in detecting diversity patterns 23 , an optional sequence rarefaction for sequencing depth normalization at the OTU level is provided after the taxonomic profiling.
For prediction and annotation of functional profile, Parallel-META 3 re-implements the PICRUSt 13 algorithm using KEGG database 14 to estimate all the functional genes harboured in a microbiota using 16S rRNA gene OTUs. The functional genes are annotated by KO (KEGG Ontology) and KEGG pathway. Parallel-META 3 also measures the prediction accuracy by the NSTI (Nearest Sequenced Taxon Index) value 13 , which is calculated by the sum of distances between OTUs and their nearest individually sequenced relatives in the phylogenetic architecture.
After taxonomical and functional profiling, Parallel-META 3 parses out the sequence counts and relative abundances (normalized into 0-100%) for all OTUs, and estimates the same information for annotated taxa from the phylum level to the genus level, as well as the genes and the pathways. Such data is framed into tables that are compatible for further analysis in Parallel-META 3 and also suitable for manual examination by users.
The α diversity evaluation and statistics. Parallel-META 3 evaluates α diversity that describes the inner complexity of each individual microbiota sample. This process generates rarefaction curves of α diversity based on observed OTU number and Shannon index to determine the adequacy of the sequencing depth. The rarefaction performs a series of random sequence selection on different sequencing depth with bootstrap (default is 20, and can also be defined by users), and the α diversity in the curves is calculated by the mean sequence count of each OTU among the bootstrapping procedures. Then the influence of each environmental factor on α diversity is quantitatively evaluated by multivariate statistical analysis with Shannon index, Simpson index and CHAO1 index.
The β diversity evaluation and statistics. Parallel-META 3 examines β diversity of multiple microbiota samples based on their pair-wise similarity matrix to discover the patterns of organism/gene sharing and variation among samples. The quantitative similarity between each sample pair is computed by Meta-Storms 15 algorithm, which considers both the relative abundance of OTUs existent in two samples and the distances among OTUs in the phylogenetic architecture. The β diversity evaluation includes unsupervised hierarchical clustering, supervises clustering using PCA (Principle Component Analysis), PCoA (Principle Co-ordinate Analysis) and multivariate statistical analysis that quantitatively evaluates the correlation between environmental factors and the sample similarities. Biomarker discovery. Parallel-META 3 can also identify key organisms or functional genes that are highly correlated with the variations of the habitats or other types of metadata. Organisms or genes with significant differences among microbial community samples were firstly chosen using Kruskal-Wallis or Wilcoxon rank-sum test as candidate makers, and these candidate makers are then ranked based on their contribution to the differentiation among samples using the Random Forest algorithm.

Construction of microbial interaction network.
The microbial interaction network is constructed to explore co-occurrence and co-exclusion patterns of organisms or functional genes across microbial community samples. In the interaction network, each node represents a single organism (or gene), and nodes are connected by links that represent their correlation coefficient of abundance variation among multiple samples 24 . Then Parallel-META 3 illustrates the global pattern among multiple samples by the network's topological characters such as nodes number, isolated island number, density, diameter, radius and centralization (an example is provided in Figure S6 in supplementary file S1).
Parallel computing and optimization. Parallel-META 3 was designed and constructed by parallel computing based on C/C+ + OpenMP library. All process steps are allocated to independent threads and in parallel invocated in multiple CPUs or CPU cores, with a dynamic threads scheduling strategy for optimized load balancing, and shared memory spaces for lower RAM usage. The thread number is automatically assigned based on hardware detection, and can also be defined or customized by the users.