The Inflammatory Bowel Disease Transcriptome and Metatranscriptome Meta-Analysis (IBD TaMMA) framework

Massimino, Luca; Lamparelli, Luigi Antonio; Houshyar, Yashar; D’Alessio, Silvia; Peyrin-Biroulet, Laurent; Vetrano, Stefania; Danese, Silvio; Ungaro, Federica

doi:10.1038/s43588-021-00114-y

Download PDF

Brief Communication
Open access
Published: 20 August 2021

The Inflammatory Bowel Disease Transcriptome and Metatranscriptome Meta-Analysis (IBD TaMMA) framework

Luca Massimino ORCID: orcid.org/0000-0003-3975-9148^1,2^na1,
Luigi Antonio Lamparelli ORCID: orcid.org/0000-0002-9571-2582²^na1,
Yashar Houshyar²,
Silvia D’Alessio³,
Laurent Peyrin-Biroulet^4,5,
Stefania Vetrano^1,2^na2,
Silvio Danese^1,2^na2 &
…
Federica Ungaro ORCID: orcid.org/0000-0001-5395-7795^1,2^na2

Nature Computational Science volume 1, pages 511–515 (2021)Cite this article

15k Accesses
30 Citations
9 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 29 September 2021

This article has been updated

A preprint version of the article is available at Research Square.

Abstract

Inflammatory bowel disease (IBD) is a class of chronic disorders whose etiogenesis is still unknown. Despite the high number of IBD-related omics studies, the RNA-sequencing data produced results that are hard to compare because of the experimental variability and different data analysis approaches. We here introduce the IBD Transcriptome and Metatranscriptome Meta-Analysis (TaMMA) framework, a comprehensive survey of publicly available IBD RNA-sequencing datasets. IBD TaMMA is an open-source platform where scientists can explore simultaneously the freely available IBD-associated transcriptomics and microbial profiles thanks to its interactive interface, resulting in a useful tool to the IBD community.

Results and lessons learned from the sbv IMPROVER metagenomics diagnostics for inflammatory bowel disease challenge

Article Open access 18 April 2023

The gut microbiome-metabolome dataset collection: a curated resource for integrative meta-analysis

Article Open access 15 October 2022

Single-cell meta-analysis of inflammatory bowel disease with scIBD

Article 15 June 2023

Main

Inflammatory bowel disease (IBD), including ulcerative colitis (UC) and Crohn’s disease (CD), is a class of multifaceted chronic inflammatory gut disorders characterized by an uncontrolled, resolution-failing inflammation that leads to bowel damage¹. In recent decades, numerous omics studies have focused on understanding IBD pathogenesis. Despite the high amount of RNA-sequencing (RNA-seq) data produced to help preclinical and clinical research, such studies are hard to integrate due to their experimental variability and different analytic approaches². To exploit the efforts made over the years by IBD experts in the field of next-generation sequencing (NGS), we here introduce a meta-analysis web app, the IBD Transcriptome and Metatranscriptome Meta-Analysis (TaMMA) platform. IBD TaMMA is a comprehensive survey of publicly available RNA-seq datasets from IBD-derived and control samples across different tissues, all analyzed with the same pipeline and batch-corrected for data harmonization and simultaneous comparison among the different studies. This tool, featuring increased statistical power due to its augmented sample size, provides to the scientific community a user-friendly, open-source platform where data-mining of the IBD-associated transcriptome and metatranscriptome can be faster and statistically more powerful than each single study alone, resulting in a useful tool for the IBD community.

Results

Various meta-analyses of gene expression profiles of patients with IBD from microarray datasets have already identified dysregulation of the expression of genes encoding for several inflammatory factors and RNA-binding proteins^3,4,5. However, these studies focused on a limited number of genes and lacked not only whole-transcriptome but also metatranscriptome profiling. The latter has recently emerged as a successful approach to uncover novel gut-populating microbial entities⁶.

To provide a wider picture of the whole transcriptome and metatranscriptome at different tissue and cell levels in patients with both UC and CD, we collected and analyzed publicly available RNA-seq datasets. As this involved 26 independent studies, we predicted an experiment-dependent bias, which we counteracted with ComBat⁷, a batch-correction algorithm that is well established in transcriptomics for adjusting unwanted sources of variation in the context of high-throughput experiments⁸, following the source and tissue of origin. We also tried to batch-correct the different library construction strategies, but their variance was already fully explained by the source study. The meta-analysis performed was used as the core to design the IBD TaMMA web app (Supplementary Fig. 1a), intended as a set of analyses displayed in a web browser, which allows quick access to differential gene expression and Gene Ontology functional enrichment results for the different conditions. Sample dispersion within the Uniform Manifold Approximation and Projection (UMAP), easily accessible through the IBD TaMMA platform, shows clustering following the tissue of origin but not the source study (Fig. 1a,b), indicating successful data harmonization. Consistently, housekeeping gene expression levels were found to be comparable across the different tissues and conditions (Supplementary Fig. 1b).

Notably, owing to the lack of patient metadata in the majority of studies integrated into IBD TaMMA, we could perform neither stratification nor clustering based on patient characteristics (age and gender). Similar limitations are found in previously published studies, although they offered important contributions for the development of innovative IBD-counteracting therapies. To mitigate limitations related to the absence of patient metadata, we first assessed whether this platform confirmed previously established IBD-specific features. Our platform indeed pinpoints strong differential gene expression among UC, CD and healthy (control) groups in the ileum, colon and rectum, as shown in Fig. 1c. Of note, IBD-specific proinflammatory signatures were confirmed. Specifically, by comparison with healthy tissues, IBD-derived intestinal tissues displayed increased expression levels of tumor necrosis factor-α (TNF), interferon-γ (IFNG), interleukin-12B (IL12B), integrin-α4 (ITGA4) and integrin-β7 (ITGB7), encoding for proteins known to be drivers of chronic inflammation and thus exploited as therapeutic targets for patients with IBD⁹ (Fig. 1d, Supplementary Fig. 1c and the ‘Old evidence from the literature’ tab at IBD TaMMA). Similarly, S100 calcium-binding protein A8 and A9 (S100A8 and A9) transcripts encoding for the two subunits of the fecal biomarker calprotectin¹⁰ and the recently emerged S100A12¹⁰ were increased in intestinal samples from CD and UC as compared to the healthy (Fig. 1e,f) groups. These results are well in line with most of the studies reporting these molecules as biomarkers of inflammation in patients with IBD¹⁰. Additionally, epithelium- and pro-angiogenic-related biological processes, known to be altered during gut chronic inflammatory disorders^11,12,13,14, were also found to be dysregulated in both UC and CD colon versus control colon, as well as in CD ileum versus control ileum (Supplementary Fig. 1d and ‘Old evidence from the literature’ tab at IBD TaMMA).

Metatranscriptomics performed on IBD and healthy stools¹⁵ paralleled previous metagenomics analysis, confirming the Bacteroidetes and Firmicutes phyla, followed by Actinobacteria and Proteobacteria, as the main colonizers of the fecal microbiota¹⁶ (Fig. 1g, upper bars). IBD TaMMA also highlighted IBD and healthy intestinal samples to be colonized by the same phyla, although with different proportions (Fig. 1g, lower bars). Moreover, decreased intestinal microbiota diversity, a well-known feature of IBD pathogenesis¹, was confirmed in IBD stools as compared to healthy stools (Fig. 1h), paralleled by decreased diversity also in the colon and ileum from the UC group and in the colon from the CD group (Fig. 1i,j). Interestingly, CD ileum showed increased microbiota diversity compared to the other groups (Fig. 1j), providing an insight into the disease location-dependent microbiota composition in patients with CD. Of note, IBD TaMMA also confirmed virome dysbiosis, with the expansion of Caudovirales in both pediatric IBD and UC samples^17,18 (Fig. 1k), as well as increased levels of the Herpesviridae family in IBD-derived samples and of the Hepadnaviridae family in UC ileum as compared to the healthy, as previously reported^19,20 (Supplementary Fig. 1e,f). Among the Herpesviridae-belonging viruses, cytomegalovirus (CMV) genus infection was previously associated with complicating UC, and its presence correlated with increased colectomy and mortality rates in patients with UC²¹. CMV includes different species of human beta herpesvirus species whose transcripts from IBD TaMMA were found to be upregulated in patients with IBD. Specifically, human beta herpesvirus 5 was highly abundant in UC and CD colon and CD ileum as compared to healthy tissues (Supplementary Fig. 1g and ‘Old evidence from literature’ tab at IBD TaMMA). These data support previously published evidence and indicate that high levels of the CMV genus-belonging beta herpesvirus 5 are associated with intestinal inflammation.

We then attempted to uncover novel aspects of the IBD-derived samples. We present evidence generated by meta-analysis of the archaeome and mycome composition. Archaeome analysis by TaMMA revealed that the archaea Nitrosophaerales, Haloferacales, Natrialbales and Thermococcales were among the most abundant archaea orders in the CD ileum, whereas the most abundant orders in UC ileum were Methanococcales, Methanobacteriales, Methanosarcinales and Methanomicrobiales, evidencing the differences between the two diseases in the ileal part. Interestingly, members of Methanomicrobiales were also found higher in UC colons, where it was the sole archaeal order to be statistically significant, while no differences were observed in colons from patients with CD. From these insights, we can conclude that each intestinal tract may display differential abundances of archaea, not only reflecting the specific gut tract, but also specific disease conditions (‘New evidence by TaMMA’ tab).

Regarding the mycome profile, CD and UC ileum both feature an increased abundance of Glomerellales, Tremellales and Hypocreales while also featuring a decreased abundance of Schizosaccharomycetales. Some orders were differentially abundant in the two conditions. Saccharomycetales, Ustilaginales, Malasseziales, Eurotiales, Mycosphaerellales and Magnaporthales were found to be differentially abundant exclusively in UC ileum, while Saccharomycetales, Ustilaginales and Sordariales were dysregulated only in CD ileum. Interestingly, different from the ileum, very few orders were found to be differentially abundant in the colon, perhaps resembling the different immune competence of the two tissues²⁰. In conclusion, the mycobiome composition is gut tract- and disease-specific, opening additional horizons for IBD-associated microbiota diversity (‘New evidence by TaMMA’ tab).

It is noteworthy that, during the analysis, most of the human unmapped reads failed to be classified by metatranscriptomics profiling and therefore were considered as NGS dark matter. These data have been submitted to a data repository (given in the ‘Data availability’ statement) as we believe these data can also contribute to the understanding of gene and microbial entities not yet known but that may be the aim of future investigations (to discover new microbial entities).

In conclusion, together, these pieces of evidence propose IBD TaMMA as a platform that can confirm well-known features of IBD pathogenesis, hopefully resulting in a useful open-source tool for uncovering further insights into personalized diagnosis and prognosis upon treatment.

Discussion

Numerous whole-transcriptome analyses of IBD samples have been performed, but a platform where these data can be browsed and compared is currently lacking. The IBD TaMMA web app introduces an integrative analysis of all IBD-related publicly available RNA-seq datasets and has been designed to have a graphical interface that allows users to interact with it, helped by a guide, icons and dropdowns selecting specific analysis and comparisons. However, owing to the lack of patient metadata, many clinically relevant aspects cannot be confirmed, such as the correlation between CMV infection and increased risk of colectomy in patients with UC²¹. Of course, including other characteristics will reinforce the web app, and we propose to update it soon as they are available in other datasets. We thus hope that more clinical metadata will be added over time, finally providing more insights into the different aspects of chronic intestinal inflammation.

As IBD TaMMA is a visualization rather than an analysis web app, users cannot upload or analyze their data independently. However, although future versions of IBD TaMMA may be developed as analysis web apps, in the meantime we encourage users to ask for the analysis of new datasets through the dedicated link https://github.com/Humanitas-Danese-s-omics/ibd-meta-analysis-data/issues.

As future steps for the IBD TaMMA, additional advantages will include the analysis of samples from IBD patients with extraintestinal manifestations (https://www.crohnscolitisfoundation.org/what-is-ibd/extraintestinal-complications-ibd). This may help understand the mechanism through which some patients may experience IBD-related complications.

Updated versions of IBD TaMMA will include the analysis of differential gene abundances for the microbiota, adding valuable information, as well as the realization of a dedicated TaMMA-like platform comprehensively analyzing IBD along with other immune-mediated disease-derived samples, which may expedite the discovery of shared features.

IBD TaMMA will also be implemented with other omics analyses, such as genomics and proteomics, and, thanks to the multi-omics analysis approach²², we will show the real variance explained by each omics, instead of using a single dataset at a time (for example, archaea only), which would consider only a part of the total picture. This is indeed the direction in which we are proceeding for future releases of the updated versions of IBD TaMMA, offering the opportunity to propose new hypotheses and insights for a better comprehension of IBD pathogenesis and the development of personalized treatments.

Methods

RNA-sequencing data

The pipelines for RNA-seq data download and analysis were designed with Snakemake v6.4.1²³. FASTQ reads from 3,853 RNA-seq data (6.5 terabytes, 13 tera base pairs) were searched in NCBI GEO/SRA using the following search query: ‘(crohn OR colitis OR ibd) AND expression AND sequencing AND sapiens’, last queried on 6 March 2021. All freely available IBD-related datasets were included. Only low-quality reads within samples were discarded before the analysis pipeline.

FASTQ file download from NCBI SRA and initial quality control (QC) filtering were performed with fastq-dump v2.11.0 (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=fastq-dump). Additional quality checks and adaptor trimming were performed with FastQC v0.11.9 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and Trimmomatic v0.39²⁴.

Read mapping to the human reference genome (GRCh38 primary genome assembly), finalized with GENCODE v35 gene annotations²⁵, and gene quantification were performed with STAR v2.7.9a²⁶. Post-mapping QC was performed with RSeQC v4.0.0²⁷ and MultiQC v1.10.1²⁸.

Meta-analysis

Because these data came from 26 different studies from different laboratories, we counteracted the presumptive bias by batch correction in accordance with source (batch covariate) and tissue of origin (explaining other possible covariance). Batch-effect detection and correction were performed with ComBat⁷, within the Surrogate Variable Analysis v1.8 R package (https://bioconductor.org/packages/release/bioc/html/sva.html). Correction was performed with the following parameters: sva::ComBat(dat=raw_counts, batch=source, mod=tissue), where raw_counts is the merged count matrix, and the source of origin was used as a batch covariate and the tissue type as a possible other covariate.

Once the gene counts were adjusted, samples were divided into groups in accordance with the tissue of origin and patient condition before differential expression analysis and Gene Ontology functional enrichment. Differential human gene expression and differential species/family/order abundance analyses were performed with DESeq2 v1.32.08²⁹. Multi-core parallelization was achieved with BiocParallel v1.26.0 (https://github.com/Bioconductor/BiocParallel). Gene-level annotations were managed by ensembldb v2.16.09 and EnsDb.Hsapiens.v86 v2.99.0 (http://bioconductor.org/packages/release/data/annotation/html/EnsDb.Hsapiens.v86.html).

The statistics for human differential gene expression, Shannon diversity and viral entity relative abundance are shown in Supplementary Table 2. Comparisons with FDR < 1 × 10⁻¹⁰ are considered statistically significant.

Functional enrichment analysis of the Gene Ontology biological process was performed with GeneSCF v1.110³⁰, using differentially expressed genes with FDR < 1 × 10⁻¹⁰.

Finally, the reads failing to map to the human genome were subjected to metatranscriptomics profiling by taxonomic classification of archaeal, bacterial, eukaryotic or viral genes. Taxonomic classification of human genome-unmapped reads by exact k-mer matching against archaeal, bacterial, eukaryotic or viral genomes was performed with Kraken2 v2.1.211³¹. Bacterial species Spearman diversity and Simpson dominance indices were calculated with vegan v2.5 (https://github.com/vegandevs/vegan).

Low-dimensional embedding of high-dimensional data was performed with umap v0.2.7.0 (https://cran.r-project.org/web/packages/umap/vignettes/umap.html).

Data carpentry was performed in R with tidyverse v1.3.1 (https://www.tidyverse.org/) and Python with NumPy v1.20.3 (https://numpy.org/) and pandas v1.2.4 (https://pandas.pydata.org/). Plots in the figures were created with ggplot2 v3.3.312.

Clinical metadata

Even if of valuable importance, clinical data are often missing from the analyzed datasets. However, because we recognize that, even if partial, this information could be useful, we take into account disease stage (n = 21), Mayo score (n = 206), Crohn’s disease activity index (CDAI, n = 155) and pediatric ulcerative colitis activity index (PUCAI, n = 206) as information deserving annotation. This can be retrieved in the metadata table and the ‘Color by’ dropdown.

Web app design

The web app was developed with Dash v1.20.0 (https://dash.plotly.com/) and Plotly v4.14.3 (https://plotly.com/), stored in GitHub and run in Heroku. Programmatic access to the data tables is performed by Requests v2.25.1 (https://docs.python-requests.org/). Programmatic access to the data tables is performed by Requests v2.25.1 (https://docs.python-requests.org/).

The web app will be updated as new datasets become available. Users can suggest missing studies by clicking on the ‘Suggestions’ link in the TaMMA footnotes.

The complete guide on how to use the TaMMA web app is available at https://ibd-tamma.readthedocs.io/.

Statistics and reproducibility

Statistical differences between groups in Fig. 1d–k and Supplementary Fig. 1b–c,e–g were calculated by analysis of variance with Tukey’s honestly significant difference post hoc test for multiple comparisons. Differences with adjusted P ≤ 0.05 were considered significant.

Ethics statement

Please refer to the original articles for the ethical approval of the human studies mentioned in this paper.

Data availability

FASTQ reads were mined from the NCBI GEO/SRA data repositories. Study IDs with their respective links and references are provided in Supplementary Table 1. The relevant datasets mentioned in this paper are available in the summary and metadata tabs within the IBD TaMMA web app, and the described results are available in the literature tab within the app. The underlying data for the web app are available at https://github.com/Humanitas-Danese-s-omics/ibd-meta-analysis-data and in the Open Science Framework repository³². Human unmapped FASTQ reads that failed to be classified by metatranscriptomics profiling have been considered as NGS dark matter. The IBD TaMMA NGS dark matter is available at https://dataverse.harvard.edu/dataverse/tamma-dark-matter. Source data are provided with this paper and in the Open Science Framework repository³².

Code availability

The code is stored in GitHub at https://github.com/Humanitas-Danese-s-omics/ibd-meta-analysis and in the Open Science Framework repository³², and running at https://ibd-meta-analysis.herokuapp.com/.

Change history

29 September 2021
A Correction to this paper has been published: https://doi.org/10.1038/s43588-021-00146-4

References

Aldars-García, L., Marin, A. C., Chaparro, M. & Gisbert, J. P. The interplay between immune system and microbiota in inflammatory bowel disease: a narrative review. Int. J. Mol. Sci. 22, 3076 (2021).
Article Google Scholar
Seyed Tabib, N. S. et al. Big data in IBD: big progress for clinical practice. Gut 69, 1520–1532 (2020).
Article Google Scholar
Li, X. et al. Meta-analysis of expression profiling data indicates need for combinatorial biomarkers in pediatric ulcerative colitis. J. Immunol. Res. 2020, 8279619 (2020).
Article Google Scholar
Naz, S. et al. Transcriptome meta-analysis identifies immune signature comprising of RNA binding proteins in ulcerative colitis patients. Cell. Immunol. 334, 42–48 (2018).
Article Google Scholar
Vennou, K. E., Piovani, D., Kontou, P. I., Bonovas, S. & Bagos, P. G. Multiple outcome meta-analysis of gene-expression data in inflammatory bowel disease. Genomics 112, 1761–1767 (2020).
Article Google Scholar
Ungaro, F., Massimino, L., D’Alessio, S. & Danese, S. The gut virome in inflammatory bowel disease pathogenesis: from metagenomics to novel therapeutic approaches. United Eur. Gastroenterol. J. 7, 999–1007 (2019).
Article Google Scholar
Stein, C. K. et al. Removing batch effects from purified plasma cell gene expression microarrays with modified ComBat. BMC Bioinformatics 16, 63 (2015).
Article Google Scholar
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Article Google Scholar
Argollo, M., Kotze, P. G., Kakkadasam, P. & D’Haens, G. Optimizing biologic therapy in IBD: how essential is therapeutic drug monitoring? Nat. Rev. Gastroenterol. Hepatol. 17, 702–710 (2020).
Article Google Scholar
Liu, F., Lee, S. A., Riordan, S. M., Zhang, L. & Zhu, L. Global studies of using fecal biomarkers in predicting relapse in inflammatory bowel disease. Front. Med. (Lausanne) 7, 580803 (2020).
Article Google Scholar
Martini, E., Krug, S. M., Siegmund, B., Neurath, M. F. & Becker, C. Mend your fences: the epithelial barrier and its relationship with mucosal immunity in inflammatory bowel disease. Cell. Mol. Gastroenterol. Hepatol. 4, 33–46 (2017).
Article Google Scholar
Danese, S. Role of the vascular and lymphatic endothelium in the pathogenesis of inflammatory bowel disease: ‘brothers in arms’. Gut 60, 998–1008 (2011).
Article Google Scholar
Schulz-Kuhnt, A., Neurath, M. F., Wirtz, S. & Atreya, I. Innate lymphoid cells as regulators of epithelial integrity: therapeutic implications for inflammatory bowel diseases. Front. Med. (Lausanne) 8, 656745 (2021).
Article Google Scholar
Alkim, C., Alkim, H., Koksal, A. R., Boga, S. & Sen, I. Angiogenesis in inflammatory bowel disease. Int. J. Inflam. 2015, 970890 (2015).
Google Scholar
Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).
Article Google Scholar
Lozupone, C. A., Stombaugh, J. I., Gordon, J. I., Jansson, J. K. & Knight, R. Diversity, stability and resilience of the human gut microbiota. Nature 489, 220–230 (2012).
Article Google Scholar
Fernandes, M. A. et al. Enteric virome and bacterial microbiota in children with ulcerative colitis and Crohn disease. J. Pediatr. Gastroenterol. Nutr. 68, 30–36 (2019).
Article Google Scholar
Zuo, T. et al. Gut mucosal virome alterations in ulcerative colitis. Gut 68, 1169–1179 (2019).
Article Google Scholar
Wang, W. et al. Metagenomic analysis of microbiome in colon tissue from subjects with inflammatory bowel diseases reveals interplay of viruses and bacteria. Inflamm. Bowel Dis. 21, 1419–1427 (2015).
Google Scholar
Ungaro, F. et al. Metagenomic analysis of intestinal mucosa revealed a specific eukaryotic gut virome signature in early-diagnosed inflammatory bowel disease. Gut Microbes 10, 149–158 (2019).
Article Google Scholar
Nguyen, M., Bradford, K., Zhang, X. & Shih, D. Q. Cytomegalovirus reactivation in ulcerative colitis patients. Ulcers 2011, 1–7 (2011).
Article Google Scholar
Argelaguet, R. et al. Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 14, e8124 (2018).
Article Google Scholar
Köster, J. & Rahmann, S. Snakemake—a scalable bioinformatics workflow engine. Bioinformatics 28, 2520–2522 (2012).
Article Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article Google Scholar
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Article Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article Google Scholar
Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).
Article Google Scholar
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Article Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article Google Scholar
Subhash, S. & Kanduri, C. GeneSCF: a real-time based functional enrichment tool with support for multiple organisms. BMC Bioinformatics 17, 365 (2016).
Article Google Scholar
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).
Article Google Scholar
Massimino, L., Ungaro, F. & Lamparelli, L. A. IBD TaMMA data (Open Science Framework, 2021); https://doi.org/10.17605/osf.io/yrxa7

Download references

Acknowledgements

We acknowledge support from the ‘Fondazione Cariplo per la ricerca Biomedica’ under grant agreement #2018-0112 to F.U., ‘Fondazione AMICI ONLUS’ for Research Prize 2020 to F.U. and the Innovative Medicines Initiative (IMI) 2020–2024 award #853995 (ImmUniverse) to S.D. and S.V.

Author information

These authors contributed equally: Luca Massimino, Luigi Antonio Lamparelli.
These authors jointly supervised this work: Stefania Vetrano, Silvio Danese, Federica Ungaro.

Authors and Affiliations

Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, Milan, Italy
Luca Massimino, Stefania Vetrano, Silvio Danese & Federica Ungaro
IBD Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
Luca Massimino, Luigi Antonio Lamparelli, Yashar Houshyar, Stefania Vetrano, Silvio Danese & Federica Ungaro
PhoenixLAB, Lodi, Italy
Silvia D’Alessio
Inserm NGERE, University of Lorraine, Vandoeuvre-les-Nancy, France
Laurent Peyrin-Biroulet
Nancy University Hospital, Vandoeuvre-les-Nancy, France
Laurent Peyrin-Biroulet

Authors

Luca Massimino
View author publications
You can also search for this author in PubMed Google Scholar
Luigi Antonio Lamparelli
View author publications
You can also search for this author in PubMed Google Scholar
Yashar Houshyar
View author publications
You can also search for this author in PubMed Google Scholar
Silvia D’Alessio
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Peyrin-Biroulet
View author publications
You can also search for this author in PubMed Google Scholar
Stefania Vetrano
View author publications
You can also search for this author in PubMed Google Scholar
Silvio Danese
View author publications
You can also search for this author in PubMed Google Scholar
Federica Ungaro
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.M. and F.U. provided conceptualization and wrote the original draft. L.M., L.A.L. and Y.H. carried out the data search and acquisition. L.M. and L.A.L. contributed bioinformatics analysis and app development and wrote the guide. S.D., L.P.-B., S.V., S.D. and F.U. reviewed and edited the manuscript. S.V., S.D. and F.U. provided supervision and resources and carried out funding acquisition.

Corresponding authors

Correspondence to Luca Massimino or Federica Ungaro.

Ethics declarations

Competing interests

S.D. has served as a speaker, consultant and advisory board member for Schering-Plough, Abbott (AbbVie) Laboratories, Merck and Co, UCB Pharma, Ferring, Cellerix, Millenium Takeda, Nycomed, Pharmacosmos, Actelion, Alfa Wasserman, Genentech, Grunenthal, Pfizer, AstraZeneca, Novo Nordisk, Vifor and Johnson & Johnson. The other authors declare no competing interests.

Additional information

Peer review information Nature Computational Science thanks Tamás Korcsmáros, Joana Pimenta Bernardes and Gholamali Rahnavard for their contribution to the peer review of this work. Handling editor: Fernando Chirigati, in collaboration with the Nature Computational Science team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary text and Fig. 1.

Supplementary Table 1

Table describing the input datasets and their references.

Supplementary Table 2

Statistics of differential gene expression analyses and relative abundance comparisons mentioned in the manuscript.

Supplementary Data 1

Normalized gene counts, endothelium-related gene ontologies categories and virus relative abundances.

Source data

Source Data Fig. 1

Fully annotated UMAP, differential gene expression results, bacteria order relative abundance, Shannon diversity indices, normalized gene counts and Caudovirales relative abundance.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Massimino, L., Lamparelli, L.A., Houshyar, Y. et al. The Inflammatory Bowel Disease Transcriptome and Metatranscriptome Meta-Analysis (IBD TaMMA) framework. Nat Comput Sci 1, 511–515 (2021). https://doi.org/10.1038/s43588-021-00114-y

Download citation

Received: 30 April 2021
Accepted: 16 July 2021
Published: 20 August 2021
Issue Date: August 2021
DOI: https://doi.org/10.1038/s43588-021-00114-y

This article is cited by

A multi-omic analysis reveals the esophageal dysbiosis as the predominant trait of eosinophilic esophagitis
- Luca Massimino
- Alberto Barchi
- Silvio Danese
Journal of Translational Medicine (2023)
Impact of diet and host genetics on the murine intestinal mycobiome
- Yask Gupta
- Anna Lara Ernst
- Tanya Sezin
Nature Communications (2023)
Single-cell meta-analysis of inflammatory bowel disease with scIBD
- Hu Nie
- Peilu Lin
- Lei Zhang
Nature Computational Science (2023)
Genetics of circulating inflammatory proteins identifies drivers of immune-mediated disease risk and therapeutic targets
- Jing Hua Zhao
- David Stacey
- James E. Peters
Nature Immunology (2023)
Genome-wide DNA methylation profiling in differentiating Crohn’s disease from intestinal tuberculosis
- Hao Wu
- Hongchun Liu
- Lili Liu
Genes & Genomics (2022)