Article
|
Open Access
Featured
-
-
Article
| Open Accessrworkflows: automating reproducible practices for the R community
Reproducibility is essential for the progress of research, yet achieving it remains elusive even in computational fields. Here, authors develop the rworkflows suite, making robust CI/CD workflows easy and freely accessible to all R package developers.
- Brian M. Schilder
- , Alan E. Murphy
- & Nathan G. Skene
-
Article
| Open AccessMetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data
The authors develop an integrative and scalable framework to eliminate systematic biases and retrieve high-quality metagenome-assembled genomes using either long-read or short-read metagenomic Hi-C data.
- Yuxuan Du
- & Fengzhu Sun
-
Article
| Open AccessHypoRiPPAtlas as an Atlas of hypothetical natural products for mass spectrometry database search
A gap exists between large-scale genome mining and mass spectral datasets for natural product discovery. Here the authors bridge the gap by developing HypoRiPPAtlas, an Atlas of hypothetical natural product structures, which is ready-to-use for in silico database search of tandem mass spectra.
- Yi-Yuan Lee
- , Mustafa Guler
- & Hosein Mohimani
-
Article
| Open AccessDNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing
Existing methods for detecting DNA methylation (5mC) are less accurate and robust. Here, the authors develop a deep learning tool ccsmeth and a Nextflow pipeline ccsmethphase for genome-wide 5mCpG detection and phasing with high accuracy from CCS reads in human.
- Peng Ni
- , Fan Nie
- & Jianxin Wang
-
Article
| Open AccessExpressAnalyst: A unified platform for RNA-sequencing analysis in non-model species
RNA-sequencing data analysis is difficult for non-model species that have no reference genome. ExpressAnalyst enables RNA-sequencing analysis for any eukaryotic species in less than 24 h, on a laptop, and without any programming.
- Peng Liu
- , Jessica Ewald
- & Jianguo Xia
-
Article
| Open AccessA randomized clinical trial assessing the effect of automated medication-targeted alerts on acute kidney injury outcomes
In a multicenter randomized trial, researchers found that electronic alerts increased the rate of discontinuation of potential nephrotoxins. This did not translate into improved clinical outcomes, except among those exposed to proton-pump inhibitors.
- F. Perry Wilson
- , Yu Yamamoto
- & Ugochukwu Ugwuowo
-
Article
| Open AccessSkin basal cell carcinomas assemble a pro-tumorigenic spatially organized and self-propagating Trem2+ myeloid niche
Tumor microenvironment elements can influence tumor state, including in skin basal cell carcinomas. Here the authors show that spatially organized and self-propagating TREM2+ tumor associated macrophages promote Ly6D- tumor cell proliferation via secretion of oncostatin M.
- Daniel Haensel
- , Bence Daniel
- & Anthony E. Oro
-
Article
| Open AccessSpatial probabilistic mapping of metabolite ensembles in mass spectrometry imaging
Spatial visualization of metabolites in tissues via mass spectrometry imaging can be prone to user perception bias. Here, the authors report the computational framework moleculaR that introduces probabilistic data-dependent molecular mapping of nonrandom spatial patterns of metabolite signals.
- Denis Abu Sammour
- , James L. Cairns
- & Carsten Hopf
-
Article
| Open AccessA comprehensive platform for analyzing longitudinal multi-omics data
The analysis of longitudinal bulk and single-cell multi-omics data is a highly complex task. Here, the authors introduce PALMO, a software platform with five modules to analyse longitudinal bulk and single-cell multi-omics data, which is extensively tested in external datasets that include multiple omics modalities.
- Suhas V. Vasaikar
- , Adam K. Savage
- & Xiao-jun Li
-
Article
| Open AccessDNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage
The extensive information capacity of DNA makes it an attractive alternative to traditional data storage. DNA-Aeon is a DNA data storage solution that can correct all error types commonly observed in DNA storage, while encoding data into sequences that meet user-defined constraints such as GC content, homopolymer length, and no undesired motifs.
- Marius Welzel
- , Peter Michael Schwarz
- & Dominik Heider
-
Article
| Open AccessAlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics
Deep learning (DL) has been frequently used in mass spectrometry-based proteomics but there is still a lot of potential. Here, the authors develop a framework that enables building DL models to predict arbitrary peptide properties with only a few lines of code.
- Wen-Feng Zeng
- , Xie-Xuan Zhou
- & Matthias Mann
-
Article
| Open AccessDeep autoencoder for interpretable tissue-adaptive deconvolution and cell-type-specific gene analysis
Traditional bulk sequencing data lack information about cell-type-specific gene expression. Here, the authors develop a Tissue-AdaPtive autoEncoder (TAPE), a deep learning method connecting bulk RNA-seq and single-cell RNA-seq, and apply it to analyze the cell type fractions and cell-type-specific gene expression in clinical data.
- Yanshuo Chen
- , Yixuan Wang
- & Yu Li
-
Article
| Open AccessIntegrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque
Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge. Here, the authors present a resource that contains pre-calculated biomedical descriptors derived from a very large knowledge graph.
- Adrià Fernández-Torras
- , Miquel Duran-Frigola
- & Patrick Aloy
-
Article
| Open AccessThe automated Galaxy-SynBioCAD pipeline for synthetic biology design and engineering
Automated design and build processes can rapidly accelerate work in synthetic biology and metabolic engineering. Here the authors present Galaxy-SynBioCAD, a toolshed for synthetic biology, metabolic engineering, and industrial biotechnology that they use to build and execute Galaxy scientific workflows from pathway design to strain engineering through the automated generation of scripts driving robotic workstations.
- Joan Hérisson
- , Thomas Duigou
- & Jean-Loup Faulon
-
Article
| Open AccessConnecting omics signatures and revealing biological mechanisms with iLINCS
There are only a few platforms that integrate multiple omics data types, bioinformatics tools, and interfaces for integrative analyses and visualization that do not require programming skills. Here the authors present an integrative web-based platform for analysis of omics data and signatures of cellular perturbations.
- Marcin Pilarczyk
- , Mehdi Fazel-Najafabadi
- & Mario Medvedovic
-
Comment
| Open AccessCommunity voices: policy proposals to promote inclusion in academia through the lens of women in science
Diversity is a creative force that broadens views and enhances ideas; it increases productivity as well as the impact of our science, making our respective organisations more agile and timely. Equality of opportunity is a key to success for any research organisation. Here we argue that every research organisation, whether in academia or in industry, needs to have better inclusion policies to harness the benefits of diversity in research. Drawing from our personal experiences and perspectives as women in science, we share our suggestions on how to promote inclusion in academia and create a better research culture for all. Our shared experiences highlight the many hurdles women in science face on a daily basis. We stress that rules and regulations, as well as education for awareness, will play critical role in this much needed shift from a male-dominated scientific culture that dates from Victorian times to a modern focus on gender equality in science. The key ingredients of this new culture will be flexibility, transparency, fairness and thoughtfulness.
- Sarah A. Teichmann
- , Muzlifah Haniffa
- & Jasmin Fisher
-
Article
| Open AccessContext-aware deconvolution of cell–cell communication with Tensor-cell2cell
Cellular contexts such as disease state, organismal life stage and tissue microenvironment, shape intercellular communication, and ultimately affect an organism’s phenotypes. Here, the authors present Tensor-cell2cell, an unsupervised method for deciphering context-driven intercellular communication.
- Erick Armingol
- , Hratch M. Baghdassarian
- & Nathan E. Lewis
-
Article
| Open AccessA microfluidic optimal experimental design platform for forward design of cell-free genetic networks
Characterization of cell-free genetic networks is inherently difficult. Here the authors use optimal experimental design and microfluidics to improve characterization, demonstrating modularity and predictability of parts in applied test cases.
- Bob van Sluijs
- , Roel J. M. Maas
- & Wilhelm T. S. Huck
-
Article
| Open AccessEnhancing bioreactor arrays for automated measurements and reactive control with ReacSight
Small-scale bioreactors are increasingly used in quantitative biology. Here, the authors report ReacSight, a software solution to connect reactor arrays with sensitive measurement devices using low-cost pipetting robots and provide applications leveraging optogenetic control in yeast.
- François Bertaux
- , Sebastián Sosa-Carrillo
- & Gregory Batt
-
Article
| Open AccessExploring the cellular landscape of circular RNAs using full-length single-cell RNA sequencing
Studies of circular RNAs have often been limited to the tissue or organism level. Here, authors investigate the comprehensive expression landscape of circRNAs in human and mouse at single-cell resolution, revealing highly specific and dynamic changes of circRNAs during multiple biological processes.
- Wanying Wu
- , Jinyang Zhang
- & Fangqing Zhao
-
Article
| Open AccessComparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data
Multiple methods to infer cell-cell communication (CCC) from single cell data are currently available. Here, the authors systematically compare 16 CCC inference resources and 7 methods, and develop the LIANA framework as an interface to use and compare all these approaches.
- Daniel Dimitrov
- , Dénes Türei
- & Julio Saez-Rodriguez
-
Article
| Open AccessA streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics
Immunopeptidomics benefits from highly sensitive mass spectrometry (MS). Here, the authors present a computational platform for integrating data-dependent and -independent acquisition MS approaches, and demonstrate its utility for deeper immunopeptidome profiling.
- Lei Xin
- , Rui Qiao
- & Ming Li
-
Article
| Open AccessThe 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data
This paper describes the ‘4DN Data Portal’ that hosts data generated by the 4D Nucleome network, including Hi-C and other chromatin conformation capture assays, as well as various sequencing-based and imaging-based assays. Raw data have been uniformly processed to increase comparability and the portal is implemented with visualization tools to browse the data without download.
- Sarah B. Reiff
- , Andrew J. Schroeder
- & Peter J. Park
-
Article
| Open AccessEnabling reactive microscopy with MicroMator
In microscopy, applications in which reactiveness is needed are multifarious. Here the authors report MicroMator, a Python software package for reactive experiments, which they use for applications requiring real-time tracking and light-targeting at the single-cell level.
- Zachary R. Fox
- , Steven Fletcher
- & Gregory Batt
-
Article
| Open AccessInteractive single-cell data analysis using Cellar
Here the authors introduce Cellar, an interactive webserver for analyzing single-cell omics data. They show that Cellar supports all aspects of the analysis and modeling process and can be used to integrate different types of single cell omics and spatial data.
- Euxhen Hasanaj
- , Jingtao Wang
- & Ziv Bar-Joseph
-
Article
| Open AccessExpanding biochemical knowledge and illuminating metabolic dark matter with ATLASx
“Mapping the dark matter of metabolism remains an open challenge that can be addressed globally and systematically by existing computational solutions. Here the authors present ATLASx, a repository of known and predicted enzymatic reaction, connecting millions of compounds to help synthetic biologists and metabolic engineers to design and explore metabolic pathways.”
- Homa MohammadiPeyhani
- , Jasmin Hafner
- & Vassily Hatzimanikatis
-
Article
| Open AccessFully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data
Cell types are typically identified in single cell transcriptomic data by manual annotation of cell clusters using established marker genes. Here the authors present a fully-automated computational platform that can quickly and accurately distinguish between cell types.
- Aleksandr Ianevski
- , Anil K. Giri
- & Tero Aittokallio
-
Article
| Open AccessA platform for oncogenomic reporting and interpretation
The interpretation of somatic variants in cancer is challenging due to the scale and complexity of sequencing data. Here, the authors present PORI, an open-source framework for interpreting somatic variants in cancer using graph knowledge base tools, automated reporting, and manual curation.
- Caralyn Reisle
- , Laura M. Williamson
- & Steven J. M. Jones
-
Article
| Open AccessEmulator-based Bayesian optimization for efficient multi-objective calibration of an individual-based model of malaria
Individual-based models have become important tools in the global battle against infectious diseases, yet model complexity can make calibration challenging. Here, the authors propose a Bayesian optimization framework to calibrate a complex malaria transmission simulator.
- Theresa Reiker
- , Monica Golumbeanu
- & Melissa A. Penny
-
Article
| Open AccessNetwork medicine for disease module identification and drug repurposing with the NeDRex platform
There is an unmet need for adaptable tools allowing biomedical researchers to employ network-based drug repurposing approaches for their individual use cases. Here, the authors close this gap with NeDRex, an integrative and interactive platform.
- Sepideh Sadegh
- , James Skelton
- & Tim Kacprowski
-
Article
| Open AccessOrchestrating and sharing large multimodal data for transparent and reproducible research
It is no secret that a significant part of scientific research is difficult to reproduce. Here, the authors present a cloud-computing platform called ORCESTRA that facilitates reproducible processing of multimodal biomedical data using customizable pipelines and well-documented data objects.
- Anthony Mammoliti
- , Petr Smirnov
- & Benjamin Haibe-Kains
-
Article
| Open AccessA scalable, secure, and interoperable platform for deep data-driven health management
The increasing scale and scope of biomedical data is generating tremendous opportunities for improving health outcomes, but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, the authors develop the Personal Health Dashboard, which provides an end-to-end solution for deep biomedical data analytics.
- Amir Bahmani
- , Arash Alavi
- & Michael P. Snyder
-
Article
| Open AccessflDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions
The authors present flDPnn, a computational tool for disorder and disorder function predictions from protein sequences. flDPnn was assessed with the data from the “Critical Assessment of Protein Intrinsic Disorder Prediction” experiment and on an independent and low-similarity test dataset, which show that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions.
- Gang Hu
- , Akila Katuwawala
- & Lukasz Kurgan
-
Article
| Open AccessA clinical deep learning framework for continually learning from cardiac signals across diseases, time, modalities, and institutions
Deep learning algorithms trained on data streamed temporally from different clinical sites and from a multitude of physiological sensors are generally affected by a degradation in performance. To mitigate this, the authors propose a continual learning strategy that employs a replay buffer.
- Dani Kiyasseh
- , Tingting Zhu
- & David Clifton
-
Article
| Open AccessIon identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment
Molecular networking connects molecules based on their fragment ion mass spectra (MS2), but may leave adduct species from the same molecular family separate. To address this issue, the authors develop a networking approach that fuses MS1- and MS2-based networks and integrate it into the GNPS environment.
- Robin Schmid
- , Daniel Petras
- & Pieter C. Dorrestein
-
Article
| Open AccessA global resource for genomic predictions of antimicrobial resistance and surveillance of Salmonella Typhi at pathogenwatch
Whole genome sequencing data are increasingly becoming routinely available but generating actionable insights is challenging. Here, the authors describe Pathogenwatch, a web tool for genomic surveillance of S. Typhi, and demonstrate its use for antimicrobial resistance assignment and strain risk assessment.
- Silvia Argimón
- , Corin A. Yeats
- & David M. Aanensen
-
Article
| Open AccessThe VRNetzer platform enables interactive network analysis in Virtual Reality
Data-rich networks can be difficult to interpret beyond a certain size. Here, the authors introduce a platform that uses virtual reality to allow the visual exploration of large networks, while interfacing with data repositories and other analytical methods to improve the interpretation of big data.
- Sebastian Pirch
- , Felix Müller
- & Jörg Menche
-
Article
| Open AccessDisease trajectory browser for exploring temporal, population-wide disease progression patterns in 7.2 million Danish patients
The Danish health system has been collecting health-related data on the entire Danish population for years. Here the authors present the Danish Disease Trajectory Browser (DTB), which allows users to explore population-wide disease progression patterns from data collected between 1994 and 2018.
- Troels Siggaard
- , Roc Reguant
- & Søren Brunak
-
Article
| Open AccessMapping allosteric communications within individual proteins
The computational prediction of protein allostery can guide experimental studies of protein function and cellular activity. Here, the authors develop a network-based method to detect allosteric coupling within proteins solely based on their structures, and set up a webserver for allostery prediction.
- Jian Wang
- , Abha Jain
- & Nikolay V. Dokholyan
-
Article
| Open AccessDeep learning for genomics using Janggu
Deep learning is becoming a popular approach for understanding biological processes but can be hard to adapt to new questions. Here, the authors develop Janggu, a python library that aims to ease data acquisition and model evaluation and facilitate deep learning applications in genomics.
- Wolfgang Kopp
- , Remo Monti
- & Altuna Akalin
-
Article
| Open AccessMultimodal image registration and connectivity analysis for integration of connectomic data from microscopy to MRI
Many approaches exist to process data from individual imaging modalities, but integrating them is challenging. The authors develop an automated resource that enables co-registered network- and tract-level analysis of macroscopic in-vivo imaging and microscopic imaging of cleared tissue.
- Maged Goubran
- , Christoph Leuze
- & Michael Zeineh
-
Article
| Open AccessSpecies abundance information improves sequence taxonomy classification accuracy
Taxonomy classification of amplicon sequences is an important step in investigating microbial communities in microbiome analysis. Here, the authors show incorporating environment-specific taxonomic abundance information can lead to improved species-level classification accuracy across common sample types.
- Benjamin D. Kaehler
- , Nicholas A. Bokulich
- & Gavin A. Huttley
-
Article
| Open AccessIntegrating biomedical research and electronic health records to create knowledge-based biologically meaningful machine-readable embeddings
The Scalable Precision Medicine Oriented Knowledge Engine (SPOKE) is a heterogeneous knowledge network that integrates information from 29 public databases. Here, Nelson et al. extend SPOKE to embed clinical data from electronic health records to create medically meaningful barcodes for each medical variable.
- Charlotte A. Nelson
- , Atul J. Butte
- & Sergio E. Baranzini
-
Article
| Open AccessMorphoNet: an interactive online morphological browser to explore complex multi-scale data
Most morphological visualization platforms are not designed to share research data, or are limited to data visualization. Here the authors present MorphoNet, an open-source, web-based tool for interactive visualization and sharing of complex morphodynamic datasets, onto which users can project their own data.
- Bruno Leggio
- , Julien Laussu
- & Emmanuel Faure
-
Article
| Open AccessCharacterizing pre-transplant and post-transplant kidney rejection risk by B cell immune repertoire sequencing
Adaptive immunity from both B and T cells critically controls the rejection or survival of transplanted organs. Here the authors show, by analyzing human B cell receptor repertoire in longitudinal studies of patients receiving kidney transplants, that repertoire diversity is positively associated with the incidence of kidney rejection.
- Silvia Pineda
- , Tara K. Sigdel
- & Minnie M. Sarwal
-
Article
| Open AccessAn imputation platform to enhance integration of rice genetic resources
Imputation can effectively augment marker density in existing genetic datasets and enable integration across germplasm resources. Here Wang et al. present a public imputation server for rice using a diverse reference panel to facilitate imputation in the rice genetics community.
- Diane R. Wang
- , Francisco J. Agosto-Pérez
- & Susan R. McCouch
-
Article
| Open AccessTracking HIV-1 recombination to resolve its contribution to HIV-1 evolution in natural infection
Recombination contributes to HIV evolution in patients, but its identification can be difficult. Here, the authors develop a computational tool called RAPR to track recombination in patients, identify recombination hot spots, and show contribution of recombination to antibody escape.
- Hongshuo Song
- , Elena E. Giorgi
- & Feng Gao
-
Article
| Open AccessShaping bacterial population behavior through computer-interfaced control of individual cells
Individual bacteria interact with each other and their environment to produce population-level patterns of gene expression. Here the authors use an automated platform combined with optogenetic feedback to manipulate population behaviors through dynamic control of individual cells.
- Remy Chait
- , Jakob Ruess
- & Călin C. Guet
-
Article
| Open AccessA new tool called DISSECT for analysing large genomic data sets using a Big Data approach
Availability of computing power can limit computational analysis of large genetic and genomic datasets. Here, Canela-Xandri, et al. describe a software called DISSECT that is capable of analyzing large-scale genetic data by distributing the work across thousands of networked computers.
- Oriol Canela-Xandri
- , Andy Law
- & Albert Tenesa