Software

  • Article
    | Open Access

    The authors present epiScanpy: a computational framework for the analysis of single-cell epigenomic data, both ATAC-seq and DNA methylation data, with examples for clustering, cell type identification, trajectory learning and atlas integration - and show its performance in distinguishing cell types.

    • Anna Danese
    • , Maria L. Richter
    •  & Maria Colomé-Tatché
  • Article
    | Open Access

    Mass spectrometry-based metabolomics is a powerful method for profiling large clinical cohorts but batch variations can obscure biologically meaningful differences. Here, the authors develop a computational workflow that removes unwanted data variation while preserving biologically relevant information.

    • Taiyun Kim
    • , Owen Tang
    •  & Jean Yee Hwa Yang
  • Article
    | Open Access

    West and colleagues develop the Variant Database software tool for examination of changing Spike mutations in SARS-CoV-2 genomes. The authors use this to detect emerging lineages of SARS-CoV-2 in New York and report the rapid spread of the B.1.526 lineage in the city.

    • Anthony P. West Jr.
    • , Joel O. Wertheim
    •  & Pamela J. Bjorkman
  • Article
    | Open Access

    Existing long-read de novo assembly methods can partially, but not completely, separate strains. Here, the authors develop Strainberry, a metagenome assembly bioinformatic pipeline that exclusively uses longread data to accurately separate and reconstruct strain genomes from single-sample low-complexity microbiomes.

    • Riccardo Vicedomini
    • , Christopher Quince
    •  & Rayan Chikhi
  • Article
    | Open Access

    Non-coding RNA function is poorly understood, partly due to the challenge of determining RNA secondary (2D) structure. Here, the authors present a framework for the reproducible prediction and visualization of the 2D structure of a wide array of RNAs, which enables linking RNA sequence to function.

    • Blake A. Sweeney
    • , David Hoksza
    •  & Anton I. Petrov
  • Article
    | Open Access

    Several existing algorithms predict the methylation of DNA using Nanopore sequencing signals, but it is unclear how they compare in performance. Here, the authors benchmark the performance of several such tools, and propose METEORE, a consensus tool that improves prediction accuracy.

    • Zaka Wing-Sze Yuen
    • , Akanksha Srivastava
    •  & Eduardo Eyras
  • Article
    | Open Access

    The genome-wide investigation of chromatin organization enables insights into global gene expression control. Here, the authors present a computationally efficient method for the analysis of chromatin organization data and use it to recover principles of 3D organization across conditions.

    • Merve Sahin
    • , Wilfred Wong
    •  & Christina S. Leslie
  • Article
    | Open Access

    Liquid biopsies enable minimally invasive applications for diagnosis and treatment monitoring. Here the authors analyse fragmentation patterns of circulating tumour DNA on multiple levels and develop a bioinformatic tool, LIQUORICE, to accurately detect and classify paediatric cancers with low mutational burden.

    • Peter Peneder
    • , Adrian M. Stütz
    •  & Eleni M. Tomazou
  • Article
    | Open Access

    Current genome mining methods predict many putative non-ribosomal peptides (NRPs) from their corresponding biosynthetic gene clusters, but it remains unclear which of those exist in nature and how to identify their post-assembly modifications. Here, the authors develop NRPminer, a modification-tolerant tool for the discovery of NRPs from large genomic and mass spectrometry datasets, and use it to find 180 NRPs from different environments.

    • Bahar Behsaz
    • , Edna Bode
    •  & Hosein Mohimani
  • Article
    | Open Access

    Gene regulatory networks are a useful means of inferring functional interactions from large-scale genomic data. Here, the authors develop a Bayesian framework integrating GWAS summary statistics with gene regulatory networks to identify genetic enrichments and associations simultaneously.

    • Xiang Zhu
    • , Zhana Duren
    •  & Wing Hung Wong
  • Article
    | Open Access

    Methods to produce haplotype-resolved genome assemblies often rely on access to family trios. The authors present FALCON-Phase, a tool that combines ultra-long range Hi-C chromatin interaction data with a long read de novo assembly to extend haplotype phasing to the contig or scaffold level.

    • Zev N. Kronenberg
    • , Arang Rhie
    •  & Sarah B. Kingan
  • Article
    | Open Access

    Data-rich networks can be difficult to interpret beyond a certain size. Here, the authors introduce a platform that uses virtual reality to allow the visual exploration of large networks, while interfacing with data repositories and other analytical methods to improve the interpretation of big data.

    • Sebastian Pirch
    • , Felix Müller
    •  & Jörg Menche
  • Article
    | Open Access

    In genome-wide association meta-analysis, it is often difficult to find an independent dataset of sufficient size to replicate associations. Here, the authors have developed MAMBA to calculate the probability of replicability based on consistency between datasets within the meta-analysis.

    • Daniel McGuire
    • , Yu Jiang
    •  & Dajiang J. Liu
  • Article
    | Open Access

    Single-cell RNA-Seq suffers from heterogeneity in sequencing sparsity and complex differential patterns in gene expression. Here, the authors introduce a graph neural network based on a hypothesis-free deep learning framework as an effective representation of gene expression and cell–cell relationships.

    • Juexin Wang
    • , Anjun Ma
    •  & Dong Xu
  • Article
    | Open Access

    Here the authors present DeepAccNet, a deep learning framework that estimates per-residue accuracy and residue-residue distance signed error in protein models, which are used to guide Rosetta protein structure refinement. Benchmarking suggests an improvement of accuracy prediction and refinement compared to other related state of the art methods.

    • Naozumi Hiranuma
    • , Hahnbeom Park
    •  & David Baker
  • Article
    | Open Access

    Here, the authors present two local methods for analyzing cryo-EM maps: LocSpiral and LocBSharpen that enhance high-resolution features of cryoEM maps, while preventing map distortions. They also introduce LocBFactor and LocOccupancy, which allow obtaining local B-factors and electron density occupancy maps from cryo-EM reconstructions and the authors demonstrate that these methods improve the interpretability and analysis of cryo-EM maps using different test cases among them recent SARS-CoV-2 spike glycoprotein structures.

    • Satinder Kaur
    • , Josue Gomez-Blanco
    •  & Javier Vargas
  • Article
    | Open Access

    Antimicrobial resistance is a major global health threat and its development is promoted by antibiotic misuse. Here, the authors present an offline smartphone application for automated and standardized antibiotic susceptibility testing, to be deployed in resource-limited settings.

    • Marco Pascucci
    • , Guilhem Royer
    •  & Mohammed-Amin Madoui
  • Article
    | Open Access

    Kinases drive fundamental changes in cell state, but predicting kinase activity based on substrate-level changes can be challenging. Here the authors introduce a computational framework that utilizes similarities between substrates to robustly infer kinase activity.

    • Serhan Yılmaz
    • , Marzieh Ayati
    •  & Mehmet Koyutürk
  • Article
    | Open Access

    Accurate analysis of single-cell RNA sequencing (scRNA-seq) data is affected by issues including technical noise and high dropout rate. Here, the authors develop a hierarchical autoencoder, scDHA, which outperforms existing methods in scRNA-seq analyses such as cell segregation and classification.

    • Duc Tran
    • , Hung Nguyen
    •  & Tin Nguyen
  • Article
    | Open Access

    Statistical colocalisation is a method to identify causal genes and shared genetic aetiology across traits. Here, the authors describe HyPrColoc, an efficient Bayesian divisive clustering algorithm which integrates summary statistics from genome-wide association studies to detect clusters of colocalised traits from large numbers of traits.

    • Christopher N. Foley
    • , James R. Staley
    •  & Joanna M. M. Howson
  • Article
    | Open Access

    Here, the authors present Methyl Assignments Using Satisfiability (MAUS), a method for the assignment of methyl groups using raw NOE data. They use eight proteins in the 10–45 kDa size range as test cases and show that MAUS yields 100% accurate assignments at high completeness levels.

    • Santrupti Nerli
    • , Viviane S. De Paula
    •  & Nikolaos G. Sgourakis
  • Article
    | Open Access

    A diverse array of antigens can trigger allergic reactions. Here the authors present the ‘AllerScan’ programmable phage display library, which is an efficient and unbiased approach for profiling anti-allergen antibody reactivities at cohort scale, with which a key wheat epitope is found to distinguish between wheat allergy and tolerance.

    • Daniel R. Monaco
    • , Brandon M. Sie
    •  & H. Benjamin Larman
  • Article
    | Open Access

    Devices for droplet generation are at the heart of many microfluidic applications but difficult to tailor for specific cases. Lashkaripour et al. show how design customization can greatly be simplified by combining rapid prototyping with data-driven machine learning strategies.

    • Ali Lashkaripour
    • , Christopher Rodriguez
    •  & Douglas Densmore
  • Article
    | Open Access

    Standard benchmarking of single-molecule localization microscopy cannot quantify nanoscale accuracy of arbitrary datasets. Here, the authors present Wasserstein-induced flux, a method using a chosen perturbation and knowledge of the imaging system to measure confidence of individual localizations.

    • Hesam Mazidi
    • , Tianben Ding
    •  & Matthew D. Lew
  • Article
    | Open Access

    Accurate cell detection in dense bacterial biofilms is challenging. Here, the authors report an image analysis pipeline that is able to accurately segment and classify single bacterial cells in 3D fluorescence images: Bacterial Cell Morphometry 3D (BCM3D).

    • Mingxing Zhang
    • , Ji Zhang
    •  & Andreas Gahlmann
  • Article
    | Open Access

    Traces from single-molecule fluorescence microscopy (SMFM) experiments exhibit photophysical artifacts that typically make analysis time-consuming. Here, the authors have developed an easily accessible software, AutoSiM, for two distinct applications of deep learning to the efficient processing of SMFM time traces.

    • Jieming Li
    • , Leyou Zhang
    •  & Nils G. Walter
  • Article
    | Open Access

    Current cell segmentation methods for Saccharomyces cerevisiae face challenges under a variety of standard experimental and imaging conditions. Here the authors develop a convolutional neural network for accurate, label-free cell segmentation.

    • Nicola Dietler
    • , Matthias Minder
    •  & Sahand Jamal Rahi
  • Article
    | Open Access

    Organ segmentation of whole-body mouse images is essential for quantitative analysis, but is tedious and error-prone. Here the authors develop a deep learning pipeline to segment major organs and the skeleton in volumetric whole-body scans in less than a second, and present probability maps and uncertainty estimates.

    • Oliver Schoppe
    • , Chenchen Pan
    •  & Bjoern H. Menze
  • Perspective
    | Open Access

    The accurate representation of data is essential in science communication, however, colour maps that visually distort data through uneven colour gradients or are unreadable to those with colour vision deficiency remain prevalent. Here, the authors present a simple guide for the scientific use of colour and highlight ways for the scientific community to identify and prevent the misuse of colour in science.

    • Fabio Crameri
    • , Grace E. Shephard
    •  & Philip J. Heron
  • Article
    | Open Access

    Dissecting the cellular heterogeneity embedded in single-cell transcriptomic data is challenging. Here, the authors introduce the concept of multiresolution cell-state decomposition as a practical approach to simultaneously capture both fine- and coarse-grain patterns of variability.

    • Shahin Mohammadi
    • , Jose Davila-Velderrain
    •  & Manolis Kellis
  • Article
    | Open Access

    The Danish health system has been collecting health-related data on the entire Danish population for years. Here the authors present the Danish Disease Trajectory Browser (DTB), which allows users to explore population-wide disease progression patterns from data collected between 1994 and 2018.

    • Troels Siggaard
    • , Roc Reguant
    •  & Søren Brunak
  • Article
    | Open Access

    Electron microscopy (EM) is the gold standard for biological ultrastructure but acquisition speed is slow, making it unsuitable for large volumes. Here the authors present a parallel imaging pipeline for continuous autonomous imaging with six transmission EMs to image 1 mm3 of mouse cortex in less than 6 months.

    • Wenjing Yin
    • , Derrick Brittain
    •  & Nuno Macarico da Costa
  • Article
    | Open Access

    The quality of human language translation has been thought to be unattainable by computer translation systems. Here the authors present CUBBITT, a deep learning system that outperforms professional human translators in retaining text meaning in English-to-Czech news translation, and validate the system on English-French and English-Polish language pairs.

    • Martin Popel
    • , Marketa Tomkova
    •  & Zdeněk Žabokrtský
  • Article
    | Open Access

    How cell clusters are defined in single-cell sequencing data has important consequences for downstream analyses and the interpretation of results, but is often not straightforward. Here, the authors present a new approach that enables the prediction of differentially expressed genes without relying on explicit clustering of cells.

    • Alexis Vandenbon
    •  & Diego Diez