Software articles within Nature Communications

Featured

  • Article
    | Open Access

    Advanced computer vision technology can provide near real-time home monitoring to support "aging in place” by detecting falls and symptoms related to seizures and stroke. In this paper, the authors propose a strategy that uses homomorphic encryption, which guarantees information confidentiality while retaining action detection.

    • Miran Kim
    • , Xiaoqian Jiang
    •  & Shayan Shams
  • Article
    | Open Access

    As the scale of single-cell genomics experiments grows into the millions, the computational requirements to process this data are beyond the reach of many. Here the authors present Scarf, a modularly designed Python package that makes the analysis workflow highly memory efficient such that even the largest existing datasets can be analyzed on an average modern laptop.

    • Parashar Dhapola
    • , Johan Rodhe
    •  & Göran Karlsson
  • Article
    | Open Access

    Climatic variables have played a significant role in plant evolution across the Phanerozoic. Here, the authors link climate with a new dynamic vegetation model to identify two windows of opportunity for plant biomass expansion, corresponding with the expansion of land plants and the angiosperm radiation.

    • Khushboo Gurung
    • , Katie J. Field
    •  & Benjamin J. W. Mills
  • Article
    | Open Access

    Graph-based genome reference representations have seen significant development, motivated by the inadequacy of the current human genome reference to represent the diverse genetic information from different human populations and its inability to maintain the same level of accuracy for non-European ancestries. Here the authors present the case for iteratively augmenting tailored genome graphs for targeted populations and demonstrate this approach on the whole-genome samples of African ancestry.

    • H. Serhat Tetikol
    • , Deniz Turgut
    •  & Brandi N. Davis-Dusenbery
  • Article
    | Open Access

    High-throughput electron tomography has been challenging due to time-consuming alignment and reconstruction. Here, the authors demonstrate real-time tomography with dynamic 3D tomographic visualization integrated in tomviz, an open-source 3D data analysis tool.

    • Jonathan Schwartz
    • , Chris Harris
    •  & Robert Hovden
  • Article
    | Open Access

    Reproducibility, traceability, and transparency have been long-standing issues in metabolomics data analysis. Here, the authors present tidyMass, an R-based computational framework that allows designing traceable, shareable, and reproducible data processing and analysis workflows for untargeted metabolomics.

    • Xiaotao Shen
    • , Hong Yan
    •  & Michael P. Snyder
  • Article
    | Open Access

    Deep learning could be applied to the challenge of somatic variant calling in cancer by making use of large-scale genomic data. Here, the authors develop VarNet, a weakly supervised deep learning model for somatic variant calling in cancer with robust performance across multiple cancer genomics datasets.

    • Kiran Krishnamachari
    • , Dylan Lu
    •  & Anders Jacobsen Skanderup
  • Article
    | Open Access

    The dia-PASEF technology uses ion mobility separation to reduce signal interferences and increase sensitivity of mass spectrometry-based proteomics. The authors present algorithms and a software solution, which boost proteomic depth in dia-PASEF experiments by up to 83% compared to previous work, and are specifically beneficial for fast proteomic experiments and those with low sample amounts.

    • Vadim Demichev
    • , Lukasz Szyrwiel
    •  & Markus Ralser
  • Article
    | Open Access

    Genome-scale metabolic models have been widely used for quantitative exploration of the relation between genotype and phenotype. Here the authors present GECKO 2, an automated framework for continuous and version controlled update of enzyme-constrained models of metabolism, producing an interesting catalogue of high-quality models for diverse yeasts, bacteria and human metabolism, aiming to facilitate their use in basic science, metabolic engineering and synthetic biology purposes.

    • Iván Domenzain
    • , Benjamín Sánchez
    •  & Jens Nielsen
  • Article
    | Open Access

    Correct interpretation of computer tomography (CT) scans is important for the correct assessment of a patient’s disease but can be subjective and timely. Here, the authors develop a system that can automatically segment the non-small cell lung cancer on CT images of patients and show in an in silico trial that the method was faster and more reproducible than clinicians.

    • Sergey P. Primakov
    • , Abdalla Ibrahim
    •  & Philippe Lambin
  • Article
    | Open Access

    Multi-channel SMLM imaging is powerful. Here the authors report globLoc, a GPU-based global fitting algorithm, to extract maximum information from multichannel single molecule data; this gives improved localisation precision for biplane and 4Pi-SMLM and colour assignment in multi-colour astigmatic SMLM.

    • Yiming Li
    • , Wei Shi
    •  & Jonas Ries
  • Article
    | Open Access

    Spatial transcriptomics experiments profile genome-wide gene expression at localized spots across a tissue. Here, the authors identify spot swapping, an artifact where RNA expressed at one tissue spot binds probes at another, and they propose SpotClean to adjust for it.

    • Zijian Ni
    • , Aman Prasad
    •  & Christina Kendziorski
  • Article
    | Open Access

    RNA modifications represent a critical aspect of RNA biology that is not well suited to sequencing methods. Here, the authors provide a software tool for automated analysis of RNA tandem mass spectra with full support of modifications, isotope labelling, and control of false discovery rate.

    • Luigi D’Ascenzo
    • , Anna M. Popova
    •  & James R. Williamson
  • Article
    | Open Access

    A scalable approach to explore DNA replication in single cells reveals that although aneuploidy does not have a major impact on the pattern of replication, different cell types and sub-populations display distinguished replication paths.

    • Stefano Gnan
    • , Joseph M. Josephides
    •  & Chun-Long Chen
  • Article
    | Open Access

    The design of highly multiplex PCR primers to amplify and enrich many different DNA sequences is increasing in biomedical importance as new mutations and pathogens are identified. The authors present and experimentally validate Simulated Annealing Design using Dimer Likelihood Estimation (SADDLE), a stochastic algorithm for design of highly multiplex PCR primer sets that minimize primer dimer formation.

    • Nina G. Xie
    • , Michael X. Wang
    •  & David Yu Zhang
  • Article
    | Open Access

    SHAPEwarp is a method that allows identifying structurally-similar RNAs by direct comparison of reactivity profiles derived from chemical probing experiments. Its application to viral genomes identified conserved RNA structure elements.

    • Edoardo Morandi
    • , Martijn J. van Hemert
    •  & Danny Incarnato
  • Article
    | Open Access

    Intra-tumor heterogeneity is often associated with resistance to targeted therapy, requiring the design of combinatorial therapies. Here, based on tumor single-cell transcriptomic datasets, the authors develop a computational approach to identify optimal combinatorial treatments targeting membrane receptors for cancer therapy.

    • Saba Ahmadi
    • , Pattara Sukprasert
    •  & Eytan Ruppin
  • Article
    | Open Access

    Chromosome conformation capture techniques have recently revealed features beyond chromatin loops such as architectural stripes. Here the authors present their stripe detection tool ‘Stripenn’ to detect and quantitate stripes from any type of chromatin conformation capture data. They show that architectural stripes are enriched at transcriptionally active and accessible genomic regions.

    • Sora Yoon
    • , Aditi Chandra
    •  & Golnaz Vahedi
  • Article
    | Open Access

    The extraction of meaningful biological knowledge from high-throughput mass spectrometry data relies on limiting false discoveries to a manageable amount. Here the authors establish an automated, false discovery rate-controlled targeted analysis workflow for data-independent acquisition that enables a robust FDR estimation improving the comparability of results in the metabolomics field.

    • Oliver Alka
    • , Premy Shanthamoorthy
    •  & Hannes L. Röst
  • Article
    | Open Access

    Gene fusions are an important class of mutations in tumor genomes. Here, the authors develop a single-cell gene fusion detection method scFusion and demonstrate its applications in cancer single-cell studies.

    • Zijie Jin
    • , Wenjian Huang
    •  & Ruibin Xi
  • Article
    | Open Access

    In this work, the authors demonstrate the application of multi-parameter photon-by-photon hidden Markov modeling (mpH2MM) on alternating laser excitation (ALEX)-based smFRET measurements. The utility of mpH2MM in identifying and quantifying dynamic biomolecular sub-populations is demonstrated in three different systems.

    • Paul David Harris
    • , Alessandra Narducci
    •  & Eitan Lerner
  • Article
    | Open Access

    Challenges in batch normalization and data integration limit the comparison of existing mass cytometry datasets. Here, the authors report CytofIn that can integrate mass cytometry datasets from the public domain and reveal cellular features associated with immune oncology by analyzing five public cancer datasets.

    • Yu-Chen Lo
    • , Timothy J. Keyes
    •  & Kara L. Davis
  • Article
    | Open Access

    Current high-dimension imaging data analysis methods are technology-specific and require multiple tools, restricting analytical scalability and result reproducibility. Here the authors present SIMPLI, a software that overcomes these limitations for single-cell and pixel analysis of multiplexed images at spatial resolution.

    • Michele Bortolomeazzi
    • , Lucia Montorsi
    •  & Francesca D. Ciccarelli
  • Article
    | Open Access

    Sample mix-up is a potential problem in large-scale omic studies due to the complexity of sample processing. Here, the authors present a pipeline for sample matching in proteogenomics to verify sample identity and ensure data integrity.

    • Ling Li
    • , Mingming Niu
    •  & Xusheng Wang
  • Article
    | Open Access

    To facilitate the rational design of (nano)-materials and biomacromolecules by MD simulations, the authors present the polyply suite, featuring a graph matching algorithm and a random walk protocol for generating multi-scale polymeric topologies and initial coordinates.

    • Fabian Grünewald
    • , Riccardo Alessandri
    •  & Siewert J. Marrink
  • Article
    | Open Access

    In cancer, associations between mutational signatures and driver mutations have been proposed but not fully explored. Here, the authors develop sigDriver to find associations between mutational signatures and mutation hotspots in order to predict coding and non-coding driver mutations in pan-cancer genomics data.

    • John K. L. Wong
    • , Christian Aichmüller
    •  & Marc Zapatka
  • Article
    | Open Access

    Ordinary differential equation (ODE) models are widely used to understand multiple processes. Here the authors show how the concept of mini-batch optimization can be transferred from the field of Deep Learning to ODE modelling.

    • Paul Stapor
    • , Leonard Schmiester
    •  & Jan Hasenauer
  • Article
    | Open Access

    Nanopore direct RNA Sequencing data contain information about the presence of RNA modifications, but their detection poses substantial challenges. Here the authors introduce Nanocompore, a new methodology for modification detection from Nanopore data.

    • Adrien Leger
    • , Paulo P. Amaral
    •  & Tony Kouzarides
  • Article
    | Open Access

    Analyses of summary statistics from GWAS are subject to biases due to errors in the discovery GWAS or linkage disequilibrium reference data set or heterogeneity between data sets. Here, the authors propose a quality control method to be added to analysis of GWAS summary data that can reduce such biases.

    • Wenhan Chen
    • , Yang Wu
    •  & Jian Yang
  • Article
    | Open Access

    The authors present DeepRank, a deep learning framework for the data mining of large sets of 3D protein-protein interfaces (PPI). They use DeepRank to address two challenges in structural biology: distinguishing biological versus crystallographic PPIs in crystal structures, and secondly the ranking of docking models.

    • Nicolas Renaud
    • , Cunliang Geng
    •  & Li C. Xue
  • Article
    | Open Access

    Improving inference in large-scale genetic data linked to electronic medical record data requires the development of novel computationally efficient regression methods. Here, the authors develop a Bayesian approach for association analyses to improve SNP-heritability estimation, discovery, fine-mapping and genomic prediction.

    • Marion Patxot
    • , Daniel Trejo Banos
    •  & Matthew R. Robinson
  • Article
    | Open Access

    @melkebir @psashittal et al. develop a graph-based method for the assembly of discontinuous transcripts produced in Coronaviruses and other Nidovirales, enabling the discovery of transcriptional changes missed by existing methods.

    • Palash Sashittal
    • , Chuanyi Zhang
    •  & Mohammed El-Kebir
  • Article
    | Open Access

    Obtaining accurate variant calls from multiple displacement amplified single cell DNA sequencing data needs dedicated models that account for amplification bias and copy errors. Here, the authors describe ProSolo, a model for calling single nucleotide variants with control over the false discovery rate.

    • David Lähnemann
    • , Johannes Köster
    •  & Alexander Schönhuth
  • Article
    | Open Access

    As studies continue sequencing with deeper coverage, computational processing of these profiles has become increasingly resource consuming. Here the authors designed an efficient computational method called Chromap to align and preprocess high throughput sequencing data from chromatin profiling techniques, including ChIP-seq, Hi-C, or scATAC-seq, with a major decrease in runtime.

    • Haowen Zhang
    • , Li Song
    •  & Heng Li
  • Article
    | Open Access

    Existing methods to identify the presence of DNA from other hominin species can be limited in the ability to accurately estimate introgression waves, or can only be applied to specific populations. Here, the authors have developed a generalizable method to identify introgression in multi-wave situations.

    • Kai Yuan
    • , Xumin Ni
    •  & Shuhua Xu