Software articles within Nature Communications

Featured

  • Article
    | Open Access

    Here, the authors present RecombinHunt, a computational method based on big data analysis, that enhances community-based detection of recombinant viral lineages.

    • Tommaso Alfonsi
    • , Anna Bernasconi
    •  & Stefano Ceri
  • Article
    | Open Access

    DNA methylation from cell-free DNA (cfDNA) can be profiled using whole genome bisulfite sequencing (WGBS). Here, the authors develop a computational method, FinaleMe, that predicts DNA methylation and tissues of-origin in cfDNA and validate its performance using paired deep and shallow-coverage whole-genome sequencing (WGS) and WGBS data.

    • Yaping Liu
    • , Sarah C. Reed
    •  & Manolis Kellis
  • Article
    | Open Access

    Long-read sequencing can greatly improve detection of genomic structural variants (SVs), and numerous methods have been developed to identify SVs using long-read data. Here the authors compare the performance of these methods and provide guidelines to aid users in selecting the most suitable tools for various scenarios.

    • Yichen Henry Liu
    • , Can Luo
    •  & Xin Maizie Zhou
  • Article
    | Open Access

    Binning is an essential step in genome-resolved metagenomic analysis in which assembled contigs originating from the same source population are clustered. However it is challenging, especially for low abundance microbial species. Here the authors introduce a toolkit that integrates multiple prominent binning tools and AI for efficient and high-resolution recovery of non-redundant bins from short- and long-read metagenomic sequencing datasets.

    • Zhiguang Qiu
    • , Li Yuan
    •  & Ke Yu
  • Article
    | Open Access

    Understanding the timing and fitness of somatic copy number alterations (SCNAs) in cancer would shed light on cancer progression and evolution. Here, the authors develop Butte, a computational framework to estimate the timing of clonal SCNAs that encompass multiple gains, and apply it on whole-genome sequencing data from 184 samples.

    • Zicheng Wang
    • , Yunong Xia
    •  & Ruping Sun
  • Article
    | Open Access

    Manual processes to produce ocular prostheses are time-consuming and yield varying quality. Here, authors present an automatic digital end-to-end process for custom ocular prostheses. It creates shape and appearance from image data of an OCT device and produces them using a full-colour 3D printer.

    • Johann Reinhard
    • , Philipp Urban
    •  & Mandeep S. Sagoo
  • Article
    | Open Access

    ‘Extrachromosomal DNA has been previously linked to tumour progression and heterogeneity, but its potential as a cancer biomarker has not been fully explored. Here, the authors develop a computational framework to refine genomic subtypes and predict response to immunotherapy in gastrointestinal cancer.

    • Shixiang Wang
    • , Chen-Yi Wu
    •  & Qi Zhao
  • Article
    | Open Access

    Advancements in spatial transcriptomics technologies have enabled the analysis of gene expression at cellular resolution in situ. The authors applied direct RNA hybridization-based in situ sequencing (dRNA HybISS) and developed a computational tool, CellScopes, to study gene expression in mouse kidneys, identifying cellular changes and interactions during injury and repair.

    • Haojia Wu
    • , Eryn E. Dixon
    •  & Benjamin D. Humphreys
  • Article
    | Open Access

    Link prediction in temporal networks is relevant for many real-world systems, however, current approaches are usually characterized by high computational costs. The authors propose a temporal link prediction framework based on the sequential stacking of static network features, for improved computational speed, appropriate for temporal networks with completely unobserved or partially observed target layers.

    • Xie He
    • , Amir Ghasemian
    •  & Peter J. Mucha
  • Article
    | Open Access

    Using All of Us pilot data, the authors compared short- and long-read performance across medically relevant genes and showcased the utility of long reads to improve variant detection and phasing in easy and hard to resolve medically relevant genes.

    • M. Mahmoud
    • , Y. Huang
    •  & F. J. Sedlazeck
  • Article
    | Open Access

    Global challenges demand global solutions. Here, the authors show a distributed self-driving lab architecture in The World Avatar, linking robots in Cambridge and Singapore for asynchronous multi-objective reaction optimisation.

    • Jiaru Bai
    • , Sebastian Mosbach
    •  & Markus Kraft
  • Article
    | Open Access

    Assessing tumour contamination in normal samples is critical for accurate variant calling in cancer samples. Here, the authors develop TINC, a computational method to determine the level of tumour in normal contamination, and demonstrate its application in the Genomics England 100,000 Genomes Project dataset.

    • Jonathan Mitchell
    • , Salvatore Milite
    •  & Giulio Caravagna
  • Article
    | Open Access

    In this work, the authors report the FreeDTS software to simulate biomembranes at the mesoscale. The software provides various membrane simulations, focusing on protein organization and shape remodeling. A versatile tool propelling realistic membrane studies and diverse applications.

    • Weria Pezeshkian
    •  & John H. Ipsen
  • Article
    | Open Access

    Cryo-EM is the go-to method for visualizing large, flexible biomolecules. Here, authors introduce a new Gaussian mixture modelling method for cryo-EM modelling tasks, including refinement, composite map generation and ensemble representation.

    • Joseph G. Beton
    • , Thomas Mulvaney
    •  & Maya Topf
  • Article
    | Open Access

    Identifying tissue structure in large-scale spatial omics datasets from multiple slices is challenging. Here, authors present MENDER, an optimisation-free spatial clustering method that can scale to million-level spatial data, enabling efficient analysis of spatial cell atlases.

    • Zhiyuan Yuan
  • Article
    | Open Access

    Copy number variants (CNV) are shown to contribute to the etiology of various genetic disorders. Here, authors present ECOLE, a deep learning-based somatic and germline CNV caller for WES data. Utilising a variant of the transformer architecture, the model is trained to call CNVs per exon.

    • Berk Mandiracioglu
    • , Furkan Ozden
    •  & A. Ercument Cicek
  • Article
    | Open Access

    Reproducibility is essential for the progress of research, yet achieving it remains elusive even in computational fields. Here, authors develop the rworkflows suite, making robust CI/CD workflows easy and freely accessible to all R package developers.

    • Brian M. Schilder
    • , Alan E. Murphy
    •  & Nathan G. Skene
  • Article
    | Open Access

    Generating microfluidic droplets with application-specific desired characteristics is hard. Here the authors report fluid-agnostic machine learning models capable of accurately predicting device geometries and flow conditions required to generate stable single and double emulsions.

    • Ali Lashkaripour
    • , David P. McIntyre
    •  & Polly M. Fordyce
  • Article
    | Open Access

    Screening mutated proteins is a versatile strategy in protein research, producing massive datasets when combined with NGS. Here, authors present ACIDES to estimate mutated protein fitness and aid protein engineering pipelines in a range of applications, including gene therapy.

    • Takahiro Nemoto
    • , Tommaso Ocari
    •  & Ulisse Ferrari
  • Article
    | Open Access

    Batch integration is a critical yet challenging step in many single-cell RNA-seq analysis workflows. Here, authors present JOINTLY, a hybrid linear and non-linear NMF-based algorithm, providing interpretable and robust cell clustering against over-integration.

    • Andreas Fønss Møller
    •  & Jesper Grud Skat Madsen
  • Article
    | Open Access

    Accurately benchmarking small variant calling accuracy is critical for the continued improvement of human genome sequencing. Here, the authors show that current approaches are biased towards certain variant representations and develop a new approach to ensure consistent and accurate benchmarking, regardless of the original variant representations.

    • Tim Dunn
    •  & Satish Narayanasamy
  • Article
    | Open Access

    Computational deconvolution with single-cell RNA sequencing data as a reference is pivotal for interpreting spatial transcriptomics data. Here, authors present Redeconve, which improves the resolution by more than 100-fold with higher accuracy and speed.

    • Zixiang Zhou
    • , Yunshan Zhong
    •  & Xianwen Ren
  • Article
    | Open Access

    There is a need for dataset-dependent MS2 acquisition in trapped ion mobility spectrometry imaging. Here the authors report spatial ion mobility-scheduled exhaustive fragmentation (SIMSEF) which enables on-tissue metabolite and lipid annotation in mass spectrometry bioimaging studies, and use this to visualise the chemical space in rat brains.

    • Steffen Heuckeroth
    • , Arne Behrens
    •  & Robin Schmid
  • Article
    | Open Access

    Benchmarking computational tools for analysis of single-cell sequencing data demands simulation of realistic sequencing reads. However, none of the few existing read simulators aim to mimic real data. Here, the authors introduce scReadSim, a single-cell RNA-seq and ATAC-seq read simulator that works by mimicking real data.

    • Guanao Yan
    • , Dongyuan Song
    •  & Jingyi Jessica Li
  • Article
    | Open Access

    Pseudotime analysis is prevalent in single-cell RNA-seq, but it remains challenging to perform it across multiple samples and experimental conditions. Here, the authors develop Lamian, a computational framework for multi-sample pseudotime analysis that adjusts for biological and technical variation to detect gene program changes along cell trajectories and across conditions.

    • Wenpin Hou
    • , Zhicheng Ji
    •  & Hongkai Ji
  • Article
    | Open Access

    Spatial omics technologies reveal the organisation of cells in various biological systems. Here, authors propose SLAT, a graph-based algorithm for aligning heterogenous data across technologies, modalities and timepoints, enabling spatiotemporal reconstruction of complex developmental processes.

    • Chen-Rui Xia
    • , Zhi-Jie Cao
    •  & Ge Gao
  • Article
    | Open Access

    Five-prime single-cell RNA-seq, especially the read 1, has precise capture of transcription start sites (TSS), but such information is often overlooked. Here, authors present a computational method suite, CamoTSS, to precisely identify TSS and quantify its expression, enabling effective detection of alternative TSS usage in different biological processes.

    • Ruiyan Hou
    • , Chung-Chau Hon
    •  & Yuanhua Huang
  • Article
    | Open Access

    The study reveals limitations in widely used RNA-seq aligners, which create 'phantom' introns in reference databases. The authors introduce EASTR, a computational tool that not only enhances alignment accuracy but also uncovers existing annotation errors. This improvement bolsters the dependability of subsequent RNA-seq analyses.

    • Ida Shinder
    • , Richard Hu
    •  & Mihaela Pertea
  • Article
    | Open Access

    The growing number of available single-cell RNA-sequencing datasets from different species creates opportunities to explore evolutionary relationships between cell types across species. Here, the authors compare different strategies for cross-species integration of these data and offer guidelines for effective integration.

    • Yuyao Song
    • , Zhichao Miao
    •  & Irene Papatheodorou
  • Article
    | Open Access

    Critical transitions and qualitative changes of dynamics in cardiac, ecological, and economical systems, can be characterized by discrete-time bifurcations. The authors propose a deep learning framework that provides early warning signals for critical transitions in discrete-time experimental data.

    • Thomas M. Bury
    • , Daniel Dylewsky
    •  & Gil Bub