Computational biology and bioinformatics

  • Article
    | Open Access

    Multiregion sequencing is needed to better capture the heterogeneity of hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (iCCA). Here, the authors analyse HCC and iCCA tumours with multiregion single-cell RNA-seq, revealing cellular dynamics and communication networks with immune cells.

    • Lichun Ma
    • , Sophia Heinrich
    •  & Xin Wei Wang
  • Article
    | Open Access

    Off-target binding hinders the development of therapeutic antibodies and reproducibility in basic research settings. Here the authors develop a method to quantify and reduce the polyreactivity of antibody fragments based on protein sequence alone.

    • Edward P. Harvey
    • , Jung-Eun Shin
    •  & Andrew C. Kruse
  • Article
    | Open Access

    Object detection using machine learning universally requires vast amounts of training datasets. Midtvedt et al. proposes a deep-learning method that enables detecting microscopic objects with sub-pixel accuracy from a single unlabeled image by exploiting the roto-translational symmetries of the problem.

    • Benjamin Midtvedt
    • , Jesús Pineda
    •  & Giovanni Volpe
  • Article
    | Open Access

    Methods for jointly analysing the different spatial data modalities in 3D are lacking. Here the authors report the computational framework STACI (Spatial Transcriptomic data using over-parameterized graph-based Autoencoders with Chromatin Imaging data) which they apply to an Alzheimer’s disease mouse model.

    • Xinyi Zhang
    • , Xiao Wang
    •  & Caroline Uhler
  • Article
    | Open Access

    Nucleosome profiling from cell-free DNA (cfDNA) represents a potential approach for cancer detection and classification. Here, the authors develop Griffin, a computational framework for tumour subtype classification based on cfDNA nucleosome profiling that can work with ultra-low pass sequencing data.

    • Anna-Lisa Doebley
    • , Minjeong Ko
    •  & Gavin Ha
  • Article
    | Open Access

    Predicting topological structures from Hi-C data provides insight into comprehending gene expression and regulation. Here, the authors present RefHiC, an attention-based deep learning framework that leverages a reference panel of Hi-C datasets to assist topological structure annotation from a given study sample.

    • Yanlin Zhang
    •  & Mathieu Blanchette
  • Article
    | Open Access

    G protein coupled receptors (GPCRs) can couple to different Gα protein subfamilies either selectively or promiscuously. Here, the authors use computational approach to show that selectivity determinants are at the periphery of the GPCR—G protein interface and that promiscuous GPCRs more frequently sample the common rather than selective contacts.

    • Manbir Sandhu
    • , Aaron Cho
    •  & Nagarajan Vaidehi
  • Article
    | Open Access

    Metabolic labeling is often used to measure protein turnover. Here the authors show that for interconvertible protein species like phosphoforms metabolic labeling does not provide information on turnover differences, but that the relative order of modification can determine the observed dynamics.

    • Henrik M. Hammarén
    • , Eva-Maria Geissen
    •  & Mikhail M. Savitski
  • Article
    | Open Access

    The presented Mean-Shift Super Resolution (MSSR) algorithm can extend spatial resolution within a single microscopy image. Its applicability extends across a wide range of experimental and instrumental configurations and it is compatible with other super-resolution microscopy approaches.

    • Esley Torres-García
    • , Raúl Pinto-Cámara
    •  & Adán Guerrero
  • Article
    | Open Access

    Identifying the designers of engineered biological sequences would help promote biotechnological innovation while holding designers accountable. Here the authors present the winners of a 2020 data-science competition which improved on previous attempts to attribute plasmid sequences.

    • Oliver M. Crook
    • , Kelsey Lane Warmbrod
    •  & William J. Bradshaw
  • Article
    | Open Access

    Methods that analyse heterogeneity and compare tissue microenvironments using spatial omics data are challenging to develop. Here, the authors present SOTIP, a method that can perform spatial heterogeneity, spatial domain, and differential microenvironment analyses across multiple spatial omics modalities.

    • Zhiyuan Yuan
    • , Yisi Li
    •  & Michael Q. Zhang
  • Article
    | Open Access

    Sinonasal tumour diagnosis can be complicated by the heterogeneity of disease and classification systems. Here, the authors use machine learning to classify sinonasal undifferentiated carcinomas into 4 molecular classe with differences in differentiation state and clinical outcome.

    • Philipp Jurmeister
    • , Stefanie Glöß
    •  & David Capper
  • Article
    | Open Access

    Studying the cell composition of acral melanoma at the single-cell level could provide some clues about its poor response to immunotherapy. Here, the authors analyse acral and cutaneous melanoma patient samples using single-cell RNA-sequencing, and reveal a severe immunosuppressive state in acral melanomas

    • Chao Zhang
    • , Hongru Shen
    •  & Jilong Yang
  • Article
    | Open Access

    Modelling how endogenous mutations accumulate in tissues is valuable to understand how cancers develop and evolve. Here, the authors establish a mathematical model that can predict the number of endogenous somatic mutations in the lifetime of tissues and approximate the time to cancer development.

    • Sophie Pénisson
    • , Amaury Lambert
    •  & Cristian Tomasetti
  • Article
    | Open Access

    The biological underpinnings underlying the increased mortality and morbidity in adolescents and young adults (AYA) remains poorly understood. Here, the authors investigate the clinical and genomic disparities in AYA and older adults in a cohort of more than 100,000 cancer patients.

    • Xiaojing Wang
    • , Anne-Marie Langevin
    •  & Siyuan Zheng
  • Article
    | Open Access

    Division of labour, where members of a group specialise on different tasks, is a central feature of many social organisms. Using a theoretical model, the authors demonstrate that division of labour can emerge spontaneously within a group of entirely identical individuals.

    • Jan J. Kreider
    • , Thijs Janzen
    •  & Franz J. Weissing
  • Article
    | Open Access

    Spatial transcriptomics analyses can be affected by noise and spatial correlation across tissue locations. Here, the authors develop SpatialPCA, a spatially-aware dimensionality reduction method that explicitly models spatial correlation structures, and show its application to the analysis of healthy and tumour tissues.

    • Lulu Shang
    •  & Xiang Zhou
  • Article
    | Open Access

    Mutations in BRCA1/2 are associated with a homologous recombination deficiency phenotype in BRCA-associated cancers. Reversion mutations can restore BRCA1/2 function and result in treatment resistance in these cancer-types. Here, the authors show that, in select cases, reversion mutations in BRCA1/2 can indicate prior BRCA-mediated tumorigenesis in non-canonical histologies.

    • Yonina R. Murciano-Goroff
    • , Alison M. Schram
    •  & Alexander Drilon
  • Article
    | Open Access

    In this work the authors provide a computational workflow for the parallel, from scratch, design of proteins to rapidly explore the shape diversity of protein folds.

    • Thomas W. Linsky
    • , Kyle Noble
    •  & Eva-Maria Strauch
  • Article
    | Open Access

    Recovering dropout-affected gene expression values is a challenging problem in bioinformatics. Here, the authors propose a data-driven framework, that first learns the underlying data distribution and then recovers the expression values by imposing a self-consistency on the expression matrix.

    • Md Tauhidul Islam
    • , Jen-Yeu Wang
    •  & Lei Xing
  • Article
    | Open Access

    Modeling the dynamics of large proteins reveals a fundamental scaling problem. Here, the authors tackle this challenge by decomposing a large system into smaller independent subsystems, simultaneously modeling each subsystem’s kinetics and ensuring their mutual independence.

    • Andreas Mardt
    • , Tim Hempel
    •  & Frank Noé
  • Article
    | Open Access

    Here the authors show that transposable element-mediated rearrangements impact more than 500 kbp of an average human genome, are a source of individual variation, a substrate for evolutionary change, and can occur through diverse mechanisms.

    • Parithi Balachandran
    • , Isha A. Walawalkar
    •  & Christine R. Beck
  • Article
    | Open Access

    By comprehensively mapping the impact that different classes of mutations (substitutions, insertions, deletions) have on the ability of the amyloid beta peptide to nucleate amyloids, the authors identify a large set of likely pathogenic variants of amyloid beta that are specifically enriched at its polar N-terminal region.

    • Mireia Seuma
    • , Ben Lehner
    •  & Benedetta Bolognesi
  • Article
    | Open Access

    Agonists selectively targeting GPR119 hold promise for treating metabolic disorders. Here, authors reveal that GPR119 adopts a non-canonical consensus structural scaffold with an extended ligand-binding pocket for chemically different agonists.

    • Yuxia Qian
    • , Jiening Wang
    •  & Anna Qiao
  • Article
    | Open Access

    Evolutionary principles could help distinguish driver from passenger mutations in cancer. Here, the authors develop SEISMIC, a method to identify cancer driver genes based on their deviation from expected mutation status patterns across a cohort under neutral evolution, and find potential drivers in melanoma and other cancer types.

    • Martin Boström
    •  & Erik Larsson
  • Article
    | Open Access

    Spatially resolved transcriptomics is a relatively new technique that maps transcriptional information within a tissue. Here the authors present MIST, which detects molecular regions from spatially resolved transcriptomics and denoises the missing gene expression values by region-specific imputation.

    • Linhua Wang
    • , Mirjana Maletic-Savatic
    •  & Zhandong Liu
  • Article
    | Open Access

    Current treatment guidelines for Type-2 diabetes endorse a massive number of potential anti-hyper-glycemic treatment options in various permutations and combinations. Here, the authors present a causal deep learning approach for more personalized recommendations of treatment selection.

    • Chinmay Belthangady
    • , Stefanos Giampanis
    •  & Beau Norgeot
  • Article
    | Open Access

    Comparisons among experimental results with large amounts of data can be more precise and meaningful when done across multiple different conditions simultaneously. Koch et al. introduce a method, called CLIMB, that does this, and captures interpretable and biologically meaningful information.

    • Hillary Koch
    • , Cheryl A. Keller
    •  & Qunhua Li
  • Article
    | Open Access

    Seasonal influenza vaccination is an important strategy to prevent serious disease in the elderly, but individual responsiveness to vaccination widely vary. Here authors establish, with an array of state-of-the art methods, the major immunological parameters that distinguish vaccine recipients developing robust antibody response and non-responders

    • Peggy Riese
    • , Stephanie Trittel
    •  & Carlos A. Guzmán
  • Article
    | Open Access

    As the throughput of single-cell RNA-seq studies increases, there is a need for tools that can make the data analysis steps more streamlined and convenient. Here, the authors develop UniverSC, a tool that unifies single-cell RNA-seq analysis workflows and also facilitates their use for non-experts.

    • Kai Battenberg
    • , S. Thomas Kelly
    •  & Aki Minoda
  • Article
    | Open Access

    The organisation of mammalian genomes plays a role in many biological processes. Here the authors report dcHiC, a tool which uses a multivariate distance measure to identify changes in compartmentalisation among multiple genome-wide chromatin contact maps, and apply this to different human and mouse datasets.

    • Abhijit Chakraborty
    • , Jeffrey G. Wang
    •  & Ferhat Ay
  • Article
    | Open Access

    Contaminant sequences in metagenomic samples can potentially impact the interpretation of findings reported in microbiome studies, especially in low biomass environments. Here the authors describe Squeegee, a computational approach designed to detect microbial contamination within low microbial biomass microbiomes and identify microbial contaminants in publicly available datasets that lack negative controls.

    • Yunxi Liu
    • , R. A. Leo Elworth
    •  & Todd J. Treangen
  • Article
    | Open Access

    Modification of transcribed mRNAs enables regulation of transcription but its extent in cancer cells is incompletely understood. Here, the authors analyse transcript assembly in over 1000 cancer cell lines and find unannotated transcripts are common, and are associated with drug sensitivity.

    • Wei Hu
    • , Yangjun Wu
    •  & Shengli Li
  • Article
    | Open Access

    Molecular level control is required to capture the folding and supramolecular assembly of collagen in mimetic materials. Here, the authors report on the creation of a synthetic collagen which assembles into banded fibers, recaptures structural properties of natural collagen and which can act as a testbed for design and experimentation

    • Jinyuan Hu
    • , Junhui Li
    •  & Fei Xu
  • Article
    | Open Access

    The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language processing and machine learning to infer the source tissue of public genomics samples based on their plain text descriptions, making these samples easy to discover and reuse.

    • Nathaniel T. Hawkins
    • , Marc Maldaver
    •  & Arjun Krishnan