Protein sequence analyses articles within Nature Communications

Featured

  • Article
    | Open Access

    Many fusion oncoproteins (FOs) form condensates, some form in the nucleus and regulate gene expression while others form in the cytoplasm and promote cell signaling. In this work, the authors report the analysis of physicochemical features to enable prediction of FO condensation behavior.

    • Swarnendu Tripathi
    • , Hazheen K. Shirnekhi
    •  & Richard W. Kriwacki
  • Article
    | Open Access

    Protein language models taking multiple sequence alignments as inputs capture protein structure and mutational effects. Here, the authors show that these models also encode phylogenetic relationships, and can disentangle correlations due to structural constraints from those due to phylogeny.

    • Umberto Lupo
    • , Damiano Sgarbossa
    •  & Anne-Florence Bitbol
  • Article
    | Open Access

    Glycosyltransferases (GT) are proteins that display extensive sequence and functional variation on a subset of 3D folds. Here, the authors use interpretable deep learning to predict 3D folds from sequence without the need for sequence alignment, which also enables the prediction of GTs with new folds.

    • Rahil Taujale
    • , Zhongliang Zhou
    •  & Natarajan Kannan
  • Article
    | Open Access

    The authors present flDPnn, a computational tool for disorder and disorder function predictions from protein sequences. flDPnn was assessed with the data from the “Critical Assessment of Protein Intrinsic Disorder Prediction” experiment and on an independent and low-similarity test dataset, which show that flDPnn offers accurate predictions of disorder, fully disordered proteins and four common disorder functions.

    • Gang Hu
    • , Akila Katuwawala
    •  & Lukasz Kurgan
  • Article
    | Open Access

    Our understanding of the residue-level details of protein interactions remains incomplete. Here, the authors show sequence coevolution can be used to infer interacting proteins with residue-level details, including predicting 467 interactions de novo in the Escherichia coli cell envelope proteome.

    • Anna G. Green
    • , Hadeer Elhabashy
    •  & Debora S. Marks
  • Article
    | Open Access

    Dpr (Defective proboscis extension response) and DIP (Dpr Interacting Proteins) are immunoglobulin-like cell-cell adhesion proteins that form highly specific pairwise interactions, which control synaptic connectivity during Drosophila development. Here, the authors combine a computational approach with binding affinity measurements and find that DIP/Dpr binding specificity is controlled by negative constraints that interfere with non-cognate binding.

    • Alina P. Sergeeva
    • , Phinikoula S. Katsamba
    •  & Barry Honig
  • Article
    | Open Access

    Billions of metagenomic and genomic sequences fill up public datasets, which makes similarity clustering an important and time-critical analysis step. Here, the authors develop Linclust, an algorithm with linear time complexity that can cluster over a billion sequences within hours on a single server.

    • Martin Steinegger
    •  & Johannes Söding