-
-
Article
| Open AccessBenchmarking tools for detecting longitudinal differential expression in proteomics data allows establishing a robust reproducibility optimization regression approach
Longitudinal proteomics holds great promise for biomarker discovery, but the data interpretation has remained a challenge. Here, the authors evaluate several tools to detect longitudinal differential expression in proteomics data and introduce RolDE, a robust reproducibility optimization approach.
- Tommi Välikangas
- , Tomi Suomi
- & Laura L. Elo
-
Article
| Open AccesspGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level
Software tools for larger-scale intact glycopeptide quantification lag far behind, which hinders exploring the differential sitespecific glycosylation. Here, the authors report pGlycoQuant, a generic tool with a deep learning model for quantitative glycoproteomics at intact glycopeptide level.
- Siyuan Kong
- , Pengyun Gong
- & Weiqian Cao
-
Article
| Open AccessRG/RGG repeats in the C. elegans homologs of Nucleolin and GAR1 contribute to sub-nucleolar phase separation
Spaulding et al. survey RG/RGG repeats in C. elegans and identify the homologs of Nucleolin (NUCL-1) and GAR1 (GARR-1). RG/RGG repeats are dispensable for nucleolar accumulation but critical for sub-nucleolar phase separation.
- Emily L. Spaulding
- , Alexis M. Feidler
- & Dustin L. Updike
-
Article
| Open AccessFLASHIda enables intelligent data acquisition for top–down proteomics to boost proteoform identification counts
Data acquisition suitable for top-down proteomics (TDP) has the potential to significantly improve proteoform analysis. Here, the authors present FLASHIda, an intelligent online data acquisition algorithm for TDP that nearly doubles the number of proteoform-level identifications in complex samples.
- Kyowon Jeong
- , Maša Babović
- & Oliver Kohlbacher
-
Article
| Open AccessA streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics
Immunopeptidomics benefits from highly sensitive mass spectrometry (MS). Here, the authors present a computational platform for integrating data-dependent and -independent acquisition MS approaches, and demonstrate its utility for deeper immunopeptidome profiling.
- Lei Xin
- , Rui Qiao
- & Ming Li
-
Article
| Open AccessProteogenomic characterization of 2002 human cancers reveals pan-cancer molecular subtypes and associated pathways
Pan-cancer proteomics analysis enables the analysis of protein expression across multiple cancer types. Here, the authors compare proteomics from 14 cancer types and show 11 distinct subtypes across multiple cancer types. Proteome data could link higher pathway activity levels with somatic alteration of specific genes in the pathway.
- Yiqun Zhang
- , Fengju Chen
- & Chad J. Creighton
-
Article
| Open AccessGlyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation
Poor peptide fragmentation and unusual glycan structures limit mass spectrometry-based analysis of intact N-glycopeptides. Here, the authors develop Glyco-Decipher, a glycan-independent peptide search tool, to tackle these issues and improve the coverage of site-specific glycan analysis.
- Zheng Fang
- , Hongqiang Qin
- & Mingliang Ye
-
Article
| Open AccesscyCombine allows for robust integration of single-cell cytometry datasets within and across technologies
Combining single-cell cytometry datasets increases the analytical flexibility and the statistical power of data analyses. Here, the authors present a method to robustly integrate cytometry data from different batches, experiments, or even different experimental techniques.
- Christina Bligaard Pedersen
- , Søren Helweg Dam
- & Lars Rønn Olsen
-
Article
| Open AccessImproved prediction of protein-protein interactions using AlphaFold2
Predicting the structure of protein complexes is extremely difficult. Here, authors apply AlphaFold2 with optimized multiple sequence alignments to model complexes of interacting proteins, enabling prediction of both if and how proteins interact with state-of-art accuracy.
- Patrick Bryant
- , Gabriele Pozzati
- & Arne Elofsson
-
Article
| Open AccessSMAP is a pipeline for sample matching in proteogenomics
Sample mix-up is a potential problem in large-scale omic studies due to the complexity of sample processing. Here, the authors present a pipeline for sample matching in proteogenomics to verify sample identity and ensure data integrity.
- Ling Li
- , Mingming Niu
- & Xusheng Wang
-
Article
| Open AccessAn atlas of protein turnover rates in mouse tissues
Protein turnover underpins biology but is challenging to measure in vivo across the entire proteome. Here, the authors provide a comprehensive resource of protein turnover in mouse tissues and develop a visualization platform to analyze these data.
- Zach Rolfs
- , Brian L. Frey
- & Nathan V. Welham
-
Article
| Open AccessDeepPhospho accelerates DIA phosphoproteome profiling through in silico library generation
The coverage and throughput of data-independent acquisition (DIA)-based phosphoproteomics is limited by its dependence on experimental spectral libraries. Here the authors develop a DIA workflow based on in silico spectral libraries generated by a novel deep neural network to expand phosphoproteome coverage.
- Ronghui Lou
- , Weizhen Liu
- & Wenqing Shui
-
Article
| Open AccessAn integrative proteomics method identifies a regulator of translation during stem cell maintenance and differentiation
To characterize molecular changes during cell type transitions, the authors develop a method to simultaneously measure protein expression and thermal stability changes. They apply this approach to study differences between human pluripotent stem cells, their progenies, parental and allogeneic cells.
- Pierre Sabatier
- , Christian M. Beusch
- & Roman A. Zubarev
-
Article
| Open AccessGproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control
Data independent acquisition (DIA) proteomics provides deep coverage and high quantitative accuracy, but is not yet well established in glycoproteomics. Here, the authors develop a DIA-based glycoproteomics workflow with stringent statistical controls to enable accurate glycopeptide identification.
- Yi Yang
- , Guoquan Yan
- & Liang Qiao
-
Perspective
| Open AccessA proteomics sample metadata representation for multiomics integration and big data analysis
The number of publicly available proteomics datasets is growing rapidly, but a standardized approach for describing the associated metadata is lacking. Here, the authors propose a format and a software pipeline to present and validate metadata, and integrate them into ProteomeXchange repositories.
- Chengxin Dai
- , Anja Füllgrabe
- & Yasset Perez-Riverol
-
Article
| Open AccessSpatiotemporal proteomic profiling of the pro-inflammatory response to lipopolysaccharide in the THP-1 human leukaemia cell line
“Protein relocalisation plays a major role in the innate immune response but remains incompletely characterised. Here, the authors combine temporal proteomics with LOPIT, a spatial proteomic workflow, in a fully Bayesian framework to elucidate spatiotemporal proteomic changes during the LPS-induced immune response in THP-1 cells.
- Claire M. Mulvey
- , Lisa M. Breckels
- & Kathryn S. Lilley
-
Article
| Open AccessIceR improves proteome coverage and data completeness in global and single-cell proteomics
Label-free quantitative proteomics by data dependent acquisition offers high protein identification rates but is often limited by missing values. Here, the authors develop a quantification workflow that substantially reduces missing values while maintaining high identification rates and quantification accuracy.
- Mathias Kalxdorf
- , Torsten Müller
- & Jeroen Krijgsveld
-
Article
| Open AccessSystematic detection of functional proteoform groups from bottom-up proteomic datasets
Many proteins exist in various proteoforms but detecting these variants by bottom-up proteomics remains difficult. Here, the authors present a computational approach based on peptide correlation analysis to identify and characterize proteoforms from bottom-up proteomics data.
- Isabell Bludau
- , Max Frank
- & Ruedi Aebersold
-
Article
| Open AccessReliable identification of protein-protein interactions by crosslinking mass spectrometry
Cross-linking mass spectrometry (MS) can identify protein-protein interaction (PPI) networks but assessing the reliability of these data remains challenging. To address this issue, the authors develop and validate a method to determine the false-discovery rate of PPIs identified by cross-linking MS.
- Swantje Lenz
- , Ludwig R. Sinn
- & Juri Rappsilber
-
Article
| Open AccessQuantitative single-cell proteomics as a tool to characterize cellular hierarchies
Single-cell proteomics can provide insights into the molecular basis for cellular heterogeneity. Here, the authors develop a multiplexed single-cell proteomics and computational workflow, and show that their strategy captures the cellular hierarchies in an Acute Myeloid Leukemia culture model.
- Erwin M. Schoof
- , Benjamin Furtwängler
- & Bo T. Porse
-
Article
| Open AccessAutomated annotation and visualisation of high-resolution spatial proteomic mass spectrometry imaging data using HIT-MAP
MALDI-mass spectrometry imaging (MSI) can reveal the distribution of proteins in tissues but tools for protein identification and annotation are sparse. Here, the authors develop an open-source bioinformatic workflow for false discovery rate-controlled protein annotation and spatial mapping from MALDI-MSI data.
- G. Guo
- , M. Papanicolaou
- & A. C. Grey
-
Article
| Open AccessGenoppi is an open-source software for robust and standardized integration of proteomic and genetic data
Genetic variation can impact protein complexes and interaction networks, but reconciling genetic and proteomic information remains challenging. To address this need, the authors develop Genoppi —a computational tool for integrating genetics and cell-type-specific proteomics data.
- Greta Pintacuda
- , Frederik H. Lassen
- & Kasper Lage
-
Article
| Open AccessA computational method for detection of ligand-binding proteins from dose range thermal proteome profiles
2D-thermal proteome profiling (2D-TPP) is a powerful assay for probing interactions of proteins with small molecules in their native context. Here the authors provide a statistical method for false discovery rate controlled analysis for 2D-TPP applications.
- Nils Kurzawa
- , Isabelle Becher
- & Mikhail M. Savitski
-
Article
| Open AccessDIALib-QC an assessment tool for spectral libraries in data-independent acquisition proteomics
Most data-independent acquisition (DIA) methods depend on mass spectral libraries for peptide identification but tools to assess library quality are lacking. Here, the authors develop DIALib- QC for the systematic evaluation and correction of spectral libraries.
- Mukul K. Midha
- , David S. Campbell
- & Robert L. Moritz
-
Article
| Open AccessIntegration of absolute multi-omics reveals dynamic protein-to-RNA ratios and metabolic interplay within mixed-domain microbiomes
Here, the authors perform a temporal multi-omic analysis of a minimalistic cellulose-degrading and methane-producing consortium at the strain level and estimate protein-to-RNA ratios and RNA-protein dynamics of the community simultaneously over time.
- F. Delogu
- , B. J. Kunath
- & P. B. Pope
-
Article
| Open AccessIdentification of modified peptides using localization-aware open search
Mass spectrometry-based proteomics is the method of choice for the global mapping of post-translational modifications, but matching and scoring peaks with unknown masses remains challenging. Here, the authors present a refined open search strategy to score all peaks with higher sensitivity and accuracy.
- Fengchao Yu
- , Guo Ci Teo
- & Alexey I. Nesvizhskii
-
Article
| Open AccessStrategies to enable large-scale proteomics for reproducible research
Clinical proteomics critically depends on the ability to acquire highly reproducible data over an extended period of time. Here, the authors assess reproducibility over four months across different mass spectrometers and develop a computational approach to mitigate variation among instruments over time.
- Rebecca C. Poulos
- , Peter G. Hains
- & Qing Zhong
-
Article
| Open AccessProteome activity landscapes of tumor cell lines determine drug responses
Proteome activity has a major role in cancer progression and response to drugs. Here, the authors use comprehensive proteomic and phosphoproteomic data, in conjunction with drug-sensitivity screens, to generate a community resource consisting of landscapes of pathway and kinase activity across different cell lines
- Martin Frejno
- , Chen Meng
- & Bernhard Kuster
-
Article
| Open AccessFocus on the spectra that matter by clustering of quantification data in shotgun proteomics
Matching mass spectra to peptide sequences is the usual first step in proteomics data analysis, often followed by peptide quantification. Here, the authors show that clustering and quantifying mass spectral features prior to peptide identification can increase the sensitivity of label-free quantitative proteomics.
- Matthew The
- & Lukas Käll
-
Article
| Open AccessThe Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics
While archaeal proteomics advanced rapidly, a comprehensive proteome database for archaea is lacking. Therefore, the authors here launch the Archaeal Proteome Project, a community-effort providing insights into archaeal cell biology via the combined reanalysis of Haloferax volcanii proteomics data.
- Stefan Schulze
- , Zachary Adams
- & Mechthild Pohlschroder
-
Article
| Open AccessCancer neoantigen prioritization through sensitive and reliable proteogenomics analysis
Identifying mutation-derived neoantigens by proteogenomics requires robust strategies for quality control. Here, the authors propose peptide retention time as an evaluation metric for proteogenomics quality control methods, and develop a deep learning algorithm for accurate retention time prediction.
- Bo Wen
- , Kai Li
- & Bing Zhang
-
Article
| Open AccessQuantitative proteomic landscape of metaplastic breast carcinoma pathological subtypes and their relationship to triple-negative tumors
Metaplastic breast carcinoma (MBC) is among the most aggressive subtypes of triple-negative breast cancer (TNBC) but the underlying proteome profiles are unknown. Here, the authors characterize the protein signatures of human MBC tissue samples and their relationship to TNBC and normal breast tissue.
- Sabra I. Djomehri
- , Maria E. Gonzalez
- & Celina G. Kleer
-
Article
| Open AccessGenerating high quality libraries for DIA MS with empirically corrected peptide predictions
Data-independent acquisition-mass spectrometry (MS) typically requires many preparatory MS runs to produce experiment-specific spectral libraries. Here, the authors show that empirical correction of in silico predicted spectral libraries enables efficient generation of high-quality experiment-specific libraries.
- Brian C. Searle
- , Kristian E. Swearingen
- & Mathias Wilhelm
-
Article
| Open AccessRapid and site-specific deep phosphoproteome profiling by data-independent acquisition without the need for spectral libraries
Localizing phosphorylation sites by data-independent acquisition (DIA)-based proteomics is still challenging. Here, the authors develop algorithms for phosphosite localization and stoichiometry determination, and incorporate them into single-shot DIA-phosphoproteomics workflows.
- Dorte B. Bekker-Jensen
- , Oliver M. Bernhardt
- & Jesper V. Olsen
-
Article
| Open AccessThe epichaperome is a mediator of toxic hippocampal stress and leads to protein connectivity-based dysfunction
The biology of Alzheimer’s disease (AD) remains unknown. We propose AD is a protein connectivity-based dysfunction disorder whereby a switch of the chaperome into epichaperomes rewires proteome-wide connectivity, leading to brain circuitry malfunction that can be corrected by novel therapeutics.
- Maria Carmen Inda
- , Suhasini Joshi
- & Gabriela Chiosis
-
Article
| Open AccessIn silico spectral libraries by deep learning facilitate data-independent acquisition proteomics
Data-independent acquisition (DIA) is an emerging technology in proteomics but it typically relies on spectral libraries built by data-dependent acquisition (DDA). Here, the authors use deep learning to generate in silico spectral libraries directly from protein sequences that enable more comprehensive DIA experiments than DDA-based libraries.
- Yi Yang
- , Xiaohui Liu
- & Liang Qiao
-
Article
| Open AccessProTargetMiner as a proteome signature library of anticancer molecules for functional discovery
Anticancer drugs often have widespread effects on the cellular proteome. Here, the authors generate a proteome signature library of drug-treated cancer cell lines and develop a software tool to deconvolute drug targets and gain insights into their mechanisms of action.
- Amir Ata Saei
- , Christian Michel Beusch
- & Roman A. Zubarev
-
Article
| Open AccessComplex I is bypassed during high intensity exercise
During high-intensity exercise, muscles convert glucose to lactate, in a process that is energetically less efficient than respiration. Here the authors develop a computational model based on muscle proteomic data showing that bypassing mitochondrial complex I increases ATP production rates, and validate these model predictions in an exercise test on 5 subjects.
- Avlant Nilsson
- , Elias Björnson
- & Jens Nielsen
-
Article
| Open AccessThe distinction of CPR bacteria from other bacteria based on protein family content
Recent studies have identified a large, phylogenetically distinct clade of bacteria, the candidate phyla radiation (CPR). Here, Méheust and colleagues analyze almost 3600 genomes to characterize the protein family content of CPR versus other bacteria and archaea.
- Raphaël Méheust
- , David Burstein
- & Jillian F. Banfield
-
Article
| Open AccessDeep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints
Prediction of protein structures on the scale of genomes remains a challenge. Here the authors introduce a protein structure prediction method that uses deep learning to predict inter-atomic distances, torsion angles and hydrogen bonds, and apply it to predict the structures of 1475 Pfam domains.
- Joe G. Greener
- , Shaun M. Kandathil
- & David T. Jones
-
Article
| Open AccessComparative analysis of mRNA and protein degradation in prostate tissues indicates high stability of proteins
Protein degradation in clinical samples is largely unexplored. Here, the authors analyze the transcriptome and proteome of clinical tissue samples and develop an algorithm to assess protein degradation, showing that protein degradation is negligible in most tissue samples and does not correlate with transcript degradation.
- Wenguang Shao
- , Tiannan Guo
- & Ruedi Aebersold
-
Article
| Open AccessAn additive Gaussian process regression model for interpretable non-parametric analysis of longitudinal data
Longitudinal data are common in biomedical research, but their analysis is often challenging. Here, the authors present an additive Gaussian process regression model specifically designed for statistical analysis of longitudinal experimental data.
- Lu Cheng
- , Siddharth Ramchandran
- & Harri Lähdesmäki
-
Article
| Open AccessChromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry
Data-independent acquisition (DIA)-based proteomics often relies on mass spectrum libraries from data-dependent acquisition experiments. Here, the authors present a method to generate DIA-based chromatogram libraries, enabling DIA-only workflows and detecting more peptides than with spectrum libraries alone.
- Brian C. Searle
- , Lindsay K. Pino
- & Michael J. MacCoss
-
Article
| Open AccessIntegrative proteomics in prostate cancer uncovers robustness against genomic and transcriptomic aberrations during disease progression
Understanding of molecular events in cancer requires proteome-level characterisation. Here, proteome profiling of patient samples representing primary and progressed prostate cancer enables the authors to identify pathway alterations that are not reflected at the genomic and transcriptomic levels.
- Leena Latonen
- , Ebrahim Afyounian
- & Tapio Visakorpi
-
Article
| Open AccessDiscovery of coding regions in the human genome by integrated proteogenomics analysis workflow
Proteogenomics enables the discovery of protein coding regions and disease-relevant mutations but their verification remains challenging. Here, the authors combine peptide discovery, curation and validation in an integrated proteogenomics workflow, robustly identifying unknown coding regions and mutations.
- Yafeng Zhu
- , Lukas M. Orre
- & Janne Lehtiö
-
Article
| Open AccessNoumeavirus replication relies on a transient remote control of the host nucleus
Large dsDNA viruses either replicate in or disrupt the nucleus to gain access to host RNA polymerases, or they rely on virus-encoded, packaged RNA polymerases. Here, the authors show that Noumeavirus replicates in the cytoplasm and relies on a transient recruitment of nuclear proteins to initiate replication.
- Elisabeth Fabre
- , Sandra Jeudy
- & Chantal Abergel
-
Article
| Open AccessImproving GENCODE reference gene annotation using a high-stringency proteogenomics workflow
Identifying and annotating functional elements in the human genome remains a challenging but important task. Here the authors propose a priority annotation score to rank identifications and suggest how proteogenomics evidence can be interpreted and what additional information substantiates protein-coding potential for annotation.
- James C. Wright
- , Jonathan Mudge
- & Jennifer Harrow
-
Article
| Open AccessQuantitative maps of protein phosphorylation sites across 14 different rat organs and tissues
The function of proteins is often regulated by their phosphorylation at specific amino-acid residues. The authors of this article have catalogued phosphoproteins and their phosphorylation sites in 14 rat organs and tissues, and provide these data as a resource for researchers.
- Alicia Lundby
- , Anna Secher
- & Jesper V. Olsen