Originally designed for measuring isotope abundances and elemental masses, mass spectrometry is becoming a mainstay across life sciences. As electrospray ionization of biomolecules turns 30 and the Orbitrap mass analyzer 20, we take this opportunity to highlight the role of both inventions in stirring mass spectrometry from physics into biology and discuss the advances and challenges that may impact the future applications of biomolecular mass spectrometry.
Biological Mass Spectrometry
Mass spectrometry was long considered a specialist technology for physicists and chemists, but is now used across biological research. Two major driving forces of this development are Electrospray Ionization and the Orbitrap mass analyzer. On the occasion of their 30th and 20th anniversary in 2019, we assembled this collection of Nature Communications articles. As the advances of mass spectrometry are closely connected to the emergence of proteomics, we updated this collection in 2020 to celebrate 10 years of the Human Proteome Project.
The Opinion section includes an Editorial as well as Commentaries from Matthias Mann and Alexander Makarov, giving personal accounts of the invention and evolution of Electrospray Ionization and Orbitrap, respectively. The sections Protein Mass Spectrometry and Beyond Proteins feature various applications of Electrospray Ionization- and Orbitrap-based mass spectrometry. Recent advances in characterizing the human proteome in health and disease are presented in The Human Proteome section.
John Fenn’s electrospray mass spectrometry (ESMS) was awarded the chemistry Nobel Prize in 2002 and is now the basis of the entire field of MS-based proteomics. Technological progress continues unabated, enabling single cell sensitivity and clinical applications.
The establishment of the Orbitrap analyzer as a major player in mass spectrometry based proteomics is traced back to the first public presentation of this technology 20 years ago; when a proof-of-principle application led the way to further advancements and biological applications.
The Human Proteome
The Human Proteome Project (HPP) was launched in 2010 to enhance accurate annotation of the genome-encoded proteome. Ten years later, the HPP releases its first blueprint of the human proteome, annotating 90% of all known proteins at high-stringency and discussing the implications of proteomics for precision medicine.
Standardization and harmonization of distributed multi-center proteotype analysis supporting precision medicine studies
Distributed multi-omic digitization of clinical specimen across multiple sites is a prerequisite for turning molecular precision medicine into reality. Here, the authors show that coordinated proteotype data acquisition is feasible using standardized MS data acquisition and analysis strategies.
Clinical proteomics critically depends on the ability to acquire highly reproducible data over an extended period of time. Here, the authors assess reproducibility over four months across different mass spectrometers and develop a computational approach to mitigate variation among instruments over time.
Mass spectrometry-based proteomics typically relies on highly sensitive nano-flow liquid chromatography (LC) but this can reduce robustness and reproducibility. Here, the authors show that micro-flow LC enables robust and reproducible high-throughput proteomics experiments at a very moderate loss of sensitivity.
The human skin is a highly complex organ comprising multiple tissue layers and diverse cell types. Here, the authors present a spatially-resolved quantitative proteomic atlas of the healthy human skin, characterizing the protein profiles of four skin layers and nine cell types.
Top-down proteomics can provide unique insights into the biological variations of protein biomarkers but detecting low-abundance proteins in body fluids remains challenging. Here, the authors develop a nanoparticle-based top-down proteomics approach enabling enrichment and detailed analysis of cardiac troponin I in human serum.
Large-scale, unbiased proteomics studies of biological samples like plasma are constrained by the complexity of the proteome. Herein, the authors develop a highly parallel protein quantitation platform leveraging multi nanoparticle protein coronas for deep proteome sampling and biomarker discovery.
Heart failure is a major health issue worldwide. Here, Egerstedt et al. perform proteomic profiling of human plasma at different stages of heart failure, providing a comprehensive analysis of changes in the plasma proteome during disease progression.
An important aspect of precision medicine is to probe the stability in molecular profiles among healthy individuals over time. Here, the authors sample a longitudinal wellness cohort and analyse blood molecular profiles as well as gut microbiota composition.
Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes
Non-canonical HLA-bound peptides from presumed non-coding regions are potential targets for cancer immunotherapy, but their discovery remains challenging. Here, the authors integrate exome sequencing, transcriptomics, ribosome profiling, and immunopeptidomics to identify tumor-specific non-canonical HLA-bound peptides.
Connecting genomics and proteomics allows the development of more efficient and specific treatments for cancer. Here, the authors develop proteogenomic methods to defining cancer signaling in-vivo starting from core needle biopsies and with application to a HER2 breast cancer focused clinical trial.
Quantitative proteomic landscape of metaplastic breast carcinoma pathological subtypes and their relationship to triple-negative tumors
Metaplastic breast carcinoma (MBC) is among the most aggressive subtypes of triple-negative breast cancer (TNBC) but the underlying proteome profiles are unknown. Here, the authors characterize the protein signatures of human MBC tissue samples and their relationship to TNBC and normal breast tissue.
Protein Mass Spectrometry
Automated mass spectrometry imaging of over 2000 proteins from tissue sections at 100-μm spatial resolution
Imaging mass spectrometry is a powerful emerging tool for mapping the spatial distribution of biomolecules across tissue surfaces. Here the authors showcase an automated technology for deep proteome imaging that utilizes ultrasensitive microfluidics and a mass spectrometry workflow to analyze tissue voxels, generating quantitative cell-type-specific images.
A machine learning-based chemoproteomic approach to identify drug targets and binding sites in complex proteomes
Proteomics is often used to map protein-drug interactions but identifying a drug’s protein targets along with the binding interfaces has not been achieved yet. Here, the authors integrate limited proteolysis and machine learning for the proteome-wide mapping of drug protein targets and binding sites.
The Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics
While archaeal proteomics advanced rapidly, a comprehensive proteome database for archaea is lacking. Therefore, the authors here launch the Archaeal Proteome Project, a community-effort providing insights into archaeal cell biology via the combined reanalysis of Haloferax volcanii proteomics data.
A synthetic peptide library for benchmarking crosslinking-mass spectrometry search engines for proteins and protein complexes
Validating crosslinking-mass spectrometry workflows is hampered by the lack of a ground truth to assess the robustness of the crosslink identifications. Here, the authors present a synthetic library of crosslinked peptides, enabling unambiguous discrimination of correct and incorrect crosslink identifications.
Temporal dynamics of protein complex formation and dissociation during human cytomegalovirus infection
Here, Hashimoto et al. apply mass spectrometry-based thermal proximity coaggregation to characterize the temporal dynamics of virus-host protein-protein interactions during human cytomegalovirus (HCMV) infection, uncovering proviral functions including the internalization of the HCMV receptor integrin beta 1 with CD63.
The development of software tools to analyse large mass spectrometry data sets lags behind the increase in diversity of the data. Here the authors develop MS-GF+, a database search tool that outperforms other popular tools in identifying peptides from a variety of data sets.
Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometry
Neoantigens determine anti-cancer immunoreactivity and are important functional targets for immunotherapy. Here, the authors use deep mass spectrometry to characterize neoepitopes from human melanoma tissue and show the presence of tumour-reactive T cells with specificity for selected neoantigens.
Multi-laboratory assessment of reproducibility, qualitative and quantitative performance of SWATH-mass spectrometry
SWATH-mass spectrometry consists of a data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics on the scale of thousands of proteins. Here, using data generated by eleven groups worldwide, the authors show that SWATH-MS is capable of generating highly reproducible data across different laboratories.
Quantitative proteomics identifies redox switches for global translation modulation by mitochondrially produced reactive oxygen species
The role of reactive oxygen species (ROS) in signalling and specific targets is not fully understood. Here the authors perform a global proteomic analysis to delineate the yeast redoxome and show that increased levels of intracellular ROS caused by dysfunctional mitochondria decrease global protein synthesis.
Nanodroplet processing platform for deep and quantitative proteome profiling of 10–100 mammalian cells
There is a great need of developing highly sensitive mass spectrometry-based proteomics analysis for small cell populations. Here, the authors establish a robotically controlled chip-based nanodroplet processing platform and demonstrate its ability to profile the proteome from 10–100 mammalian cells.
Diffuse-type gastric cancer (DGC) accounts for 30% of gastric cancers and has few treatment options. Here the authors present a mutation and proteome dataset for 84 patients, identifying three major classes of DGC and indicating potential targets for therapy.
Metaproteomics reveals associations between microbiome and intestinal extracellular vesicle proteins in pediatric inflammatory bowel disease
Gut microbial dysbiosis has been implicated in the pathogenesis of inflammatory bowel disease. Here, the authors examine host-microbiota protein interactions that occur in inflammatory bowel disease; they show an upregulation in proteins related to antimicrobial activities, and alterations in intestinal extracellular vesicles that are associated with aberrant microbiota-interactions.
Cell-specific proteome analyses of human bone marrow reveal molecular features of age-dependent functional decline
Ageing causes an inability to replace damaged tissue. Here, the authors perform proteomics analyses of human haematopoietic stem cells and other cells in the bone marrow niche at different ages and show changes in central carbon metabolism, reduced bone marrow niche function, and enhanced myeloid differentiation.
Proteogenomics and Hi-C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia
High hyperploidy is a common feature in childhood B-cell precursor acute lymphoblastic leukemia. Here, the authors perform proteogenomic and Hi-C analyses of this leukemia and the ETV6/RUNX1 subtype and show that CTCF and cohesin expression are low in hyperdiploid cases and transcriptional dysregulation in relation to topologically associating domain borders in some of these cases.
Comparative cross-linking and mass spectrometry of an intact F-type ATPase suggest a role for phosphorylation
Rotary ATPases are membrane-embedded motors that produce or consume ATP and control pH within cells. Schmidt et al.use mass spectrometry to characterize the intact chloroplast ATPase from spinach and, using comparative cross-linking, show that phosphorylation affects stability and nucleotide occupancy.
RNA-binding proteins (RBPs) are implicated in many biological functions. Here the authors expand the human and yeast RNA interactome identifying new and conserved RBPs, several of which with no prior function assigned to RNA biology or structural motifs known to mediate RNA-binding, and suggesting new roles of RNA as modulators of protein function.
Protein N-myristoylation is a ubiquitous modification implicated in the regulation of multiple cellular processes. Here, Thinon et al. report the development of a general method to identify N-myristoylated proteins in human cells and identify over 100 endogenous post- and co-translational substrates of N-myristoyltransferase.
The spatial location of proteins within a cell is a key element of protein function. Here the authors describe hyperLOPIT—a proteomics workflow that allows the simultaneous assignment of thousands of proteins to subcellular niches with high resolution—and apply it to mouse pluripotent stem cells.
Hybrid mass spectrometry approaches in glycoprotein analysis and their usage in scoring biosimilarity
Many biopharmaceuticals exhibit mixed heterogeneity in their post-translational modifications (PTMs) that are essential for their function. Here the authors use a combination of mass spectrometry techniques to analyse human erythropoietin (EPO) and properdin to discover new PTMs on properdin and derive a biosimilarity score for various sources of EPO.
ADP-ribosylation is a reversible post-translational protein modification involved in many cellular processes. Here the authors describe a sensitive approach for the analysis of ADP-ribosylation sites under physiologic conditions and identify lysine residues as in vivotargets of ADP-ribosylation.
Quantitative phosphoproteomics has become a standard method in molecular and cell biology. Here, the authors compare performance and parameters of phosphoproteome quantification by LFQ, SILAC, and MS2-/MS3-based TMT and introduce a TMT-adapted algorithm for calculating phosphorylation site stoichiometry.
Secondary transporters catalyse substrate translocation across the cell membrane but the role of lipids during the transport cycle remains unclear. Here authors used hydrogen-deuterium exchange mass spectrometry and molecular dynamics simulations to understand how lipids regulate the conformational dynamics of secondary transporters.
ATP can function as a biological hydrotrope, but its global effects on protein solubility have not yet been characterized. Here, the authors quantify the effect of ATP on the thermal stability and solubility of the cellular proteome, providing insights into protein solubility regulation by ATP.
A computational platform for high-throughput analysis of RNA sequences and modifications by mass spectrometry
Mass spectrometry (MS) enables identification of modified RNA residues, but high-throughput processing is currently a bottleneck. Here, the authors present a free and open-source database search engine for RNA MS data to facilitate reliable identification of modified RNA sequences.
Bacterial tRNA is modified by thiolation of nucleosides. Here the authors identify 2-methylthiocytidine in bacterial tRNA using nucleic acid isotope labeling coupled mass spectrometry. Exposure to methylating agents converts 2-thiocytidine to 2-methylthiocytidine, which is repaired by demethylase AlkB in vivo.
A nonenzymatic method for cleaving polysaccharides to yield oligosaccharides for structural analysis
While mass spectrometry-based proteomics largely relies on digesting proteins into peptides, there is no equivalent strategy for polysaccharide analysis. Here, the authors develop a chemical approach to break down poly- into oligosaccharides and present a workflow to identify polysaccharides by oligosaccharide fingerprinting.
Heparan sulfates (HS) contain functionally relevant structural motifs, but determining their monosaccharide sequence remains challenging. Here, the authors develop an ion mobility mass spectrometry-based method that allows unambiguous characterization of HS sequences and structure-activity relationships.
Targeted mass spectrometry enables reproducible and accurate lipid quantification but dedicated software tools to develop targeted lipidomics assays are lacking. Here, the authors develop a targeted lipidomics workbench and lipid knowledgebase for the streamlined generation of targeted assays.
Coupling photochemical derivatization with tandem mass spectrometry enables C=C-isomer resolved lipidomics. Here, the authors further develop this approach into a shotgun lipidomics workflow that allows simultaneous characterization of lipid C=C locations and sn-positions in complex biological samples.
The use of machine learning for identifying small molecules through their retention time’s predictions has been challenging so far. Here the authors combine a large database of liquid chromatography retention time with a deep learning approach to enable accurate metabolites’s identification.
Ion mobility collision cross-section atlas for known and unknown metabolite annotation in untargeted metabolomics
Collision cross section (CCS) information can aid the annotation of unknown metabolites. Here, the authors optimize the machine-learning based prediction of metabolite CCS values and curate a 1.6 million compound CCS atlas, improving annotation accuracy and coverage for known and unknown metabolites.
Fast and sensitive flow-injection mass spectrometry metabolomics by analyzing sample-specific ion distributions
Flow-injection mass spectrometry (FI-MS) enables high-throughput metabolomic profiling, but ion overload typically limits its sensitivity. Here, the authors show rapid and highly sensitive FI-MS overcoming an overload of the Orbitrap by analyzing sample-specific ion distributions.
Mycobacteria can adapt to the stress of human infection by entering a dormant state. Here the authors show that hypoxia-induced dormancy in M. bovisBCG involves the reprogramming of tRNA wobble modifications and copy numbers, coupled with biased use of synonymous codons in survival genes.
Mass spectrometry sequencing of long digital polymers facilitated by programmed inter-byte fragmentation
Digital information can be stored in monomer sequences of non-natural macromolecules, but only short chains can be read. Here the authors show long multi-byte digital polymers sequenced in a moderate resolution mass spectrometer. Full sequence coverage can be attained without pre-analysis digestion or the help from sequence databases.
Anomeric memory of the glycosidic bond upon fragmentation and its consequences for carbohydrate sequencing
Establishing generic carbohydrate sequencing methods is both a major scientific challenge and a strategic priority. Here the authors show a hybrid analytical approach integrating molecular spectroscopy and mass spectrometry to resolve carbohydrate isomerism, anomeric configuration, regiochemistry and stereochemistry.
Correlating chemical diversity with taxonomic distance for discovery of natural products in myxobacteria
It is thought that the chances for discovery of novel natural products increase by screening rare organisms. Here the authors analyse metabolites produced by over 2300 myxobacterial strains and, indeed, find a correlation between taxonomic distance and production of distinct secondary metabolite families.
The biological functions of lipids critically depend on their highly diverse molecular structures. Here, the authors determine the mass-resolved collision cross sections of 456 sphingolipid and glycerophospholipid species, providing a reference for future structural lipidomics studies.
Spatial-fluxomics provides a subcellular-compartmentalized view of reductive glutamine metabolism in cancer cells
Measuring metabolic fluxes in cellular compartments is a challenge. Here, the authors introduce an approach to infer fluxes in mitochondria and cytosol, and find that IDH1 is the major producer of cytosolic citrate in HeLa cells and that in SDH- deficient cells citrate synthase functions in reverse.
Untargeted metabolomics detects large numbers of metabolites but their annotation remains challenging. Here, the authors develop a metabolic reaction network-based recursive algorithm that expands metabolite annotation by taking advantage of the mass spectral similarity of reaction-paired neighbor metabolites.
Lack of best practice guidelines currently limits the application of metabolomics in the regulatory sciences. Here, the MEtabolomics standaRds Initiative in Toxicology (MERIT) proposes methods and reporting standards for several important applications of metabolomics in regulatory toxicology.
Glycomics is gaining momentum in basic, translational and clinical research. Here, the authors review current reporting standards and analysis tools for mass-spectrometry-based glycomics, and propose an e-infrastructure for standardized reporting and online deposition of glycomics data.