Mass spectrometric based detection of protein nucleotidylation in the RNA polymerase of SARS-CoV-2

Coronaviruses, like severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), encode a nucleotidyl transferase in the N-terminal (NiRAN) domain of the nonstructural protein (nsp) 12 protein within the RNA dependent RNA polymerase. Here we show the detection of guanosine monophosphate (GMP) and uridine monophosphate-modified amino acids in nidovirus proteins using heavy isotope-assisted mass spectrometry (MS) and MS/MS peptide sequencing. We identified lysine-143 in the equine arteritis virus (EAV) protein, nsp7, as a primary site of in vitro GMP attachment via a phosphoramide bond. In SARS-CoV-2 replicase proteins, we demonstrate nsp12-mediated nucleotidylation of nsp7 lysine-2. Our results demonstrate new strategies for detecting GMP-peptide linkages that can be adapted for higher throughput screening using mass spectrometric technologies. These data are expected to be important for a rapid and timely characterization of a new enzymatic activity in SARS-CoV-2 that may be an attractive drug target aimed at limiting viral replication in infected patients.


nature research | reporting summary
April 2020 Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.
Sample size

Data exclusions
Replication Randomization

Blinding
No sample-size calculation was performed. Initial results demonstrated strictly unambiguous outcomes, i.e. band radiolabeling was readily apparent from background and/or non-specific signals. Because we did not need quantitation in our study to determine positive results, we deemed sample sizes greater than n=1 where not needed in a single experiment, and instead reasoned that replication of the experiment multiple times on different days suited the experimental purposes sufficiently.
No data was excluded from analyses.
Gel-based nucleotidylation reactions that included minimally WT EAV nsp9 and nsp7 or SARS-CoV-2 nsp12, nsp7 and nsp8 were repeated at least three times, though not with the identical set-up as displayed in Figure 1. All attempts at replication were successful. The mass spectrometry assay directly supports the gel-based assay, so additional replication was deemed unnecessary.
For mass spectrometry analysis, a single experiment provided the initial, reported nucleotidylated data for theEAV and SARS-CoV-2. Technical replicates were performed either simultaneously (EAV) or on a different day (SARS-CoV-2) and yielded the same results (ie presence of appropriate m/z in appropriate samples). Biological replicates (separate experiments) for modification of WT proteins were performed in the more advanced experimental set-up examining the mutant proteins described in the text. Results confirmed the initial dataset but are not directly reported, although the data is available in the repository described above.
LC-MS/MS examination of GMP-labeling of mutant proteins SARS-CoV-2 nsp7 S2A, SARS-CoV-2 nsp7 K3A, EAV nsp9 K380A and EAV nsp7 K156A was performed in a single experiment. All other mutants discussed were examined in two separate experiments. No data was omitted. Though these experiments looked for the disappearance of GMP-labeling on the mutated residue (i.e. negative data), success of the experiment was gauged by verifying GMP-labeling of a WT control in every new experiment and verifying GMP-labeling of other known sites within the mutant protein samples. Radiolabeling of EAV mutant proteins were performed once and verified with LC-MS/MS, whereas the SARS-CoV-2 mutant radiolabeling was performed twice in separate experiments, which was verified with LC-MS/MS in additional experiments.
Further biological replicates were not performed because 1) the gel based assay supported the mass spectrometry data, 2) the experimental design incorporated controls (ie unlabeled, GTP-labeled, 15N-GTP-labeld and 13C labeled samples) that generated mathematically-predictable and very specific results without ambiguity, 3) the nature of LC-MS/MS itself (even without the internal controls) is highly precise and accurate such that errors are highly unlikely given the stringent matching criteria (precursor mass errors < 1.5 ppm, MS/MS fragment ion detection in orbitrap with 0.04 Da error tolerance, high Sequest Xcorrelation scores for top peptide spectrum matches that are > 2.5, and false-discovery rate less than 1%), and 4) the analyses generated unique but redundant results for many peptides (e.g. the SARS-CoV-2 GMP modification was observed in four different peptides in MS/MS).
Estimates of percent protein labeling with radionucleotide was performed once as an estimate and for additional information requested by reviewers. Because this data was not central to the conclusions of the study, it was only performed once.
Competition of radiolableing with cold nucleotide was performed in three separate experiments with consistent results. Again, because data was not central to the conclusions of the study, no further replication was performed.
Radiolabeling with UMP was performed in a single experiment with multiple technical replicates. Because this data was followed up by more informative experiments via LC-MS/MS, we did not performed additional UMP radiolabeling experiments.
LC-MS/MS data of UMP-peptide adducts were collected from a single biological experiment. Two replicates were examined. Further replication was not performed due to the same reasons provided above for LC-MS/MS data and because our study ended with this data that was central to the main conclusions and results of this paper.
The experimental samples were varied by a single factor (no nucleotide, GTP, 15N-GTP, 13C-GTP, etc.) in a very quick 30 minute assay such that covariates were not considered. Randomization was not performed. Injection order onto the LC-MS/MS generally started with negative controls first ( ie "no nucleotide" reactions) to minimize carryover. Between batches of analyses, the LC columns were washed extensively and monitored for unacceptable levels of contamination. Initial analyses also assessed carryover from prior samples; we determined the carryover was very low and did not obscure data interpretation.
Blinding was not applied for the following reasons. The gels were generally loaded in a specific order that facilitated experimental result interpretation. In addition, successful nucleotidyation radiolabeling was readily distinguished from non-specific background, which did not require a scoring regime. For the mass spectrometry data, the raw data needed to be subjected to a complicated analysis after acquisition