Enhanced access to the human phosphoproteome with genetically encoded phosphothreonine

Protein phosphorylation is a ubiquitous post-translational modification used to regulate cellular processes and proteome architecture by modulating protein-protein interactions. The identification of phosphorylation events through proteomic surveillance has dramatically outpaced our capacity for functional assignment using traditional strategies, which often require knowledge of the upstream kinase a priori. The development of phospho-amino-acid-specific orthogonal translation systems, evolutionarily divergent aminoacyl-tRNA synthetase and tRNA pairs that enable co-translational insertion of a phospho-amino acids, has rapidly improved our ability to assess the physiological function of phosphorylation by providing kinase-independent methods of phosphoprotein production. Despite this utility, broad deployment has been hindered by technical limitations and an inability to reconstruct complex phopho-regulatory networks. Here, we address these challenges by optimizing genetically encoded phosphothreonine translation to characterize phospho-dependent kinase activation mechanisms and, subsequently, develop a multi-level protein interaction platform to directly assess the overlap of kinase and phospho-binding protein substrate networks with phosphosite-level resolution.

-Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy

Human research participants
Policy information about studies involving human research participants and Sex and Gender in Research.
Reporting on sex and gender Population characteristics

Recruitment
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Data exclusions

Randomization
Mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via PRIDE. Data will be published when there is associated DOI or PubMed ID for the work.  Figure S11, Accession: PXD026420 Figure S1, Accession: PXD026425 Figure S4, Accession: PXD026421 Figure S10, Accession: PXD026432 FASTA files associated MS searches can be found in supplementary data. Peptide library FASTAs were generated in house, E coli proteome was taken from Uniprot reviewed IDs.
NGS files associated with Hi-P and Hi-P+ have been deposited to NCBI, Accession: PRJNA732384 ID: 732384.
For MS studies our standard sample size was N of 3, as the experiments are time and resource intensive, but this N provides enough information for statistical analysis. A lower N (N of 2) was used in one of the expression control experiments for Thr (not pThr) library expression (Figure 2), and Figure S1A. Smaller N was chosen for Thr library expression because this experiment was an OTS-independent qualitative assessment of hard-coded Thr library coverage with no associated statistical analysis or claims of significance. Similarly, the N for S1A was chosen as a pre-filtering measure by qualitative comparison of performance across a large parameter set. Follow-up experiments were performed on a subset of samples identified through the screening process with N of 3 for statistical comparison. Hi-P and Hi-P+ experiments were performed in duplicate due to practical limitations, and the qualitative application of the data. We directly acknowledge the divergent N numbers for these data and provide justification in the main text.
No data was excluded.
All experiments were replicated an additional 1-2 times (as dictated by N numbers), with experiments being carried out on different days with different starting materials to ensure reproducibility. The axis labels state the marker and fluorochrome used (e.g. CD4-FITC).
The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Methodology
Sample preparation Instrument Software Cell population abundance Blinding was not necessary for our project as we were not comparing any treatments or working with patients/animals that could be influenced by our observations.
Per the manufacturer's website, Jackson ImmunoResearch has verified the antibody reacts via immunoelectrophoresis and/or ELISA with whole molecule rabbit IgG. No antibody was detected against non-immunoglobulin serum proteins. The antibody has been tested by ELISA and/or solid-phase adsorbed to ensure minimal cross-reaction with bovine, chicken, goat, guinea pig, syrian hamster, horse, human, mouse, rat and sheep serum proteins, but it may cross-react with immunoglobulins from other species. 20 ml of cells containing either no OTS (for Thr libraries), pThrOTSZeus, or pThrOTSHercules were grown to an OD600 of 0.4 and electroporated using the method stated above with either Thr library or pThr library cloned in the split mCherry vector (see supplemental plasmid files). The cells were then resuspended in 1.2 mLs of LB and incubated for 1 h at 37°C and 230 RPM in a 15 ml culture tube. Recovered cells were directly inoculated in 20 ml of LB with 100 ng/!l ampicillin, 50 ng/!l kanamycin and grown overnight at 37°C and 230 RPM. Cells were plated at 10-4 and 10-5 serial dilutions on LB plates with antibiotics and grown at 37°C overnight. Experiments that proceeded forward required at least 20 colony-forming units per 10-5 dilution. The following morning, cultures were diluted to an OD600 of 0.15 in 5 ml of LB containing either 100 ng/!l ampicillin or 100 ng/!l ampicillin and 50 ng/!l kanamycin grown at 37°C and 230 RPM. The cells were grown until an OD600 between 0.6-0.8 and set on ice. Protein expression was induced using 1 mM IPTG, 0.2% arabinose, and 100 ng/!L anhydrotetracycline, cells were then grown at 20°C and 230 RPM for 20 hours. 30 !l of cells were resuspended in 3 ml ice cold PBS in a 5 ml polystyrene tube (Falcon) prior to analysis.