Tripartite degrons confer diversity and specificity on regulated protein degradation in the ubiquitin-proteasome system

Guharoy, Mainak; Bhowmick, Pallab; Sallam, Mohamed; Tompa, Peter

doi:10.1038/ncomms10239

Download PDF

Article
Open access
Published: 06 January 2016

Tripartite degrons confer diversity and specificity on regulated protein degradation in the ubiquitin-proteasome system

Mainak Guharoy¹^na1,
Pallab Bhowmick¹^na1,
Mohamed Sallam¹ &
…
Peter Tompa^1,2

Nature Communications volume 7, Article number: 10239 (2016) Cite this article

12k Accesses
89 Citations
10 Altmetric
Metrics details

Subjects

Abstract

Specific signals (degrons) regulate protein turnover mediated by the ubiquitin-proteasome system. Here we systematically analyse known degrons and propose a tripartite model comprising the following: (1) a primary degron (peptide motif) that specifies substrate recognition by cognate E3 ubiquitin ligases, (2) secondary site(s) comprising a single or multiple neighbouring ubiquitinated lysine(s) and (3) a structurally disordered segment that initiates substrate unfolding at the 26S proteasome. Primary degron sequences are conserved among orthologues and occur in structurally disordered regions that undergo E3-induced folding-on-binding. Posttranslational modifications can switch primary degrons into E3-binding-competent states, thereby integrating degradation with signalling pathways. Degradation-linked lysines tend to be located within disordered segments that also initiate substrate degradation by effective proteasomal engagement. Many characterized mutations and alternative isoforms with abrogated degron components are implicated in disease. These effects result from increased protein stability and interactome rewiring. The distributed nature of degrons ensures regulation, specificity and combinatorial control of degradation.

ReLo is a simple and rapid colocalization assay to identify and characterize direct protein–protein interactions

Article Open access 03 April 2024

Harpreet Kaur Salgania, Jutta Metz & Mandy Jeske

Targeting DCAF5 suppresses SMARCB1-mutant cancer by stabilizing SWI/SNF

Article 27 March 2024

Sandi Radko-Juettner, Hong Yue, … Charles W. M. Roberts

Proteome-scale discovery of protein degradation and stabilization effectors

Article 20 March 2024

Juline Poirson, Hanna Cho, … Mikko Taipale

Introduction

Regulated degradation (Deg) of proteins via the ubiquitin-proteasome system (UPS) is critical for diverse cellular processes such as cell cycle progression, transcription, immune response, signalling, differentiation and growth¹. Deg is spatio-temporally controlled and needs to be harmonized with protein synthesis and functionality to maintain proteostatic balance and to achieve precise proteome remodelling in response to environmental and intracellular cues. This necessitates an intricate monitoring system capable of recognizing specific signals that mark proteins for Deg (degrons).

A degron has been defined as a protein element that confers metabolic instability². Although seemingly straightforward, its molecular correlates are often difficult to define and the term degron has been used inconsistently in the literature. For example, the degron is often defined as the substrate site that is recognized by E3 ubiquitin ligases and a variety of such degrons (short peptide motifs and specific structural elements) have been characterized³. The eukaryotic linear motif resource⁴ also classifies several short, linear motifs (SLiMs) as degrons. In contrast, other studies have indicated the site of polyubiquitination or the polyubiquitin chain itself as the degron⁵. Lys48-ubiquitin linkages form the canonical signal for Deg by the 26S proteasome, although other linkage types may also be recognized⁶. Recently, Matouschek and colleagues⁷ suggested that successful Deg requires another additional element, an intrinsically disordered Deg initiation site on the substrate that facilitates substrate unfolding and entry into the proteasome catalytic core, which is accessible only through a narrow channel⁸. They subsequently suggested that degrons are bipartite, composed of the substrate-bound polyubiquitin tag and an appropriately spaced disordered Deg initiation site⁹.

In this study we focus on signals that activate proteolysis when protein function is no longer required, that is, regulated Deg. To understand substrate commitment and entry into the Deg pathway, here we have systematically analysed known degrons based on multiple data sets and hypothesize that the minimal region necessary and sufficient for UPS-mediated regulated Deg is composed of three substrate elements. The primary degron is a short, linear (peptide) motif located mostly within structurally disordered regions (less often within surface-exposed segments of structured domains) and contains a specific sequence pattern that is recognizable by cognate E3 ligases. The secondary degron is one (or multiple neighbouring) substrate lysine(s) present on a defined surface region of the substrate (ubiquitination zone¹⁰). These lysines possess certain contextual preferences that favour (poly)ubiquitin conjugation such as a moderate-to-high degree of local structural flexibility and a biased amino acid composition in its neighbourhood. Finally, the tertiary degron is a disordered/locally flexible Deg initiation site located proximal to (or overlapping with) the secondary degron. We demonstrate that known degron components show a significant correlation with intrinsically disordered regions (IDRs) and highly flexible substrate segments. We had previously analysed the manifold regulatory advantages of structural disorder in enzymatic components involved in ubiquitination¹¹ and here we suggest that the multi-layered substrate degron architecture reflects the complexity of proteostatic regulation. Thus, this study lays a solid foundation for a model where the blueprint for protein Deg is encoded in a distributed (combinatorial) architecture, determining the diversity and specificity of Deg, and thereby enabling a complex and spatiotemporal rewiring of the interactome.

Results

The underappreciated complexity of the degrome

Proteostasis entails a balance between protein synthesis, functional regulation and Deg. The regulatory complexity is clearly apparent at the synthesis and functional levels, but much less data are available detailing the regulatory elements at the level of Deg (Supplementary Fig. 1). The control of protein production at the transcriptional stage involves a complex interplay between transcription factors (TFs) and DNA regulatory elements (for example, promoters, enhancers and silencers). The transcription of ∼20,000 human genes is regulated by ∼1,800 TFs and ∼636,000 genomic binding sites have been mapped for 119 TFs. Overall, the number of DNA regulatory elements in the genome may reach into millions¹². Following synthesis, a large variety (∼300 types) of posttranslational modifications (PTMs) regulate protein localization, activity and interactions, in a synergistic and combinatorial manner¹³. The numbers of enzymes for certain modifications can be in the hundreds (for example, several hundred human kinases modifying ∼100,000 phosphosites¹³). Further, a plethora of peptide (sequence) motifs are known to direct a diverse range of protein functions (binding, modifications, localization, proteolytic cleavage and so on) and their total number has been estimated to be around one million in the human proteome¹³.

E3 ubiquitin ligases confer specificity in selecting substrates for UPS-mediated Deg, and consistent with a balance between the regulatory complexity of protein synthesis and functions vis-à-vis Deg the estimated number of human E3s is ∼600 (ref. 11). This is commensurable with the numbers of TFs and modification enzymes, for example, kinases. However, the balance breaks down when we observe the paucity of characterized E3-specific primary degrons. The number of SLiM-type degrons identified thus far is only 28, of which 25 types are found in human proteins (Table 1 and Supplementary Table 1). Experimental validation is available only for a limited number of corresponding substrates (93 human substrates; Supplementary Table 2) and even with predicted substrates (based on permissive criteria) no >30% of the human proteome can be putatively covered (Supplementary Fig. 1). Further, some of these motifs are highly degenerate (for example, the DBOX and SPOP sequence patterns); thus, even after filtering the number of false predictions is likely to be high and significant exploratory research will be required to validate these predicted candidate proteins as true substrates.

Table 1 Primary degrons collected from the literature and the ELM database⁴.

Full size table

In principle, one might expect many more E3 ligases to function by the recognition of a specific SLiM as a primary degron; therefore, the number of unique degrons should be much larger than the current number (that is, 28). Thus, a large part of the ‘degrome’ (that is, the full complement of degrons) remains to be explored. We propose that (in addition to as-yet uncharacterized primary degron types and uncharacterized substrates carrying known primary degrons), the regulatory complexity in Deg arises from a tripartite (distributed) degron architecture (Fig. 1), which enables a combinatorial use of degron components making the degrome commensurable in complexity with transcriptional and posttranslational regulatory elements in the proteome.

**Figure 1: The tripartite degron architecture.**

However, it should be clarified that although many more peptide (SLiM type) primary degrons may be anticipated, not all E3 ligases necessarily bind to peptide degrons. For example, the E3 ligase listerin/Ltn1 forms part of the large ribosomal subunit-associated quality control complex that facilitates translational surveillance in eukaryotes by ubiquitin-tagging defective polypeptides from stalled ribosomes¹⁴. In another example, a distributed structural degron dispersed across 523 residues of the amino-terminal transmembrane domain of yeast 3-hydroxy-3-methylglutaryl-coenzyme A reductase isozyme Hmg2p is required for its Hrd1-dependent regulated Deg¹⁵. It is not clear at the moment what fraction of E3 ligases use peptide degrons for substrate recognition.

Primary degrons specify substrate recognition by E3 ligases

E3 ligases target specific substrates for Deg by recognizing their primary degron (Fig. 1a). We catalogued 28 primary degron motifs encompassing a broad functional range, from 171 experimentally validated instances in 157 diverse substrates (Table 1 and Supplementary Tables 1 and 2). Analysis of their properties revealed that primary degrons resemble typical SLiMs. SLiMs are short (typically 5–10 residues), evolutionarily conserved functional peptide segments present within IDRs that mediate interactions with partner proteins¹⁶. Functional regions (such as binding motifs) behave as islands of sequence conservation within fast diverging IDRs. As E3 ligases bind substrates via their primary degrons, as part of a serious decision on the protein’s fate, the functional importance of degrons across orthologues is clearly reflected in their highly significant sequence conservation (Fig. 2a).

**Figure 2: Characteristic features of primary degrons.**

In terms of their structural preferences, primary degrons tend to be located within segments that are predicted to be intrinsically disordered (IUPred¹⁷, Fig. 2b), with high local backbone flexibility (DynaMine¹⁸, Supplementary Fig. 2), as compared with the overall substrate sequences in which they occur (P<2.2E−16). Again, this mirrors the general tendency towards local structural disorder for functional SLiMs. Consistently, we found that a large majority is located outside domain boundaries: 79% of primary degron instances had 0% overlap with annotated Pfam domains¹⁹ and the remainder localized to surface loops of structured domains (Fig. 2c). Secondary structure propensities calculated using PSIPRED²⁰ also indicated a predominance of coil conformations (∼85%) in degrons.

We also searched the Protein Data Bank²¹ for unbound structures of all 157 substrates, to analyse the primary degron before E3 recognition. Although we found 505 unbound structures (corresponding to 65 substrates), in the majority of cases the degron-containing segment was not part of the experimental constructs used, most probably because of its location outside domains—in a potentially disordered region—which hampers crystallization. Only in 17 structures (8 substrates) the primary degron sequence was part of the experimental construct used (Supplementary Table 3) and only for one protein the degron region was actually visible in the structure (Fig. 2c). For the remaining structures the degron regions have missing electron density (one such example is shown in Fig. 2d) and they fall into disordered regions. In contrast to the unbound structures, many primary degrons undergo disorder-to-order transitions when they bind to cognate E3 ligases (Supplementary Table 4), indicating conformational stabilization on binding (Supplementary Fig. 3); such behaviour is typical of motif-mediated interactions serving to impart recognition specificity¹⁶.

Surface accessibility of primary degrons infers regulation

It is clear from analysis presented in the previous section that the majority of degrons are present in locally disordered regions. However, to investigate the possibility that subtle, local modulations in surface accessibility may regulate degron recognition by E3s, we investigated the surface accessibility of primary degrons relative to their flanking residues. Because of missing structure coordinates (both of degrons and degron-flanking sequences), we predicted accessible surface areas (ASAs) from sequence using SPINE-X²² (see Methods for details). Per-residue absolute ASA values were converted into Z-scores (using amino acid type-specific ASA distributions, shown in Supplementary Fig. 4, to enable comparisons between regions comprising different residue types) and then the average Z-score was computed for both the degron and degron-flanking segments in each of the 157 proteins. We observed a statistically significant trend for primary degrons to have a lower average Z-ASA as compared with their flanking regions (<Z-ASA>_degron<<Z-ASA>_flanking, P=8.3E−12; Fig. 2e). This leads us to speculate on the existence of conformations where the degron may be locally buried/shielded by its neighbourhood (for example, by steric occlusion) and thereby held in conformations that are unsuitable/inaccessible for recognition by E3s.

Dissected by degron type (Supplementary Fig. 5), only the N-terminal (N-end rule) degrons are more exposed than their flanking residues; for the majority of the remaining categories the primary degron has a lower average Z-ASA compared with its flanking regions (significant differences were observed in case of APC/C (DBOX), APCC_TPR_1, CBL (MET), COP1, CRL4_CDT2_1, KLHL3, MDM2_SWIB, SCF_COI1_1, SCF_TIR1_1 degrons and the SCF_SKP2-CKS1_1 and SCF_TRCP1 phospho-degrons). This observation potentially indicates regulatory control that couples signalling events with degron accessibility (via local conformational changes, as we discuss in the following section).

PTMs regulate degron accessibility and conformational state

By keeping the primary degron relatively occluded compared with its flanking regions and thus inaccessible to recognition by E3 ligases, Deg can be made conditional and subject to regulation. Given the extensive cross-talk between phosphorylation and ubiquitination²³, a putative mechanism may involve the initiation of Deg via an appropriate (priming) PTM in the degron neighbourhood resulting in changes in degron accessibility by shifting the local conformational equilibrium towards more exposed states of the degron. Indeed, degron-flanking phosphorylation events that induce a conformational change resulting in degron exposure have been proposed in the Deg of Chk1 (ref. 24), Cdc25A and Cdc25B phosphatases^25,26. In a recent molecular dynamics study, it was also shown that phosphorylation of a disordered segment shifts the conformational equilibrium towards binding competent sub-states²⁷.

In several substrates, multiple (sequential) priming modifications are essential for Deg to occur. For example, coordinated phosphorylation by multiple kinases occurs in the inhibition of the transcriptional coactivator Yes-associated protein, a human oncoprotein²⁸. Phosphorylation of the ³⁷⁶HXRXXpS³⁸¹ motif by the Lats tumour suppressor provides the priming signal for CK1δ/ɛ to phosphorylate the neighbouring ³⁸³DpSGXpS³⁸⁷ phosphodegron in Yes-associated protein and enables recognition by β-TRCP, the substrate recognition subunit of SCF^β-TRCP. In case of β-catenin, casein kinase Iα primes Ser45, causing relay phosphorylation of Thr41 by glycogen synthase kinase 3, followed by Ser37 and Ser33 creating a canonical DpSGXXpS phosphodegron²⁹. A priming phosphorylation is also required for the proteolytic turnover of interferon receptor (IFNAR1)³⁰. In the light of our ASA calculations, these events can be hypothesized to occur via local conformational shifts that enable formation of binding-competent sub-states that are accessible/compatible for E3 binding. Certain IDR amino acid compositional types (polar tracts and polyelectrolytes³¹) possess a tendency to form nonspecific and unstructured, but locally compact, conformations (which can maintain the degrons in a buried/sterically occluded state); this process can be reversed by increasing the net charge per residue by the addition of PTMs such as phosphorylation. The fact that most known degrons are present within locally disordered regions would make PTM-mediated structural shifts highly sensitive and lend itself to fine-tuning of the regulation of Deg.

Perhaps reflecting the generality of this mechanism, we found 121 experimentally annotated PTM sites corresponding to 48 primary degrons (from 45 proteins) in our data set. Eighty-four of these PTM sites were located in flanking regions (considering upto 20 residues upstream and downstream of the degron in the primary sequence; Supplementary Data 1). Furthermore, mutations in many of these flanking residues that are known to be modified by PTMs lead to reduced Deg rates and loss of E3 binding (Supplementary Data 2), which clearly demonstrates the mechanistic importance of the nearby PTM event in priming primary degron recognition by E3 ligases. The potential for PTM (for example, phosphorylation)-based regulation of degron accessibility (and conformational state) is also indicated by the high incidence of Ser, Thr and Tyr residues (∼19%) in the degron-flanking regions in these 157 proteins. These residues, many of which show a high sequence conservation among related proteins (Supplementary Fig. 6), can potentially be modified by phosphorylation.

In addition to phosphorylation, there is evidence for a dynamic regulatory interplay between O-GlcNAcylation at Ser/Thr sites and protein stability: for example, O-GlcNAcylation of p53 at Ser-149 prevents the phosphorylation of adjacent Thr-155, which blocks ubiquitin-dependent proteolysis and stabilizes p53 (ref. 32). Thus, there is speculation that O-GlcNAc could act as a protective signal against proteasomal Deg, perhaps by counteracting the effect of phosphorylation³³.

Secondary degrons are Deg linked substrate Ubsites

The selection of primary degrons is followed by the ubiquitination (poly-, mono- or multiple monoubiquitination) of acceptor Lys residue(s) on the substrate by the E3–E2 machinery (Fig. 1). The complexities of ubiquitin chain linkages and their functional readout is a field of intense study⁶. We define the secondary degron as the substrate lysine(s) that are ubiquitinated and linked to Deg. Although a Lys48-linked tetra-ubiquitin chain was widely believed to be the canonical/minimal determinant for proteasomal Deg, there is increasing evidence for other chain lengths and also (multiple) mono-Ub moieties to serve as efficient proteasomal targeting signals³⁴. Here we have analysed two data sets (details in Methods) as follows: (i) 108 Deg-linked lysines on 42 proteins (‘Deg’) and, as control, their remaining lysines (‘Others’); and, ii) 9,323 ubiquitinated lysines in 3,756 proteins (ubiquitination sites (‘Ubsites’)), based on two proteomics studies^35,36, however, without annotation as to the outcome of ubiquitination. Their control set was an equivalent number of non-ubiquitinated lysines from the same set of proteins (‘Non-Ubsites’; Supplementary Data 3).

Secondary degrons are characterized by structural flexibility

In an early paper, Varshavsky³⁷ suggested a ‘stochastic capture’ model for the ubiquitination of N-end rule substrates, suggesting that the probability of a Lys to serve as the Ubsite was proportional to its local flexibility. The rationale was that too rigid a conformation would reduce ubiquitination efficiency and render it incompatible with processive polyubiquitination. Recent bioinformatics analyses have led to somewhat opposing results claiming that Ubsites do³⁸ or do not³⁹ fall into locally disordered regions, although Deg-linked Ubsites were clearly more disordered³⁹. We addressed these features by analysing both available structural data and disorder predictions of the lysine data sets. Only for 8 of the 42 Deg substrates, we found 9 non-redundant Protein Data Bank structures (in which only 18 of the 108 Deg Lys were visible; see Supplementary Table 5). Two structures (possessing 2 and 1 characterized Deg lysines) are shown in Fig. 3a,b. In both cases, even though structured in the crystal environment, the surface patch (ubiquitination zone¹⁰) containing the Deg lysine(s) were predicted to be the most disordered/flexible. From this limited structural data for the 18 Deg lysines, we also plotted their observed secondary structural elements (SSE) distribution (Fig. 3c). Non-regular (coil) conformations are highly preferred (∼80%), mirroring previous analysis of in vivo Ubsites that showed a majority of lysines to be present on surface loops, followed by α-helices^39,40.

**Figure 3: Features of secondary degrons and their sequence neighbourhood.**

Based on predictions of structural disorder (IUPred) for the complete Deg data set, we observed that about half of the Deg lysines are in IDRs (median IUPred disorder score ∼0.5); interestingly, Deg regions containing multiple Deg-linked lysines tend to have higher disorder scores than those regions with single Deg lysines (P<0.05; Fig. 3d). This suggests a greater role of structural plasticity for ubiquitination of redundant neighbouring lysines and this is also demonstrated by the two previous examples: the ubiquitination surface containing multiple Deg lysines (Fig. 3a) is predicted as being more disordered than the surface with a single Deg lysine (Fig. 3b).

Using disorder predictions (IUPred) and predictions of local backbone flexibility (DynaMine), we also analysed the neighbourhood of the lysines (using 21-residue sequence windows centred on each lysine). Deg lysine neighbourhoods are significantly more disordered than the other lysine categories (Deg versus Others, P=2.8E−7; Deg versus Ubsites, P<2.2E−16; and Deg versus Non-Ubsites, P<2.2E−16; Fig. 3e, left). Similarly, Deg neighbourhoods are characterized by significantly higher backbone flexibility (lower DynaMine S2 scores; Deg versus Others, P=5.1E−8; Deg versus Ubsites, P<2.2E−16; and Deg versus Non-Ubsites, P=1.3E−10; Fig. 3e, right). These observations correlate with predicted secondary structural propensities: the majority (>75%) of Deg lysines occupy coil conformations and this preference is significantly higher than all the other categories (Fig. 3f). Thus, an environment characterized by significantly increased local disorder/flexibility appears to specify Deg-linked Ubsites. We surmise that stable anchoring of the substrate to the E3 ligase via the primary degron combined with structural adaptability around the ubiquitination surface enables processivity in ubiquitination and the choice of multiple lysines to modify⁴¹.

Sequence features neighbouring the secondary degron

Unlike for primary degrons, no general sequence motif(s) encompassing the secondary degron have been established. Ubiquitin transfer occurs in a special microenvironment created by E2 active site residues, the acceptor Lys on the substrate and its neighbouring residues. Therefore, the lack of globally identifiable motif preferences could be due to E2-specific catalytic mechanisms^42,43 that necessitate a requirement for compatibility between the amino acid environment of the acceptor lysine and key residues within the E2 catalytic core, as demonstrated for yeast Cdc34 (ref. 44).

We analysed amino acid frequencies within a 21-residue window centred on Ubsites lysines (enrichment calculated relative to the Non-Ubsites set, Fig. 3g). Aromatic and hydrophobic residues were significantly over-represented (in particular within ±6 residues of the lysine), whereas Cys, Met and charged residues are depleted (Fig. 3g, left). Whereas Glu is disfavoured throughout, the positively charged residues (Arg and Lys) are depleted in the immediate vicinity but enriched further away (>6 aa). Although the general trend is towards the depletion of strongly helix-favouring residues (Arg, Lys, Glu and Met), Leu and Gln appear exceptional, as they are enriched at certain positions. Asn is also enriched; however, it does not possess striking propensities for forming any regular secondary structure elements. Further, Gly and Pro are also over-represented at certain positions in the vicinity of Ubsites compared with Non-Ubsites. In one of the two proteomics data sets included in Ubsites, putative Deg-linked sites were ascertained using a proteasomal inhibitor and SILAC (stable isotope labelling by amino acids in cell culture) strategy³⁵. We compared the residue usage in this subset with substantially increased ubiquitination after proteasome inhibition (SILAC ratio>1.2), putatively linked to Deg. However, residue frequencies were qualitatively very similar (Fig. 3g, right). Finally, we also checked the residue composition for Deg versus Others. Although the number of Deg lysines is considerably less than the number in Ubsites (making a strong statistical trend less probable), nevertheless the enrichment of aromatics is still evident (Supplementary Fig. 7). Interestingly, Thr and Tyr are also enriched around Deg sites, suggesting a possible role for PTMs in secondary degron selection. A previous study also observed significant enrichment of phosphorylatable residues (Ser, Thr and Tyr) flanking in-vivo Ubsites⁴⁰.

Although overall residue usage preferences were evident (Fig. 3g), we were unable to determine any specific (enriched) sequence motifs. Nevertheless, the neighbourhood of ubiquitinated lysines is clearly important, forming an additional layer of regulatory control, as several studies on specific E2s have demonstrated the dependence of lysine selection on specific local amino acid preferences. For example, efficient Ub chain initiation by the APC/C-specific E2 Ube2C is dependent on charged residues close to the preferred Lys, which determine the timing and rate at which substrates are degraded by the APC/C⁴⁵. Substrate lysines selected by the yeast E2 Ubc4 contain neighbouring acidic residues that are complementary to a highly conserved Lys (K91) adjacent to the catalytic Cys86 of Ubc4 (ref. 46). Ubc4 is a specialized ‘initiator’ E2: yeast APC/C uses Ubc4 for chain initiation and Ubc1 for K48-linked chain elongation⁴⁷. K48 selectivity of Ubc1 arises from a cluster of polar residues proximal to the Ubc1 active site⁴². Cdc34 is another E2 that displays specificity for Ub-K48 and this behaviour is promoted by the presence of an acidic loop near the E2 active site that optimally positions ubiquitin K48 for nucleophilic attack⁴⁸. These results suggest that different E2s use distinct strategies to determine acceptor sites and detailed studies attempting to determine any corresponding specificity determinants (ubiquitination motifs) should consider E2 specificity.

Relationship between primary and secondary degrons

Once the active E3–E2/substrate assembly has formed, spatial and geometric constraints such as distance and orientation relative to the E3-bound primary degron⁴⁹ limit the ubiquitination surface and the selection of Deg-linked lysine(s). Thus, the relative separation between primary and secondary degrons should be an important and conserved feature. Indeed, altering the distance between the Dbox primary degron and Ub-initiation motifs in the APC/C substrate geminin stabilizes the protein against Deg⁴⁵.

Although we do not have sufficient data to draw general conclusions, for 11 proteins present in both the primary degron and Deg data sets, we could investigate the positions of Deg lysines relative to the primary degron (Supplementary Fig. 8). P53 has 17 characterized Deg lysines and relative to its N-terminal MDM2-binding site, the lysines fall into multiple distance bins (Supplementary Fig. 8A). For most of the remaining proteins, however, we observed a clear distinction between Deg and Other lysines: the former (Deg) tend to be located very close in sequence to the primary degron (often within 20 residues). Pavletich and colleagues⁴⁹ have commented on the fact that β-catenin (Supplementary Fig. 8C) and IκBα (Supplementary Fig. 8D) orthologues and paralogues all contain lysines located 9–14 residues upstream of their primary degrons. Further, they showed that altering the relative spacing between the primary and secondary degron sites in β-catenin strongly influenced ubiquitination efficiency.

This apparent proximity of the primary (E3-binding motif) and secondary degron (site(s) of ubiquitination) suggests another possible role for the Ser/Thr residues flanking the primary degron (described earlier). Although ubiquitination occurs mostly on Lys residues, evidence exists that Ser/Thr can also be modified by ubiquitin or Ub-like proteins, thus replacing the Lys as a secondary degron site in proteins such as BH3 interacting-domain death agonist, neurogenin and the heavy chain of major histocompatibility complex I³⁴.

Physical separation thus uncouples E3-binding and ubiquitination but also should enable a degree of allosteric control between these functionalities. Thus, it is possible that E3 binding to the primary degron increases reactivity of Lys at the secondary site. Each lysine will possess a distinct probability of being the secondary degron, with proximity to the primary degron being a strong determinant in most cases (Supplementary Fig. 8). Further factors such as local structural flexibility and sequence neighbourhood (Fig. 3) also strongly contribute to defining a unique kinetic code of modification. For example, ubiquitination of yeast Sic1 (after binding to SCF^Cdc4) is restricted to six lysines in the N-terminal domain and each modification seems to have a different readout in terms of Deg efficiency and downstream signalling⁵⁰.

The tertiary degron initiates substrate unfolding

Recent evidence suggests that (poly)ubiquitination may not be sufficient for efficient proteasomal Deg, which additionally requires a disordered (or partially unfolded) region on the substrate^7,9. The proteasome initially engages with this flexible segment (tertiary degron) and the substrate is then unfolded in a cooperative, ATP-dependent manner (Fig. 1c,d). Based on recent structural details of the proteasome⁸, the ubiquitin receptors Rpn10 and Rpn13 on the 19S regulatory particle are located ∼70–80 Å away from the ATPase unfolding channel. To facilitate access to these buried sites and thereby entry into the catalytic core of the 20S particle, the tertiary degron apparently requires a minimal length of 20–30 residues and needs to be located adjacent to the polyubiquitin tag. Deg efficiency drops sharply when the two sites are gradually separated⁹.

We therefore investigated the presence of long disordered regions/segments (LDRs; defined as at least 20 consecutive disordered residues, see Methods) in the vicinity of known Ubsites in physiological substrates. IUPred calculations show a strong distinction in this feature between Deg and all the other lysine categories: nearly 60% of Deg sites are located in the proximity (within 0–10 residues) of an LDR as compared with only 20–30% in the other Lys categories (Fig. 4a). DynaMine predictions also show that a significantly higher fraction (∼75%) of Deg lysines possess a flexible segment less than ten amino acids away (Supplementary Fig. 9). Considering the fraction of sites located within LDRs (Fig. 4a inset and Supplementary Fig. 9 inset), it is significantly more for Deg lysines than for all the other categories. Not only do these results confirm the earlier observations that Deg sites are significantly more disordered/flexible among all lysine categories (Fig. 3e,f) but it indicates a strong correlation between substrate Deg and the requirement for a proximal disordered Deg initiation site.

**Figure 4: Sequence distance between secondary and tertiary degrons.**

Thus, a disordered region of adequate length for proteasomal entry appears to be an essential component of the signal for Deg. Furthermore, transient local unfolding enabled by intrinsic flexibility or promoted allosterically by external factors (for example, binding of AAA-ATPase p97 of ‘unfoldase’ activity⁵¹) or posttranslational modifications (for example, phosphorylation of the CDK inhibitor p19^4inkd (ref. 52)) can also be used. Further, polyubiquitination itself can promote unfolding and exposure of the tertiary site in a ‘regulated unfolding’ mechanism^53,54.

Local disorder and outcome of ubiquitination

Relative local disorder/flexibility can putatively distinguish ubiquitinated lysine sites that are connected either with Deg or regulation-linked events. The fraction of Ubsites located within LDRs (10%) is significantly lower than the equivalent fraction not only of Deg sites (50%) but remarkably also of non-ubiquitinated control sites (that is, Non-Ubsites and Others, with 24% and 21% respectively; Fig. 4a inset), which suggests that many of the identified Ubsites may not be linked to protein Deg. This completely agrees with the observation that ∼40% Ubsites did not show any significant increase in site-specific ubiquitination on proteasomal inhibition³⁵. Thus, the observation (that is, presence/absence) of local structural disorder might be useful to predict the functional outcome of protein ubiquitination and it may also have evolutionary implications. The comparison between Ubsites specifically linked to Deg and all Ubsites relative to the two non-ubiquitinated control sets (Fig. 4) suggests that the distance between the Ubsite and the nearest LDR is likely to be under strong evolutionary pressures such that Deg-linked sites show positive selection in favour of overlapping disordered regions, whereas sites involved in signalling exhibit negative selection to remove proximal LDRs to prevent Deg of the protein, or at least reduce its efficiency (Fig. 4b). The importance of the tertiary degron is especially striking, as variations in this component appear to significantly impact protein half-life and many paralogues affected by such changes (those with significantly shorter flexible segments) were observed to possess signalling functions⁵⁵.

Degron components and disease links

To further the evidence for the validity and the functional importance of the tripartite model, we next examined how impairing degron elements is linked to disease. We reasoned that a corrupted primary degron should abrogate substrate targeting by E3 ligases, a mutated secondary degron should block substrate ubiquitination (unless the lysine selection for a particular substrate is less stringent, as in the case of multiple neighbouring redundant lysines) and finally removal of the tertiary degron or mutations (for example, the removal of phosphorylatable residues) should alter substrate Deg kinetics (Fig. 5a). For the proteins in our primary degron and Deg data sets, UniProt annotations were scanned for experimental evidence of modifications (for example, isoforms resulting from alternative splicing, natural sequence variants including polymorphisms and disease-associated mutations) that interfere with the known degron components (see Methods). The results are summarized in Fig. 5b: in many cases, degron abrogation is directly linked with diseases such as cancer and growth defects (Supplementary Data 4–6). As mutations affecting degron elements are likely to critically influence cellular physiology, our model suggests that many more disease links are yet to be discovered.

**Figure 5: Tripartite degron components and functional impacts on their abrogation.**

Interfering with degrons also rewires the local interactome

As modifications affecting many of the known degron components have profound effects on cell physiology (as manifested in causing disease; Fig. 5b), we inferred that many of these proteins with deleted degron component(s) may take central positions in the local interactome, that is, they are highly connected. In fact, based on data from experimentally validated interactomes (Methods), many of these proteins are hubs (57% and 68% of proteins from the primary degron and Deg data sets, respectively, have >20 experimentally validated binding partners). After binning the proteins according to their number of known interaction partners (Fig. 5c), we observed that several proteins for which alternative splicing causes the removal of known degron component(s) are present in high interaction density bins. Similar observations are likely to grow as experimental evidence about protein isoforms and data on degron components become increasingly available. Here we discuss protein isoforms (often resulting from alternative splicing), because this may result in local interactome rewiring and affect protein function in complicated ways: first, the increased protein availability (degron removal often increases protein half-life by orders of magnitude) would affect equilibrium concentrations of different protein complexes; second, alternative splicing removes significant portions of the protein sequence (Fig. 5d) and affects considerable parts of the exposed surface (Fig. 5e), thus altering the interaction landscape by removing interaction patches/sites for certain partners.

The possible complex outcome is illustrated using the example of NIMA-related kinase 2 (Nek2; Fig. 6). Nek2 is a serine/threonine protein kinase that regulates centrosome separation in mitotic cells and controls chromatin condensation in meiotic cells. Two (primary) degrons have been characterized in the non-catalytic, carboxy-terminal domain of Nek2A (a KEN-box and an exposed C-terminal MR dipeptide tail)⁵⁶. Alternative splicing of this protein removes these two primary degrons and the resulting isoform (Nek2B) is significantly more stable (Fig. 6a)^56,57. In addition, this splicing event also removes the interaction regions of several binding partners (Fig. 6a), thus initiating a complex remodelling of the local interactome, realizing at least three distinct cellular states (Fig. 6b) involving Nek2A (following synthesis), Nek2A (mostly degraded after 30 min) and Nek2B (stable but having a reduced/altered interaction capacity). Protein availability and functional lifetime are crucial parameters, as for example; elevated Nek2 levels have been detected in a number of human cancer types and cancer-derived cell lines⁵⁸. In addition to stability changes due to degron removal (or mutations), the subset of disrupted interactions will also be critical. For instance the C terminus of Nek2A, which is missing in Nek2B, contains a binding site for the catalytic subunit of protein phosphatase 1 (Fig. 6b), which acts as a physiological inhibitor of Nek2 (ref. 59). Thus, interactome rewiring by changes in protein stability (due to degron removal) and alterations in the interaction landscape are likely to be a common biological phenomena that would help to understand disease pathways.

**Figure 6: Removal of primary degrons influences protein stability and interactions.**

Discussion

Regulated Deg is a fundamental mechanism used to exert control over cellular processes and pathways by enabling precise alterations in protein levels. The distributed nature and synergistic relationships between the E3 recognition/docking site on the substrate (primary degron), Ubsite (s) (secondary degron) and proteasomal Deg initiation site (tertiary degron) (Fig. 1) will help explain the diversity and specificity of the ‘degrome’. However, it also infers new challenges for the identification of degron components and delineating their regulatory relations. What is clearly apparent however is that the multipartite nature of degrons also maximizes the regulatory complexity in decision-making before removing a protein from the cell. Another challenge for the structural characterization of degrons will arise from their presence within intrinsically disordered protein regions (Figs 2, 3, 4). The importance of conformational dynamics in degron functionality and regulation via PTMs will also need to be carefully studied further. Degron elements can also fundamentally control biological functioning as altered half-life due to the removal/mutation of degron elements will have a significant impact on the duration of activity of a protein. In addition, alternative isoforms with missing degrons will not only have altered stability, but also affected interaction capacity, thus enabling a complex and temporal rewiring of the interactome (Figs 5 and 6).

Methods

Prediction of structural disorder and backbone flexibility

IUPred¹⁷ and DynaMine¹⁸ software were used to predict structural disorder and protein backbone dynamics, respectively, using amino acid sequences as input. IUPred outputs scores (ranging between 0 and 1) for each residue; scores >0.5 indicate disordered residues (the ‘short’ mode of IUPred was used). DynaMine, trained on NMR data, estimates protein backbone dynamics and outputs per-residue S2 order parameter values, indicating residue flexibility. Values ≤0.69 indicate highly flexible residues; [0.7–0.8] shows context-dependent flexibility and >0.8 indicates ordered residues. Using the IUPred scores, LDRs were defined as consecutive stretches of at least 20 disordered residues (breaks of upto three consecutive ordered residues within an LDR were permitted).

Prediction of secondary structure propensities

Secondary structure propensities were calculated from primary sequence using the PSIPRED software²⁰ using default settings. The output provides for each residue a classification: C (coil), H (helix) or E (strand).

Prediction of residue surface accessibility

SPINE-X achieves accuracy >80% in predictions of ASAs based on amino acid sequences as input²². SPINE-X outputs ASA values for every residue in the input sequence. Absolute ASAs were converted into Z-scores, to facilitate comparison of relative solvent accessibility between regions consisting of different residue types. The protocol used was as follows: using SPINE-X predictions for all 157 proteins in our primary degron data set, we built ASA distributions for each of the 20 amino acid types (Supplementary Fig. 4). Next, for a specific motif instance, the absolute ASA of each motif residue was converted into a Z-score (using the ASA distribution corresponding to its amino acid type) and the average Z-score calculated for that motif. The same protocol was followed when estimating the average Z-score for motif-flanking regions.

Orthologue alignments and calculation of sequence conservation

Pre-computed multiple sequence alignments of orthologues were obtained from Discovery@Bioware (http://bioware.ucd.ie/~compass/biowareweb/) and used to calculate Shannon entropy scores for each aligned position ‘i’ using the equation: S(i)=−Σp(k).ln(p(k)), where p(k) is the probability of the ith position in the sequence alignment being occupied by a residue of class ‘k’. The classifications of residues used were as follows: [(Ala, Val, Leu, Ile, Met, Cys), (Gly, Ser, Thr), (Asp, Glu), (Asn, Gln), (Arg, Lys), (Pro, Phe, Tyr, Trp), (His)]⁶⁰. Substitutions within a group were considered conservative. The lower the sequence entropy at a given alignment position, the higher its evolutionary conservation. For a given region (for example, primary degron sequence), the sequence entropy values for each motif position, S(i), were calculated and then averaged (<S>_motif=ΣS(i)/n_motif, where n_motif is the number of motif residues).

Data sets of ubiquitinated lysines

We used the following data sets of ubiquitinated lysines for analysis (Supplementary Data 3):

1
A set of 42 mammalian proteins where the ubiquitination of 108 lysines had been linked to their Deg (this set is referred to as Deg). This set had been compiled (and used in an earlier publication) based on the following criteria³⁹: (a) all the Ubsites had been studied in vivo; (b) existence of literature and database (UniProt⁶¹, UbiProt⁶², Phosphosite⁶³) evidence of ubiquitination, detected either by high-throughput mass spectrometry or by point mutations of specific lysines that abolish ubiquitination; (c) proteins for which the data quality precluded complete detection of Ubsites or proteins with ambiguous sites were excluded; and, importantly, (d) the 42 proteins have experimental evidence of undergoing UPS-mediated Deg (or processing) after ubiquitination. Deg lysines were compared with a control data set comprising the remaining 1,024 lysines (Others) from the same 42-protein set.
2
A large-scale set of experimentally validated Ubsites in human proteins had previously been collected and used to train a human-specific ubiquitin site predictor (hCKSAAP_UbSite)⁶⁴. The outcome of ubiquitination was unknown for this set of proteins. This data had been compiled from two recent proteomics-scale assays (in which the Ubsites were assigned based on enrichment of endogenous ubiquitinated peptides using affinity purification followed by high-resolution mass spectrometry^35,36) and from literature-derived UniProt annotations; the final set was prepared by filtering the proteins for sequence redundancy (using a 30% identity cutoff). This list was matched against the current UniProt release and contained 9,323 Ubsites (from 3,756 human proteins) that we refer to as Ubsites. For comparison, we used a set of 9,318 non-ubiquitinated lysines assembled from the same set of proteins (Non-Ubsites).

Sequence windows of 21 residues centred on the lysines were created for analyses of their features. In cases where the Lys was located near the termini of the protein chain, truncated sequence windows were used.

Protein–protein interaction data

iRefWeb⁶⁵ was used to retrieve data for known protein–protein interactions. The following filters were applied: (i) only physical interactions based on experimental validation; (ii) the interaction had been described in at least one publication; (iii) single-organism interactions only (that is, both proteins were from the same organism); and (iv) MI (MINT-inspired) score of 0.4 or more. This score is a measure of confidence in the observed interaction.

UniProt annotations of sequence variants

Two data sets were used for cataloguing instances where degron elements were adversely affected: the primary degron data set of 157 proteins (Supplementary Table 2) and the Deg data set of 42 proteins (Supplementary Data 3). The Deg data set contains secondary degron instances (that is, Deg-linked lysines) and tertiary degrons were also defined using the same data set (that is, LDRs nearest to each Deg lysine). We searched UniProt annotations (feature tables, denoted ‘FT’) for each of these proteins for cases where any of the degron components were missing/altered:

a
Sequence variants (alternative sequence or isoforms, denoted ‘VAR_SEQ’ in UniProt annotation) resulting from alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting; alternative splicing was the most abundant.
b
Mutations (denoted ‘MUTAGEN’) corresponding to site(s) that have been experimentally altered by mutagenesis and their effects studied.
c
Sequence variations (position specific, denoted ‘VARIANT’) as reported by authors; validated human polymorphisms are linked to entries in the Single Nucleotide Polymorphism database⁶⁶. Entries in this category also include disease-associated mutations.

Statistical tests

All statistical tests for calculating P-values were carried out using the Mann–Whitney U-test, unless otherwise specified.

Additional information

How to cite this article: Guharoy, M. et al. Tripartite degrons confer diversity and specificity on regulated protein degradation in the ubiquitin-proteasome system. Nat. Commun. 7:10239 doi: 10.1038/ncomms10239 (2016).

References

Ciechanover, A. Intracellular protein degradation: from a vague idea thru the lysosome and the ubiquitin-proteasome system and onto human diseases and drug targeting. Biochim. Biophys. Acta 1824, 3–13 (2012) .
Article CAS Google Scholar
Varshavsky, A. Naming a targeting signal. Cell 64, 13–15 (1991) .
Article CAS Google Scholar
Ravid, T. & Hochstrasser, M. Diversity of degradation signals in the ubiquitin-proteasome system. Nat. Rev. Mol. Cell. Biol. 9, 679–690 (2008) .
Article CAS Google Scholar
Dinkel, H. et al. The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Res. 42, D259–D266 (2014) .
Article CAS Google Scholar
Bachmair, A. & Varshavsky, A. The degradation signal in a short-lived protein. Cell 56, 1019–1032 (1989) .
Article CAS Google Scholar
Komander, D. & Rape, M. The ubiquitin code. Annu. Rev. Biochem. 81, 203–229 (2012) .
Article CAS Google Scholar
Prakash, S., Tian, L., Ratliff, K. S., Lehotzky, R. E. & Matouschek, A. An unstructured initiation site is required for efficient proteasome-mediated degradation. Nat. Struct. Mol. Biol. 11, 830–837 (2004) .
Article CAS Google Scholar
Tomko, R. J. Jr. & Hochstrasser, M. Molecular architecture and assembly of the eukaryotic proteasome. Annu. Rev. Biochem. 82, 415–445 (2013) .
Article CAS Google Scholar
Inobe, T., Fishbain, S., Prakash, S. & Matouschek, A. Defining the geometry of the two-component proteasome degron. Nat. Chem. Biol. 7, 161–167 (2011) .
Article CAS Google Scholar
Mattiroli, F. & Sixma, T. K. Lysine-targeting specificity in ubiquitin and ubiquitin-like modification pathways. Nat. Struct. Mol. Biol. 21, 308–316 (2014) .
Article CAS Google Scholar
Bhowmick, P., Pancsa, R., Guharoy, M. & Tompa, P. Functional diversity and structural disorder in the human ubiquitination pathway. PLoS ONE 8, e65443 (2013) .
Article CAS ADS Google Scholar
Consortium, E. P.. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012) .
Tompa, P., Davey, N. E., Gibson, T. J. & Babu, M. M. A million peptide motifs for the molecular biologist. Mol. Cell 55, 161–169 (2014) .
Article CAS Google Scholar
Lyumkis, D. et al. Structural basis for translational surveillance by the large ribosomal subunit-associated protein quality control complex. Proc. Natl Acad. Sci. USA 111, 15981–15986 (2014) .
Article CAS ADS Google Scholar
Gardner, R. G. & Hampton, R. Y. A ‘distributed degron’ allows regulated entry into the ER degradation pathway. EMBO J. 18, 5994–6004 (1999) .
Article CAS Google Scholar
Van Roey, K. et al. Short linear motifs: ubiquitous and functionally diverse protein interaction modules directing cell regulation. Chem. Rev. 114, 6733–6778 (2014) .
Article CAS Google Scholar
Dosztanyi, Z., Csizmok, V., Tompa, P. & Simon, I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434 (2005) .
Article CAS Google Scholar
Cilia, E., Pancsa, R., Tompa, P., Lenaerts, T. & Vranken, W. F. The DynaMine webserver: predicting protein dynamics from sequence. Nucleic Acids Res. 42, W264–W270 (2014) .
Article CAS Google Scholar
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014) .
Article CAS Google Scholar
Buchan, D. W., Minneci, F., Nugent, T. C., Bryson, K. & Jones, D. T. Scalable web services for the PSIPRED protein analysis workbench. Nucleic Acids Res. 41, W349–W357 (2013) .
Article Google Scholar
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000) .
Article CAS ADS Google Scholar
Faraggi, E., Zhang, T., Yang, Y., Kurgan, L. & Zhou, Y. SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J. Comput. Chem. 33, 259–267 (2012) .
Article CAS Google Scholar
Hunter, T. The age of crosstalk: phosphorylation, ubiquitination, and beyond. Mol. Cell 28, 730–738 (2007) .
Article CAS Google Scholar
Zhang, Y. W. et al. The F box protein Fbx6 regulates Chk1 stability and cellular sensitivity to replication stress. Mol. Cell 35, 442–453 (2009) .
Article Google Scholar
Kanemori, Y., Uto, K. & Sagata, N. Beta-TrCP recognizes a previously undescribed nonphosphorylated destruction motif in Cdc25A and Cdc25B phosphatases. Proc. Natl Acad. Sci. USA 102, 6279–6284 (2005) .
Article CAS ADS Google Scholar
Kieffer, I., Lorenzo, C., Dozier, C., Schmitt, E. & Ducommun, B. Differential mitotic degradation of the CDC25B phosphatase variants. Oncogene 26, 7847–7858 (2007) .
Article CAS Google Scholar
Bui, J. M. & Gsponer, J. Phosphorylation of an intrinsically disordered segment in Ets1 shifts conformational sampling toward binding-competent substates. Structure 22, 1196–1203 (2014) .
Article CAS Google Scholar
Zhao, B., Li, L., Tumaneng, K., Wang, C. Y. & Guan, K. L. A coordinated phosphorylation by Lats and CK1 regulates YAP stability through SCF(beta-TRCP). Genes Dev. 24, 72–85 (2010) .
Article CAS Google Scholar
Wu, G. & He, X. Threonine 41 in beta-catenin serves as a key phosphorylation relay residue in beta-catenin degradation. Biochemistry 45, 5319–5323 (2006) .
Article CAS Google Scholar
Qian, J. et al. Pathogen recognition receptor signaling accelerates phosphorylation-dependent degradation of IFNAR1. PLoS Pathog. 7, e1002065 (2011) .
Article CAS Google Scholar
Mao, A. H., Lyle, N. & Pappu, R. V. Describing sequence-ensemble relationships for intrinsically disordered proteins. Biochem. J. 449, 307–318 (2013) .
Article CAS Google Scholar
Yang, W. H. et al. Modification of p53 with O-linked N-acetylglucosamine regulates p53 activity and stability. Nat Cell Biol 8, 1074–1083 (2006) .
Article CAS Google Scholar
Guinez, C. et al. Protein ubiquitination is modulated by O-GlcNAc glycosylation. FASEB J. 22, 2901–2911 (2008) .
Article CAS Google Scholar
Kravtsova-Ivantsiv, Y. & Ciechanover, A. Non-canonical ubiquitin-based signals for proteasomal degradation. J. Cell Sci. 125, 539–548 (2012) .
Article CAS Google Scholar
Wagner, S. A. et al. A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles. Mol. Cell. Proteomics 10, M111 013284 (2011) .
Article Google Scholar
Danielsen, J. M. et al. Mass spectrometric analysis of lysine ubiquitylation reveals promiscuity at site level. Mol. Cell. Proteomics 10, M110 003590 (2011) .
Article Google Scholar
Varshavsky, A. The N-end rule: functions, mysteries, uses. Proc. Natl Acad. Sci. USA 93, 12142–12149 (1996) .
Article CAS ADS Google Scholar
Radivojac, P. et al. Identification, analysis, and prediction of protein ubiquitination sites. Proteins 78, 365–380 (2010) .
Article CAS Google Scholar
Hagai, T., Azia, A., Toth-Petroczy, A. & Levy, Y. Intrinsic disorder in ubiquitination substrates. J. Mol. Biol. 412, 319–324 (2011) .
Article CAS Google Scholar
Catic, A., Collins, C., Church, G. M. & Ploegh, H. L. Preferred in vivo ubiquitination sites. Bioinformatics 20, 3302–3307 (2004) .
Article CAS Google Scholar
Hochstrasser, M. Lingering mysteries of ubiquitin-chain assembly. Cell 124, 27–34 (2006) .
Article Google Scholar
Rodrigo-Brenni, M. C., Foster, S. A. & Morgan, D. O. Catalysis of lysine 48-specific ubiquitin chain assembly by residues in E2 and ubiquitin. Mol. Cell 39, 548–559 (2010) .
Article CAS Google Scholar
Ye, Y. & Rape, M. Building ubiquitin chains: E2 enzymes at work. Nat. Rev. Mol. Cell. Biol. 10, 755–764 (2009) .
Article CAS Google Scholar
Sadowski, M., Suryadinata, R., Lai, X., Heierhorst, J. & Sarcevic, B. Molecular basis for lysine specificity in the yeast ubiquitin-conjugating enzyme Cdc34. Mol. Cell. Biol. 30, 2316–2329 (2010) .
Article CAS Google Scholar
Williamson, A. et al. Regulation of ubiquitin chain initiation to control the timing of substrate degradation. Mol. Cell 42, 744–757 (2011) .
Article CAS Google Scholar
Cook, W. J., Jeffrey, L. C., Xu, Y. & Chau, V. Tertiary structures of class I ubiquitin-conjugating enzymes are highly conserved: crystal structure of yeast Ubc4. Biochemistry 32, 13809–13817 (1993) .
Article CAS Google Scholar
Rodrigo-Brenni, M. C. & Morgan, D. O. Sequential E2s drive polyubiquitin chain assembly on APC targets. Cell 130, 127–139 (2007) .
Article CAS Google Scholar
Petroski, M. D. & Deshaies, R. J. Mechanism of lysine 48-linked ubiquitin-chain synthesis by the cullin-RING ubiquitin-ligase complex SCF-Cdc34. Cell 123, 1107–1120 (2005) .
Article CAS Google Scholar
Wu, G. et al. Structure of a beta-TrCP1-Skp1-beta-catenin complex: destruction motif binding and lysine specificity of the SCF(beta-TrCP1) ubiquitin ligase. Mol. Cell 11, 1445–1456 (2003) .
Article CAS Google Scholar
Petroski, M. D. & Deshaies, R. J. Context of multiubiquitin chain attachment influences the rate of Sic1 degradation. Mol. Cell 11, 1435–1444 (2003) .
Article CAS Google Scholar
Beskow, A. et al. A conserved unfoldase activity for the p97 AAA-ATPase in proteasomal degradation. J. Mol. Biol. 394, 732–746 (2009) .
Article CAS Google Scholar
Barrick, D. Biological regulation via ankyrin repeat folding. ACS Chem. Biol. 4, 19–22 (2009) .
Article CAS Google Scholar
Jakob, U., Kriwacki, R. & Uversky, V. N. Conditionally and transiently disordered proteins: awakening cryptic disorder to regulate protein function. Chem. Rev. 114, 6779–6805 (2014) .
Article CAS Google Scholar
Hagai, T. & Levy, Y. Ubiquitin not only serves as a tag but also assists degradation by inducing protein unfolding. Proc. Natl Acad. Sci. USA 107, 2001–2006 (2010) .
Article CAS ADS Google Scholar
van der Lee, R. et al. Intrinsically disordered segments affect protein half-life in the cell and during evolution. Cell Rep. 8, 1832–1844 (2014) .
Article CAS Google Scholar
Hayes, M. J. et al. Early mitotic degradation of Nek2A depends on Cdc20-independent interaction with the APC/C. Nat. Cell Biol. 8, 607–614 (2006) .
Article CAS Google Scholar
Hames, R. S., Wattam, S. L., Yamano, H., Bacchieri, R. & Fry, A. M. APC/C-mediated destruction of the centrosomal kinase Nek2A occurs in early mitosis and depends upon a cyclin A-type D-box. EMBO J. 20, 7117–7127 (2001) .
Article CAS Google Scholar
Hayward, D. G. & Fry, A. M. Nek2 kinase in chromosome instability and cancer. Cancer Lett. 237, 155–166 (2006) .
Article CAS Google Scholar
Helps, N. R., Luo, X., Barker, H. M. & Cohen, P. T. NIMA-related kinase 2 (Nek2), a cell-cycle-regulated protein kinase localized to centrosomes, is complexed to protein phosphatase 1. Biochem. J. 349, 509–518 (2000) .
Article CAS Google Scholar
Guharoy, M. & Chakrabarti, P. Conservation and relative importance of residues across protein-protein interfaces. Proc. Natl Acad. Sci. USA 102, 15447–15452 (2005) .
Article CAS ADS Google Scholar
UniProt, C. Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res. 42, D191–D198 (2014) .
Article Google Scholar
Chernorudskiy, A. L. et al. UbiProt: a database of ubiquitylated proteins. BMC Bioinformatics 8, 126 (2007) .
Article Google Scholar
Hornbeck, P. V. et al. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 43, D512–D520 (2015) .
Article CAS Google Scholar
Chen, Z., Zhou, Y., Song, J. & Zhang, Z. hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties. Biochim. Biophys. Acta 1834, 1461–1467 (2013) .
Article CAS Google Scholar
Turinsky, A. L., Razick, S., Turner, B., Donaldson, I. M. & Wodak, S. J. Navigating the global protein-protein interaction landscape using iRefWeb. Methods Mol. Biol. 1091, 315–331 (2014) .
Article CAS Google Scholar
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001) .
Article CAS Google Scholar
Nickell, S. et al. Insights into the molecular architecture of the 26S proteasome. Proc. Natl Acad. Sci. USA 106, 11943–11947 (2009) .
Article CAS ADS Google Scholar
Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J. & Gevaert, K. Improved visualization of protein consensus sequences by iceLogo. Nat. Methods 6, 786–787 (2009) .
Article CAS Google Scholar
Ng, C. et al. Structural basis for a novel intrapeptidyl H-bond and reverse binding of c-Cbl-TKB domain substrates. EMBO J. 27, 804–816 (2008) .
Article CAS Google Scholar
Varshavsky, A. The N-end rule pathway and regulation by proteolysis. Protein Sci. 20, 1298–1345 (2011) .
Article CAS Google Scholar

Download references

Acknowledgements

We thank Professors Shoshana Wodak, Joël Janin and Madan Babu for advice and comments on the work, and Dr Rita Pancsa for help with DynaMine calculations. This work was supported by the Odysseus grant G.0029.12 from Research Foundation Flanders (FWO) to P.T. and a fellowship from the Marie Curie Initial Training Network project 264257 (IDPbyNMR) from the European Commission to P.B. M.G. is the recipient of a VIB/Marie Curie COFUND Postdoctoral (omics@VIB) fellowship.

Author information

Mainak Guharoy and Pallab Bhowmick: These authors contributed equally to this work

Authors and Affiliations

VIB Structural Biology Research Center (SBRC), Vrije Universiteit Brussel (VUB), Building E, Pleinlaan 2, Brussels, 1050, Belgium
Mainak Guharoy, Pallab Bhowmick, Mohamed Sallam & Peter Tompa
Institute of Enzymology, Research Center for Natural Sciences, Hungarian Academy of Sciences, Budapest, 1117, Hungary
Peter Tompa

Authors

Mainak Guharoy
View author publications
You can also search for this author in PubMed Google Scholar
Pallab Bhowmick
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Sallam
View author publications
You can also search for this author in PubMed Google Scholar
Peter Tompa
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.G. and P.T. conceived the study and wrote the paper. M.G. and P.B. performed the research with assistance from M.S. All authors analysed the results.

Corresponding authors

Correspondence to Mainak Guharoy or Peter Tompa.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-9, Supplementary Tables 1-5 and Supplementary References (PDF 2751 kb)

Supplementary Data 1

List of experimentally observed PTMs within primary degrons and degron flanking residues annotated in UniProt. (XLSX 44 kb)

Supplementary Data 2

List of residues flanking primary degrons that undergo post-translational modifications with experimental annotation about the effect of mutations. (XLSX 13 kb)

Supplementary Data 3

Lists of lysine residues for the four datasets used (Deg, Others, Ubsites and Non-Ubsites). (XLSX 188 kb)

Supplementary Data 4

Lists of proteins with characterized isoforms, variants and mutants that affect primary degrons. (XLSX 75 kb)

Supplementary Data 5

Lists of proteins with characterized isoforms, variants and mutants that affect secondary degrons. (XLSX 77 kb)

Supplementary Data 6

Lists of proteins with characterized isoforms, variants and mutants that affect tertiary degrons. (XLSX 62 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Guharoy, M., Bhowmick, P., Sallam, M. et al. Tripartite degrons confer diversity and specificity on regulated protein degradation in the ubiquitin-proteasome system. Nat Commun 7, 10239 (2016). https://doi.org/10.1038/ncomms10239

Download citation

Received: 27 April 2015
Accepted: 17 November 2015
Published: 06 January 2016
DOI: https://doi.org/10.1038/ncomms10239

This article is cited by

HSP70-binding motifs function as protein quality control degrons
- Amanda B. Abildgaard
- Vasileios Voutsinos
- Rasmus Hartmann-Petersen
Cellular and Molecular Life Sciences (2023)
Systematic prediction of degrons and E3 ubiquitin ligase binding via deep learning
- Chao Hou
- Yuxuan Li
- Tingting Li
BMC Biology (2022)
Degron masking outlines degronons, co-degrading functional modules in the proteome
- Mainak Guharoy
- Tamas Lazar
- Peter Tompa
Communications Biology (2022)
PROTAC targeted protein degraders: the past is prologue
- Miklós Békés
- David R. Langley
- Craig M. Crews
Nature Reviews Drug Discovery (2022)
Development of a BCL-xL and BCL-2 dual degrader with improved anti-leukemic activity,
- Dongwen Lv
- Pratik Pal
- Daohong Zhou
Nature Communications (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.