Introduction

Eukaryotic regulation of cellular processes is achieved through modular post-translational modification (PTM) cascades, whereby ‘writers’, which catalyse PTMs, and ‘readers’, which recognize them, collaboratively transduce input signals into specific cellular outputs1. Much of our understanding of the principles underlying modular signalling derives from studies of kinase or acetyltransferase catalytic domains, and the motifs that read their phosphorylation or acetylation marks, respectively. Such signalling has been likened to a ‘code’, determined by the arrangements of writer enzymatic domains, their targets, reader motifs and regulatory sequences within multidomain proteins and higher-order assemblies that execute PTM pathways. Some of the underlying modules are sufficiently well understood that PTM signalling pathways can be designed de novo, through the generation of artificial proteins built from mix-and-match regulatory, writer and reader elements. Although the modification with ubiquitin and ubiquitin-like proteins (UBLs), such as interferon-stimulated gene 15 (ISG15), small ubiquitin-like modifier (SUMO) and neural precursor cell-expressed developmentally downregulated protein 8 (NEDD8)) differs from other PTMs in that the modifiers are proteins, the ubiquitin and UBL code is often interpreted using modular signalling principles.

Ubiquitin and UBL modifications are written by combinations of E2 conjugating enzymes and E3 ligases with hallmark catalytic domains, for example the well-studied really interesting new gene (RING) domain, homologous to the E6-AP carboxy terminus (HECT) domain or RING-between-RING (RBR) domain, with unique enzymatic mechanisms (reviewed in refs.2,3,4). E2 and E3 enzymes catalyse modifications ranging from a single ubiquitin or UBL site-specifically linked to a particular target to polyubiquitin ‘chains’ wherein multiple ubiquitins are linked to each other. Ubiquitin and UBL modifications are ultimately read by downstream machineries that selectively bind and alter the fates of modified proteins (Fig. 1). Many polyubiquitin chain readers, including the 26S proteasome, display tandem ubiquitin-binding domains that each bind weakly to ubiquitin but that are arranged within a reader to synergistically recognize multiple ubiquitins in a linkage-specific manner. These properties are sufficiently robust to have enabled the design of linkage-selective ubiquitin chain sensors5,6.

Fig. 1: Elements of the ubiquitin code.
figure 1

E3 enzymes (in combination with E2 and E1 enzymes) function as writers of the ubiquitin code. E3 enzymes are endowed with substrate specificity and — in a multistep mechanism that also engages E1 and E2 enzymes — attach ubiquitin to one or more residues of the substrate. Ubiquitin was primarily thought to modify Lys residues until recent studies discovered modification of many moieties on proteins and other macromolecules. This reaction can be repeated using a Lys of ubiquitin for attachment of the next ubiquitin molecule, giving rise to a ubiquitin chain. Depending on the E2–E3 pair that catalyses the last step of the ubiquitylation reaction, different Lys residues of ubiquitin are used, resulting in chains with different linkage types. Besides the seven Lys residues of ubiquitin, linkage can also occur with the amino-terminal Met. While ubiquitylation can be reversed by deubiquitylating enzymes (DUBs) that function as erasers of the ubiquitin code, it can be expanded by post-translational modifications of ubiquitin itself (Box 1). Distinct ubiquitin chains (and distinctly modified ubiquitin chains) have been shown to encode different cellular functions. They are decoded by readers equipped with ubiquitin-binding domains that are able to distinguish ubiquitin modifications and link the ubiquitylated substrate to downstream events, such as protein degradation, relocation, formation of multiprotein complexes and activation of enzymatic pathways. Ub, ubiquitin.

This Review summarizes the current knowledge of the ubiquitin code, including unprecedented ubiquitin modifications, new targets (such as sugars and lipids) and unique approaches by which bacteria and viruses reconfigure the ubiquitin code to promote infection. We also describe a fascinating set of novel structural, molecular and functional discoveries that revealed how the ubiquitin code depends on multivalent protein interactions to regulate cellular functions, as well as how its dysregulation promotes pathogenesis.

Ubiquitylation beyond Lys

Most studies of ubiquitylation have focused on linkage to amino groups, initially on Lys residues7. It seems likely that the successful identification of more than 100,000 modified Lys residues in human cells relied on the inherent stability of isopeptide bonds. More recent studies appreciated that protein amino termini are also amino groups that become linked to ubiquitin8,9,10. Indeed, from a chemical perspective, thioester bonds, which typically link the carboxy terminus of ubiquitin and the active site Cys in E2 and E3 enzymes, are highly reactive with properly placed amino groups11,12. Nonetheless, over the past four decades, there have been sporadic reports that the activities of viral E3 ligases13,14, endoplasmic reticulum-associated degradation15,16, peroxisomal protein translocation17,18,19, transcriptional regulation of cell fate20 and other processes involve ubiquitin modification of Cys, Ser and/or Thr side chains on targeted proteins (Box 1). However, such conclusions were largely based on indirect methods, for example ubiquitin modifications succumbing to reductive or hydrolytic chemical methods that destroy thioester or ester bonds, or loss of ubiquitylation upon mutation of Cys, Ser and/or Thr side chains. It was also not clear how specificity could be established for non-Lys side chains.

In the past four years, our understanding of endogenous ubiquitylation of non-Lys residues has greatly improved thanks to fortuitous discoveries facilitated by chemical biology and gene editing. Nuclear magnetic resonance spectroscopy and technical advances in proteomics allow direct detection of ester linkages between ubiquitin and a specific molecule, while deep probing of E3 ligases has established mechanisms specifying ubiquitin linkage to non-Lys moieties.

Writers forging ester bonds between ubiquitin and Ser or Thr hydroxyls

Thr was discovered as the preferred site of modification by a novel ‘RING–Cys–relay’ (RCR) catalytic mechanism used by the human E3 ligase MYCBP2 through a remarkable series of experiments initiated for unrelated purposes. MYCBP2 was found to react with a chemical probe resembling an E2–ubiquitin intermediate21. The probe was designed to stably capture E3 catalytic Cys through reaction with an electrophile positioned between the E2 conjugating enzyme and ubiquitin21,22,23. Proteomics studies revealed that when the probe was applied to cell lysates, it reacted with almost all E3s with a catalytic Cys known to receive ubiquitin from an E2 (that is, HECT-family and RBR-family E3s). However, MYCBP2 stood out as having a RING domain, but lacking a recognizable E3 domain with a catalytic Cys. That changed with the identification of the Cys that reacted with the probe. Elegant biochemical and structural studies showed that this Cys and another one nearby in the sequence are both catalytic in the newly discovered RCR catalytic domain, where the RING element binds an E2–ubiquitin intermediate, ubiquitin is transferred from the E2 Cys to one and then the other MYCBP2 catalytic Cys, and ubiquitin is ultimately ligated to Thr21,23 (Fig. 2a). Thr, not Lys or Ser, preferentially discharged ubiquitin from the RCR domain of MYCBP2. Crystal structures and mutational analyses revealed how the RCR domain catalyses various steps in this process. Most remarkably, an active site pocket, adjacent to the loop harbouring the second Cys, binds and positions the Thr for ubiquitylation. Knock-in mice harbouring a mutation that abolishes the RCR E3 mechanism and non-Lys ubiquitylation have a striking neurodevelopmental phenotype23. Furthermore, axons are protected in models of injury, suggesting that non-Lys ubiquitylation activity has important neuronal functions and that its inhibition might be of therapeutic value24.

Fig. 2: Unconventional ubiquitylation.
figure 2

a | Thr ubiquitylation by the E3 MYCBP2. The RING–Cys–relay (RCR) mechanism depends on the RING domain, which binds E2–ubiquitin, and on the tandem Cys (TC) domain, which contains two catalytic Cys residues. Upon E2–E3 transthiolation and E2 dissociation, ubiquitin is relayed from Cys4520 to Cys4572 within the TC domain by intramolecular transthiolation. The TC domain also positions the substrate to allow Thr esterification. b | Linear ubiquitin assembly complex (LUBAC)-mediated ubiquitylation of unbranched glucosaccharides. The LUBAC components haem-oxidized IRP2 ubiquitin ligase 1 (HOIL1)-interacting protein (HOIP) and SHANK-associated RH domain-interacting protein (SHARPIN) bind glucosaccharides, whereas HOIL1 ubiquitylates the C6 hydroxyl moiety of glucose in one catalytic step. The reaction is accelerated by non-covalent binding of unconjugated Met1-linked or Lys63-linked ubiquitin oligomers to the RING-between-RING (RBR) domain of HOIL1. c | Ubiquitylation of ADP-ribose (ADPR) by DELTEX3L. E2–ubiquitin recruitment by the RING domain of DELTEX3L promotes a conformational change that positions the NAD+ bound to the carboxy-terminal domain (DTC) in close proximity to E2–ubiquitin, which facilitates ubiquitylation of ADPR at Gly76. Ub, ubiquitin.

Regulation of ester linkage of ubiquitin to amino acid side chains gained a solid footing when haem-oxidized IRP2 ubiquitin ligase 1 (HOIL1; also known as RBCK1) was identified as another E3 catalysing such modifications25. HOIL1, along with the linear ubiquitin chain forming E3 HOIL1-interacting protein (HOIP) and regulator SHANK-associated RH domain-interacting protein (SHARPIN), form a multifunctional trimeric linear ubiquitin assembly complex (LUBAC). HOIL1 is an RBR-family E3 ligase, but its direct substrates were unclear until knock-in mice harbouring a HOIL1 active site mutation were generated to understand the physiological role of HOIL1 in immune signalling. Biochemical studies of lysates from bone marrow-derived macrophages showed a curious defect of the active site mutation: they lacked a slowly migrating form of HOIL1 that disappeared upon chemical hydrolysis of ester linkages. Proteomics revealed wild type HOIL1 is ubiquitylated on a specific Ser. HOIL1 substrates — including subunits of the Myddosome inflammatory regulatory complex — are also apparently modified by ubiquitin linkages to hydroxy group-containing amino acids, as they are sensitive to chemical treatments that hydrolyse ester bonds.

Endogenous eukaryotic ubiquitylation of non-protein molecules

HOIL1 modifies not only hydroxyls of amino acid side chains, but also primary hydroxyls of specific sugars26 (Fig. 2b). The connection to carbohydrates was indicated by a mysterious feature of HOIL1 mutant disease in humans and mouse models: accumulation of aberrant starch-like polysaccharide polyglucosan bodies, which may ultimately cause death. Elegant in vitro biochemical reconstitutions showed that HOIL1 ubiquitylates glycogen and maltoheptose, but not some other sugars. Nuclear magnetic resonance spectroscopy identified the hydroxyl functionality on the C6 carbon as the attachment site. Moreover, the binding properties of LUBAC subunits suggested how this ubiquitylation of sugars may be regulated: both HOIP and SHARPIN were found to bind to amylose resin. This result indicates that these subunits can directly localize the LUBAC E3 to ubiquitylate sugars. Also, sugar ubiquitylation was substantially increased upon HOIL1 binding to polyubiquitin chains, which may portend a feedforward mechanism whereby some polyubiquitylated moiety fuels HOIL1-catalysed ubiquitylation of nearby carbohydrate hydroxyls.

A different E3 was discovered to link the C terminus of ubiquitin to ADP-ribose (ADPR). The RING E3 ligase DELTEX3L was found to bind the ADP-ribosyltransferase PARP9, suggesting a connection between these two PTMs. Indeed, in the presence of E1, E2 and the DELTEX3L–PARP9 complex, the C terminus of ubiquitin was ADP-ribosylated27. This reaction depends on NAD+. Subsequent biochemical studies showed that DELTEX3L alone encompasses E3 ligase activity towards ADPR. Structural studies28 showed that DELTEX3L has an ADPR-binding domain, and this is positioned relative to the RING domain to promote ubiquitin transfer from a RING-bound E2 enzyme to ADPR (Fig. 2c). Notably, ADP-ribosylation blocks the C terminus of ubiquitin, and thus inhibits its conventional E1–E2–E3-dependent linkage to other proteins or macromolecules, although the modification can be reversed by deubiquitylating enzymes (DUBs)28. Given the sufficiency of DELTEX3L for these reactions, the precise role of the DELTEX3L–PARP9 complex will require further study. The interaction of DELTEX3L with PARP9 may be required for localizing DELTEX3L to substrates. Indeed, proteins that are ADP-ribosylated by PARP are recruited to the ADPR-binding site in DELTEX-family E3s for ubiquitylation28,29 both in vitro and in cells.

We anticipate that future studies will show that many E3s modify non-proteinaceous molecules. A particularly tantalizing candidate for such an E3 is the aforementioned MYCBP2, which was found to discharge ubiquitin to glycerol as well as to Thr21.

Erasing ester and thioester ubiquitin linkages

PTM ‘codes’ depend not only on writers but also on ‘eraser’ DUBs that remove modifications30. Previously, it was not readily possible to systematically assay DUB activity towards different bond types because the technology was not robust enough to generate comparable probes with the C terminus of ubiquitin linked to various amino acids31. This challenge was overcome by taking advantage of the ability of the RCR domain of MYCBP2 to forge ester bonds32. By use of a chemo-enzymatic approach to prepare a suite of reagents with ubiquitin linked to a Thr hydroxyl, Ser hydroxyl or Cys thiol, DUB activity was detected by mass spectrometry, or by a change in fluorescence polarization from the starting ubiquitin-bound 5-carboxytetramethylrhodamine substrates upon liberation of fluorescent 5-carboxytetramethylrhodamine (much like the classic DUB substrate ubiquitin–7-amido-4-methylcoumarin)33, or by SDS–PAGE. With these reagents, 53 recombinant DUBs were tested for removing ubiquitin from the various amino acid side chains. The initial screen showed that DUBs in the ubiquitin-specific protease (USP) and ubiquitin C-terminal hydrolase (UCH) families are active towards both ubiquitin–Lys and ubiquitin–Thr substrates; many in the ovarian tumour (OTU) family are specific for ubiquitin–Lys substrates, with the notable exceptions of a viral DUB and TRAF-binding domain-containing protein (TRABID). These latter two DUBs, along with those in the so-called Machado–Joseph domain-containing protease (MJD) family, were superior at removing the ubiquitin from a Thr than from a Lys32. Follow-up studies showed that a subset of DUBs with esterase activity could also remove ubiquitin from a Cys in the context of an unlabelled glutathione molecule. When tested towards peptide substrates, the MJD DUBs showed maximal activity in removing ubiquitin from the primary hydroxyl in a Ser side chain32. These intrinsic specificity differences raise the possibility that DUBs may preferentially remove ubiquitin from hydroxyls or thiols in different contexts, and possibly not only from proteins, but also from lipids or sugars. This may be particularly true for the MJD DUBs, as this entire family showed strong de-esterification activity that presumably is directed towards specific substrates in vivo.

Pathogen-induced ubiquitylation

Pathogens have developed molecular strategies to subvert and/or co-opt ubiquitin signalling for their own purposes (that is, to create a niche permissive for their intracellular replication). For this, pathogens secrete virulence factors, also known as pathogenic effectors, that function as DUBs, E3 ubiquitin ligases (such as the novel E3 ligase (NEL) family or the IpaH family) or enzymes that modify ubiquitin and UBLs (such as ISG15, SUMO and NEDD8) (reviewed in refs.34,35). These pathogen-mediated modifications either block normal ubiquitin functions in cells or create unique ubiquitin conjugations to various substrates. Both scenarios alter and expand the ubiquitin code. Bacteria and viruses apply similar principles to affect host ubiquitin and UBL function, but the spectrum of chemical entities used by specific bacteria or viruses for this purpose is surprisingly diverse and not always found in the mammalian proteome.

Phosphoribose-linked Ser ubiquitylation

The effector arsenal of Legionella pneumophila, which causes Legionnaires’ disease, masters a unique form of ubiquitylation that ignores several principles of endogenous, canonical ubiquitylation. First, ubiquitin is conjugated not via the highly conserved C-terminal Gly76 but via Arg42, which is unprecedented; second, conjugation is via linkage to Ser residues in the substrate instead of Lys residues; third, conjugation proceeds without ATP and E1, E2 and E3 enzymes, and instead uses only one bacterial enzyme that catalyses the entire ubiquitylation reaction36. This chemically and mechanistically unique type of ubiquitylation involves a phosphoribosyl bridge between the substrate and ubiquitin and is mediated by the members of the SidE family (SdeA, SdeB, SdeC and SidE), which act as all-in-one enzymes replacing E1, E2 and E3 activities, thus being entirely independent of the host ubiquitylation system36. Instead, SidE proteins use NAD+ as a cofactor via their mono-ADP-ribosyltransferase (mART) domain and a phosphodiesterase (PDE) domain37 for an atypical two-step hydrolysis reaction38,39 (Fig. 3a). Neither phosphoribosyl-ubiquitylated proteins nor the reaction intermediate ADPR–ubiquitin can be regulated by the host’s DUBs, nor can they be used by the host’s ubiquitylation machinery. Phosphoribosylated ubiquitinome analyses show that SidE family enzymes target more than 180 structurally and functionally diverse substrates40, including Golgi apparatus proteins41, mitochondrial proteins, cytoskeletal proteins, components of the autophagy machinery and endoplasmic reticulum-associated36,40,42 and lysosomal proteins43. Thus, L. pneumophila-mediated phosphoribosyl-ubiquitylation has wide-ranging and pleiotropic effects on host cells.

Fig. 3: Pathogen-induced ubiquitin modifications.
figure 3

a | Ser ubiquitylation catalysed by the Legionella pneumophila effector SdeA. Using its mono-ADP-ribosyltransferase (mART) activity, SdeA first generates ADP-ribose (ADPR)–ubiquitin by transferring ADPR from NAD+ to Arg42 of ubiquitin. Subsequently, the phosphodiesterase (PDE) domain conjugates ADPR–ubiquitin to a Ser residue on substrates, thereby generating a ubiquitin-phosphoribosylated (PR) protein. The substrates modified in this way lead to pleiotropic changes in the host cell, including enhanced endoplasmic reticulum (ER) fragmentation and recruitment of ER membrane to L. pneumophila-containing vesicles. b | The carboxy-terminal domain (CTD) of cullin is modified by neural precursor cell-expressed developmentally downregulated protein 8 (NEDD8; N8), which leads to conformational changes that allow interactions between N8 and ubiquitin-conjugating enzyme E2 D (UBE2D) family members, the RING domain and ubiquitin. Deamidation of N8 at Gln40 (resulting in conversion to Glu40) by the enteropathogenic Escherichia coli (EPEC) virulence factor cycle-inhibiting factor (Cif) causes disruption of these mutual allosteric regulations of cullin–RING ligases (CRLs) and N8 that are required for CRL enzymatic activity. CRL substrates accumulate and cause cell cycle arrest. c | In normal conditions, ubiquitin is transferred to the catalytic Cys87 of UBE2N through a thioester bond involving Gly76 of ubiquitin. In this state, E2 is active and can participate in ubiquitylation reactions that trigger NF-κB signalling. The L. pneumophila effector MavC acts as a transglutaminase that links Ub via Gln40 to Lys92 or Lys94 of UBE2N, thereby inactivating E2 and preventing it from stimulating NF-κB signalling. The highly homologous effector MvcA reverses the reaction by hydrolysing the isopeptide bond created by MavC. d | Ubiquitylation of lipids by E3 ubiquitin ligase RING finger protein 213 (RNF213). Following escape from a damaged Salmonella enterica-containing vacuole (SCV), RNF213 is the first E3 ligase to attack cytosolic S. enterica, targeting a lipopolysaccharide composed of lipid A, core sugars and O antigen, in the outer bacterial membrane. Using an atypical zinc-finger domain, RNF213 attaches ubiquitin to a hydroxyl group of lipid A very close to the outer bacterial membrane (1). This first ubiquitin molecule then serves as the docking site for linear ubiquitin assembly complex (LUBAC), which builds linear ubiquitin chains (linked through Met1) on pre-existing ubiquitin molecules (2). The assembled ubiquitin coat is recognized by autophagy receptors such as p62, NDP52 and the optineurin effector protein NF-κB essential modulator (NEMO), which activate xenophagy (p62 and NDP52) and NF-κB-dependent immune signalling (optineurin), respectively (3). NTD, amino-terminal domain; Ub, ubiquitin. In part c, blue Ub, Ub linked via Gly76; red Ub, Ub linked via Gln40. In part d, blue Ub, ubiquitin molecule attached by LUBAC; red Ub, ubiquitin molecule attached by RNF213.

Whereas phosphoribosyl-ubiquitylation cannot be erased by host DUBs, L. pneumophila secretes factors that regulate the levels of phosphoribosyl-ubiquitylation: deubiquitylase for phosphoribosyl-ubiquitylation A (DupA) and DupB (also known as LaiE and LaiF, respectively) function as phosphoribosyl-ubiquitin-specific DUBs40,44. Surprisingly, the deubiquitylating activity of DupA and DupB is mediated by a PDE domain that structurally corresponds to the PDE domain of SidE enzymes but catalyses the opposite ligation reaction. This is possible due to the different substrates offered to DupA and DupB and their kinetic parameters. Whereas the PDE domains of SidE enzymes do not bind to phosphoribosyl-ubiquitylated substrates and have moderate binding affinity for ubiquitin, DupA and DupB show strong and selective affinity for ubiquitin and phosphoribosyl-ubiquitylated peptides. Indeed, point mutations weakening the affinities of DupA and DupB PDE domains for phosphoribosyl-ubiquitylated peptides are sufficient to convert them into SidE-type ubiquitin ligases40. This indicates that through subtle changes, pathogens can tweak their arsenal of effectors significantly.

DupA and DupB seem to spring into action at later stages of infection, possibly because L. pneumophila might actively seek to regulate the extent of phosphoribosyl-ubiquitylation to avoid the toxic effects of sustained high phosphoribosyl-ubiquitylation levels, which have been shown in both yeast and mammalian cells45. This hypothesis is also supported by the function of the L. pneumophila effector SidJ, a glutamylase that inactivates the catalytic site of the ART domains of SidE and that reduces the intracellular replication of L. pneumophila when deleted45,46,47. Interestingly, SidJ activity depends on its interaction with calmodulin, a eukaryote-specific protein. In this way, SidJ is activated only in the host cells, not within bacteria, and its binding to calmodulin is additionally regulated by intracellular Ca2+ levels in host cells47.

Deamidation of ubiquitin and NEDD8

Another way pathogens target the surface of ubiquitin, or the UBL and closest homologue of ubiquitin NEDD8, is by glutamine deamidation48. This virulence strategy is successfully used by Burkholderia pseudomallei, enteropathogenic Escherichia coli and L. pneumophila, which secrete effectors that possess a papain-like hydrolytic fold able to deamidate the conserved Gln40, resulting in conversion to Glu40. Cycle-inhibiting factor (Cif), the effector secreted from enteropathogenic E. coli, specifically targets NEDD8 (ref.49). As discussed later, NEDD8 modification of cullin–RING ligases (CRLs) activates their E3 activities. However, Cif-mediated deamidation inhibits this NEDD8-dependent activation of CRL ubiquitylation49,50,51. Mechanistically, NEDD8 deamidation does not impair the conjugation of NEDD8 onto cullin proteins, but precludes the non-covalent interactions between NEDD8 and cullins that trigger conformational changes and binding to ubiquitin-carrying enzymes52,53,54,55 (Fig. 3b). As a result, ubiquitylation and degradation of multiple CRL substrates, such as the cyclin-dependent kinase inhibitors p21 and p27, are abolished, disrupting the host’s cell cycle and arresting cell growth either at the G2–M transition or at the G1–S transition48. Similarly to Cif, Cif homologue in B. pseudomallei (CHBP) blocks cell cycle progression of infected cells by deamidation of NEDD8, but it also targets Gln40 of ubiquitin with similar efficiency. Deamidated ubiquitin cannot be transferred from E2 to the acceptor Lys49, and thus substrates such as NF-κB inhibitor-α (IκBα) escape proteasomal degradation.

An interesting example of a ubiquitin deamidation cycle is used by L. pneumophila, which secretes the effectors MavC and MvcA, which robustly deamidate Gln40 of ubiquitin but not of NEDD8 (ref.56). MavC and MvcA are structurally similar to Cif and CHBP but have an additional binding domain that targets host proteins. MavC interacts with the E2 conjugating enzyme UBE2N (also known as UBC13), which is involved in the formation of non-degradative Lys63 polyubiquitin chains. Strikingly, MavC was found not only to deamidate Gln40 but also to catalyse a transglutamination reaction that covalently links Gln40 of ubiquitin to Lys92 or Lys94 of UBE2N in the absence of E1 enzyme or ATP57,58,59,60. The resulting atypical UBE2N–ubiquitin complex does not possess E2 activity and accumulates in infected cells61 (Fig. 3c). Intriguingly, MvcA, which is 50% identical to MavC, was found to act as a DUB that specifically hydrolyses the isopeptide bond between ubiquitin and UBE2N62. In doing so at later stages of infection, MvcA restores UBE2N activity. As described earlier, highly homologous catalytic domains that mediate chemically opposite reactions appear to be a recurring concept in bacterial effectors.

Ubiquitylation of lipopolysaccharides

Work on antipathogenic strategies of mammalian cells has led to the discovery of a novel type of writer in the host proteome. It has long been assumed that so far unidentified bacterial membrane proteins serve as initial targets for ubiquitin conjugation, which then activates antibacterial responses. However, a recent study63 showed that the lipid A moiety of bacterial lipopolysaccharide (LPS), serves as the first site of ubiquitylation. LPS is the first example of a lipid modified by ubiquitin. This very unconventional modification is catalysed by the host E3 ubiquitin ligase RING finger protein 213 (RNF213) and is obligatory for the recruitment of the LUBAC E3 ligase, which adds Met1-linked ubiquitin chains onto the pre-existing ubiquitin moieties. These, in turn, serve as docking sites for autophagy receptors that function as readers of the attached ubiquitin modification and trigger the host’s autophagic response as shown in Salmonella enterica-infected cells64 (Fig. 3d).

RNF213 is a susceptibility factor for Moyamoya disease, a severe cerebrovascular disorder65,66,67, and seems to be involved in lipid droplet formation, lipotoxicity, hypoxia and NF-κB signalling65. Very recently, RNF213 was found to counteract infections by various microorganisms, including Listeria monocytogenes, herpes simplex virus 1, human respiratory syncytial virus and coxsackievirus B3. This function of RNF213 is induced by type I interferons and involves its isgylation and oligomerization on lipid droplets, where it acts as a specific sensor for isgylated proteins68. Interestingly, RNF213 contains a classic RING domain sequence that typically cooperates with an E2 conjugating enzyme to transfer ubiquitin to Lys residues of substrate proteins. However, ubiquitylation of LPS diverges from this well-established mechanism and does not require the RNF213 RING domain. Instead, it depends on other domains, including a dynein-like AAA+ domain69, which could be involved in accessing hidden substrates or may promote the recruitment of RNF213 to bacteria. Also, although the precise mechanism of ubiquitin attachment remains unclear, initial studies indicate that a non-canonical zinc-finger domain (containing the strictly conserved Cys4462) might function as the E3 active site for LPS ubiquitylation70. Moreover, E3 activity seems to be allosterically regulated by ATP70. How RNF213 recognizes LPS or which functional groups of lipid A (potentially its hydroxyl groups, its phosphate groups or both) are modified is currently unknown. It is very likely that in addition to RNF213 and bacterial LPS, other examples of lipid ubiquitylation will be discovered in the future.

Viral impact on the ubiquitin code

In addition to ubiquitylation and neddylation, viruses specifically manipulate the UBL ISG15, which is encoded by one of the most highly and rapidly host-induced genes in response to viral infection71,72. ISG15 becomes attached to both host and viral proteins, with the net effect of inhibiting viral stability, transport and assembly. Moreover, isgylation stabilizes key host antiviral proteins, thereby enhancing immune responses, whereas at later stages of viral infection, isgylation appears to attenuate the cellular immune response by inhibiting or destabilizing certain host proteins73,74. Also, unconjugated ISG15, which is not attached to a target protein, can affect viral replication and host responses through non-covalent protein interactions75 and its function as a cytokine76,77, respectively (Box 2).

To escape the ubiquitin-mediated and ISG15-mediated antiviral responses, viruses have acquired different strategies, including the reduction of ISG15 transcription by sequestering signal transducer and activator of transcription 2 (STAT2) by human cytomegalovirus, the inhibition of ISG15 conjugation by influenza B virus78,79, the sequestration of isgylated host proteins that would otherwise impede viral RNA synthesis also by influenza B virus80 or the secretion of enzymes that deconjugate ISG15 from target proteins81,82,83,84,85. The last of these approaches is used by coronaviruses, porcine reproductive and respiratory syndrome virus and equine arteritis virus, which secrete OTU-containing proteases. The leader protease (Lb(pro)) from foot-and-mouth disease virus is even able to irreversibly inactivate ISG15 by cleaving the peptide bond before the C-terminal diGly motif, leaving the Gly–Gly dipeptide attached to the substrate85.

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is the cause of the ongoing COVID-19 pandemic, and its close relatives Middle East respiratory syndrome coronavirus and SARS-CoV are able to deconjugate as many as three different UBLs: ubiquitin, NEDD8 and ISG15. Intensive research during the COVID-19 pandemic revealed that the papain-like proteases (PL(pro)) of SARS-CoV and SARS-CoV-2 have remarkably divergent specificities in cleaving ubiquitin, NEDD8 and ISG15 (refs.83,86,87,88,89,90). Although the enzymes are almost identical, each pathogen has evolved subtle changes that — quite unpredictably — affect their specificity, thus altering the UBL code and disabling the host’s immune system in different ways. SARS-CoV preferentially cleaves Lys48-ubiquitylated substrates, whereas SARS-CoV-2 PL(pro) prefers ISG15-conjugated substrates over ubiquitylated ones. These differences may contribute to the cellular response to viral attack and might be relevant for the development of treatment strategies and anti-COVID-19 drug design. Several SARS-CoV-2 variants of concern that were circulating in 2021 carried a mutation that affected a region of the enzyme that senses ubiquitin and ISG15 and enhances Lys48–ubiquitin chain cleavage compared with the original variant91. Further, de-isgylation by PL(pro) generates free ISG15, enhancing the secretion and extracellular signalling function of ISG15. This in turn promotes pro-inflammatory cytokine production from cells of the immune system that contributes to the devasting cytokine storm reported in some patients with severe COVID-19 (refs.77,92,93). Together these findings underline how flexibly pathogens can transform the host’s ubiquitin code and the need to understand the cellular functions encoded by pathogen-rewired ubiquitin codes.

Advances in understanding readers

The ubiquitin code is ultimately deciphered by ‘readers’, which bind ubiquitin-modified proteins to trigger a downstream output. Outputs include degrading the ubiquitylated protein, modifying the ubiquitylated protein with another PTM and stimulating the enzyme activity of the reader. Early studies focused on how isolated ubiquitin-binding domains recognize either monoubiquitin or particular polyubiquitin chains94,95,96,97. However, it soon became clear that there is great diversity in how readers recognize modified targets98,99,100,101. Some readers primarily bind ubiquitin or a ubiquitin chain. Such readers include the degradative machinery — the 26S proteasome — and Cdc48 in yeast (also known as p97 in humans) that act on thousands of unrelated proteins whose common feature is modification by ubiquitin. Other readers recognize only a specific protein ubiquitylated on a particular Lys100,102,103,104,105,106. In yet other cases, recognition depends on the modified protein undergoing ubiquitin (or UBL)-dependent conformational changes107,108,109. All these types of interaction with a ubiquitin-modified protein have been visualized by recent structural studies.

Readers that regulate numerous different ubiquitylated proteins

Cryogenic electron microscopy (cryo-EM) structures and biophysical studies have now shown how the 26S proteasome and Cdc48 recognize, unfold and in the case of the proteasome also degrade ubiquitylated proteins110,111,112,113. The 26S proteasome consists of two subcomplexes, arranged in layers (reviewed in refs.114,115,116). The outermost layer is the 19S regulatory particle, which is a multiprotein complex that contains many regulatory proteins and recruits even more regulators, and it also contains an AAA-ATPase motor91,112,113,117. The 19S regulatory particle is responsible for recruiting ubiquitylated proteins via at least three distinct intrinsic ubiquitin receptor subunits, each of which displays one or more distinct ubiquitin-binding motifs114,115,116. These proteasome-intrinsic subunits can bind directly to ubiquitylated substrates. Alternatively, ubiquitylated substrates can be delivered to the proteasome by so-called shuttle factors (or extrinsic receptors), which contain a ubiquitin-associated domain that binds ubiquitylated substrates, and a flexibly tethered ubiquitin-like domain that binds to the proteasome-intrinsic receptors. The 19S regulatory particle also contains three DUBs with distinct activities. Meanwhile, AAA ATPases are hexameric doughnut-shaped assemblies that transduce hydrolysis of ATP in the different subunits into mechanical motion. In the case of the 26S proteasome, this involves grasping the substrate, pulling it through the AAA ATPase pore to unfold it and ultimately feeding the unfolded substrate into the third subcomplex, the barrel-shaped 20S proteolytic chamber.

A major challenge when one is trying to visualize the 26S proteasome in action is that ubiquitylated substrates are processed too rapidly to be observed by existing structural methods. Several research groups have overcome this challenge by designing optimal ubiquitylated substrates and inhibiting specific steps in the multifaceted activities of the proteasome in a variety of ways110,113. One study generated proteasome complexes with substrate polypeptide threaded into the proteolytic chamber by the motor until the substrate could not be further pulled because it was stuck by its linked ubiquitin firmly bound to the inactivated DUB Rpn11 (ref.110). Subjecting these substrate-engaged 26S proteasome complexes to cryo-EM showed the subcomplexes concentrically aligned, and the substrate traversing a direct path from its linkage to the Rpn11-bound ubiquitin, through the centre of the AAA ATPase and into the proteolytic chamber that catalyses degradation. Structures were obtained for multiple conformations, which provided a model for how the receptors in the 19S regulatory particle bind ubiquitin and how individual subunits in the heterohexameric AAA ATPase grab onto substrates and move in a coordinated manner to pull them through the centre of the motor and push them into the proteolytic chamber.

But what is the role of ubiquitin? Although the substrates used for structural studies had long polyubiquitin chains, only the proximal ubiquitin directly linked to the substrate could be visualized. Thus, it seems likely that ubiquitins within the chain dynamically interact with multiple receptors in the regulatory particle. Furthermore, the structures, together with findings from biochemical and single-molecule fluorescence resonance energy transfer studies118, revealed further intricate regulation by ubiquitin that includes impacting the rates of switching between functionally distinct 26S proteasome conformations. Ubiquitin mainly seems to affect substrate engagement and capture, in part by allosteric modulation that ultimately facilitates the ubiquitylated substrate entering and being grasped by the AAA-ATPase motor118 (Fig. 4a).

Fig. 4: Different readers of the ubiquitin code — three examples.
figure 4

a | Recognition of multiple diverse ubiquitylated substrates by one reader, the 26S proteasome. Lys48-linked polyubiquitin chain-tagged proteins are recognized by ubiquitin receptors (RPN1, RPN10 and RPN13) that are part of the base subcomplex of the proteasome. Interactions between ubiquitin and the ubiquitin-binding domains of the ubiquitin receptors regulate the proteasome conformations that activate unfolding, deubiquitylation and degradation of the substrate. b | Multiple downstream machineries recognize the ubiquitylated substrate, histone H2B. Ubiquitin attached to Lys120 of H2B can flexibly adopt different positions to bind various chromatin-modifying methyltransferases. The positioning of ubiquitin in combination with additional binding sites of the methyltransferase on the nucleosome activates methylation of specific Lys residues. DOT1L and SET2 target two different Lys residues of histone H3, Lys 79 (H3K79) and H3K36, respectively, in cis, whereas MLL1-including complex of proteins associated with Set1 (COMPASS) methylates H3K4 in trans. c | Different modes through which various ubiquitin-carrying enzymes recognize neddylated cullin–RING ligases (CRLs). Neural precursor cell-expressed developmentally downregulated protein 8 (NEDD8; N8) can allosterically regulate the conformation of the cullin it modifies, and vice versa, the cullin affects the positioning of N8 within the complex so that different readers, even homologous readers (ARIH1 and ARIH2) are accommodated distinctly on different CRLs. CP, core particle; CTD, carboxy-terminal domain; H2BK120ub, Lys120-ubiquitinylated histone H2B; H4, histone H4; NTD, amino-terminal domain; Ub, ubiquitin; UBE2D, ubiquitin-conjugating enzyme E2 D.

Before degradation, many ubiquitylated proteins (for example, those embedded in membranes or complexes) must first be extracted by the unfoldase Cdc48 (p97 in humans). Cdc48 is a homohexameric AAA ATPase that associates with several interchangeable multidomain cofactors, which in turn bind substrates and regulators104,119,120,121,122,123. Cdc48 mediates unfolding of cofactor-recruited substrates through ratcheting through its central AAA-ATPase pore, much like the unfolding process mediated by the distinct 26S proteasomal AAA ATPases124,125,126.

The Npl4–Ufd1 cofactor complex is evolutionarily conserved. Recent structures of complexes of yeast proteins showed how Npl4–Ufd1 recruits substrates marked with Lys48-linked polyubiquitin chains127,128. Npl4–Ufd1 binds three ubiquitins in an arrangement attainable only by Lys48 linkages. An additional ubiquitin-binding domain is poised to bind yet another distally linked ubiquitin, although this last ubiquitin was not visible by cryo-EM. One possibility is that this fourth ubiquitin-binding site might interchangeably capture various ubiquitins at the distal end of a chain, resulting in conformational heterogeneity that would preclude observation by cryo-EM. The ubiquitin chain linked to the substrate was not observed by cryo-EM of human p97–NPL4–UFD1, suggesting this is also a conformationally dynamic complex126.

A landmark cryo-EM study visualized budding yeast Cdc48 in the act of unfolding a ubiquitylated substrate. The structure was made possible by trapping the otherwise dynamic process through use of a model substrate and inhibiting the AAA-ATPase motor by mutation or use of various nucleotide analogues128. Remarkably, a relatively proximal ubiquitin within the Lys48 chain is itself unfolded124,128. This is extraordinary considering that ubiquitin is often thought of as an exceptionally stable protein — it can be purified from a cell lysate owing to its ability to refold properly even after boiling the lysate129 — but a groove in Npl4 binds unfolded ubiquitin. The structures also showed part of the unfolded ubiquitin extended and threaded into the central pore of the Cdc48 barrel.

Together, the structures of these two molecular machines provide the first glimpses into how they distinctly recognize ubiquitin. Ubiquitin alone mediates interactions, which explains how ubiquitylation is sufficient to direct unfolding and degradation of an extraordinary array of diverse proteins. Although a substrate-linked monoubiquitin is sufficient to effectively anchor the substrate to the 26S proteasome for its translocation into the catalytic chamber for degradation130, both the 26S proteasome and Cdc48 contain multiple ubiquitin-binding sites that contribute to regulation. A recent study showed human p97–NPL4–UFD1 can associate with other p97-binding cofactors that also bind ubiquitin to reduce the threshold number of ubiquitin molecules required for substrate unfolding131. Moreover, different types of ubiquitin modification — especially branched ubiquitin chains such as those linked through both Lys11 and Lys48 — are particularly efficient at promoting degradation132. We anticipate future structures will visualize how different types of ubiquitin modification are read by the proteasome to influence the degradation process.

Ubiquitin-modified histone H2B is recognized by multiple readers

Whereas the 26S proteasome and Npl4–Ufd1-bound Cdc48 (and p97) recognize numerous ubiquitylated targets, in other cases ubiquitin directs some site-specifically modified targets to many distinct partners. One such example is the ubiquitylation of human histone H2B Lys120 (Lys123 in yeast)133. One function of ubiquitin linkage to H2B Lys120 is to trigger further histone modifications. Recent cryo-EM structures showed how the human H2B Lys120–ubiquitin-modified nucleosome binds the MLL1 complex, which methylates histone H3 Lys4 (ref.134) (and the related yeast H2B Lys123–ubiquitin-modified nucleosome bound to complex of proteins associated with Set1 (COMPASS)135,136), and Set2 (from Chaetomium thermophilum), which methylates H3 Lys36 (ref.137). Structural studies also showed how the human H2B Lys120–ubiquitin-modified nucleosome binds and activates the enzyme DOT1L, which methylates histone H3 Lys79 (refs.138,139,140,141). Interestingly, the different H2B Lys120-linked ubiquitin-bound methyltransferases target different molecules of histone H3 within the histone octamer (within each histone octamer there are two molecules of histone H2B and two molecules histone H3 — the active site of DOT1L engages Lys79 on the histone H3 located in cis relative to H2B’s ubiquitylated Lys120; the active sites of MLL1 and COMPASS face the opposite direction, to methylate Lys4 from the so-called trans molecule of histone H3) (Fig. 4b).

Ubiquitin linked to Lys120 of histone H2B is flexibly tethered to the nucleosome. As such, the globular domain of ubiquitin adopts different relative positions to bind distinct partners, which also uniquely contact nucleosomal DNA and the histone octamer. The distinct structure of each of these methyltransferases (or methyltransferase complexes) — and the collection of interactions with ubiquitin and the nucleosome — uniquely positions their active sites. It seems that multivalent interactions, including with ubiquitin, generally lower the energy barrier for attaining their active conformations. However, H2B Lys120 ubiquitylation impacts the function of each distinct partner in a different way. For MLL1, ubiquitin binding favours the active orientation and may promote formation of the complex134. Meanwhile, ubiquitin binding stabilizes the conformation of COMPASS that is competent for nucleosome binding135,136, and restricts the conformational search of DOT1L, facilitating access to H3 Lys79 (refs.138,139,140,141).

Allosteric regulation of ubiquitin-modified proteins

Ubiquitin and UBLs can modulate their target proteins by stimulating conformational changes. Early studies emphasized the inhibitory functions of such interactions. Many monoubiquitylated proteins — or proteins modified by a single UBL (for example, SUMO) — also contain ubiquitin-binding or UBL-binding domains. Intramolecular interactions between a protein’s ubiquitin-binding domain and its linked ubiquitin (or SUMO-binding domain and a linked SUMO) allosterically blocked association with other binding partners142,143,144.

Could intramolecular interactions between ubiquitin or a UBL and its modified protein also stimulate binding to new partners? Such allosteric interactions contribute to the functions of the UBL NEDD8. NEDD8 is approximately 60% identical to ubiquitin but has its own targets and readers. The best understood function of NEDD8 is to modify a specific Lys in the C-terminal domains of cullin proteins, which are subunits of the large family of CRL E3s94,145,146. As mentioned earlier, neddylation promotes CRL binding to ubiquitin-carrying enzymes, which are the direct mediators of ubiquitin transfer to CRL-bound substrates. Neddylated CRLs partner with different types of ubiquitin-carrying enzyme, including some E2 enzymes and members of the Ariadne RBR E3 ubiquitin protein ligase (ARIH) subfamily of RBR E3 enzymes147,148. In addition to being ubiquitin writers, such E2s and ARIH-family E3s are also readers of NEDD8 linked to a cullin-family protein.

Three core principles were revealed by structures of NEDD8-modified cullin 1 (CUL1)-containing CRLs bound to ubiquitin-conjugating enzyme E2 D (UBE2D) family members or E3 ARIH1 (refs.53,54) (Fig. 4c). First, NEDD8 engages in specific non-covalent interactions with the CUL1 domain to which it is linked. Notably, it is these interactions that are impaired by Cif-mediated deamidation of Gln40 described earlier. Second, these non-covalent interactions allosterically stabilize particular conformations of NEDD8, which, like ubiquitin, on its own normally interconverts between alternative conformations149. But when bound to CUL1, NEDD8 adopts only a so-called loop-out conformation, with its Ile44 hydrophobic patch exposed. Notably, UBE2D family members and ARIH1 specifically recognize the loop-out conformation of NEDD8. Third, in the context of the reader-bound complexes, NEDD8 also facilitates conformational changes of the CRL. The connection between the neddylated domain and the rest of CUL1 is a flexible tether, which allows distinct positioning relative to the rest of the CRL when bound to different readers. From the findings taken together, NEDD8 allosterically modulates the CRL, but cullin also allosterically modulates NEDD8 to promote its binding directly to UBE2D or ARIH1.

Unexpected cullin-specific allosteric regulation by NEDD8 was revealed by comparison of the structures of NEDD8-modified CUL1-containing CRLs bound to ARIH1 with the structure of a NEDD8-modified CUL5-containing CRL bound to the reader ARIH2 (ref.55). Surprisingly, ARIH2 neither contacts nor approaches NEDD8 linked to CUL5. Instead, NEDD8 interacts with multiple CUL5 domains to elicit conformational changes, which generate new surfaces that bind ARIH2. These data demonstrate that homologously modified proteins can bind homologous readers in different ways. Moreover, this latter structure shows that a UBL can allosterically modulate the conformation of a target protein to indirectly promote binding to a reader.

Concluding remarks and future challenges

Upon discovery of ubiquitin-directed protein degradation in the 1980s, it was impossible to envision the breadth of ubiquitin modifications or regulation depending on ubiquitin. Forty years later, we know that beyond protein degradation, ubiquitin directly or indirectly affects virtually every cellular process with surprising specificity. This is possible because the ubiquitylation machinery can generate a versatile code from the small universal ubiquitin molecule that acquires unique abilities through differently linked and branched ubiquitin chains, multiple acceptors (also non-Lys targets) and a variety of PTMs. Reading and interpreting this expanded ubiquitin code is readily accomplished by a growing body of highly specialized ubiquitin-binding and UBL-binding receptors in cells. Decoding the biological importance of such elaborate networks could help researchers exploit the ubiquitin code for tools and therapeutic treatments (Box 3).

Many of the recently discovered types of ubiquitylation involve relatively labile linkages between the C terminus of ubiquitin — or even a ubiquitin side chain — and hydroxyl or thiol moieties on the substrate. Unlike in the case of the ubiquitin modification of Lys residues, tools are lacking for identifying such modifications in a high-throughput manner. Rather, identification of ester-linked, thioester-linked and ADPR-linked ubiquitin has largely relied on studies of particular pathways, raising the question of how many other types of ubiquitin modification remain elusive. A future challenge to be overcome will be the development of tools and methods for detecting and modulating these, and potentially other, ubiquitin modifications. Going forward, we anticipate that artificial intelligence and machine learning will gain traction in assigning E3 ligases to specific ubiquitin modifications and vice versa, and in predicting structural mechanisms of ubiquitin and UBL recognition by their reader machineries.

Importantly, the accumulated knowledge described in this Review provides a new basis for the development of innovative ubiquitin-based therapeutics, including proteolysis-targeting chimeras and molecular glues. Their efficacy and specificity will depend on tissue selectivity of action of ubiquitin-modifying enzymes and regulation of conjugation–deconjugation dynamics in cells. The expanding lexicon of ubiquitin provides vast opportunities for scientists from different fields to join and make unexpected and therapeutically important discoveries in the future.