Rumi O-glucosylates the EGF repeats of a growing list of proteins essential in metazoan development, including Notch. Rumi is essential for Notch signaling, and Rumi dysregulation is linked to several human diseases. Despite Rumi's critical roles, it is unknown how Rumi glucosylates a serine of many but not all EGF repeats. Here we report crystal structures of Drosophila Rumi as binary and ternary complexes with a folded EGF repeat and/or donor substrates. These structures provide insights into the catalytic mechanism and show that Rumi recognizes structural signatures of the EGF motif, the U-shaped consensus sequence, C-X-S-X-(P/A)-C and a conserved hydrophobic region. We found that five Rumi mutations identified in cancers and Dowling–Degos disease are clustered around the enzyme active site and adversely affect its activity. Our study suggests that loss of Rumi activity may underlie these diseases, and the mechanistic insights may facilitate the development of modulators of Notch signaling.
Notch signaling has essential roles in the development of all metazoans, and dysregulation in the Notch pathway leads to a variety of diseases in humans, including cancers and developmental disorders1,2. Differential O-linked glycosylation of Notch extracellular domain (NECD) can regulate Notch signaling. Conserved O-glucose modification of a subset of NECD EGF repeats by Rumi is essential for Notch signaling3,4,5, whereas elongation of O-linked glucose by sequential action of GXYLT1/2 and XXYLT1 negatively regulates Notch activation6. Rumi is required for embryonic development, in organisms from flies to mammals, in a Notch-dependent manner: Rumi aberration in flies leads to a temperature-sensitive loss of Notch signaling3, and mouse embryos lacking Rumi show early embryonic lethality and severe cardiovascular defects4. In addition to the NECD, the best-characterized substrate of Rumi, several new substrates have recently been identified, including Eys in photoreceptor development7, CRUMBS2 in mammalian gastrulation8 and JAG1 in the Notch pathway9, all of which feature tandem EGF repeats. Moreover, Rumi dysregulation has been linked to several diseases in humans, including Dowling–Degos disease10 (DDD) and Alagille syndrome9.
Rumi is a member of the GT90 family of glycosyltransferases11 (GTs) and adds an O-linked glucose to the serine in the consensus C-X-S-X-(P/A)-C sequence found in EGF repeats. No structural information is available for any member of the GT90 family. In general, there is a paucity of structures of GTs that target protein substrates, although a few studies have reported GTs in complex with short peptides of acceptor protein substrates12,13,14,15,16. More recently, structures of GTs that glycosylate small cysteine-rich protein domains such as EGF repeats and thrombospondin type 1 repeats (TSRs) have been reported. For example, POFUT1 (a member of the GT65 family) and POFUT2 (a member of the GT68 family) add O-fucose to EGF repeats and TSRs, respectively, which contain the appropriate consensus sequences17. Proper folding is required for the EGF repeats and TSRs to become efficient glycosylation substrates17. POFUT1 and POFUT2 have classic GT-B folds18,19,20. A recent crystal structure of Caenorhabditis elegans POFUT2 complexed with a folded TSR and GDP revealed the importance of not only the 3D fold of the TSR acceptor but also the interfacial water molecules in the enzyme substrate recognition20. How POFUT1 recognizes the folded EGF acceptor has not been determined experimentally19. We recently solved the crystal structure of the endoplasmic-reticulum-anchored XXYLT1, a GT-A–fold enzyme that adds a xylose to the existing O-glucose disaccharide on an EGF repeat. We found limited contacts between XXYLT1 and the EGF in the enzyme–donor–acceptor ternary complex structure21. Therefore, the molecular basis of how a folded EGF repeat is recognized and glycosylated—either fucosylated by POFUT1 or glucosylated by Rumi—remains unknown.
Here we describe the structures of Drosophila Rumi (dRumi) complexed with its substrates, a folded EGF module and a donor ligand (UDP or UDP-glucose). These structures show that Rumi recognizes conserved signatures of the EGF module, including the O-glucosylation consensus sequence, C-X-S-X-(P/A)-C, and a previously unknown conserved hydrophobic region (P55 and Y69). Two different ternary complex structures and biochemical data revealed the catalytic mechanism of Rumi, with dRumi D151 functioning as the catalytic base. We structurally and biochemically characterized Rumi alterations that were identified previously in DDD and demonstrated that they are either activity-abolishing missense mutations or likely nonfunctional nonsense/frameshift mutations. Moreover, we assessed the functional consequences of Rumi mutations identified in human cancers and found four missense mutations with severely reduced O-glucosylation activity, suggesting the deleterious impact of Rumi dysfunction in these cancers. Our studies contribute to the body of knowledge on Notch O-glucosylation and provide a framework for understanding Rumi aberrations in diseases and for the development of modulators of Notch signaling.
Overall structure of Rumi complexed with an EGF repeat
dRumi shares high sequence identity with its human (∼52%) and mouse (∼51%) counterparts (Supplementary Results, Supplementary Fig. 1a). We produced an N-terminally truncated dRumi protein (residues 21–407) in HEK293T cells, purified the protein to homogeneity and removed the recombinant purification tag (Supplementary Fig. 1b, Online Methods). Rumi O-glucosylates various EGF repeats of its regulated substrates, all of which share a similar fold of the cysteine knot that is constrained by three conserved disulfide bonds, despite their wide sequence variation22. Therefore, we used the EGF repeat (residues 46–84) of human factor IX (hFA9) as a surrogate in our structural study, as it was previously shown to be an authentic Rumi substrate in vitro23. We determined the co-crystal structure of dRumi in complex with the hFA9 EGF repeat at 1.9-Å resolution (Supplementary Table 1), using the anomalous signals of native sulfurs24. In this complex, dRumi residues 42–406 and hFA9 EGF repeat residues 48–84 were resolved. Furthermore, we soaked donor ligand UDP or UDP-glucose (UDP-Glc) into the binary complex crystals and solved the structures of the ternary complexes of dRumi–EGF–UDP at 2.15 Å and dRumi–(Glc–EGF)–UDP at 2.5 Å. We also crystallized and solved the structure of dRumi complexed with UDP at 3.2-Å resolution.
The overall structure of the dRumi–EGF binary complex is shown in Figure 1a,b. dRumi was found to contain two domains, an A-domain and a smaller B-domain, each with a Rossmann fold typical of members of the GT-B superfamily (Fig. 1c, Supplementary Fig. 1c). The two-domain architecture of dRumi was stabilized by two tandem helixes; one helix was in the N-terminal region (residues 42–88), and the other helix was kinked and was located in the C-terminal region (residues 352–390). The two helices were stabilized by two disulfide bonds (C64–C75 and C73–C378). A DALI search25 identified the DNA α-glucosyltransferase (PDB ID 1YA6) as the closest structure of dRumi with a Z-score of 12.1 and r.m.s. deviation of 4.5 Å over 216 aligned residues. Despite the similar fold, these two enzymes have very different functions26.
In the dRumi–EGF binary complex structure, the EGF bound to a cleft between the A- and B-domains of Rumi, with an interface area of 827 Å2. Therefore, approximately one-third of the 2,922-Å2 surface of the EGF was enclosed by Rumi. Surrounding the cleft were two loops that made additional contacts with the EGF: a 'thumb' region (residues 256–273) of the A-domain and an intervening region (residues 181–202) that bridged the A- and B-domains (Fig. 1a).
Rumi recognizes structural features of EGF repeats
Rumi O-glucosylates folded EGF repeats with the C-X-S-X-(P/A)-C consensus sequence1, but elucidating the underlying molecular basis for this required solution of the binary complex. We found remarkable surface complementarity between dRumi and the EGF repeat in the crystal structure (Supplementary Fig. 2). This was imposed mainly by the O-glucosylation consensus sequence C-E-S-N-P-C of hFA9 EGF, which is located in the N-terminal region (residues 51–56) of the motif and forms a U-shaped loop structure. The structure of this loop is not affected by the crystal packing (Supplementary Fig. 3). The loop inserts into the substrate-binding cleft of Rumi, and the backbone atoms of EGF (C51, N54, C56 and N58) interact extensively with Rumi residues Q259 and G260 in the thumb region and with residues A192 and P197 in the intervening region (Fig. 2a,b). Additionally, the side chain of EGF N54 at subsite +1 is involved in two hydrogen bonds in a narrow space, consistent with the known preference of Rumi for Asn or Ser over Arg at this subsite23 (Supplementary Fig. 4a–c). Following this region, the backbones of several EGF residues stacked above the thumb region with the O-fucosylation site S61 facing upward, away from Rumi (Fig. 2a), which explains why Rumi is able to tolerate a fucosylated EGF27. On the other side of the cleft, dRumi F122 stacked against Y69 and P55 of the hFA9 EGF repeat (Fig. 3a). These two hydrophobic interactions are probably important because the residue at position 69 is conserved to either a Y or an F, whereas P55 is absolutely conserved among all EGF repeats of Drosophila Notch with O-glucose sites (Fig. 3b and Supplementary Fig. 4a,d). Around this region, the side chain of EGF Y69 forms a hydrogen bond with the dRumi M121 main chain, whereas the EGF N81 side chain hydrogen bonds with dRumi backbone atoms P123 and A124 (Fig. 2a).
The interface provided detailed insights into how Rumi recognizes its folded EGF repeat substrates. We observed that dRumi F122 and Q259 define the narrowest point of the cleft (8.6 Å), which is accessible only to a loop and excludes any other secondary structure elements (Fig. 3a). EGF P55 at subsite +2 is located right in the middle of the cleft, only 3.7 Å away from dRumi F122. Therefore, residues bulkier than Pro at the +2 subsite are not tolerated, as shown previously23. More important, P55 and the two disulfide bonds at subsites −2 and +3 fashion the C-X-S-X-(P/A)-C motif into the U-like configuration. This configuration is essential for insertion of the glucose-accepting residue Ser deep into the Rumi active site (Fig. 3c). This observation rationalizes the requirement of the consensus sequence, and perhaps explains why Rumi does not modify EGFs with serine residues either shifted by even one position or with a consensus sequence with even one extra residue (Supplementary Fig. 4b). As for the identified hydrophobic region (hFA9 EGF P55 and Y69; Fig. 3a), we previously showed that the P55A mutant of hFA9 EGF is a very poor Rumi substrate23. We prepared the properly folded Y69A mutant (Supplementary Fig. 4e,f) and found that it was also a poor substrate (Fig. 3d). Taken together, these findings indicate that Rumi recognizes the U-shaped loop structure of the consensus motif as well as the conserved hydrophobic region formed by P55 and Y69 of the acceptor EGF (Figs. 2 and 3a–d). The Rumi recognition surface is largely defined by the conserved F122, A124, A192 and Q259. Indeed, F122A and Q259A mutations markedly reduced the enzymatic activity, and A124F and A192F mutations disrupted surface complementarity, leading to a reduction of Rumi activity by 20% and 80%, respectively. The P197A mutation in the intervening region also markedly reduced the enzymatic activity (Fig. 3e).
Because the binary Rumi–EGF structure showed exquisite interface complementarity and a recognition interface largely made up of loop–loop interactions (Figs. 1a and 2a, Supplementary Fig. 2a), we wondered whether conformational changes were involved in this recognition process. To address this question, we solved the structure of dRumi complexed with UDP but in the absence of the EGF (Supplementary Table 1). We found that the dRumi structure in the absence of the EGF repeat was essentially the same as the structure in complex with EGF, with an r.m.s. deviation of 0.24 Å. Thus, the EGF-contacting surface of Rumi apparently did not change upon binding to EGF (Supplementary Fig. 2b,c). The hFA9 EGF repeat structure in the presence or absence of Rumi had a slightly higher r.m.s. deviation of 0.81 Å, but changes were limited to regions distant from the interacting interface (Supplementary Fig. 2d,e). Therefore, neither dRumi nor the EGF repeat undergoes conformational change in the binary complex structure. This exemplifies the classic 'lock-and-key' recognition mechanism, with minimal binding-induced conformational changes.
Our binary complex structure contained only a single EGF repeat, yet all currently known Rumi-regulated targets, such as Notch, Eys, JAG1, factor IX and CRUMBS2, contain multiple EGF repeats3,7,8,9. We therefore considered how Rumi might recognize these more complicated substrates, of which Notch is the best characterized. As noted earlier, because all EGF repeats have a similar fold, we were able to superimpose the crystal structure of human NOTCH1 EGF11–13 with the hFA9 EGF in our binary structure (Fig. 3f, Supplementary Fig. 5). We found that, in addition to the primary interface discussed above, a 'pinkie' region (residues 152–166) from the B-domain of dRumi that is on the opposite side of the thumb region may make contact with a second EGF that is upstream of the primary EGF, and the secondary EGF binding may be dependent on the linking angle between the two adjacent EGFs (Supplementary Fig. 5b–e). Notch NECD is known for its flexible or rigid linkage between neighboring EGF repeats28. However, the EGF repeats in these different molecular contexts can still be modified with O-glucose at high stoichiometry6,29. In the crystal structure, the B-factor in this pinkie region is substantially higher than in the rest of the structure (Supplementary Fig. 5f), implying a level of flexibility in this region. We speculate that the flexible pinkie region may accommodate a certain level of linking-angle variation between the tandem EGF repeats of Rumi substrates.
Catalytic mechanism revealed by two ternary complexes
Rumi was previously identified as an inverting GT27. To determine the transfer mechanism used by Rumi, we soaked donor ligand UDP-Glc or UDP into dRumi–EGF complex crystals and solved the structures of two ternary complexes: dRumi–(Glc–EGF)–UDP, a product complex in which glucose transfer reaction had occurred in the crystals, and dRumi–EGF–UDP (Supplementary Fig. 6). In the product complex, both the transferred glucose (Supplementary Fig. 6f) and the cleaved UDP (Fig. 4a) were held in place by a network of hydrogen bonds. The UDP diphosphate was sandwiched between and formed salt bridges with Rumi R237 and R298, and it was further stabilized by side chains of S231, T233 and S296.
Between UDP and acceptor EGF S53 in the Rumi–EGF–UDP ternary complex, we found electron densities of two water molecules and one solvent glycerol. The spatial arrangement of the five hydroxyl groups—two from the two water molecules, and three from glycerol—closely resembled the arrangement of oxygen atoms in a glucose molecule in the transitional boat configuration. Therefore, we modeled UDP-Glc into the experimental densities in the Rumi active site and obtained a putative Michaelis complex structure of dRumi–EGF–(UDP-Glc) (Fig. 4b, Supplementary Fig. 6b,c). In the model structure, R125 seemed to be crucial, as it formed a salt bridge with the active residue D151, bringing it close to the acceptor S53 (2.4 Å). Consequently, D151 oriented the EGF S53 side chain into a less favored rotamer conformation with the oxygen pointed directly toward the anomeric carbon of the donor UDP-Glc. In this arrangement, the acceptor oxygen of EGF S53 is almost linear with the anomeric C–O bond (170°) of the donor, consistent with an SN2 inverting mechanism30,31. Therefore, we propose that dRumi D151 functions as the general base to activate the nucleophile (EGF S53 OH), and R237 and R298 weaken the phosphoester bond by forming salt bridges with the donor β-phosphate (Supplementary Fig. 6h). Similar mechanisms have been reported in structural studies of other members of the GT-B family such as POFUT2 (refs. 18,20), human OGT12,16 and T4 bacteriophage β-glucosyltransferase (BGT)32, where aspartate, glutamate, histidine or donor α-phosphate has been identified as a catalytic base and positively charged residues such as arginine and lysine facilitate the departure of the leaving group. Indeed, substituting dRumi active site residues individually to alanine markedly reduced enzyme activity (Fig. 4c). In particular, the four catalytic residue mutants R125A, D151A, R237A and R298A all completely abrogated enzymatic activity (Supplementary Fig. 7).
Disease-associated mutations in Rumi
Notch pathway components are frequently implicated in cancer21,33,34,35. Given the essential roles of Rumi in Notch activation3,4,5, we hypothesized that Rumi mutations might also be related to cancer. We therefore searched a cancer genomics database36, and we found 26 Rumi mutations (Supplementary Table 2; referred to in dRumi sequence). We subsequently analyzed these mutations in the context of the dRumi–EGF–(UDP-Glc) ternary complex model. We found that the Rumi truncations, resulting from either frameshift or nonsense mutations, all lacked key structural components, making the enzyme almost certainly inactive (Supplementary Fig. 8a,b). The missense mutations mostly clustered in four regions in the A-domain but were rarely present in the B-domain (Fig. 5a–c). Among the identified point mutations, we chose a few representatives from these regions for further characterization. Rumi S231 was located at the donor-binding interface (Fig. 4a). The S231A mutant, which is milder than the S231L mutant found in cancer, showed markedly reduced in vitro activity (Fig. 4c). G189R and G199V were located within the intervening region, which is flexible, and probably affect substrate binding (Fig. 5b). G189E was previously reported to abolish enzyme activity3. Consistently, we found in this study that G199V retained only a trace amount of Rumi activity (Fig. 5d, Supplementary Fig. 7e). R245 and T267 stabilized the flexible thumb region that anchors the EGF (Fig. 5c). Indeed, destabilizing mutations R245L and T267I markedly reduced the enzyme activity (Fig. 5d). The deleterious Rumi mutations characterized here have been identified in cancers in which Notch functions as a tumor suppressor34,37,38,39—for example, S231L, R237* and R386* in endometrial cancer; R245L and S307* in bladder urothelial cancer; G189R in lung squamous cell cancer; and G199V in head and neck squamous cell cancer (Supplementary Table 2). As Rumi is essential for Notch signaling3,4,5, it is likely that these deleterious mutations promote tumorigenesis by compromising Notch function. We therefore suggest that Rumi might have a tumor-suppressor role.
DDD is a rare autosomal dominant disease characterized by reticulate skin hyperpigmentation. Nine causative mutations of Rumi have been reported in DDD10,40,41 (Supplementary Table 2). It is unknown how these mutations affect Rumi's enzymatic activity. Through homology modeling and mapping of the human mutations onto the dRumi crystal structure, we found that the eight truncated Rumi variants resulting from either frameshift or nonsense mutations all lacked key portions of the structure, almost certainly rendering them enzymatically inactive (Supplementary Fig. 8c). The only missense mutation substituted the catalytic residue arginine (R298) for a tryptophan (W) (Fig. 4a). As expected, we found that R298W completely abolished Rumi activity in vitro (Fig. 5d, Supplementary Fig. 7e). Therefore, all reported Rumi mutations in DDD are deleterious, supporting the proposed role of Rumi loss of function in DDD pathogenesis.
Here we have described a set of dRumi structures, in the presence or absence of a folded EGF acceptor as well as the donor substrates. To our knowledge, this work presents the first structure of the GT90 family, which has more than 500 members, and is the first demonstration of how a properly folded EGF repeat is recognized and glycosylated by a glycosyltransferase. We have elucidated the structural basis for the requirement of the consensus sequence C-X-S-X-(P/A)-C for EGF O-glucosylation. We discovered a conserved hydrophobic region of EGF repeats (corresponding to hFA9 EGF repeat P55 and Y69) as an additional structural feature recognized by Rumi. Furthermore, by solving the crystal structures of two enzyme–acceptor–donor ternary complexes, we have identified D151 as the catalytic residue and elucidated the SN2 inverting reaction mechanism for Rumi.
In the SN2 mechanism, a properly positioned nucleophile is paramount in order for the reaction to proceed. We previously observed that Rumi strongly prefers serine over threonine as the nucleophile in the acceptor EGF29. To understand the structural basis of the preference, we computationally substituted S53 with a threonine in the EGF, and we found that T53 was too close to the donor glucose ring (Supplementary Fig. 6g). The close contact may prevent the glucose ring from assuming an optimal binding pose at the enzyme active site. This observation might explain why Rumi adds glucose to a serine rather than to a threonine.
Our structural and biochemical studies provide a framework for understanding the functional consequences of Rumi alterations in diseases. Our analyses have shown that the Rumi mutations reported in DDD abolish the enzyme activity. This observation strongly supports a pathogenic role of these Rumi mutations. More important, we found that cancer-associated mutations in Rumi alter key structural components and inactivate the enzyme. This suggests that Rumi may function as a tumor suppressor in contexts where Notch activity inhibits tumor growth. We note that the contribution of these Rumi mutations to tumorigenic activity might be more complex than it appears, as the Rumi mutations found in DDD apparently do not cause tumors. Furthermore, the cancer genomics database cBioPortal (http://www.cbioportal.org/) does not specify the homo- or heterozygosity of the mutations. We could assume that at least some of the mutations are from heterozygotes. Because cancer is a complicated disease, and most cancers, unlike DDD, have multiple mutations that in combination lead to a cancer phenotype, there are several possibilities that may reconcile these apparent discrepancies. These include loss of heterozygosity, which is very common in cancers, or other mutations in the Notch pathway that decrease Notch activity—in combination with mutations in Rumi or POGLUT1—to the point where they have an effect. In other cases where mutations of several Notch pathway genes (NCSTN, APH1A, MAML1 and NOTCH2) are known to be from heterozygotes, these mutations have been associated with a tumor-suppressor role35. Although our work points to a tumor-suppressive role of Rumi in cancer, further studies are required.
The atomic structure of an important enzyme such as Rumi is invaluable for therapeutic development. Inhibitors of γ-secretase, a Notch-processing enzyme, have garnered considerable clinical interest as cancer therapeutics and have also been extensively used in functional studies of Notch42. However, it is widely recognized that downregulating Notch by inhibiting γ-secretase is problematic, because γ-secretase has multiple substrates, which can lead to severe off-target effects42,43. In this regard, Rumi as an essential Notch regulator may provide a novel molecular target for the development of small-molecule inhibitors that downregulate Notch signaling. However, Rumi does not exclusively target Notch either, so it remains to be investigated whether Rumi will become a druggable target. Our Rumi structure and mechanistic insights will facilitate future research in this direction.
Preparation of human factor IX EGF.
The procedure was performed as previously described23. Briefly, hFA9 EGF repeat was expressed in BL21 (DE3) Escherichia coli and purified by Ni-NTA agarose (Qiagen) affinity chromatography and subsequent reverse-phase HPLC. The final product was lyophilized using a vacuum centrifuge. Product mass was analyzed by LC–MS/MS. For introduction of the Y69A mutation, site-directed mutagenesis was carried out via a conventional PCR-based method with the pET20b vector encoding wild-type hFA9 EGF repeat as a template and primers (forward, 5′-CATTAATTCCGCTGAATGTTGGTGTCCCTTTGGATTTGAAGGA-3′; reverse, 5′-CCAACATTCAAAGGAATTAATGTCATCCTTGCAACTGCC-3′). The introduced mutation was confirmed by DNA sequencing.
Cloning, expression and purification of Drosophila Rumi.
Cloning of the dRumi cDNA was described previously3. The cDNA encoding Drosophila melanogaster Rumi without its signal peptide and C-terminal KDEL endoplasmic reticulum (ER)-retention motif was cloned into pSecTag2c vector (Invitrogen) with a C-terminal thrombin-cleavable Myc/His6-tag. The whole primary sequence of the expressed dRumi is shown in Supplementary Figure 1a,b.
Protein expression and purification of dRumi were done as previously described23. Because of the strict quality control system in the ER of mammalian cells, only properly folded and stable proteins are secreted into the medium. By taking advantage of this robust ER quality control system, we successfully solved the structure of XXYLT1 and generated, purified and tested multiple XXYLT1 mutants21. Here we took advantage of this same quality control method by purifying recombinant dRumi protein from the culture medium of transiently transfected HEK293T cells (ATCC; DAPI staining verified that the cell line was negative for mycoplasma). The expression plasmid (2 μg) was transfected in adherent HEK293T cells using polyethylenimine transfection reagent, and the transfected cells were cultured in a 10-cm plate with 6 ml of DMEM (Invitrogen) containing 10% bovine calf serum (HyClone, GE Healthcare) overnight. The medium was changed to 6 ml of OPTI-MEM I (Invitrogen), and the cells were cultured for another 3 d. The secreted dRumi protein was purified from the culture media using Ni-NTA agarose (Qiagen) affinity chromatography with gravity flow. The culture media (approximately 240 ml) from forty 10-cm plates were supplemented with 0.5 M NaCl and 10 mM imidazole and applied to Ni-NTA agarose (column volume, 300 μl). After the column had been washed with 20 ml of Tris-buffered saline, pH 7.4 (TBS), containing 0.5 M NaCl and 10 mM imidazole, the dRumi protein was eluted with 2 ml of TBS containing 250 mM imidazole, dialyzed against TBS containing 20% glycerol at 4 °C overnight, and stored at −80 °C until use. The yield of dRumi from one standard transfection was approximately 0.4 mg/l of culture. Protein expression was confirmed by western blotting analysis with anti-Myc (clone 9E10, Stony Brook University Cell Culture/Hybridoma Facility), and protein purity and concentration were estimated by 10% SDS–PAGE followed by Coomassie staining with BSA as a standard.
For crystallization, the C-terminal tag was removed through thrombin cleavage and then the tag-free Rumi was further purified by size-exclusion chromatography (Superdex 200, GE Healthcare) in 20 mM HEPES, pH 7.5, 150 mM NaCl. The sample was concentrated to 5 mg/ml and stored at −80 °C until use.
Crystallization, ligand soaking and heavy atom soaking.
For Rumi–UDP binary complex crystallization, Rumi at a concentration of 5 mg/ml was mixed with 3 mM UDP and incubated at 4 °C for 2 h. The hanging-drop diffusion method was used to produce initial microcrystals in mother liquor containing 20 mM HEPES, pH 7.4, sodium citrate tribasic and glycerol at 20 °C. The rare crystals with maximum dimensions reaching ∼30 μm × 30 μm × 200 μm were obtained only by seeding optimization. Well solution with increasing glycerol concentration (30%) was used as the cryoprotectant for crystal flash-freezing in liquid nitrogen.
For crystallization of the binary complex of Rumi and hFA9 EGF acceptor ligand (Rumi–EGF), the purified Rumi at a concentration of 5 mg/ml was mixed with a twofold molar excess of hFA9 EGF and incubated at 4 °C for 2 h; then the mixture was used for crystal screening or crystal reproduction. We obtained crystals of the Rumi–EGF binary complex several days after setting up the hanging-drop vapor-diffusion plates at 20 °C using a reservoir solution containing (NH4)2SO4. For ligand soaking in the Rumi–EGF crystals, UDP or UDP-Glc at a final concentration of 20 mM was added to crystal-containing drops for 1 h before crystals were picked up and flash-frozen in liquid nitrogen. The Rumi–EGF binary complex or ligand-soaked crystals were frozen in cryoprotectant consisting of the well solution supplemented with 25% glycerol.
For heavy atom soaking of the Rumi–EGF binary complex crystals, we screened several heavy atom compounds and found that the following conditions, with the original well solution used as the soaking solution, gave derivatized crystals that diffracted to >4-Å resolution with useful anomalous signals: (1) 1 min soaking in 1 M KI; (2) 1 h soaking in 20 mM ErCl3.
X-ray diffraction data collection, structural determination and refinement.
The data sets were collected at NSLS beamlines X25 and X29 in Brookhaven National laboratory at 1.1000-Å wavelength, and at the APS LRL-CAT at 0.9793-Å wavelength, except for the heavy atom derivative data sets that were collected at wavelengths of 1.7000 Å and 1.4832 Å for KI and ErCl3 derivatized crystals, respectively. The native sulfur SAD data sets were collected at NSLS X4A at 2.0703 Å. Diffraction images were processed and scaled in XDS44, Mosflm45 or HKL2000 (ref. 46). The Rumi–EGF binary complex crystals had a space group of H32 with one complex in the asymmetrical unit (AU). The Rumi–UDP binary complex crystals were in space group P31 with six molecules in AU.
Substructure determination with KI or ErCl3 derivative data sets failed, likely owing to the low occupancy of heavy atoms. To solve the phase problem, we collected three data sets of Rumi–EGF binary complex crystals at a wavelength of 2.0703 Å to measure the sulfur anomalous signal. These data sets were processed using XDS44, combined with Pointless and merged with Scala of the CCP4 suite47 as previously described24. Ten sulfur atoms were found by SHELXD48. The initial SAD phases were obtained using the PHENIX49 AutoSol, and a crude model was built automatically with Phenix AutoBuild. The crude model was then used to generate phases for the 1.9-Å Rumi–EGF binary complex data set by molecular replacement with the program MOLREP50. With an improved electron density map, the Rumi–EGF binary complex model was built by PHENIX AutoBuild and further corrected and completed in several iterations by manual building in COOT51 followed by refinement with REFMAC in the CCP4 program52. For all ligand-soaked data sets, structures were determined via a similar strategy: molecular replacement by MOLREP using the side chain hydroxyl group of EGF S53 and carboxyl group of Rumi D151 deleted version of the Rumi–EGF binary complex structure as the search model, followed by one round of automatic refinement in REFMAC without building the soaked ligand. Then the Fo–Fc difference maps and the 2Fo–Fc electron density maps were carefully analyzed before the ligand(s) were built into the density map. Donor UDP was fit into the map first, and the side chain hydroxyl group of EGF S53 and carboxyl group of Rumi D151 were fit next, with fitting followed by the building of glycerol (UDP soaking) or transferred glucose (UDP-Glc soaking). After ligand building, several additional rounds of refinement were carried out in REFMAC. Water molecules were added last.
The Rumi–UDP binary complex structure was solved to 3.2 Å by molecular replacement with MOLREP using the Rumi moiety of the Rumi–EGF binary complex structure as the search model. There are six molecules in ASU, and the diffraction data set was highly twinned (twin fractions of ∼0.5). By using REFMAC, we carried out one round of rigid body refinement and one round of restrained refinement (with the twin-refinement option enabled) with local NCS restraints. The clear density of intact UDP was identified at this stage, on the basis of which six UDP in ASU were modeled in COOT and refined by one round of restrained refinement with REFMAC.
The crystallographic statistics for data collection and refinement are presented in Supplementary Table 1.
Mutagenesis, enzyme activity measurement and mass spectrometry.
Site-directed mutagenesis was performed via a conventional PCR-based method with the pSecTag vector encoding wild-type dRumi used as a template. Primers are listed in Supplementary Table 3. Introduced mutations were confirmed by direct DNA sequencing.
The enzymatic assay with radiolabeled UDP-[6-3H]glucose (Glc) (American Radiolabeled Chemicals, >97%) was performed as previously described23. Briefly, the standard 10-μl reaction mixtures contained 50 mM HEPES, pH 6.8, 10 mM MnCl2, 10 μM hFA9 EGF repeat, 10 μM UDP-[6-3H]Glc (7.14 GBq/mmol), 10 ng dRumi enzyme, and 0.1% Nonidet P-40. The reaction was carried out at 37 °C for 20 min and was stopped by the addition of 900 μl of 100 mM EDTA, pH 8.0. The sample was loaded onto a C18 cartridge (100 mg, Agilent Technologies). After the cartridge had been washed with 5 ml of H2O, the EGF repeat was eluted with 1 ml of 80% methanol. Incorporation of [6-3H]Glc into the EGF repeats was determined by scintillation counting of the eluate. Reactions without enzymes were used as background control. Data were collected from three independent assays; in figures, values are presented as mean and s.e.m.
For overnight reactions with dRumi or its mutants, hFA9 EGF repeat (10 μM) was incubated in the presence of 100 ng of dRumi or its mutants and UDP-Glc at a concentration of 200 μM. The reaction was carried out in 30 μl of 50 mM HEPES, pH 6.8, 10 mM MnCl2 at 37 °C overnight. An aliquot of the products was analyzed by LC–MS using an Agilent 6340 ion-trap mass spectrometer with a nano-HPLC CHIP-Cube interface. Extracted ion chromatograms for the most abundant charge state of the unmodified form or O-glucosylated form of EGF repeats were generated.
Any supplementary information, chemical compound information and source data are available in the online version of the paper. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html. Correspondence and requests for materials should be addressed to R.S.H. or H.L.
Protein Data Bank
Protein Data Bank
We thank members of the Li and Haltiwanger labs for critical comments on this work, as well as S. Singh Johar for technical assistance. The work was supported by the NIH (grants GM061126 (to R.S.H.) and AG029979 (to H.L.)) and SBU–BNL (seed grant to R.S.H. and H.L.). We acknowledge access to beamlines X25, X29 and X4A at NSLS, Brookhaven National Laboratory and LRL-CAT at APS, Argonne National Laboratory, and we thank the staff at these beamlines. NSLS and APS were supported by the US Department of Energy, Office of Science, Office of Basic Energy Sciences, under contract nos. DE-AC02-98CH10886 and DE-AC02-06CH11357, respectively. Use of the Lilly Research Laboratories Collaborative Access Team (LRL-CAT) beamline at Sector 31 of the Advanced Photon Source was provided by Eli Lilly Company, which operates the facility. The results published here are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). H.L. dedicates this work to the loving memory of his son Paul J. Li.
Supplementary Results, Supplementary Figures 1–8 and Supplementary Tables 1–3.
About this article
Structural basis of Notch O-glucosylation and O–xylosylation by mammalian protein–O-glucosyltransferase 1 (POGLUT1)
Nature Communications (2017)