Dual roles of the Sterol Recognition Region in Hedgehog protein modification

Nature provides a number of mechanisms to encode dynamic information in biomolecules. In metazoans, there exist rare chemical modifications that occur through entirely unique mechanistic regimes. One such example occurs in the Hedgehog (Hh) morphogens, proteins singular across all domains of life for the nature of their covalent ligation to cholesterol. The isoform- and context-specific efficiency of the ligation reaction has profound impact on the activity of Hh morphogens and represents an unexplored aspect of Hh ligand-dependent cancers. To elucidate the chemical mechanism of this modification, we have defined roles of the uncharacterized sterol recognition region (SRR) in Hh proteins. We use a combination of sequence conservation, directed mutagenesis, and biochemical assays to specify residues of the SRR that are responsible for cellular and biochemical processes in Hh cholesterolysis. Our investigations offer the first functional template of this region, providing opportunities to identify parallel reactivity in nature and revealing new mechanisms that can be exploited as tools in chemical biology.


Introduction
The Hedgehog (Hh) morphogens undergo an auto-catalytic modification that cleaves the translated protein approximately in half and covalently appends cholesterol to the last residue of the N-terminal fragment. 1 To date, this intramolecular small molecule transfer activity has been observed only in Hh proteins and in no other domains of life. 2 Cholesterol modification is fundamental to the activity of the Hh morphogen; 3,4 mutation of the catalytic residues are lethal, and mutations that reduce the efficiency of the reaction give rise to profound congenital disorders (Fig. S1A). 5,6 This unique modification appears to have co-evolved with cholesterol-sensing mechanisms that coordinate Hh signal transduction, chemically linking embryogenesis to cholesterol homeostasis. 7,8,9 The initially translated, full-length Hh proteins (typically 350-475 residues in length) consist of an N-terminal signaling domain, an internal catalytic domain, and a C-terminal region known as the Sterol Recognition Region (SRR, Fig. 1A). 10 During post-translational modification, the full-length protein is cleaved at the end of the N-terminal signaling domain; concomitantly, the last residue of the N-terminus is released as a cholesterol ester. Independently, the N-terminal residue of the Hh morphogen is palmitoylated by dedicated acyltransferase enzyme in the ER. 11,12 A crystal structure of the first 145 residues of D. melanogaster Hh-C provided by Beachy and coworkers immediately revealed the basis for proteolytic activity of the protein: an intein splicing domain remarkably similar in tertiary structure to those found in bacteria and fungi. 13,14 Inteins catalyze intramolecular attack of a cysteine, serine, or threonine side chain on the backbone carbonyl of the preceding residue, effecting N-to-S/O acyl transfer and creating a labile (thio)ester bond in the protein backbone. In conventional inteins, a nucleophilic residue in the C-terminus of the protein reacts with this (thio)ester in a transesterification reaction, ultimately ligating the two exteins. 15,16 However, Hh proteins lack catalytic intein residues required for protein re-ligation. Instead, the thioester formed during the splicing reaction is intercepted by cholesterol, co-opting this mechanism for small molecule adduction.
As in many other intein-containing proteins, the N-terminus of Hh does not participate in the cleavage reaction. 10 Thioester formation and cholesteroylation activities inhere solely in the intein and SRR regions, respectively. Accordingly, cholesterol ligation can occur in vitro and in cells when the Hh N-terminus is replaced by fluorescent proteins or simply by a His-tag. 17,18,19,20 However, the molecular details of the SRR that facilitate cholesterol transfer remain unknown. The SRR, which spans 50-100 residues in different species, bears no significant homology to known proteins in any domain of life. While nematodes possess Hh intein-domain (Hint) homologues that likewise lack a C-terminal nucleophilic residue for transesterification, these domains do not couple small molecule transfer to splicing activity. 2 The structure of the SRR has never been established crystallographically, and to date presents an unsolved challenge for highresolution analysis. Moreover, no static conformation can fully capture the dynamic activity of these fascinating proteins.
To define structural requirements for cholesterol transfer by the Hh SRR, we used a combination of tools at the interface of chemistry and biology. We demonstrate that the process of Hh cholesterolysis encompasses two main activities: 1) engagement of cholesterol as a nucleophile, and 2) cellular localization to cholesterol-containing membranes. We use sequence conservation and structure prediction to reveal a helix-loop-helix arrangement that is broadly represented among Hh SRRs. Guided by this structural analysis, we used site-directed mutagenesis of these motifs within human Sonic hedgehog (hSHH) expressed in mammalian cells to identify loss of function mutations that abolish Hh autoprocessing. Biochemical analysis of full-length protein purified from mammalian cells enabled us to distinguish conserved hydrophobic residues that are essential for cholesterolysis in cells but not in vitro. Cellular localization of a fluorescent SRR fusion protein revealed a novel Golgi targeting motif that may bias interaction of full-length hSHH with cellular membranes for cholesterol recruitment. By defining the structural requirements and elementary steps in this unconventional modification, this work expands our understanding of Nature's chemical biology toolkit and facilitates rational design of tools for small molecule ligation in cells.

A helix-loop-helix motif is conserved across SRRs
To identify the residues within the SRR that participate in cholesterol transfer we considered the SRR as V366-S462 of the hSHH protein, which begins after the last corresponding residue in the crystal structure of the D. melanogaster intein domain. 10 Cleavage of the D. melanogaster protein at this residue renders the protein soluble in the absence of detergent and eliminates cholesterol ligation during proteolysis. As protein-and nucleotide-based alignment programs show no significant homology between the hSHH SRR and other non-Hh proteins, we aligned 710 manually curated, diverse Hh proteins to identify common features across species and isoforms ( Fig. 1B and S2). 21 This alignment revealed two sections of conserved residues before and after a long intervening sequence that is unique to a subset of Hh proteins (e.g. hSHH). Secondary structure analysis revealed that the conserved segments are predicted to form αhelices, 22 while the glycine-rich intervening segment is predicted to be disordered.
To guide our hypotheses regarding the residue-level contributions of the SRR to cholesterolysis, we used ab initio fragment assembly to generate a 3D structure of the last 100 residues of hSHH, which encompasses the hSHH SRR (Fig. 1C). 23 In line with our secondary structure predictions, a lowest energy hSHH SRR structure possesses helices at residues from W372-L390 (1 st helix) and I432-L447 (2 nd helix), which are linked by a disordered loop comprising residues A391-G431. The structure contains a "kink" introduced by proline residue P379 in the 1 st helix, which separates the 1 st helix into to angled segments from W372-A378 and P379-L390. 24 Helical wheel analysis of the two regions, revealed that each segment of the helices possess a face enriched in hydrophobic residues (Fig. 2D). This arrangement is characteristic of amphipathic peripheral membrane helices; 25 further, none of our analyses predict these sequences to span the bilayer.
To gain empirical support for our alignment-based model, we synthesized peptides comprising the 1 st and 2 nd helices and examined their secondary structures by Circular Dichroism (CD) spectroscopy (Fig. 1E). As anticipated, these hydrophobic peptides are insoluble in the absence of liposomes. CD spectra of each helix in multidisperse POPC liposomes clearly displayed helical character in both the presence and absence of cholesterol. Grounded by conservation analysis and secondary structure analysis, we used this model as a scaffold for experimental studies to define residue-level structural elements of the SRR that link intein proteolysis to cholesterol transfer.

Conserved SRR residues are required for hSHH cholesterolysis in cells
As a relatively hindered, hydrophobic nucleophile with a pKa of ~18, 26 the cholesterol hydroxyl group imposes strict requirements on the residues involved in nucleophilic activation and trajectory for attack. Whereas an intein-SRR unit catalyzes cholesterolysis both in cells and in vitro, Hh mutants lacking the SRR undergo proteolysis without attachment of cholesterol. 10 These observations demonstrate that the SRR is necessary and sufficient for biochemical Hh cholesterolysis. In cells, however, the marked insolubility of cholesterol in aqueous solution (<10 nM) 27 and cellular mechanisms of cholesterol sequestration also require mobilization of cholesterol from cellular membranes. 28 Accordingly, we hypothesize that conserved helices within the SRR fulfill two functions: 1) facilitate biochemical attack by cholesterol, and 2) recruit the full-length protein to cholesterol-containing membranes in the cell.
To assess the propensity of mutant hSHH proteins to undergo cholesterolysis in mammalian cells, we overexpressed hSHH and mutants in HEK293T cells. In this cell type, which lacks accessory proteins involved in native secretion of cholesteroylated hSHH, differently processed and/or lipidated forms of the hSHH protein partition into cell lysates or secreted media. The cholesteroylated and dually lipidated (cholesteroylated and palmitoylated) N-terminal proteins remain membrane-associated and are retained in cell lysates. By contrast, hSHH bearing the palmitate modification alone is secreted into cell media. 11,29,30,31,32 Accordingly, we used production of cell-associated hSHH-N in lysates (hSHH-NL) as a measure of cholesteroylation for a given SRR mutant (Fig. S3).
We first generated deletion mutants that lacked components of the helix-loop-helix motif identified in our protein alignment and biochemical studies ( Fig. 2A). Consistent with previous observations, a construct lacking the SRR failed to produce any cell-associated hSHH-N. 10 Specific deletion of the 1 st helix, 2 nd helix, and loop region demonstrated that hSHH-N was likewise absent in lysate of mutants lacking the 1 st or 2 nd helices. Strikingly, the ∆loop construct that lacked a non-conserved 32 residue connector functioned identically to the wild-type protein. This result defines the 1 st and 2 nd helices as a minimal required unit, consistent with the absence of this loop region in Hh proteins known to undergo Hh cholesterolysis. 33 To more explicitly probe the residues of the SRR that participate in cholesterolysis in cells, we performed alanine scanning of both the 1 st and 2 nd helices ( Fig. 2B and 2C). Mutation of highly conserved 1 st helix residues H374, F377, and R381 drastically reduced production of cellassociated hSHH-N. Interestingly, mutation of the absolutely conserved P379 residue did not reduce the proportion of cell-associated hSHH-N relative to WT but lowered the total amount of protein in cell lysate, likely due to an effect on biochemical and/or cellular protein stability. Based on our helical wheel analysis (Fig. 1D), we envisioned that the three conserved leucine residues on the same face of the 1 st helix, L382, L386, and L390 might interact with the lipid tails of the membrane. Consistent with this hypothesis, Monte Carlo simulations of the interaction between the 1 st helix peptide and a phospholipid bilayer predict that these three leucines will reside at the membrane interior (Fig. 2D). 34 While single L382A, L386A, and L390A mutants enabled the production of cell-associated hSHH-N, a L382A/L386A/L390A triple mutant drastically reduced cell-associated hSHH-N, indicating that a stretch of alanine residues is insufficient to support the production of cholesteroylated hSHH-N in cells (Fig. 2E). Consistent with a model involving membrane association, replacement of either of the individual L382 or L386 residues with a negatively charged residue (glutamate) abolished production of cell-associated hSHH-N. An L390E mutation showed a significant but attenuated reduction of cell-associated hSHH-N, whereas glutamate mutation of L387, which is perpendicular to the hydrophobic leucine surface, shows only slight reduction in cell-associated hSHH-N versus wild-type protein. This model also suggests an electrostatic interaction between the basic residues H374 and R381 and phospholipid head groups, which may contribute to membrane association. 25 The 2 nd helix of the hSHH SRR contains a HWY motif that is largely conserved among species and Hh isoforms. While this motif has been demonstrated to regulate post-cleavage trafficking of hSHH-C, 35 a contribution to cholesterolysis has not been tested. 36 Alanine mutation of each residue in this motif drastically reduced production of cell-associated hSHH-N (Fig. 2H). In particular, hSHH-N was undetectable in the lysates of cells expressing a Y435A mutant. We envisioned that this conserved tyrosine residue might participate in a hydrogen bond interaction with the C3-OH of cholesterol. Surprisingly, mutation of this residue to an alternative hydrogen bond donor (aspartate) failed to rescue production of cell-associated hSHH-N. Interestingly, mutation to phenylalanine reduced total protein in lysate but restored the fraction of cellassociated hSHH-N relative to wild-type. The exclusive recovery of cell-associated hSHH-N by an aromatic residue suggests a potential mechanistic contribution of the Y435 -system to cholesterolysis. 37 By contrast, the lower overall expression of Y435F suggests that the hydroxyl group is essential to biochemical and/or cellular protein stability.
The segment of the 2 nd helix following the HWY motif consists of a number of polar residues interspersed with four conserved leucine residues (L438, L439, and L446, and L447). Surprisingly, while alanine replacement of the conserved L447 residue significantly decreased cell-associated hSHH-N, mutation of the less conserved residues L438, I442, and L446 to alanine had an attenuated effect (Fig. 2C). Alanine scanning of the remaining positions in the 2 nd helix revealed that substitution of the polar residues S436 and T444 and the hydrogen bond donating residue Y440 did not significantly reduce production of hSHH-N in lysates. Likewise, mutation of glutamine residues Q437 and Q441 did not affect levels of cell-associated hSHH-N. Taken together, these studies provide a roadmap of the SRR residues involved in cholesterol transfer by Hh proteins in human cells.

A subset of SRR residues are required for hSHH cholesterolysis in vitro
To distinguish between biochemical and cellular factors that contribute to cholesterolysis, we investigated the reactivity of SRR mutants toward cholesterol in vitro. To do so, we appended a Myc-DDK tag to the C-terminus of our hSHH construct (Fig. S4). Overexpression of tagged protein followed by mild lysis and affinity purification with agarose conjugated to an anti-FLAG antibody enabled us to isolate active unprocessed hSHH-FL free from N-terminal protein generated in cells. Consistent with previous observations, exposure of this protein to 0.5 mM cholesterol and 1 mM DTT resulted in production of cholesteroylated hSHH-N (hSHH-NC), which could be distinguished from un-cholesteroylated hSHH-N by apparent molecular weight (Fig. 3A, Fig. S5). 28,30,31,38,39 For all mutants, exposure of the purified protein to 50 mM DTT effected production of hSHH-N, confirming formation of an active thioester. While a mutant lacking the catalytic cysteine residue (C198A) was unreactive, a construct lacking the entire SRR (E368*, ∆SRR) was susceptible to in vitro DTT cleavage but not cholesterolysis (Fig. 3B). Consistent with our cellular assays, neither the ∆1 st helix nor the ∆2 nd helix mutants were capable of undergoing cholesterolysis in vitro. Interestingly, a ∆2 nd helix mutant underwent non-cholesteroylative cleavage in the presence of DTT and cholesterol, whereas a ∆1 st helix mutant was unreactive under these conditions. This result suggests that an interaction between the 1 st helix and cholesterol is sufficient to promote cleavage but not cholesterolysis. As anticipated, a ∆loop mutant functioned equivalently to wild-type hSHH.
We next sought to query the in vitro reactivity of 1 st helix SRR mutants that failed to produce cell-associated hSHH-N. All purified mutant proteins were capable of undergoing robust cleavage at high (50 mM) but not low concentrations (1 mM) of DTT (Fig. S6). Consistent with our cellular assays, expression levels of the H374A and R381A mutants were lower than wildtype hSHH, and protein that was produced did not undergo cholesterolysis. While an F377A mutant expressed high quantities of protein, trace hSHH-N protein that was produced in the presence of 1 mM DTT and cholesterol likewise did not exhibit a shift characteristic of cholesteroylated hSHH-N. Strikingly, while neither L382A/L386A/L390A, nor L386E mutants were capable of producing cell-associated hSHH-N, both exhibited a migration shift to indicate cholesterolysis in vitro. This observation suggests that the L382/L386/L390 residues act primarily to facilitate cellular aspects of cholesterolysis, potentially through an interaction involving this hydrophobic face within the 1 st helix.
We then sought to characterize the biochemical contributions of the 2 nd helix HWY motif to cholesterolysis in vitro. Purified H433A nor Y435A mutant proteins failed to produce cholesteroylated hSHH-N, consistent with their observed lack of this reactivity in cells. As in cells, mutation of the conserved Y435 residue to an alternative aromatic (Y435F) but not an aliphatic (Y435A) residue preserved biochemical cholesterolysis. By contrast, a W434A mutant, which was also incapable of producing cell-associated hSHH-N, showed restored reactivity toward cholesterolysis in vitro. The unique restoration of cholesterolysis by a W434A mutant indicates that this residue contributes predominantly to cellular aspects of hSHH cholesterolysis.
The observation that specific SRR mutants can undergo cholesteroylation in vitro but not in cells implies that hSHH-N cholesteroylation entails separable biochemical and cellular processes. Within our structural model, this analysis points to a cluster of conserved aromatic residues that is required for biochemical cholesterol adduction opposite to a hydrophobic face that is required for cellular cholesterolysis (Fig. 3C). Thereby, this functional portrait provides a launching point for identification of related motifs in nature and a blueprint for redesign.

The SRR contains a Golgi localization motif
Due to its fleetingly low (nanomolar) concentrations in aqueous media, essentially all cellular cholesterol is sequestered in membranes. Moreover, specific lipid and protein compositions within a given membrane control the fraction of active cholesterol that is available to interact with other biomolecules. 40,41,42,43 While hSHH-N release has been investigated both functionally and biochemically in native hSHH-N secreting cells, 44,45 only a handful of studies have tracked the intra-and extracellular trajectories of the hSHH-C fragment. 28,46 Whether the SRR has inherent cellular targeting activity is unknown.
Because the Hh intein domain itself is soluble, we hypothesized that the SRR might localize the full-length hSHH protein to microdomains within the ER and or/other membranes for cholesterolysis. To ascertain preferences for subcellular localization inherent to the SRR, we created a fusion protein comprising N-terminal EGFP fused to SRR-encompassing residues A365-S462 of hSHH (Fig. 3D). To survey whether the SRR could bias the localization of the intein domain, we also created EGFP-intein(C198A), EGFP-intein(C198A)-SRR fusions. When EGFP alone was expressed from a pCMV6-EGFP construct, it assumed a diffuse cellular distribution that included the nucleus. By contrast, when expressed as a fusion with the hSHH SRR, EGFP fluorescence was observed as discrete puncta clustered at the periphery of the nucleus, and nuclear fluorescence was eliminated. Time course studies revealed that this distribution was established immediately upon expression and at various levels of overexpression. Likewise, overexpression of an EGFP-intein-SRR but not an EGFP-intein construct was sufficient to establish punctate, extranuclear distribution. Finally, while deletion of the 1 st and 2 nd helices in an EGFP-SRR construct preserved extranuclear fluorescence, deletion of the 1 st helix dramatically reduced puncta formation and enhanced cytoplasmic fluorescence, while deletion of the 2 nd helix attenuated, but did not abolish, wild-type EGFP-SRR distribution. This observation is consistent with a significant role for the 1 st helix in subcellular trafficking, and reinforces a hypothesis that the 1 st helix engages in functional membrane interactions during cellular cholesterolysis.
To determine which cellular compartments colocalize with EGFP-SRR, we co-expressed the EGFP-SRR construct with red fluorescent protein-labeled organelle markers (Fig. 3E). These studies revealed that SRR constructs colocalized extensively with the Golgi, as demonstrated using mCherry markers bearing either Golgin or β-galactosyltransferase Golgi-targeting sequences (Fig. S6). Importantly, the EGFP-SRR construct showed no indication of lysosomal localization to indicate misfolded or aberrantly processed protein. Partial colocalization was also observed with ER and caveolin markers, whereas no significant colocalization was observed with the nucleus, peroxisomes, mitochondria, actin, or the plasma membrane (Fig. S7). Together these studies reveal that the SRR imparts a distinct Golgi localization to fusion proteins.

Discussion
To elucidate the molecular features of the SRR that ligate cholesterol to Hh proteins, we have defined residues that are required for cellular and biochemical Hh cholesterolysis. We used sequence conservation to identify common regions of secondary structure, which revealed a helix-loop-helix motif shared by all Hh proteins. Site-directed mutagenesis provided a blueprint of residues essential for Hh cholesterolysis in human cells and established minimal criteria for biochemical Hh cholesteroylation. This cellular and biochemical analysis was coupled to our identification of a new motif that induces Golgi localization in living cells.
Our data suggest possible mechanisms by which the SRR achieves this cholesterolprotein ligation in cells (Fig. 4). In one scenario, the SRR may tether the full-length Hh protein to cholesterol-rich membranes, enabling direct recruitment of a cholesterol nucleophile to the intein active site. Alternatively, the SRR might itself extract cholesterol from the membrane and shuttle it to the reactive site within the intein. Finally, the SRR could form a hydrophobic conduit with the intein domain, and together this unit could orient cholesterol for nucleophilic attack.
These studies can now provide a chemical basis for a number of SRR mutations that lead to developmental disease and can shed light on dysregulated Hh ligand activity in cancer (Fig.  S1). We are currently building upon this basis set of functional residues to re-engineer the SRR ligase activity for alternative substrates and proteins of interest. Future studies will investigate the contribution of intein-SRR interactions to reaction efficiency to inform construct design. We anticipate that this work will lead to a general means to repurpose Nature's small molecule ligation technology as an invaluable tool in the chemical biology of living systems.

Constructs
Myc-DDK-tagged human Sonic Hedgehog (hSHH) in a pCMV6 vector was obtained from Origene (RC222175). For analysis of proteins from cell lysates and media, a stop codon was introduced after the last residue of hSHH (S462) to obtain untagged hSHH. Untagged and Myc-DDK tagged hSHH constructs were used for site-directed mutagenesis to create the corresponding SRR mutants. Briefly, mutagenesis reactions were performed using overlapping or non-overlapping primers as required. Truncation mutants were generated by inserting stop codons via site-directed mutagenesis. SRR deletion mutants were generated by PCR extension of the parent hSHH sequence using phosphorylated reverse primers to exclude specified residues, then ligating the resulting PCR product using T4 ligase (NEB, M0318). To generate EGFP and EGFP-hSHH constructs used for microscopy experiments, an EGFP insert (Clontech, Addgene vdb2487) followed by the linker sequence (GGGS)2 was cloned into the pCMV6 backbone before the corresponding hSHH fragment using Gibson assembly. All PCR reactions were performed using Phusion High Fidelity polymerase (NEB, M0530) in the presence of 10% DMSO. Parent construct was digested with DpnI (NEB, R0176). The full hSHH insert sequence for each mutant was verified by Sanger sequencing.

Liposome preparation and CD spectroscopy
Peptides for residues encompassing the 1 st helix (368-391) and 2 nd helix (431-449) of the hSHH SRR were obtained from Genscript Inc. at >90% purity. Crude soybean phospholipids containing L-α-phosphatidylcholine (PC, Sigma-Aldrich, P5638) were used to prepare liposomes for CD analysis. Briefly, phospholipids in the presence or absence of cholesterol were dissolved in chloroform and dried to a thin layer on the sides of a glass vial using a rotary evaporator.
Dried lipids were resuspended in CD buffer (specified below) and mixed to homogeneity using a vortex to yield a 10 mM PC ± 2 mM cholesterol stock solution. Unilamellar vesicles were prepared by disruption on ice using a tip sonicator, and liposomes were centrifuged at 4400 × g for 10 minutes to remove undissolved lipids. Supernatant containing unilamellar vesicles was diluted to 0.5 mg/mL and analyzed by dynamic light scattering (Wyatt DynaPro PlateReader-II) to assess size distribution.

Gel electrophoresis and Western blot analysis
Samples were loaded on a 4-15% SDS-PAGE gel (Bio-Rad, 5678084) in Tris/Glycine/SDS running buffer (Bio-Rad, 1610772), and proteins were resolved at a current of 150 V for 1 h at room temperature. After electrophoresis, proteins were transferred to a 0.

Analysis of hSHH protein reactivity in vitro
Isolated hSHH-FL-MycDDK protein was thawed on ice and divided into 30 µL aliquots in a PCR strip. For each experiment, 30 µL of the protein sample was boiled immediately after thawing to serve as an input control. For thiolysis, 1 M DTT was added to a final concentration of 50 mM.   ). E. Relative hSHH-NL produced by alanine and glutamate mutants of the three predicted membrane-embedded (L382, L386, L389A) residues and one surface (L387A) residue in the 1 st helix of the SRR. F. Relative hSHH-NL produced by alanine, aspartate, and phenylalanine mutants of Y435 in the 2 nd helix of the SRR. For A, B, C, E, and F: The ratio of pixel intensity of hSHH-NL to hSHH-FL for each mutant was compared to the same ratio for wildtype protein and expressed as %WT. A biological replicate for wild-type protein was analyzed in each blot. Symbols represent the mean of n > 3 biological replicates for each mutant ± s.d. Mutants that produced < 50% hSHH-NL protein relative to wild-type protein are indicated in red.