Crystal structure of steroid reductase SRD5A reveals conserved steroid reduction mechanism

Steroid hormones are essential in stress response, immune system regulation, and reproduction in mammals. Steroids with 3-oxo-Δ4 structure, such as testosterone or progesterone, are catalyzed by steroid 5α-reductases (SRD5As) to generate their corresponding 3-oxo-5α steroids, which are essential for multiple physiological and pathological processes. SRD5A2 is already a target of clinically relevant drugs. However, the detailed mechanism of SRD5A-mediated reduction remains elusive. Here we report the crystal structure of PbSRD5A from Proteobacteria bacterium, a homolog of both SRD5A1 and SRD5A2, in complex with the cofactor NADPH at 2.0 Å resolution. PbSRD5A exists as a monomer comprised of seven transmembrane segments (TMs). The TM1-4 enclose a hydrophobic substrate binding cavity, whereas TM5-7 coordinate cofactor NADPH through extensive hydrogen bonds network. Homology-based structural models of HsSRD5A1 and -2, together with biochemical characterization, define the substrate binding pocket of SRD5As, explain the properties of disease-related mutants and provide an important framework for further understanding of the mechanism of NADPH mediated steroids 3-oxo-Δ4 reduction. Based on these analyses, the design of therapeutic molecules targeting SRD5As with improved specificity and therapeutic efficacy would be possible.


9.
, figure caption: Remove the track changes in "Red stars (*) indicated the variants in the patients". Also, this sentence should be "Red stars (*) indicate the variants in patients". Please make the red stars in Fig. 3 and Fig. 4 more noticeable by using bold font.
Reviewer #2 (Remarks to the Author): This manuscript describes a structure of a steroid 5a-reductase from Proteobacteria bacterium obtained at 2.0 angstrom resolution with NADPH bound. Based on sequence homology with human steroid 5a-reductase (SRD5A) a homology model is built. Validation of the presumptive roles of critical residues involved in cofactor binding and steroid 5a-reduction was conducted using site-directed mutagenesis and transfection in HEK293T cells. Evidence is provided for a conserved catalytic mechanism with steroid 5b-reductase; and the role of disease related mutants observed in the human enzyme are elaborated. The manuscript is of potential importance since there is a deficit of structural information on the human enzyme which is targeted for the treatment of BPH with finasteride and dutasteride. In addition steroid 5a-reductase (SRD5A2) deficiency in humans is associated with the autosomal recessive disorder of sex development associated with pseudo-hermaphroditism, lack of male pattern baldness, and an atrophied prostate gland. There is a lot to like about this manuscript but some important features require attention.
1. The authors restrict their discussion to the role of steroid 5a-reductase to the metabolism of androgens and progestins but do not mention the importance of the enzyme in the metabolism of glucocorticoids. Please rectify. In the Main paragraph there are some incomplete thoughts. For example, the authors suggest that progesterone binds to the GABA receptor when the neurosteroid in question is the product of 5a-dihydroprogesterone metabolism, allopregnanolne. Please correct. Similarly the authors state that androstenedione is the precursor of androgens and estrogens but do not complete the thought; are they trying to state that by conducting 5a-reduction of these steroids the amount of androstenedione that can be converted to testosterone and estrogens will be diminished ? regulation of the ACE2 receptor, the main target of SARS-COV-2 and related virus for cell invasion. The structural looks sound with clear presence of NADPH in the electron density and the conclusions made are reasonable. Biochemical investigations using point mutations provided further light of the structural implications on the reductase enzymatic activity enhancing our understanding of the mechanism(s) involved. The latter point allowed significant categorisation of known loss of function with mutations in the Human SRD5A2 Questions and issues for the Authors to address Q1: Are there any drugs in the market that target SRD5A ? Would you envisage this molecule to ever be a target in the future?
Q2: A large-scale BLAST and investigation was performed to select a good candidate for studies of the HsSRD5A however no mention of what criteria were used to rank or eliminate candidates and why PbSRD5A was selected. This should be explored even if briefly. It is also not clear how many targets were selected together with PbSRD5A and why 4 particular bacterial homologs were selected. Q3: Sequence similarity between PbSRD5A and HsSRD5A is mentioned as between 52% and 51% overall but no mention of the sequence between SBD and NDPBD domains? Q4: Proteobacteria bacterium is a gram-negative bacterium from the same type as Escherichia, Salmonella. What was the reason to use an insect cell expression system on a protein from this organism? Q5: Can you provide a comment on occupancy and/or B factor values of the NADPH and monoolein molecules in comparison with the surrounding atoms and residues particularly those at hydrogen or Van der Waals distances? Q6: Can you provide any further indication (biochemical or structural) that the pocket where monoolein was modelled is the location where steroid subtracts would bind? Are there any mutation work in the literature related to this ? Q7: Have you attempted or considered obtained a complex with a inhibitor or substrate ? Q8: Please consider depositing your homology models for HsSRD5A1 and -2 under one of the servers that accept protein structural models (e.g. https://modelarchive.org/ or similar) and reference their unique ID in the publication. This will allow further scrutiny and reuse of those models by other researchers in the future. If you decide against this suggestion please justify? Q9: Were there any attempts to solve the structure with sequence homolog domains before the model generation using Rosetta was attempted? Q10:Please explain what you mean by "All diffraction data analyses have been reproduced at least three times" as states under the Reporting Summary? Q11: Please comment why Extended Table 1 describes a resolution range for the data from 30 to 2 Å while the validation report shows 19.36 to 2.0? Was there any manual truncation of the low-resolution data and if so why? Q12: Reporting Summary vs Extended Data table 1 vs validation report Q13: On the reporting summary it states "For structure refinement, 10% data were selected randomly for cross validation. " But on the Reporting Data Table 1 it states "Rfree was calculated with 5% of the reflections selected" (page 12, line 6-7)While on the validation report it states 2000 reflections (8.71%) Please explain discrepancies and correct if necessary.
Q14: Please reconsider interpretation of water molecules in light of close contacts i) A:402:HOH close contact to ARG 237 ii) A:401:HOH close contact to ASP 240 Q15: Ligands structure: There are quite a few outliers in bond length as well as large angle outliers (e.g. 133 degrees for a sp2 120 degrees expected angle is a large outlier) particularly for NADPH in the validation report. It looks no dictionary was used or the weight was too low. Can you comment if a restrain dictionary has been used for NADPH and MO? If so what weight was it used?
Other minor points Hs in HsSRD5A is never described as "Homo sapiens". Should be defined the first time it is mentioned in the main text. Pb in PbSRD5A is loosely defined as from Proteobacteria bacterium in the Abstract. Should be defined the first time it is mentioned in the Intro.
Page 3, second paragraph A space is required between EC and 1.3.1.22 Page 5 Native diffraction data is collected at a certain energy, certain temperature, etc but not collected at 2.0 Angsts resolution. Unless the data was collected at a particular distance that only recorded diffraction up to 2 Angsts resolution. On that context please comment on what criteria was used for the high resolution limit of the data?
Please consider depositing your models for PbSRD5A used to perform MR on one of the servers that accept protein structural models (e.g. https://modelarchive.org/ or similar) and reference their unique ID in the publication. This will allow further scrutiny and reuse of those models by other researchers in the future.
There is no Page 13: Was the protein-lipid reconstitution really made (v/v) or was it (w/w) or (w/v)? Please reference what protocol was used for lipid cubic phase reconstitution with reference. Crystal size: "and reach full size within one week at 20 °C" mention the size that "full size" references to.
State that PEG 400 % is (v/v) Page 14: MSA is mentioned for the first time with our prior full description of the abbreviation Page 15: Was these data collected at one ore more beamlines? SSRL beamlines are mentioned but then only one beamline BL18U1 reference Is there a beamline paper describing BL18U1 ? If so please reference. If not please mention in the paper or extended data the type of goniometer, beam size, an estimation of dose (or flux and exposure time) and detector used to collect the data.
Page 16: Docking was done with AutoDock Vina. What was used for the molecular dynamics?
Page 17: Reference the "web server H++" ? Reference charmm36m force field? Legend PDB IDs but particularly for MaSR1 should be re-stated in the legend (actually as it is done correctly for AKR2D1 in Extended data figure 7b). Short mention of the software used and method to generated SSM should be mentioned and referenced. There are colours in the figure not mentioned in the legend. Please complete.
Extended Data Table 1 Please provide for the low resolution bin Range Rmerge Rpim CC1/2 I/Sigma Please provide how many reflections were used for R and RFree overall but also for low and high resolution bins. Please provide ISa for dataset. Please make number of decimal places consistent (e.g. B factors for protein and ligand, etc) Reviewer #4 (Remarks to the Author): The authors report the crystal structure of a bacterial orthologue of the human steroid 5α-reductase. The data show that the bacterial enzyme crystalizes as a seven-transmembrane structure with three transmembranes forming a putative steroid substrate binding domain and the other four composing an NADPH-cofactor binding domain. Biochemical experiments suggest the bacterial enzyme functions as a monomer to reduce the Δ4,5 double bond of some steroids such as progesterone but not others, such as testosterone. These substrate preferences share similarities and differences with the human steroid 5α-reductases. Similarly, the bacterial and human enzymes differ in their sensitivities to two 4azasteroid inhibitors that are used in the clinic. A number of mutations identified in subjects with the human genetic disease steroid 5α-reductase 2 deficiency are re-created in the bacterial enzyme and shown to have similar detrimental effects. All together, the current findings provide new insight into the structure and function of steroid 5α-reductase, an important gene, enzyme, and pharmacological target. Specific Comments 1. The Abstract indicates that the determined structure "unveiled the substrate recognition of SRD5A". As no steroid substrates were visualized in the structure, it might be more accurate to indicate that the structure predicted the location of a steroid substrate binding domain. 2. Similarly, the Abstract indicates in the last sentence that the deduced SRD5A structure will aid the design of therapeutic molecules. Given that hundreds if not thousands of inhibitors of SRD5A have already been identified and at least two of these are used in the clinic, it is unlikely that the atomic structure will have much impact on drug design. 3. On page 3, the first paragraph indicates that androstenedione is the precursor of androgens and estrogens, which is correct. The next sentence indicates that "These steroids could be converted to their corresponding 3-oxo-5α steroids…". This is not a true statement as only androgens can be converted to a 3-oxo-5α steroids; estrogens cannot be. 4. It is interesting that the bacterial enzyme is active against progesterone but not testosterone. There is some literature to suggest that over evolutionary time, progestins preceded androgens as active steroid hormones thus the bacterial enzyme substrate preference supports this literature. 5. The authors use the term "pseudovaginal perineoscrotal hypospadias" to describe the human deficiency of steroid 5α-reductase. This term has now been replaced with "steroid 5α-reductase 2 deficiency" as the symptoms of the disease range from simple hypospadias to the most severe form presenting with pseudovaginal perineoscrotal hypospadias.

Reviewer #1:
Steroid 5α-reductases (SRD5αs) are enzymes involved in catalyzing steroids with 3-oxo-Δ4 structure, such as testosterone, androstenedione and progesterone, to generate corresponding 3-oxo-5 α steroids, which are essential for multiple physiological and pathological processes. Abnormal activities of SRD5αs have been linked to several diseases, such as benign prostatic hyperplasia, prostatic cancer and male infertility. Recently, SRD5α s have also been indicated as potential targets for the treatment of COVID-19. Although human steroid 5αreductase isozymes, HsSRD5A1 and HsSRD5A2, have been extensively investigated for decades, the structural information and molecular reaction mechanism remain poorly understood. In this paper, the authors reported the crystal structure of PbSRD5A in complex with the cofactor NADPH at high resolution (2.0 Å). PbSRD5A is a bacterial homolog of SRD5A from Proteobacteria, which shares 60.6% and 51.5% sequence similarities with HsSRD5A1 and HsSRD5A2, respectively. Homology-based structural models of HsSRD5A1 and HsSRD5A2 were built, and substrates were docked to PbSRD5A, HsSRD5A1 and HsSRD5A2, respectively. Extensive biochemical characterizations were also carried out, using in vitro reduction assay for PbSRD5A and cell-based enzymatic assay for HsSRD5A2. These structural, computational and biochemical studies were combined to understand the mechanism of NADPH mediated steroids 3-oxo-Δ 4 reduction. The presentation of the manuscript should be significantly improved to make the manuscript easier to follow. There are also many typos.
Major concerns: 1. On Page 4, in the section on "Functional characterizations of PbSRD5A", the authors stated that "To unveil the molecular mechanisms underlying the substrate recognition and catalytic reaction of SRD5A1 and -2, BLAST searches, using HsSRD5A1 and -2 as queries, against the sequenced bacterial genomes were performed to identify a steroid 5α-reductase suitable for structural investigation." It seems that the authors aimed to unveil the mechanisms of substrate recognition and catalytic reaction of HsSRD5A1 and -2. However, they selected PbSRD5A, a bacterial homolog, as the target of investigation. Please explain in the main text why HsSRD5A1 or Reply: We thank the reviewer for the constructive suggestion to better explain the logic for homolog selection.
The main reason for homolog screening is that, we have difficulties in human SRD5A1 and -2 purification due to the instability and low expression level when we initiate this project. So, we took a different approach to search a homolog, with the similar function, from lower species to get enough protein for structural studies. We also stated this directly in the revised version (Page 4, line 8-9) 2.On Page 8, in the section on " Molecular mechanism of 3-oxo-Δ 4 reduction " , the authors stated that "Considering the sequence similarities of bacterial SRD5A, HsSRD5A1 and -2, we built the homology models of HsSRD5A1 and -2 …" and "To fully explore the substrate recognition and reduction reaction mechanism, progesterone was docked into PbSRD5A, as well as testosterone into HsSRD5A1 and -2, respectively." The substrate selectivity suggests that different SRD5As may adopt different conformations in the substrate binding sites. Indeed, sequence alignment analysis of SRD5αs across different species indicated that while the residues involved in the binding of NADP are highly conserved, the residues in the progesterone-binding pocket are much less conserved. Therefore, the accuracy of the homology models of the progesterone-binding pocket is unclear, and the rigor of the docking studies based on the homology models is uncertain.
Moreover, on Page 16, in the 'Substrate docking and MD simulation for PbSRD5A, HsSRD5A1 and -2' section, the authors mentioned that substrates were docked into 500 conformers generated from MD simulations, making the docking results even more uncertain. Please justify the credibility of this whole approach, besides the observation that E57 and Y91 were shown to be important for binding from both computational and experimental studies.
Reply: We appreciate the reviewer for the critical question about homology modeling and docking results.
Firstly, we carefully checked the conservation of putative substrate binding pockets again, picked all residues which compose the substrate binding pockets, and highlighted the conserved residues in Table R1 as below.

Transmembrane
Helix No.

Residue in human SRD5A1
Residue in human SRD5A2

Residue in PbSRD5A
Conservation Table R1: Conservation analysis of human SRD5A1, -2 and PbSRD5A substrate binding pockets Then we used PbSRD5A as the template to map the residues in the structure ( Figure R1). It clearly showed that, the residues which are deeply buried inside pockets are highly conserved and the variable residues are gathered on periplasmic side of TM1 and TM4. Considering the orientation of NADPH and substrates, those conserved residues could recognize 3-oxo and delta-4 signature structures of the substrates and the variable residues probably will interact with the "tail" part of substrates, which is also quite different among testosterone, androstenedione and progesterone. So we believe that, our structural information, combining with the docking results, will be a good template to study not only the reaction mechanism but also substrate recognition specificity.
Indeed, we set extensive trials to get structures of all three proteins (HsSRD5A1, HsSRD5A2 and PbSRD5A) with different substrates and carried out MD simulations to identify some residues on TM1 and TM4 for substrate recognition simultaneously. However, these parts are challenging, time consuming and need massive computing resources. In this work we performed two different kinds of docking: (1) using crystal structure for PbSRD5Aprogesterone complex; (2) using model for human SRD5A1 and -2. In both cases, flexibility of the ligands and residues were considered.
The formers follow the standard AutoDock 4 and AutoDock Vina docking procedures. Briefly, each ligand is docked with at least 250,000 poses and the calculations were repeated for 100 times. Rather than simply looking at the docking scores, poses with higher docking scores were further clustered according to their RMSDs. This gives the self-consistent poses for all ligands. Note that AutoDock 4 and AutoDock Vina yield consistent results under this analysis.
The following procedure is a more complicated ensemble docking, based on a conformation ensemble considering the thermodynamic effect obtained from all-atom MD simulations at isothermal-isobaric condition, which has shown to be effective in sampling unseen, druggable pockets for multiple targets (see Nat. Common. , 4 ( 2013 ) , p. 1407 and PLoS Negl Trop Dis. 2010 Aug 24;4(8):e803.). This is a critical step to assess the credibility of our modeling structures and predicted binding poses starting from crystal structure snapshot. In brief, all the MD simulation parameters were set up to mimic the physiological conditions of SRD5As (ER membrane, pH 7.4, and 0.15M NaCl), and the state-of-the-art force field for membrane proteins, Charm36m, was employed to describe the systems. To avoid artificial/unphysical conformational deformation of the conserved binding pose of the NADPH inside proteins in simulations, a harmonic potential (500 kJ/mol) was applied to restraint the movement of the heavy atoms of the NADPH molecule. Moreover, given that our biochemical results have shown the catalytic activity of the NADPH/HsSRD5As to testosterone, a harmonic potential (50 kJ/mol) was applied to constraint the distance of the center-of-mass of between the nicotinamide group of NADPH and the C=O group of the testosterone predicted by the initial docking pose, ensuring the efficiency of our MD sampling on the reaction active state of the system. After a 10000 steps of energy minimization, 200 ns MD simulation was performed to fully relax the protein, membrane, and solvent step by step ( Figure R2).

Figure R2: RMSD of the HsSRD5A1/NADPH and HsSRD5A2/NADPH in complexed with the testosterone along the MD simulations.
After those carefully treatments, product conformation ensemble was generated from another 50 ns of MD simulation at isothermal-isobaric ensemble, which allows us to consider thermodynamic effect on the conformation of the proteins avoiding large (high energy) conformation changes (see Nat. Common., 4 (2013), p. 1407 and PLoS Negl Trop Dis. 2010 Aug 24; 4(8):e803.). Then, flexible docking was carried out for each structure of the conformation ensemble using AutoDock Vina. Note that exhaustiveness docking (repeated 80 times) was carried out for each conformation, and then only the best (strongest binding affinity) docking result was recorded for data analysis later on. Finally, we reported two representative binding poses for each system: a) Conf1, the one with the highest binding affinity predicted by Vina scoring function from the whole conformation ensemble (500 structures); b) Conf2, the representative of a cluster of conformations which has the highest binding affinity on average. The relationship between the representative binding poses and other (high energy) binding poses was characterized by a correlation analysis of their binding affinity and RMSD of the ligand relative to the representative conformation Conf1 ( Figure R3).

HsSRD5A1
HsSRD5A2 Figure R3: A correlation analysis of the binding affinity and the RMSD of ligand relative to the one in the representative conformation Conf1. Here, each data point represents the average values (RMSD and/or binding affinity) of a cluster; the standard deviations of RMSD and binding affinity within each cluster are shown as error bars.
Our results show clearly that the two representative binding poses of testosterone on HsSRD5A2/HADPH complex are very similar in 3D space (upper panel of Figure R4) and their binding affinity scored by Vina are very close (-12 kcal/mol and -11.3 kcal/mol). This is also similar for the simulation of HsSRD5A1/NADPH (lower panel of Figure R4). Note that in both cases, the ligand is stabilized by a "Q-E-Y" motif of the protein.
These binding poses are also very similar to the one starting from experimental conformations. Conf1 (HsSRD5A2/NADPH, -12 kcal/mol) Conf2 (HsSRD5A2/NADPH, -11.3 kcal/mol) Conf1 (HsSRD5A1/NADPH, -11.6 kcal/mol) Conf2 (HsSRD5A1/NADPH, -11 kcal/mol) Moreover, not only the two representatives, we found that the top 10 conformations scored by binding affinities in Vina also have similar binding poses ( Figure R5), which confirms again that this interaction pattern is reliable even the protein is relaxed. Overall, we conclude that upon binding with NADPH the pocket is conditioned for substrate binding and the subsequent reaction. 3. On Page 14, in the section on "Initial model building of PbSRD5A", the authors did not explain why they needed initial model construction to solve the structure of PbSRD5A from their diffraction data.
Reply: We appreciate the reviewer's suggestion in better explaining the structural determination process. Using X-ray crystallography to determine the protein structure, we typically get 2D images with diffraction spots, referring to as the reciprocal lattice and representing a wave with an amplitude and a relative phase.
Unfortunately, we measured the intensity of each spot but lost information about the relative phases of different diffraction. This is called "phase problem" in crystallography. Some methods like molecular replacement (MR), single-wavelength anomalous scattering (SAD) or multi-wavelength anomalous scattering (MAD) of heavy atoms in crystals are commonly used to get the phase information. Because we use lipidic cubic phase crystallization method to yield protein crystal, the crystal is growing in a gel-like lipidic environment, which is quite difficult to soak with heavy atoms. We also failed to get the phase by using seleno-methionine substitution method. Additionally, there is no structural template with reasonable sequence similarity available to try molecular replacement. So, we built the initial model from scratch for molecular replacement purpose.

The authors used only one sentence to describe their homology modeling of HsSRD5A1 and HsSRD5A2:
Page 8, "we built the homology models of HsSRD5A1 and -2 using the server for initial model building of PbSRD5A (Fig. 3a,b and Extended Data Fig. 6)." Please provide the details since homology modeling is an important part of this study. Particularly, the superimposition of these three structures showed a big movement of TM1, which is unusual for homology modeling. Speculate the origin of the movement from the modeling perspective.
Please also provide the name of the server for initial model building and relevant reference (e.g., Refs. 51 and 52).
Reply: We appreciate the reviewer for the critical questions about homology modeling and the movement of TM1.
For the initial PbSRD5A model building, similar to trRosetta [1], we fold the 3D model using a de novo approach from a predicted distance/orientation matrix, which is derived from a variety of multiple sequence alignments (MSAs) by a multi-branch fusion ResNet [2] ( Figure R6).  Figure R6: A de novo approach from a predicted distance/orientation matrix, which is derived from a variety of multiple sequence alignments (MSAs) by a multi-branch fusion ResNet. Following the trRosetta methodology, we generated 300 models from the predicted distances and orientations using constrained minimization, which is an embedded module from PyRosetta. Finally, the top 1 model with the lowest Rosetta energy was selected as the input model for molecular replacement (MR) to determine the protein structure.
We applied the same strategy for HsSRD5A1 and -2 model building and performed MD simulations. We noticed the big movement of TM1s in PbSRD5A structure and other two structural models (Extended Data Figure 7a), so we carefully checked the residue contacts among all transmembrane helices. Unlike other helices, the interactions among TM1, -4 and -7 are relatively weak, especially for the N-terminal half of TM1. This weak interaction might be the origin of TM1 flexibility. As we discussed in Point 1, TM1 probably will NOT interact with NADPH and the ring structure of steroid molecule but may provide the substrate specificity by interacting with the tail part of steroid molecules. We speculated that, TM1 may act as a gate to control the substrate entry and determine the substrate specificity. Since we are still trying to get structures of all three proteins with different substrates, it's quite difficult to interpret the conformations of TM1 in different states (apo, substrate or inhibitor bound) precisely. We also applied these structure and models in extensive MD simulation studies to identify some residues on TM1 for ligand recognition. As we mentioned above, these parts are time consuming and need massive computing resources. Reference: [1] Yang, J., Anishchenko, I., Park, H., Peng, Z., Ovchinnikov, S. and Baker, D., 2020. Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences, 117(3), pp.1496-1503.

NADPH locates in the central cavity. Discuss how NADPH enters the cavity. Is it because of the movement of TM1 or large-scale motions of the cytosolic loops?
Reply: We appreciate the reviewer for the insightful question. To our understanding, the TM1 movement will open the entrance towards the hydrophobic lipid bilayer, which is only for hydrophobic substrate entering.
NADPH is well solubilized in cytosol and may enter into the cavity by the motion of cytosolic loops. Another indirect evidence is that, there are several disease related mutations located on cytosolic loops, especially at the C-terminal loop (Figure 4). Although these loops will not interact with NADPH directly, the conformations are almost identical in PbSRD5A and MaSR1 (Extended Data Figure 2), indicating their important role in maintaining the enzyme activity. However, we didn't see the conformational change of these loops during simulation, so we didn't discuss the NADPH entry in the manuscript.  4. Please provide the RMSD trajectory of the MD simulation for HsSRD5A2 to demonstrate that the system has reached equilibrium.

On
Reply: Thanks for raising this point. We have provided the RMSD trajectory of the MD simulation as below.   [15] scores from the best of our servers on those targets are 59.1 (from 2020-05-08 to 2020-08-01), which greatly surpassed Robetta (lDDT value is 47.7). In the near future, we shall release our approach as a web server and make it available to the public.
[15] Mariani, V., Biasini, M., Barbato, A. and Schwede, T., 2013. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics, 29 (21) Reply: We thank the reviewer to make our work more precise. We have made related amendment in our manuscript. Steroids have multiple physiological and pathological functions. Here we mainly focused on prostate disease field. Testosterone, androstenedione and progesterone are the main steroid we are interested in.
Cortisol, as an important steroid, is an important substrate of SRD5As, which we added related description in the main text.
2. The crystal structure reported was obtained from concentrated 5a-reductase containing NADPH. However, the final structure likely contains NADP+ due to oxidation of the cofactor that will occur. This is not a trivial point since NADPH is non co-planar while NADP+ is aromatic and this difference will likely affect the trajectory of hydride transfer.
Reply: We appreciate the reviewer for the insightful question. Due to the resolution limit, the hydrogen is invisible in the x-ray structure. So we cannot differentiate NADP + and NADPH by the electron density map. For structure determination of PbSRD5A in complex with NADPH, we added 2mM fresh-made NADPH and 5mM DTT to maintain the reduced environment during the whole purification process, and added extra 10mM NADPH before we set up the crystallization trials. The crystals grew for one week to reach the full size (15*15*120 um) in the droplets. The molar ratio of NADPH and PbSRD5A is about 10:1 in lipidic cubic phase droplets. So it has the higher chance to get the NADPH complex but not NADP + . We are unable to fish crystals and extract enough NADPH to determine the redox state. Meanwhile, we tried to get the substrate bound structure with NADP + to see if any difference but didn't get the structure yet. Obviously, this is inconclusive at this stage until the structure with high resolution is obtained.  Figure 1b). The purification table was added in the extended materials (Extended Data Table 1) and the activities were calculated as: Calculated activity = detected activity / (Sample volumes in an assay system *grey value / Volume loaded on western blot), and normalized by sample #6

In the supplemental material the authors show the purity of their
(Extended Data Figure 1b).
Your suggestion is well taken. The purification table (Extended Data Table 1) and diagram (Extended Data   Table 1b) are updated in the revised manuscript.
As the results shown, PbSRD5A maintained about 20-30% of apparent activity after affinity purification. There are several possible reasons. Firstly, the lipid environment is replaced by detergent micelle and some functional important lipid molecules may lost during purification. A recent paper also provided the evidence to demonstrate the importance of lipids in SRD5A activity (Endocrinology, August 2020, 161(8):1-11). However we reconstituted protein into lipidic cubic phase (LCP) for crystallization. Currently we are unable to measure the enzyme activity in situ because the LCP droplet is gel like. Secondly, the endogenous enzymes which may non-specifically catalyze progesterone were removed after affinity purification. Additionally, the protein quantification by western blot is not precise enough.

The authors show that their enzyme is only weakly inhibited by finasteride. Whereas the human enzyme
displays nM affinity for this drug. This difference deserves further comment since it raises issues that the bacterial enzyme may not be such a good model of the human enzyme after all. It is also known that finasteride is turned over by SRD5A1 to produce a high affinity enzyme generated bisubstrate analog which does not seem to be the case in their study [ see, J. Amer. Chem. Soc., 1996, 118, 2359-2365.
Reply: We appreciate the reviewer for the insightful question. We noticed the finasteride is turned over by human SRD5A2 (but not SRD5A1 in most articles and reviews) to produce a high affinity bisubstrate and the recent non-peer reviewed paper published on research square also confirmed this result.
The accuracy of structural model is judged majorly by the sequence homology. The PbSDR5A shares 60.6% and 51.5% sequence similarities with human SRD5A1 and -2, respectively. We carefully checked the conservation of substrate binding pockets, picked all residues which compose the substrate binding pockets, and highlighted the conserved residues in Table R1 as below.

Transmembrane
Helix No.

Residue in human SRD5A1
Residue in human SRD5A2

Residue in PbSRD5A
Conservation  Then we used PbSRD5A as the template to map the residues in the structure ( Figure R1). It clearly showed that, the residues which are deeply buried inside pockets are highly conserved and the variable residues are gathered on periplasmic side of TM1 and TM4. Considering the orientation of NADPH and substrates, those conserved residues could recognize 3-oxo and delta-4 signature structures of the substrates and the variable residues probably will interact with the "tail" part of substrates, which are the major different among testosterone, androstanedione, progesterone and two inhibitors.
We speculated that, finasteride may not be the good inhibitor for PbSRD5A because the tail part cannot be well recognized. From the previous studies, human SRD5A1 and -2 also showed differences in inhibitor recognition.
Human SRD5A2 can be inhibited by both finasteride and dutasteride but human SRD5A1 can only be strongly inhibited by dutasteride. There is NO specific inhibitor for SRD5A1 only. Without the structural information, it's impossible to figure out the recognition specificity of these SRD5As. The inhibition specificity provides the chance for inhibitor development targeting given SRD5A protein. This is why we believe that, our structural information, combining with the docking results and biochemical analysis, will help to study not only the SRD5A reaction mechanism but also inhibitor recognition specificity of SRD5As. Currently, we are still trying to get structures of all three proteins with varies of inhibitors. We also applied these structures and models in extensive MD simulation studies to identify some residues on TM1 and TM4 for inhibitor recognition. However, these parts are challenging, time consuming and need massive computing resources.
5. If the catalytic mechanism of 5a-reductase and AKR1D1 are conserved, did the authors over lay the key residues from each structure to show the rmsd differences in the residues?
Reply: We appreciate the reviewer for the constructive suggestion. PbSRD5A-progesterone complex model and AKR1D1-progesterone structure are used as templates to show the similarity ( Figure R2). We superimposed NADPHs in two structures and showed progesterone molecules or key residues (Try and Glu) in panel a and b, respectively. Because AKR1D1 and PbSRD5A transfer the hydride ion from alpha or beta side of progesterone to produce the corresponding 5α or 5β DHP, the NADPH, progesterone and key residues in PbSRD5A are almost organized as the mirror image to that in AKR1D1. The detailed distance between substrate and key residues are labeled in Figure 3 and Extended Data Figure 7.
6. in docking the substrate testosterone into their structure to generate a ternary complex, how many binding poses were observed and how did they compare in energy?
Reply: We appreciate the reviewer for the insightful question. In this work we performed two different kinds of docking: (1) using crystal structure for PbSRD5Aprogesterone complex; (2) using model for human SRD5A1 and -2. In both cases, flexibility of the ligands and residues were considered.
The formers follow the standard AutoDock 4 and AutoDock Vina docking procedures. Briefly, each ligand is docked with at least 250,000 poses and the calculations were repeated for 100 times. Rather than simply looking at the docking scores, poses with higher docking scores were further clustered according to their RMSDs. This gives the self-consistent poses for all ligands. Note that AutoDock 4 and AutoDock Vina yield consistent results under this analysis.
The following procedure is a more complicated ensemble docking, based on a conformation ensemble considering the thermodynamic effect obtained from all-atom MD simulations at isothermal-isobaric condition, which has shown to be effective in sampling unseen, druggable pockets for multiple targets (see Nat. Common. , 4 ( 2013) , p. 1407and PLoS Negl Trop Dis. 20104(8):e803.). This is a critical step to assess the credibility of our modeling structures and predicted binding poses starting from crystal structure snapshot. In brief, all the MD simulation parameters were set up to mimic the physiological conditions of SRD5As (ER membrane, pH 7.4, and 0.15M NaCl), and the state-of-the-art force field for membrane proteins, Charm36m, was employed to describe the systems. To avoid artificial/unphysical conformational deformation of the conserved binding pose of the NADPH inside proteins in simulations, a harmonic potential (500 kJ/mol) was applied to restraint the movement of the heavy atoms of the NADPH molecule. Moreover, given that our biochemical results have shown the catalytic activity of the NADPH/HsSRD5As to testosterone, a harmonic potential (50 kJ/mol) was applied to constraint the distance of the center-of-mass of between the nicotinamide group of NADPH and the C=O group of the testosterone predicted by the initial docking pose, ensuring the efficiency of our MD sampling on the reaction active state of the system. After a 10000 steps of energy minimization, 200 ns MD simulation was performed to fully relax the protein, membrane, and solvent step by step ( Figure R2).

Figure R2: RMSD of the HsSRD5A1/NADPH and HsSRD5A2/NADPH in complexed with the testosterone along the MD simulations.
After those carefully treatments, product conformation ensemble was generated from another 50 ns of MD simulation at isothermal-isobaric ensemble, which allows us to consider thermodynamic effect on the conformation of the proteins avoiding large (high energy) conformation changes (see Nat. Common., 4 (2013), p. 1407and PLoS Negl Trop Dis. 20104(8):e803.). Then, flexible docking was carried out for each structure of the conformation ensemble using AutoDock Vina. Note that exhaustiveness docking (repeated 80 times) was carried out for each conformation, and then only the best (strongest binding affinity) docking result was recorded for data analysis later on. Finally, we reported two representative binding poses for each system: a) Conf1, the one with the highest binding affinity predicted by Vina scoring function from the whole conformation ensemble (500 structures); b) Conf2, the representative of a cluster of conformations which has the highest binding affinity on average. The relationship between the representative binding poses and other (high energy) binding poses was characterized by a correlation analysis of their binding affinity and RMSD of the ligand relative to the representative conformation Conf1 ( Figure R3).

HsSRD5A1
HsSRD5A2 Figure R3: A correlation analysis of the binding affinity and the RMSD of ligand relative to the one in the representative conformation Conf1. Here, each data point represents the average values (RMSD and/or binding affinity) of a cluster; the standard deviations of RMSD and binding affinity within each cluster are shown as error bars.
Our results show clearly that the two representative binding poses of testosterone on HsSRD5A2/HADPH complex are very similar in 3D space (upper panel of Figure R4) and their binding affinity scored by Vina are very close (-12 kcal/mol and -11.3 kcal/mol). This is also similar for the simulation of HsSRD5A1/NADPH (lower panel of Figure R4). Note that in both cases, the ligand is stabilized by a "Q-E-Y" motif of the protein.
Moreover, not only the two representatives, we found that the top 10 conformations scored by binding affinities in Vina also have similar binding poses ( Figure R5), which confirms again that this interaction pattern is reliable even the protein is relaxed. Overall, we conclude that upon binding with NADPH the pocket is conditioned for substrate binding and the subsequent reaction. So there has to be some method used to compensate for differences in protein expression. This could have been achieved with epitope tagging with FLAG-tags.
Reply: We appreciate the reviewer for the constructive suggestion. We measured the expression level by western blot using anti-human SRD5A2 antibody and updated in Extended Data Figure 8. (Page 9, Line 6) 8. The author should move Fig. 9 from the supplemental material to the body of the manuscript. This figure describes the catalytic mechanism for 5a-reductase which is a main point of the manuscript. In describing the mechanism in the discussion, the properties of the E57Q mutant seem poorly described. Surely, E57 is present in the protonated state to facilitate hydride transfer to the C3 carbonyl and this property would be lost in Q57 where an amide is present instead, Please clarify.
Reply: We appreciate the reviewer for the constructive suggestion and made changes in the revised version.
( Figure 4; Page 11, line 18-21) 9. In the opinion of this reviewer the link to the COVID-19 pandemic is overstated. The paper has many attributes and this could just get a passing mention. TMPRSS2 expression is usually up regulated in late stage prostate cancer often following ADT; and expression differences may exist across ethnic groups.
Reply: We appreciate the reviewer for the constructive suggestion and removed the COVID-19 part in the revised version.

The structure has not been deposited in the PDB which is a criteria for publication.
Reply: We appreciate the reviewer for the constructive suggestion and deposited the structure in PDB (PDB code: 7C83) Minor: 1. NADPH dependent oxidoreductases do not belong to the family EC1.3.1.22. This is just the enzyme commission number where instead a family suggests some evolutionary relationship.
Reply: Thanks for raising this point. We have deleted this description in the revised version. (Page 3, line 13-14)

NADPH is not an electron and proton donor it donates a hydride ion.
Reply: Thanks for raising this point. We have corrected in the revised version. (Page 3, line 17) 3. 5a-androstenedione is misspelled; it should be 5a-androstanedione.
Reply: Thanks for raising this point. We have corrected all this misspelled word in the revised version.

Please define AKR1D1 at first mention.
Reply: Thanks for raising this point. We have defined AKR1D1 as steroid 5β-reductase, Aldo-keto reductase family 1 member D1 in the revised version. (Page 8, line 19)

Reviewer #3 (Remarks to the Author):
The work is original and highly relevant for the field. Reply: We appreciate the reviewer for the constructive suggestion. As far as we know, there are at least 2 drugs, finasteride and dutasteride, in the market targeting SRD5As. The indications of finasteride covers benign prostatic hyperplasia and alopecia and the indication of dutasteride is benign prostatic hyperplasia. That's why we used these two benchmark compounds to inhibit SRD5As in our work. It's worth mentioning that, dutasteride targets both SRD5A1 and -2 but finasteride specifically targets SRD5A2. We majorly focus on the molecular mechanism of SRD5As induced prostate cancer occurrence and progression. Activity of SRD5A1 but not -2 is dramatically increased in the prostate cancer patients for unknown reason. However, there is NO specific SRD5A1 inhibitor available in the market. Additionally, the current SRD5A inhibitors can be classified to steroid like and non-steroid like molecules. Both finasteride and dutasteride are steroidal medicine. Long term treatment of finasteride or dutasteride reduces the overall incidence of prostate cancer but results in more aggressive prostate cancer. The molecular mechanism behind this outcome is still unclear. So the development of non-steroidal inhibitor may provide an alternative choice. We believe that, our structural information, combining with computational studies and biochemical analysis, will shed on the light of the inhibitor rational design. So our future work will focus on the inhibitor-SRD5A complex structure determination, biochemical analysis, as well as novel inhibitor design. crystal, #2 cannot catalyze 3-oxo-Δ4-steroids, and #3 is partly aggregated after concentration. Additionally, these 3 candidates exhibited totally different inhibition properties to finasteride and dutasteride. Although we didn't report the biochemical analysis in the manuscript, these 3 candidates may act as good candidates in the following studies to explore the substrate and inhibitor specificities.
Q3: Sequence similarity between PbSRD5A and HsSRD5A is mentioned as between 52% and 51% overall but no mention of the sequence between SBD and NDPBD domains?
Reply: We appreciate the reviewer for the constructive suggestion. The overall sequence similarity of PbSRD5A and HsSRD5A1 is 60.6%. The SBDs (TM1-4) and NDPBDs (TM5-7) of PbSRD5A and HsSRD5A1 share 50.3% and 72.3% similarity, respectively. We also took suggestion from another reviewer not to define these two domains because we didn't get the substrate-protein complex and cannot rule out the contribution of TM5-7 in substrate binding.
Q4: Proteobacteria bacterium is a gram-negative bacterium from the same type as Escherichia, Salmonella.
What was the reason to use an insect cell expression system on a protein from this organism?
Reply: We appreciate the reviewer for the interesting question. We tried E.coli expression system first and got small amount of proteins. However, the protein profile on size-exclusion is poor and the N-terminal tag cannot be removed. Considering its high sequence similarity with human SRD5As, we tested insect cell system and got the functional protein. Besides, a recent paper also provided the evidence to show the importance of lipids in SRD5A activity (Endocrinology, August 2020, 161 (8) Reply: We appreciate the reviewer for the insightful question. Indeed we tried very hard to get the substrate or inhibitor bound structure but didn't get any promising results. One of the possible reason is that, the monoolein concentration is much higher than the solubility of substrates and inhibitors. We added inhibitors and substrates in the purification buffers, monoolein and crystallization buffers but cannot replace monoolein in the structures.
The recent non-peer reviewed paper published on research square supported the pocket is the real binding site for substrates and inhibitors (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7373137/). Because human SRD5A2 will form a bisubstrate complex with finasteride but not the case in PbSRD5A, we also tried to cocrystallize dutasteride and other inhibitors with those SRD5As. Moreover, the disease mutations we mentioned in the manuscript can be classified into 3 categories. Q53, E54, and Y89 in PbSRD5A2 located in the monoolein binding site but won't interact with NADPH. The good size-exclusion profiles of these 3 mutants (Q56A, E54L and Y89F) indicated that, the overall structures are maintained. This is also an indirect evidence to confirm the substrate binding site.

Q7: Have you attempted or considered obtained a complex with an inhibitor or substrate?
Reply: We appreciate the reviewer for the constructive suggestion. We tried very hard and are still attempting to get complex structures with inhibitors and substrates using HsSRD5As and other homologs. Reply: We appreciate the reviewer for the insightful question. We tried MR using MaSR1 TM5-10 but failed.
We also tried Se-met method but cannot get enough proteins for crystallization. Soaking with heavy metal is quite challenging because LCP droplets will change to lamella phase, which have strong diffraction spots under X-ray, in a few seconds.

Q10
:Please explain what you mean by "All diffraction data analyses have been reproduced at least three times" as states under the Reporting Summary?
Reply: Thanks for raising this question. The diffraction data we used are processed by two authors (Xiao, Q and Ren, R.) independently. Before submission to PDB, Xiao re-processed data again to confirm the accuracy. Reply: Thanks for raising this question. We corrected the number in revised manuscript from 5% to 10%. We reexamined the parameter for the refine process. In default, 10% data were selected to calculate the Rfree in Phenix. However, Rfree was calculated with 8.71% of the reflections selected in practice. The following figure lists the diffraction data for R work and R free during the refine process.

Reply:
We appreciate the reviewer for the insightful discussion. First of all, we briefly describe the criteria for homolog selection. The criteria to rank the candidates are based on the sequence similarity, host species, expression level, profile on size-exclusion chromatography and the quality of crystals. Indeed, we blasted over 100 species, cloned 10 candidates with highest sequence similarity to human SRD5A2, selected 4 with highest expression level, and set up one round of crystallization trials. We also conducted biochemical analysis for all 4 candidates. We simply named other 3 candidates #1, #2 and #3 here. #1 can efficiently catalyze progesterone, testosterone and androstanedione but didn't yield crystal. #2 cannot catalyze 3-oxo-Δ4-steroids at all, although shared over 50% sequence similarity. #3 can catalyzed progesterone, testosterone and androstanedione but the protein partly aggregated after concentration. The inhibition patterns are different as well.
We have mentioned in the manuscript that the variable residues on TM1 and TM4 may be the main determinants for enzyme specificity. In the following work, a high throughput MD simulation will be performed, using 6 protein structural templates, 4 substrates and 2 inhibitors, to figure out key residues involving in substrate recognition. Extensive biochemical analysis and mutagenesis studies are needed although this a time and resource consuming process.
Additionally, some literatures suggested that over evolutionary time, progestin preceded androgens as active steroid hormones thus the bacterial enzyme substrate preference supports this literature.
In general, we believe that the bacterial structure provides information of great use to characterize human SRD5As.
3. In the abstract they indicate that the work would lead to more specific inhibitors of steroid 5a-reductase. This issue has been solved with finasteride and dutasteride. If the authors want a specific inhibitor only for SRD5A1 they should make the case as to why this is needed. The opinion of this reviewer is that the strength of the article is in the structural information on an important steroidogenic enzyme that has been lacking and that this information can be used to assign the properties of disease-related mutants. It is this point that should be stressed.

Reply:
We thank the reviewer for this suggestion and we now strength the point that the results would be used to assign the properties of disease-related mutants (Page 2, Line 16-18). For the statement of inhibitor development, there are 2 drugs, finasteride and dutasteride, in the market targeting SRD5As. The indications of finasteride covers benign prostatic hyperplasia and male androgenetic alopecia and the indication of dutasteride is benign prostatic hyperplasia. It's worth mentioning that, dutasteride targets both SRD5A1 and -2 but finasteride specifically targets SRD5A2. However there is NO specific SRD5A1 inhibitor available in the market. We majorly focus on the molecular mechanism of SRD5As induced prostate disease. Activity of SRD5A1 but not -2 is dramatically increased in the prostate cancer patients for unknown reason. Based on our structural, biochemical and MD simulation information, the tail part of steroid like inhibitors are essential in improving the recognition specificity and potency. The variable residues of human SRD5A1 and -2 gathered in TM1 and TM4 are responsible for interacting with the tail of inhibitors. We believe that, our structural information, combining with computational studies and biochemical analysis, will shed on the light of the inhibitor rational design targeting specific isoforms. Additionally, both finasteride and dutasteride are potent SRD5A inhibitors used in clinic. However, the long term treatment of finasteride or dutasteride would increase the incidence of aggressive prostate cancer [1,2]. Novel (non-steroidal) inhibitors might be helpful for prostate cancer management. Our future work will focus on the inhibitor-SRD5A complex structure determination, biochemical analysis, as well as novel inhibitor design.

Why do the authors claim that 5a-androstanedione regulates prostate function? (p3. Line 7)
Reply: We thank the reviewer for this comment. An alternative pathway has been unveiled to find that DHT synthesis bypasses testosterone [3]. Androstenedione is converted to 5α-androstanedione, but not testosterone, for DHT synthesis. This pathway might protect androgens from degradation by UGT2B15/17, to facilitate androgen accumulation. This pathway was found in castration resistant prostate cancer cells at first. Later, in our lab, biopsy samples from benign patients (no prostate cancer cells at all) could also generate DHT through 5αandrostanedione, indicating that the function of 5α-androstanedione is not limited in prostate cancer. Since these results have not been published, we revised the manuscript and emphasized the function of 5α-androstanedione in prostate cancer only. The authors have responded to my critique well except as it relates to catalytic mechanism. They still propose that E57 is the principal proton donor. However, their own mutagenesis data show that both Y91F and Y91D eliminate enzyme activity while residual activity remains in the E57Q mutant which is more consistent with Y91 acting as the general acid. Furthermore, examination of the crystal structure of AKR1D1 shows a dual function for the corresponding glutamic acid residue (it also allowed steroid substrates to penetrate the active site more deeply so that hydride transfer could occur to C5). How can the authors be so confident that Y91 is not the general acid and that E57 facilitates enolization of the carbonyl and allows penetration of the steroid into the PbSDR5A active site? Unless I am missing something the authors do not have compelling data to distinguish between these mechanisms.

Reply:
We thank the reviewer's insightful discussion. We missed to emphasize the importance of Y91 in human SRH5A2 to facilitate enolization of the carbonyl and made a significant change in the new MS.
Here I would like to discuss the proton donor to complete the reduction reaction based on the structural comparison with AKR1D1 and SRD5A2. Figure R1 showed the proposed reaction mechanism of SRD5A2 (Herbert G. Bull et al., JACS, 1996). Figure R1. Proposed mechanism of human SRD5A2.
In general, the proton (represented by B-H in figure R1) may come from Tyr, Glu or H2O in the pocket. We rule out the possibility of H2O to provide the proton in the presence of substrate by structural analysis and MD simulation. Shown in figure R2, NADPH transfer the hydride ion from beta-face of testosterone in AKR1D1 and alpha-face in SRD5As to achieve stereo-specificity. Except for hydride, another proton should be added at C4 of testosterone. The Glu and Tyr residues are pointed to the alpha-face of testosterone in AKR1D1 and beta-face in SRD5As. We measured the distances of Glu and Tyr to C4 of testosterone in AKR1D1 and SRD5A2 and showed in Table R1. In SRD5A2, hydrogen atom of carboxyl group in E57 side chain (3.5 Å) is closer to C4 of testosterone than hydroxyl group of Y91 (4.9 Å). To the contrast, hydrogen atom in hydroxyl group of Y58 (3.5 Å) is closer to C4 of testosterone than carboxyl group in E120 (4.6 Å). So we proposed the proton may be transferred from E57 in SRD5A2 to C4 of testosterone.  Å   Table R1. Distances of oxygen in glutamate and tyrosine side chains to C4 of Testosterone.