Mechanistic insights into the loss-of-function mechanisms of rare human D-amino acid oxidase variants implicated in amyotrophic lateral sclerosis

Impaired enzymatic activity in D-amino acid oxidase (DAAO) caused by missense mutations has been shown to trigger amyotrophic lateral sclerosis (ALS) through an abnormal accumulation of D-serine in the spinal cord. While loss of enzymatic functions of certain ALS-causing DAAO variants have been studied before, a detailed understanding of structure-dynamics-function relationship of the rare DAAO variants has not been investigated hitherto. To address this, we carried out a comprehensive study of all the reported rare DAAO variants. By employing a spectrum of bioinformatics analyses along with extensive structural dynamics simulations, we show that certain rare variants disrupted key interactions with the active site and decreased the conformational flexibility of active site loop comprising residues 216–228, which is essential for substrate binding and product release. Moreover, these variants lost crucial interactions with the cofactor flavin-adenine-dinucleotide, resulting in weaker binding affinity. A detailed inspection revealed that these variants exhibited such characteristics due to the abrogation of specific salt bridges. Taken together, our study provides a gateway into the structural-dynamic features of the rare DAAO variants and highlights the importance of informatics-based integrated analyses in the screening and prioritization of variants a priori to the clinical-functional characterization.


Scientific Reports
| (2020) 10:17146 | https://doi.org/10.1038/s41598-020-74048-2 www.nature.com/scientificreports/ In this milieu, it is extremely important to understand the functional consequences and disease predisposition capabilities of the rare variants in order to define the genetic paradigm of ALS. As rare variants are known to cause an increased disease risk across diverse populations, not only their discovery but also understanding their structure-function relationship is warranted in order to comprehend the origin and progression of ALS. In light of this observation, we focused our study on the rare variants catalogued in the Project MinE consortium, whose goal is to sequence the whole genome of > 15,000 ALS patients and ~ 7500 control individuals to provide a comprehensive genetic architecture of ALS 12 .
To achieve this, we mined the Project MinE Variant Browser for DAAO variants owing to its increased association with ALS. This led to the identification of 20 rare variants from different populations, among which three were previously studied [13][14][15] . Human DAAO, a flavin-adenine-dinucleotide (FAD)-dependent oxidase, governs its neuroprotective functions via its active site residues, in addition to an active site loop comprising residues 216 to 228 that acts as an "active site lid", transitioning between a "closed" state observed in crystal structures and an "open" state required for the binding of substrate and release of product (Fig. 1A) 16 . These structural features facilitate in the catalysis of D-alanine, D-serine, and D-proline to their corresponding keto-acids in human DAAO through the process of oxidative deamination 17 . Several interesting studies using biochemical and functional characterization have shown that (1) mutations in Arg120 in human DAAO favor FAD-binding and nuclear mistargeting, (2) the G183R-DAAO variant loses its ability to degrade D-serine, leading to the formation of protein aggregates and is partially targeted to peroxisomes at cellular level, (3) FAD affinity is significantly increased by the presence of a ligand in the DAAO active sites, and (4) some variants are potentially susceptible to an increased risk of schizophrenia [18][19][20][21] . Moreover, the R199W and R199Q DAAO variants identified from ALS patients caused an altered protein conformation, resulting in loss of enzymatic activity and abnormal D-serine levels 22 . The loss of enzymatic activity, and excessive accumulation of D-serine in the spinal cord was further observed in sALS patients and in SOD1 G93A -ALS mouse model 14 . Because compromised enzymatic activity of DAAO happens to play a vital role in manifestation of ALS, as a result of excessive accumulation of D-serine, we recently demonstrated that certain structural and dynamic changes disrupts the enzymatic activity in ALS associated DAAO variants, leading to the disease causation 23 . This encouraged us to investigate if any of the Project MinE derived rare DAAO variants could also exhibit loss of enzymatic functions and if so, how and by what molecular mechanisms. To achieve this, we carried out several bioinformatics analyses in conjunction with extensive all-atom molecular dynamics (MD) simulations for wild-type (WT) and fifteen rare DAAO variants. All the DAAO rare variants with their corresponding ID and allele frequencies are presented in Table 1 and shown in Fig. 1B. Our integrated study including MD simulations and in silico approaches reveal that certain structural and dynamic attributes corresponding to the active site residues, an active site loop comprising residues 216-228 and binding with cofactor FAD, contribute to the plausible loss of enzymatic activity, thus likely to be predisposing the individuals carrying these variations to ALS. This report and our earlier study 23 , employing MD simulations and other informatics-based analyses clearly demonstrate that certain Project MinE based rare DAAO variants are also loss-of-function type and could cause ALS through an unusual deposition of D-serine in the spinal cord.

Results and discussion
In this study, we carried out a comprehensive analysis of all the Project MinE catalogued DAAO rare variants to examine the structural and dynamic changes related to the enzymatic activity and subsequently ALS association.
Fourth, the role of mutations on local structure stability of the rare variants were analyzed by computing the secondary structural elements during simulations. It was found that substitutions in some of the rare variants, such as, (1) in D46N, caused helix to turn during the simulations, (2) the R115W substitution while retained the sheet for maximum duration, also induced turn at certain instances, and (3) the P119L induced sheet conformation. These results indicated that the variants did not significantly impact the secondary structure during the simulations and the overall stability of the variants, further correlating with the RMSD profiles ( Figure S4).
Dynamics and plasticity of the active site loop. As only the WT and H78Y, R279Q and S340F variants exhibited an increased fluctuation and conformational flexibility at around the active site loop consisting of residues 216-228, we wanted to predict the importance of such characteristics in terms of DAAO's enzymatic activity. It is already established that active site loop consisting of residues 216-228 act as a lid in DAAO, and this transitions between a "closed" and "open" state for executing its enzymatic activity. In other words, the transition between "closed" to "open" conformation facilitates the enzymatic reaction when the substrate comes in and the product releases. Upon structural superposition of one of the open-conformation active site loop of WT-DAAO crystal structure in the presence of an inhibitor [3-(7-hydroxy-2-oxo-4-phenyl-2H-chromen-6-yl) propanoic acid] (PDB ID: 4QFD), onto the simulated structures of WT, it was observed that they exhibit the open state active site loop residues on top of the profusely found closed loop structures. Interestingly, it was also observed that certain rare DAAO variants including V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E predominantly exhibited the active site loop in its closed state during the simulations (Fig. 3B). On the other hand, the H78Y, R279Q and S340F variants mainly displayed the open state active site loop conformation (Fig. 3C), similar to that of WT-DAAO (Fig. 3A). Furthermore, RMSD profiles of the active site loop residues from 216 to 228 show that the WT exhibits higher deviation during the 100 ns simulations as compared to certain other rare variants. Average RMSDs calculated from the respective trajectories further show that H78Y, R279Q and S340F variants exhibit relatively higher deviation than V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E variants (Fig. 3D, Figure S5). Furthermore, the dynamics of the active site loop was characterized by measuring the distance of the key active site loop residues, Thr216-Arg221, Arg221-Tyr228 and Thr216-Tyr228. The calculated distance between Thr216-Arg221 and Arg221-Tyr228 pairs of residues for WT and the rare variants indicated that V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E variants exhibited a considerable reduced distance as compared to WT and H78Y, R279Q and S340F variants (Fig. 4A,B). The WT and rare variants did not exhibit a major change in distance between Thr216-Tyr228 residue pairs (Fig. 4C). This gave us an impression that perhaps due to the "closed" active site loop conformation, V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E rare variants cannot accomplish their enzymatic activity in catalyzing the oxidative deamination of D-amino acids (primarily D-serine) and hence trigger its excessive accumulation in ALS spinal cord. This finding was interesting as early experimental reports have shown that loss of enzymatic activity in ALS causing DAAO variant R199W is an important pathogenic factor for the build-up of D-serine, a co-agonist of the NMDA glutamate receptor inducing glutamate transmission 14 . Our present results in addition to earlier reports 23 , thus indicates that this loss of enzymatic activity in DAAO variants could be originated due to "closed" conformation of active site loop, owing to which they fail to catalyze the D-serine resulting in its abnormal build-up.
Role of crucial salt bridges in active site loop dynamics. As certain rare DAAO variants were proposed to exhibit restricted active site loop flexibility with predominantly experiencing closed conformation during the course of simulations, we were curious to know what are the driving forces of such behaviour. During our investigation of the structural features, it was found that variants V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E established fewer salt bridges during the simulations as compared to WT and H78Y, R279Q and S340F variants, thus restricting their active site loop flexibility and movement. Computation of salt bridges revealed that WT and H78Y, R279Q and S340F variants had overall higher number of salt bridge interactions than V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E variants during the MD simulations. Our focused analysis on only the salt bridges that are formed by or between the active site loop residues further revealed that WT established ten salt bridges whereas the rare variants D46N and L215F exhibiting closed active site loop dynamics established only three and four interactions respectively (Table 2). Interestingly, R279Q and S340F variants formed relatively higher i.e. eight salt bridges, of which some of the unique interactions were absent in D46N and L215F variants ( Table 2). The time profile salt bridge formation/dissolution, and percentage occupancy analysis for the rare DAAO variants further demonstrates this and highlight that salt bridges indeed play a major role in DAAO dynamics and eventually in the reduced conformational plasticity of active site loop in DAAO variants as compared to WT and H78Y, R279Q and S340F variants ( Table 2). For clarity of representation, salt bridge interactions with respect to time for other variants have been presented in Figure S6. Overall, a thorough inspection of the MD simulation trajectories emphasized  www.nature.com/scientificreports/ differently and exhibited reduced flexibility during the simulations. It was also observed that the majority of the dynamics occurring in these molecular systems was contributed by a small number of eigenvectors representing the overall collective motions. Next, the Gibbs free energy landscape (FEL) plot was generated from the PC1 and PC2 coordinates, where the blue, green, and cyan color represented metastable conformations with a lowenergy state, while red color signified high-energy protein conformation. In the FEL plots, the ∆G value of the rare DAAO variants ranged from 13.6 to 16.3 kJ/mol (Fig. 5). This further demonstrated that V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E variants undergone different dynamics as compared to H78Y, R279Q and S340F (Fig. 5).

RIN of rare DAAO variants. RIN analysis is valuable to identify key residues of interactions as well as
perturbed interactions for an individual protein and among a group of proteins during a dynamic run. In this study, we first determined residue-residue interactions between the mutated residues with active site residues and secondly, between the active site loop (residues 216-228) with the rest of the proteins for WT and all the DAAO variants. Our analysis on residue-residue interactions between variant residues and the active site residues during 100 ns simulations revealed that WT-DAAO possessed several interactions among the active site residues ( Figure S7), whereas in D46N variant had fewer interactions in total as a result of the substitution from Asp46 to Asn46 ( Figure S7). On the other hand, S340F variant showed relatively higher interactions than the D46N variant ( Figure S7). The RINs between the variant and active site residues for the remaining variants have been presented in Figure S8. The differences in residue-residue interactions for the rare variants in relation to the active site residues are summarized in Table S1.
Next, our analysis on the RINs between active site loop residues 216-228 revealed that WT-DAAO has a greater number of interactions than rare variants V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E ( Figure S9). It was observed that while WT-DAAO possessed several key interactions such as hydrogen bond between the Arg221-Asp255, Arg221-Asp127, Glu220-His99, and Asp218-His99 residues, the D46N variant failed to establish these interactions ( Figure S9). However, the S340F formed a few unique interactions than the D46N variant showing its ability to exhibit a flexible active site loop ( Figure S9). The RINs between the active site loop with the rest of the proteins for the remaining variants are presented in Figure S10.

Binding free energy calculations and interactions with cofactor FAD.
To understand the affinity between rare DAAO variants with cofactor FAD, binding free energies were obtained from respective trajectories of DAAO variants using MM-PBSA method. As shown in Table 3, while the binding free energy of WT was found to be the lowest, followed by H78Y, S340F and R279Q variants, the binding free energies with cofactor FAD for V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E variants were found to be negatively influenced by such variations. A detailed inspection of energy components revealed that the van der Waal and electrostatic energies typically determine the stronger versus the weaker binding affinities towards cofactor FAD. Ligand interaction diagrams of rare variants with cofactor FAD from the average structures further show that L215F and D46N variants established fewer molecular interactions as compared to S340F and R279Q ( Figure S11). It was observed that while FAD established ~ 16 types of molecular interactions with L215F and D46N, in S340F and R279Q variants, the FAD established ~ 20 types of molecular interactions with the surrounding residues ( Figure S11). This confirms that due to the formation of certain key interactions, variants such as H78Y, R279Q and S340F have higher affinity and favourable binding towards cofactor FAD than the others. Moreover, studies have previously shown that the DAAO apoprotein exists in two conformations at equilibrium in solution, where the state having higher affinity with FAD is significantly enhanced in the presence of ligands, such as benzoate in the active site 21 .
Comparison with experimentally characterized ALS associated DAAO variants. In one of our recent studies, we have effectively shown that certain ALS causing DAAO variants, such as R199W and R199Q, Glu100-His217 (5%) Absent Absent Glu100-His217 (0.2%) Glu100-His217 (2%) www.nature.com/scientificreports/ lose their enzymatic activity due to reduced active site loop plasticity and decreased affinity towards FAD 23 . These results highlighted the molecular mechanisms by which the ALS associated DAAO variant, R199W, loses its enzymatic activity and leads to the build-up of D-serine in the ALS spinal cord 14 . It was further observed that R199W and R199Q variants fail to form certain vital salt bridges, such as Glu220-Arg265, Glu100-His217, Glu261-Arg221, Asp255-Arg221 and Asp127-Arg221, which caused the reduced active site loop flexibility, possibly leading to the loss of enzymatic activity. These correlation in structural and dynamic attributes encouraged us to further investigate the Project MinE documented uncharacterized variants of DAAO. Our comprehensive analyses of all the rare DAAO variants showed that V5A, D46N, F90V, P103L, R115W, P119L, L215F, P268S, R283Q, R286C, L329F and G331E rare DAAO variants would lose their enzymatic activity due to an assimilated interplay between restricted active site loop flexibility, distorted interactions with active site residues and eventually due to weak binding towards FAD.
Limitations of the study. We acknowledge certain limitations in the current study. Our work provides mechanistic insights into the loss-of-function mechanisms for existing mutations using a spectrum of in silico analyses, and hence, comparison with functional assays would be conclusive to support and validate the computational predictions. Moreover, owing to the large number of DAAO variants being discovered frequently, their detailed structure-function analyses through extensive sampling or replica-exchange or longer time-scale simulations would provide a conclusive role of functionally important regions of human DAAO in relation to human diseases.

Conclusion
Disease causing potential of rare genetic variations cannot be ruled out because of their increased prevalence and susceptibility among ethnic groups, especially in the pathophysiology of complex disorders like ALS. Even if certain rare variants could be tolerated, they may be highly lethal when combined with other variants by exerting a synergistic negative effect on disease development. One such emerging protein associated with ALS is DAAO, where the number of rare variants is enormous, though their structure-function relationship and possible role in loss-of-functions have never been established. We have earlier shown that missense mutations in DAAO cause ALS through loss-of-function mechanisms and during the course of this research, we realized there are several ALS associated rare variants of DAAO for which the mechanisms of disease association are not understood. Rare variants act as risk factors and disease modifiers to increase disease burden and thus investigating their role is of great importance to define the genetic architecture of ALS. Further, predicting whether a novel variant is deleterious or benign, is one of essential tasks with respect to disease pathophysiology. We were curious to comprehend if certain structural and dynamic attributes can be identified from our comprehensive in silico analyses, which can be used to distinguish a deleterious/benign variant and provide insights into their functional loss mechanisms and subsequently ALS causation. Although a multidisciplinary approach is the ultimate solution to this complexity, extensive in-silico analysesbased approaches will be of immense use in shortlisting and prioritizing deleterious variants for further clinical and biological studies. To fill this gap, we mined all the rare DAAO variants documented in ProjectMinE. Through extensive analyses, we found that some of the variants should have a collective consequence on the active site loop, and cofactor binding of DAAO, all of which play a key role in negatively affecting the enzymatic function. This report along with previous works by us and other researchers demand that whole exome sequencing of ALS www.nature.com/scientificreports/ patients across diverse geographical regions is required to identify novel DAAO variants in order to develop effective therapeutic strategies. This work gives us a set of structural and dynamic attributes, which we can quickly compute and evaluate for any novel DAAO variant and predict their loss-of-function mechanisms. Our computational methodology can be expanded to other ALS relevant protein families and neurodegenerative disorders to bridge the gap between variant identification, characterization and disease manifestation through a better insight into their structure-dynamics-function relationships.

Methods
Retrieval of rare variants of DAAO. Rare DAAO variants were retrieved from the Project MinE data browser and then filtered for our analysis as follows. First, only the missense variants with rare variant burden of 0.5% were extracted and variants for which the functional mechanisms are not known were selected for our study. Out of a total of 20 variants (V5A, R38H, D46N, H78Y, F90V, P103L, R115W, P119L, R199W, R199Q, L215F, P268S, R279Q, R283Q, R286C, L329F, G331E, S340F, S345F and S345C), the R38H, R199W and R199Q variants were not included in this study because they were previously studied by us and others 13,14,23 . Further, variants S345F and S345C were excluded because the crystal structure of WT-DAAO (PDB ID: 2E49) contains residues until Ser340 only. The allele frequency data retrieved for these variants indicated that they qualify as rare variants (MAF < 0.05%). Next, the human DAAO (Accession number: P14920) protein sequence was retrieved from UniProtKB to locate the position of source residues, which were then altered to the mutated residues.
Modeling of rare DAAO variants and all-atom MD simulations. The co-crystal structure of WT-DAAO bound to FAD and imino-serine (PDB ID: 2E49) was used as the starting structure for all-atom MD simulations 24,25 . As the crystal structures of the rare variants of DAAO were not available, the variant structures were modelled using the swapaa tool of UCSF Chimera and the models were evaluated using PROCHECK for stereochemical quality assessment 26,27 . In total, 15 rare variants of DAAO including V5A, D46N, H78Y, F90V, P103L, R115W, P119L, L215F, P268S, R279Q, R283Q, R286C, L329F, G331E and S340F were modelled and considered in this study. First, the topologies for FAD and imino-serine were assigned using ATB Repository and the resulting complexes were added with hydrogen atoms 28,29 . Each molecular system was then solvated in a dodecahedron box of simple point charge (SPC) water in the center at a minimum distance of 1.0 nm between the protein surface and box boundary 30 . Following solvation, each system was electrostatically neutralized and simulations were carried out with gromos96 54A7 force field using GROMACS 5.1.2 [31][32][33] . Each simulation was performed by employing 50,000 steps of energy minimization followed by a gradual heating from 0 to 300 K in 200 ps and constant temperature equilibration for 1000 ps at 300 K. During this, the Parrinello-Rahman barostat pressure coupling was used to avoid the impact of velocity 34 . Upon the completion of these steps, each system was inspected for equilibration at the desired temperature and pressure, and subsequently, 100 ns production runs were carried out for each rare variant with a periodic boundary condition in the NPT ensemble with modified Berendsen temperature coupling at 300 K and a constant pressure of 1 atm 35 . The leap-frog integrator and the Verlet cut-off schemes were employed. The LINCS algorithm was used to constrain the bond lengths and the Particle-mesh Ewald method was employed to calculate the electrostatic forces 36,37 .
Analysis of MD simulations. The MD simulated trajectories of rare DAAO variants were analyzed using gmx rms, gmx rmsf and gmx gyrate GROMACS utilities to obtain the root mean square deviation (RMSD), root mean square fluctuation (RMSF), and radius of gyration (Rg) respectively. The intermolecular hydrogen bonds and salt bridges were computed using visual molecular dynamics (VMD)'s hydrogen bonds and salt bridge modules respectively. The formation or dissolution of salt bridges with respect to time was processed using the "timeline" plugin of VMD 38 . The secondary structure content for the rare DAAO variants during the course of the simulations were measured using the DSSP (Dictionary of Secondary Structure of Proteins) tool of the GROMACS ' do_dssp' utility.
Essential dynamics and free energy landscape of DAAO rare variants. To understand the prevailing and cooperative modes of the rare DAAO variants from the complete dynamics of the MD trajectories, principal component analysis (PCA) was performed, where a variance/covariance matrix was constructed by calculating the eigenvectors and eigenvalues and their projection along the first two principal components 39 . Briefly, PCA is based on the diagonalization of the covariance matrix C, with the elements defined as follows: where r i is the Cartesian coordinate of the ith C⍺ atom, N is the number of C⍺ atoms considered, and < r i > represents the time average over all the configurations obtained in the simulation. By projecting the MD trajectories of the variants onto the main essential direction, which corresponds to the largest eigenvector, one can visualize the extreme structures and the major fluctuations of the correlated motions. Following this, the movements of the variants in the essential subspace were identified by the projection of Cartesian coordinates across the trajectories. The percentage variability of the rare DAAO variants was calculated by the eigenvalues associated with each eigenvector. The eigenvalues and eigenvectors were obtained by diagonalizing the covariance matrix. Further, the conformational changes of the rare variants free energy landscape (FEL) were computed using gmx sham package 40,41 .

Residue interaction network (RIN) of DAAO rare variants.
To understand the effect of variations on overall interacting residues of the rare DAAO variants, RINs were constructed using an integrated computational C ij ≤ (r i − < r i >) * (r j − < r j > ) (i, j = 1, 2, 3, . . . 3N) Scientific Reports | (2020) 10:17146 | https://doi.org/10.1038/s41598-020-74048-2 www.nature.com/scientificreports/ framework comprising UCSF Chimera, structureViz and RINalyzer plugins of Cytoscape [42][43][44][45] . For this, UCSF Chimera was used to load the MD trajectories of the rare variants and its RIN wizard was used to compute the residue interactions and then visualized using structureViz. In the network diagrams, the amino acids were represented by nodes and non-covalent interactions by edges. Furthermore, contacts, hydrogen bonds, backbone connectivity and distances were retrieved from each rare variant. After the RINs were generated, they were visualized by the "yFiles Tree layout" in Cytoscape. For each DAAO variant, two interaction networks were generated.
Binding free energy between rare DAAO variants and FAD. For calculating the binding free energies between rare DAAO variants and FAD, molecular mechanics/Poisson Boltzmann surface area (MM-PBSA) methodology embedded in the g_mmpbsa tool of GROMACS was used 46,47 . In MM-PBSA, the binding free energy of the protein and ligand is typically defined as where ΔG complex , ΔG protein , and ΔG ligand represent the total free energies of the protein-ligand complex, the protein and the ligand separately in the solvent respectively 48,49 . Moreover, the free energy of the individual entity is symbolized as where E MM, G solvation and TS represent the average molecular mechanics potential energy in the vacuum, free energy of solvation, and entropic contribution to the free energy in vacuum respectively. The T and S denotes the temperature and entropy, respectively. Further, the E MM consists of bonded and nonbonded terms including bond angle, torsion, and electrostatic (E elec ) and the VDW (E vdw ) interactions respectively. Lastly, the solvation free energy, G solvation considers both electrostatic and non-electrostatic (G polar and G nonpolar ) components. The binding free energies of each system with FAD were calculated for 400 snapshots taken from the final 30 ns of the simulation.

Figures and rendering.
Relevant figures were rendered and generated using PyMOL and VMD. Cytoscape was used for the generation of RINs, and ligand interaction diagrams were generated using Molecular Operating Environment (MOE) software version 2018.01.