MELD-accelerated molecular dynamics help determine amyloid fibril structures

It is challenging to determine the structures of protein fibrils such as amyloids. In principle, Molecular Dynamics (MD) modeling can aid experiments, but normal MD has been impractical for these large multi-molecules. Here, we show that MELD accelerated MD (MELD x MD) can give amyloid structures from limited data. Five long-chain fibril structures are accurately predicted from NMR and Solid State NMR (SSNMR) data. Ten short-chain fibril structures are accurately predicted from more limited restraints information derived from the knowledge of strand directions. Although the present study only tests against structure predictions – which are the most detailed form of validation currently available – the main promise of this physical approach is ultimately in going beyond structures to also give mechanical properties, conformational ensembles, and relative stabilities.


Selection of fibrils for MELD x MD simulation
From the database of 109 structures of Stanković et. al for amyloid fibrils, we have selected 12 fibrils for short-peptides, which represents different classes of steric zipper. The structure of these fibrils are solved by X-ray diffraction. Classes 1-4 have parallel alignment of β -strands and classes 5-8 have antiparallel arrangement of β -strands within each β -sheets. The fibril structures for long fibrils are selected considering various parameters of monomeric sequence length, number of oligomeric chains in the fibrils, and total fibril length. We consider the peptide sequence length higher than 10 residues as long fibrils. The monomeric sequence length of the peptide monomers of these fibrils varies from 11(PDB 2m5n) to 79(PDB 2kj3) residues. The oligomeric state of these fibrils are also different (trimer for PDB 2kj3, to 16-mer for PDB 2m5n). The total fibril sequence lengths for PDB 2beg, 2e8d and 2m5n is below 200, whereas PDB 2lnq, 2kj3 and 2mxu are longer. The structure of PDB 2beg is solved by solution NMR, while all other structures by solid state NMR.

Restraint protocol for short fibrils
Inter-strand distances of 4.8Å are incorporated between Cα atoms of all corresponding residues. Also dihedral angle restraints for the peptide backbone with approximate dihedrals for parallel (phi = -119 0 , psi = +113 0 ) and antiparallel ( phi = -139 0 , psi = +135 0 ) are imposed to all residues, with a standard deviation of ± 5Å. Inter-sheet distances of 10Å are incorporated between Cα atoms of three central residues of each strand with one another. We arbitrarily decided to enforce all restraints at 80 % accuracy since our restraints data are ambiguous and uncertain.

Restraints for long fibrils
The restraints data ( distance restraints and dihedral angle restraints) for long fibrils are obtained from 'NMR Restraints' file of PDB. We enforce atom-atom NMR distance restraints at 80 %. The backbone dihedral angle restraints are also enforced at 80%. The pairwise contacts of the distance restraints used for different fibrils are shown here. The x-axes are scaled to show the fibril size.
The experimental structures are shown on the right, with distance restraints data (red) used in the study. 4

MELD x MD simulation for the conversion of extended monomers to trimer
The protocol for generating fibril structure for longer-chain fibrils is to first generate trimeric structure starting from the extended monomer chains. We then build up the actual fibril from the trimeric structure (except for PDB 2kj3). For PDB 2kj3, first the monomer structure is generated from the extended chain, and then fibril structure is build up. The RMSD vs. time plots and RMSD histograms for the conversion of monomers to trimer are shown below. The errorbar is shown at 5.0 A, the cutoff for folded to native fibril. The RMSD data are for the five lowest temperature replicas, the same replicas that were clustered for analysis.

MELD x MD simulation with limited information
For both short and long fibrils, we have simulated a set of simulations with limited restraints data.
(a) The restraints protocols used for short fibrils using limited information are as follows: 1. MELD without dihedral angle restraint (S1): Here we removed the dihedral angle restraints information, but keeping the distance restraints intact. These simulations resulted in fibril structures similar to the general protocol, and removing dihedral angle restraints did not change the population of the top cluster in most cases. However, the peptide backbones are more flexible compared to the general protocol. 2. MELD without dihedral angle and limited distance restraints (S2) : Here we limited the number of distance restraints used, but kept the information of strand-arrangements (parallel/antiparallel) intact. For 7 out of 12 cases, the most populated cluster is within 5.0Å RMSD from reference PDB. 3. Unguided MELD Simulations (S3): At last we asked if the physical computations alone, without specific information, could predict arrangement of strands in short fibrils. A set of simulations were run without any informative distance restraints and dihedral angle information. However, to keep the peptide monomers together and to avoid straying too far from each other, we used one distance restrain of 4.8Å between the Cα atoms of the central residues of each peptide strand. We call it 'unguided' because here we have not used any directive restraints. The top MELD cluster of these unguided simulations are far from the reference PDB.  Table 4 shows the amount of restraints used in these different systems. The RMSD vs. time and RMSD histograms for system S2 and S3 are shown in Supplementary Figures 7-8. System S1 is omitted, as the population distribution are similar to the general protocol. The results of all the different systems are summarised in Supplementary Table 2. (b) To assess the quality of MELD x MD structure prediction with limited restraints information for long fibrils, we simulated a set of systems for four structures, namely 2beg, 2e8d, 2mxu and 2m5n. We skipped PDB 2lnq and 2kj3 for limited restraints analysis. PBD 2kj3 with monomer chain-length of 79 amino acid is the longest in our selection of fibrils; and form cross-β -sheets with residues within the individual chains. Due to its large size, we assumed that simulations without informative restraints would not be successful. Whereas MELD prediction for 2lnq was unsuccessful even with SSNMR restraints, as restraints per residue is lowest for 2lnq.
The restraint protocol used for these different simulations are : System L1: Here we simulated the structures with NMR distance restraints, but without any dihedral angle restraints information. The most populated clusters for all cases are within 5.0 A RMSD from reference PDB.
System L2: Next, we simulated these fibril structures using restraints protocol similar to our MELD simulation of short fibrils. PDB 2m5n has two straight cross-β sheets similar to short fibrils. Therefore, the restraints protocol is also exactly similar. Fibrils of PDB 2beg and 2e8d have turns or loop regions between strands. These structures are characterised by U-shaped structure, with strand-loop-strand (β -loop-β ) arrangement. Considering this knowledge as external information, we incorporated intra-sheet distance restraint of 4.8Å between residues of the monomers. We used dihedral angle restraint for parallel β -sheets (phi=-119, psi= +113) for all residues except the residues in the turn regions. We also applied some intra-monomer distance restraints (similar to inter-sheet distance restraints for short fibril) of 10Å between Cα atoms of residues of one end of each fibril monomer to the other end. PDB 2mxu is an S-shape fibril. However, we used the same protocol of restraints as for the U-shaped structure. The reason here is, defining a general restraints protocol for S-shaped fibril is difficult without the availability of proper experimental data. We observed that even with these limited pieces of information, the centroid of the most populous clusters are within 4.0Å RMSD from the reference fibril structure, except for PDB 2mxu (Supplementary Figures 9-12). The failure of 2mxu is due to enforcing large number of inaccurate restraints at 80% precision, as described in our protocol. Therefore, the MELD prediction is far away from the native fibril structure. These results suggests that although extensive experimental data may not be necessary, however, some limited but qualitatively informative information are required to predict fibril structure.
System L3: We also performed another set of MELD x MD simulations, with the only information of parallel strand arrangement. The restraints input here is the inter-monomer distance restraints of 4.8Å between the Cα atoms of the corresponding residues of parallel strands. We did not apply any intramolecular restraints and dihedral angle restraints information. However, we observed that in absence of any intra-monomer restraints in system L3, the RMSD deviations are much higher in all cases.
Supplementary Table 5 shows the amount of restraints used in different systems for long fibrils. The results of all the predicted structures of these different systems are shown in Supplementary Figures 9-12    The unguided MELD simulations of system S3 generates oligomeric structures in all cases, and for 8 cases out of 12, the most populated cluster in MELD gives structures with correct strand arrangements (parallel/anti-parallel) as of the reference PDB structure. For other 4 fibrils, the oligomeric structures are found to be a random mix of both parallel and anti-parallel strands. However, the RMSDs are higher in all cases.    Figure 12: The pairwise contacts, RMSD vs. time and RMSD histograms plotted for PDB 2mxu in different simulations. The most populated cluster for system L1 is within 5.0Å RMSD from the reference PDB. For system L2, the MELD prediction is far away from the native fibril structure. PDB 2mxu is an S-shaped fibril, whereas our general protocol of restraints is for U-shaped structure; and these large number of inaccurate restraints were enforced at 80% precision. Whereas in absence of any intra-monomer restraints in system L3, the RMSD deviation is much higher.       Table 5: Limited restraints information used in different systems for long fibrils. System L0 is when all NMR restraints are imposed. In system L1, dihedral angle restraints are removed. In system L2, the restraints protocol is similar to short fibrils. In system L3, only intermonomer distance restraints of 4.8Å for parallel strands are imposed. System L0 is a reference for, when all NMR restraints are imposed. Dihedral angle restraints are also considered in calculating restraints per residue.

The failure modes, and recovering from them
In order to recover the failed MELD structures, we carried out another set of simulations. Our general MELD protocol for short fibrils failed to predict structures for two short fibrils 2omq and 2ona. To improve the structure prediction, we have added distance restraints with accurate intersheet distances, derived from the reference PDB. The restraints are applied between Cα atoms of all corresponding residues of the two sheets to ensure a similar pattern of steric-zipper interface as in the PDB. A total of 24 accurate inter-sheet distance restraints are added in both caese. On the other hand the failure of PDB 2lnq is mostly associated with the intrinsic flexibility of the peptide in simulations, as the number of inter-monomer β -sheet alignment restraints were not sufficient. Therefore we have added an extra 60 inter-strand distance restraints between residues according to the reference PDB structure. These inter-strand distance restraints of 4.8Å are added between Cα atoms of all corresponding residues.

The replica exchange condition of different MELD x MD simulations
High temperature replicas (bigger replica indices) mixed well with the low temperature ones. The exchanges are attempted every 50 ps, and the acceptance probability of replica exchanges are typically at about 30-50 %. In some cases we have observed poor replica exchanges for simulations of long fibrils starting from extended monomers to form fibril structure. The poor replica exchanges may reduce sampling and limit accurate structure prediction. Increasing the number of replicas improved the exchanges. The exchanges among replicas, and acceptance probability of replica exchanges over simulation 20 time for PDB 2mxu are shown. Figure 15: The exchanges among different replicas (above) and acceptance probability (below) of PDB 2mxu in different MELD x MD simulations. (a) shows poor exchange, with 28 replicas while starting from extended monomers as initial configuration to form fibril; (b) shows simulation with 60 replicas where the high temperature replicas (bigger replica indices) mixed well with low temperature ones; whereas in (c) even with 28 replicas, the replica exchange becomes much better, while forming fibril structure starting from trimers.

Convergence of simulations
To check the convergence of our replica exchange simulation, we checked the RMSD histograms of all 30 replicas relative to the last frame of the simulation. Converged simulations would give overlapping histograms. It is observed that in some cases the higher replica index trajectories are not converged.