Conformational Transitions of the Pituitary Adenylate Cyclase-Activating Polypeptide Receptor, a Human Class B GPCR

The G protein-coupled pituitary adenylate cyclase-activating polypeptide receptor (PAC1R) is a potential therapeutic target for endocrine, metabolic and stress-related disorders. However, many questions regarding the protein structure and dynamics of PAC1R remain largely unanswered. Using microsecond-long simulations, we examined the open and closed PAC1R conformations interconnected within an ensemble of transitional states. The open-to-closed transition can be initiated by “unzipping” the extracellular domain and the transmembrane domain, mediated by a unique segment within the β3-β4 loop. Transitions between different conformational states range between microseconds to milliseconds, which clearly implicate allosteric effects propagating from the extracellular face of the receptor to the intracellular G protein-binding site. Such allosteric dynamics provides structural and mechanistic insights for the activation and modulation of PAC1R and related class B receptors.

). Extended simulations were only carried out from the starting models of G1, G2, G3, and G4. We did not pursue the simulations starting from the models of g1', g3', and g4', since they were converted to similar intermediate conformations of G1, G3, and G4 after 36, 5, and 2 ns. The Wootten numbering. Similar to the Ballesteros-Weinstein numbering scheme 5 for class A GPCRs, the Wootten numbering scheme 6 helps to identify the conserved residues in the transmembrane (TM) helices of class B GPCRs and facilitates the comparisons among members in the same class. The most conserved residue in each TM helix is designated X.50b (the "b" after the number is to identify class B when comparing with class A), where X is the TM helix number and all other residues in that helix are numbered relative to this conserved position (highlighted in bold in Supplementary Table 1).
The gridcount method. Gridcount 7 is an analysis tool that creates 3D (number) densities from molecular dynamics trajectories. It is used to look at the density of water or ions near proteins or in channels and pores. First, it counts occurrences of molecules on a 3D grid over a MD trajectory, and then creates (after proper normalization) a 3D density map. Next, the map is calculated by the cylindrical and slice averages and converted into a density map that is readable by VMD. 8 Note: The conserved residues in TM1-7 are highlighted in bold.

Supplementary Information B: The construction, validation, and analyses of our Markov state model (MSM)
We used the MSMBuilder 9,10 program to construct the MSM transition matrix between the open and closed conformations and calculated the transition timescale based on the transition-path theory. [11][12][13][14] We first prepared the trajectory dataset with atoms indices by saving the C α coordinates of res. 30-419.
The C α atoms are often used to represent the overall protein structures and the dynamic terminal residues should be excluded. With the structural stabilities of our PAC1R models in the microsecond MD refinements (by RMSD measurement in Supplementary Fig. 5), each PAC1R model reached to a relatively stable state after 200~500 ns, which had been continuously relaxed to prove the stability for another 1.5~2.5 µs. The percentages of the open and closed states were 23.2%, 17.5%, 19.9%, and 25.1% for G4, G1, G2, and G3 respectively in the total conformations we sampled. Therefore, we have collected six shorter MD simulations of 20~50 ns each, and the first 200~550 ns from the long simulations, totaling 6,324 configurations.
Next, we grouped the conformation dataset into a set of clusters, called microstates, based on structural similarities. We used the k-centers algorithm 9 to group the dataset into 55 clusters by RMSD metric with mean distance of ~0.34 Å and maximum distance of ~0.56 Å, which is in the range of the RMSD standard deviations of the last ~1.5 µs in Supplementary Fig. 3. The small cluster number, in a difference from the hundreds to thousands of intrinsic conformations of previous protein folding/unfolding studies, 12,15 is the result of limited conformational changes due to the linker rotation and partial melting. Supplementary Fig. 3 Time evolution of PAC1R RMSDs. RMSDs were computed by backbone alignments on initial structures with standard deviations of 0.34~0.56 Å in the last ~1.5 µs. Each PAC1R model reached to a relatively stable state after 200~500 ns, which had been continuously relaxed to prove the stability for another 1.5~2.5 µs. The conformational states between which we calculated the shortest pathways are circled.

S6
With the set of microstates discretization, a series of microstate transition matrixes in the evolution of the observation interval or lag time at τ, 2τ, …, nτ, were constructed, on which the implied timescales and the Chapman-Kolmogorov test 16 were carried out to examine if a microstate transition matrix with the microstate discretization and a chosen lag time is Markovian. 12,16,17 We used maximum likelihood estimation to build the transition matrixes at series of lag times, in which increasing the lag time means that states can get larger and more coarse grained as the longer lag time the fewer states are kinetically relevant (kinetically reach each other on timescales faster than the lag time). 18 The implied timescales as a function of the lag time and the eigenvalues of the transition matrix are shown in in Supplementary Fig. 4. The macrostates number of four is determined by the number of the major gaps of the implied timescales as well as the number of eigenvalues of the transition matrix that are close to 1. 19 The division of four macrostates were calculated from the eigenfunction structure using the Perron Cluster Cluster Analysis (PCCA) method. 19 Consistently, the conformations of G1, G2, G3, and G4 were lumped into the four macrostates, A, B, C, and D, respectively. The Chapman-Kolmogorov test can be implemented on an individual state or set of states. 12,16 In general, we compared the transition probability matrix T(nτ) based on the transition counts (known as observed trajectory) and [T(τ)] n from MSM for a given set of states A defined at the starting microstate discretization. 12,16 The initial stationary distribution at time τ restricted to a set A is given by where π is the stationary probability of the m × m transition matrix T(τ). The trajectory-based timedependence of the probability to be after time nτ with starting distribution ! is given by where !" ( , ; ) is the trajectory-based estimate of the stochastic transition function given by where !" !"# ( ) is number of transition counts between states i and j at time nτ.
Likewise, the probability to be at A by Markov model is given by We tested how well the equality !" , ; = !"! , ; holds, as whether the solid line is within the error bar range of the dash line in Supplementary Fig. 5. The uncertainties of the transition probabilities estimated from the MD trajectories are computed as: There are around 27 microstates that constitute the shortest and second shortest transition pathways between closed and open states, and were identified as four subsets based on the macrostate division.
The Chapman-Kolmogorov test of the four subsets is shown in Supplementary Fig. 5 Our open-state model (G4) clearly shows that the 21-aa sequence does not impact the peptide-binding groove, 20 which provides an explanation to the unaltered binding properties from 21-aa deletion in previous studies. 21