Determining subunit-subunit interaction from statistics of cryo-EM images: observation of nearest-neighbor coupling in a circadian clock protein complex

Biological processes are typically actuated by dynamic multi-subunit molecular complexes. However, interactions between subunits, which govern the functions of these complexes, are hard to measure directly. Here, we develop a general approach combining cryo-EM imaging technology and statistical modeling and apply it to study the hexameric clock protein KaiC in Cyanobacteria. By clustering millions of KaiC monomer images, we identify two major conformational states of KaiC monomers. We then classify the conformational states of (>160,000) KaiC hexamers by the thirteen distinct spatial arrangements of these two subunit states in the hexamer ring. We find that distributions of the thirteen hexamer conformational patterns for two KaiC phosphorylation mutants can be fitted quantitatively by an Ising model, which reveals a significant cooperativity between neighboring subunits with phosphorylation shifting the probability of subunit conformation. Our results show that a KaiC hexamer can respond in a switch-like manner to changes in its phosphorylation level.


S2 Statistics of pattern distribution in pure hexamers
In this note, we briefly explain some critical points and introduce additional fitting methods and results mentioned in the main text, regarding the statistics of hexamer pattern distribution in pure hexamers.
The conformational patterns of hexamer.As mentioned in the main text, each monomer could be either in the exposed (Ex) state or buried (Bu) state, so there should be 2 $ = 64 possible hexamer configurations in total.Because all the monomers are regarded identical, the configurations that only differ in a rotation operation can be grouped into the same hexamer conformational pattern, as is shown in Supplementary Fig. 11, and the number of configurations in the group is the pattern's degeneracy.Furthermore, for the two patterns that only differ in chirality (pattern-( 7) and pattern-( 8) in Supplementary Fig. 11a), we found that their numbers (in KaiC-AA, pattern-( 7) has 1,292 hexameric particles pattern-( 8) has 1,399 hexameric particles; in KaiC-EE, pattern-( 7) has 6,306 hexameric particles pattern-( 8) has 6,248 hexameric particles) are quite close.Hence, we ignore this chiral effect and regard the two patterns the same.So, there are 13 patterns in total, whose degeneracies are listed in Supplementary Fig. 11b.
Statistics of the previously unused "top view" hexameric particles.To avoid bias in particle selection, we also carried out statistical analysis by including a large number of top-view particles that were not used in the analysis reported in the main text, which include 1,375,275 KaiC-AA top-view particles and 459,815 KaiC-EE top-view particles, see Supplementary Figs. 4 and 6 for illustration of this additional analysis.These ``previously unused" top-view particles were randomly divided into sub-groups with equal number of particles (22 groups for KaiC-AA and 6 groups for KaiC-EE).Then, each sub-group of top-view particles were combined with all the non-top-view ("side view" and "tilt view") particles for KaiC-AA and KaiC-EE, respectively.These combined groups of particles are numbered from "KaiC-AA combined group 1" to "KaiC-AA combined group 22" (or from "KaiC-EE combined group 1" to "KaiC-EE combined group 6").Finally, we performed statistical analysis for each combined group following the same procedures described in the main text by using RELION (Supplementary Figs. 4 and 6).We show the statistical results of two such combined groups for KaiC-AA and KaiC-EE respectively in Supplementary Fig. 12, which indicates that including these previously unused hexameric particles did not change statistics of the hexamer patterns significantly.volumes classified by RELION with the two masks that are used to characterize the buried state (Bu) for KaiC-AA combined group 1 (c), KaiC-AA combined group 2 (e), KaiC-EE combined group 1 (g), KaiC-EE combined group 2 (i).All 3D volumes (within the region corresponding to mask2) are shown at a high density threshold (4σ, red mesh, left) and a low density threshold (2σ, black mesh, right) for KaiC-AA combined-group 1 (d), KaiC-AA combined-group 2 (f), KaiC-EE combined-group 1 (h), KaiC-EE combinedgroup 2 (j).

S3 Statistics of pattern distribution in mixed hexamers
In this note, we describe some details about conformational statistics for mixed hexamers.
Conformational pattern distribution and monomer arrangement for mixed hexamers.In principle, the conformational pattern distribution for mixed hexamers should lie between the two pure cases.This is because the mixed hexamers can have 14 distinct monomer arrangements (see Supplementary The fitting for subunit arrangement.Generally,  & could be obtained from fitting.We fit the theoretical predicted  % (according to Eqs. 5 & 6 in main text) to experimental observation ( %(()) ) by using the following least square method:  and 15b).We choose a specific λ = 0.1 in the analysis in main text to enforce the difference between KaiC-AA and KaiC-EE percentage to be less than 10%.
From the fitting result of  & shown in Supplementary Fig. 15a, in the absence of coupling, the resulting distribution is almost the same as no-mixing scenario, which is inconsistent with our previous results and certainly not the case.Hence strong coupling is necessary.
The extended Ising model for mixed hexamers.We extended the original Ising model by considering the dependence of the coupling constant  on whether the two neighboring monomers are the same or not.
Specifically, we modified our original Ising model by introducing the composition-dependent coupling constant Δ into Eq.( 4) introduced in the main text.Specifically, the modified Hamiltonian has the following form: where Δ is the difference in coupling constant between the same type of monomers (EE-EE or AA-AA) and different types of monomers (AA-EE) and J is the coupling constant for pure hexamers.
The monomer arrangements are assumed to be random so each monomer in the ring has an equal probability of being AA or EE, i.e., corresponding to the fully-mixed scenario mentioned in the main text.
Supplementary Figure 17 shows how  0 of the fitting depends on Δ.The best fit corresponds to  = 0.09 > 0, which is consistent with the observation that the overall (averaged) coupling constant in the mixed hexamer is smaller than that for the pure hexamers:  92) < .However, the optimal  0 ≈ 0.83, which indicates that the improvement is quite limited.Therefore, introducing a different coupling strength between EE and AA is not enough to explain the data for mixed hexamers.

Supplementary Figure 1 :
Cryo-EM structure determination of KaiC-AA and KaiC-EE.On the left is a typical cryo-EM micrograph of KaiC-AA (a) and of KaiC-EE (e), taken with a FEI Titan Krios G2 microscope equipped with the post-column Gatan BioQuantum energy filter connected to Gatan K2 Summit direct electron detector.Scale bar, 100 nm.On the right is the power spectrum evaluation corresponds to the left micrograph.(b) and (f) give gallery of unsupervised 2D class averages of KaiC-AA (b) and of KaiC-EE (f).(c) and (g) provide local resolution estimation calculated by ResMap for KaiC-AA (c) and for KaiC-EE (g).(d) and (h) show Fourier shell correlation (FSC) for two independently refined halves of KaiC-AA (d) and of KaiC-EE (h).During the refinement we imposed C6 symmetry.Supplementary Figure4: Data processing flow chart of KaiC-AA data set.For the 3D classification results of the high quality well-defined structure (consist of 140,475 hexameric particles), there is 47.6% of subunits belong to the exposed state and 26.7% of subunits belong to the buried state, the rest 25.7% subunits remain unclassified.With x-y shift and angular information, we can put these exposed/buried subunits back into hexamers, and then carry out statistical analysis of the hexamer conformational patterns shown in Fig.3ain the main text.SupplementaryFigure 5: Criteria for distinguishing the exposed (Ex) state and buried (Bu) state in KaiC-AA.(a) The overlap intensities (Integral density values) of 3D volumes classified by RELION (Supplementary Fig.4) with the two masks that are used to characterize the buried state (Bu).(b) All 3D volumes (within the region corresponding to mask2) are shown at a high density threshold (4σ, red mesh, left) and a low density threshold (2σ, black mesh, right).The first row shows the structures that are classified as the buried (Bu) state.All structures in the first row agree with atomic model in all parts of the A-loop area at both the high and the low thresholds.The second row shows the undefined (Un) states, with density profiles that agree with the atomic model of the Bu state at the low threshold but not at the high threshold.The third row shows exposed (Ex) states, with the density profiles that disagree with the atomic model of the Bu state in the A-loop area at both the high and the low thresholds.(c) Comparison of hexamer conformation pattern distributions when 3D volume "1-8" is classified as "Un" (black dots) or "Bu" state (red dots).Supplementary Figure6: Data processing flow chart of KaiC-EE data set.For the 3D classification results of the high quality well-defined structure (consist of 371,557 hexameric particles), there is 28.9% of subunits belong to the exposed state and 52.9% of subunits belong to the buried state, the rest 18.2% subunits remain unclassified.With x-y shift and angular information, we can put these exposed/buried subunits back into hexamers, and then carry out statistical analysis of the hexamer conformational patterns shown in Fig.3bin the main text.Supplementary Figure 7: Criteria for distinguishing the exposed (Ex) state and buried (Bu) state in KaiC-EE.(a) The overlap intensities (Integral density values) of 3D volumes classified by RELION (Supplementary Fig. 6) with the two masks that are used to characterize the buried state (Bu).(b) All 3D volumes (within the region corresponding to mask2) are shown at a high density threshold (4σ, red mesh, left) and a low density threshold (2σ, black mesh, right).The first row shows the buried (Bu) states, the second row shows undefined (Un) states, the third row shows exposed (Ex) states.Note that the 3D volume 2-5 has a poor resolution and weak density at the bottom at the 4σ density threshold, and the 3D volume 2-11 only has density on the left-side at the 4σ density threshold.So both 2-5 and 2-11, which are at the boundary of the Un states and the Bu states, are classified as undefined (Un).However, even including 2-5 as a buried state does not affect the final results.(c) Comparison of hexamer conformation pattern distributions when 3D volume "2-5" is classified as "Un" (black dots) or "Bu" state (red dots).Supplementary Figure 8: Comparisons of the statistics of thirteen hexamer conformational patterns for the non-top-view particles only (light blue triangles), the top-view particles only (red triangles), and all particles (black triangles) for (a) Kai-AA particles; (b) Kai-EE particles.

Figure 11 :
Considering the structure degeneracy, 64 configurations can be combined into 13 conformational patterns.(a) These 64 configurations are grouped into 13 conformational patterns according to the same pattern.Note that pattern (7) and (8) are only different in chirality, and our experiment shows very little difference between their numbers.So these two patterns are combined into one.(b) The degeneracy Ω % and the number of exposed state  % are listed for each hexamer pattern.
& ≤ 1.The second term above is introduced to enforce the overall equal percentage of KaiC-EE and KaiC-AA monomers with  & the percentage of KaiC-EE monomer in the hexamer with arrangement-, and λ is a weight constant.The fitting results  & and performance  0 are in general insensitive to λ (Supplementary Figs. 14

Table 2 )
, though the arrangement distribution  & is unknown.Two extreme scenarios to infer  & is assuming either monomers are: 1) fully mixed, i.e., AA and EE monomers appear in the ring with no preference (fully-mixed scenario), or 2) not mixed at all (the no-mixing scenario).In fully-mixed scenario,  & can be calculated analytically and listed in Supplementary Table2.In no-mixing scenario, only the two pure arrangements have non-zero  & .With the two assumptions of  & , we can calculate the theoretical distribution of hexamer patterns in respective scenarios.These two theoretical distributions are shown in Supplementary Fig.13, with the three experimental distribution (two pure cases and the mixing case) also presented for comparison.It appears that neither of the two extreme scenarios can well explain the experimental distribution of mixed hexamers.

Table 2 :
Enumeration of subunit arrangements.The arrangements for the two extreme scenarios are also listed.