Supplementary Figures Supplementary Figure 1. Overview of Refolding and Misfolding for I27i27 and I27i28 Tandem Repeats Observed in Previous Single-molecule Fret Experiments. 1 A. Transfer Efficiency Histogram of Never-unfolded

I27I27, with only the native state present at E  0.38, shown alongside a representative structure of the protein, with the residues labelled with fluorescent dyes indicated as orange spheres. b. Histogram of I27I27 after refolding by dilution from high GdmCl concentration, showing the formation of a long-lived misfolded population at E  0.9, displayed with a representative structure of this species based on Gō-like simulations. c. Histogram of I27I28 after refolding: the low sequence identity between the two naturally neighbouring domains prevents formation of the stable misfolded states seen in b. The structure shown is based on I27-I27. The area shaded in gray indicates the population of molecules lacking an active acceptor fluorophore (" donor-only "). Data taken from ref.


Supplementary Figures
Supplementary Figure 1. Overview of refolding and misfolding for I27I27 and I27I28 tandem repeats observed in previous single-molecule FRET experiments. 1 a. Transfer efficiency histogram of never-unfolded I27I27, with only the native state present at E  0.38, shown alongside a representative structure of the protein, with the residues labelled with fluorescent dyes indicated as orange spheres. b. Histogram of I27I27 after refolding by dilution from high GdmCl concentration, showing the formation of a long-lived misfolded population at E  0.9, displayed with a representative structure of this species based on Gō-like simulations. c.
Histogram of I27I28 after refolding: the low sequence identity between the two naturally neighbouring domains prevents formation of the stable misfolded states seen in b. The structure shown is based on I27-I27. The area shaded in gray indicates the population of molecules lacking an active acceptor fluorophore ("donor-only"). Data taken from ref. FRET efficiency histograms recorded during refolding of doubly-labelled I27I27 after GdmCl dilution from 4.56 to 0.23 M (see Methods; time from mixing and corresponding uncertainty is given in the panels), ranging from 1.2 ± 0.1 ms to 54 ± 4 s. Fits of Gaussian peak functions corresponding to individual populations are colorcoded as in Fig. 3; the respective sums of all Gaussians are shown as black lines. The gray lines in the panels below each histogram show the residuals calculated per bin, according to Eq. 9 (Methods). Every histogram shown is the average of two or more independent measurements.  Table 1). Rate coefficients resulting from fits yielding a  2 within 50% (main histograms, light grey) or 5% (insets, dark grey) of the best  2 , are shown, together with the value of the distribution maximum for the 5% threshold; histograms with bars lined in red indicate rate coefficients fixed and randomized together with all peak positions and widths prior to the procedure. b. Histograms of positions and widths for every population yielding the rate coefficient distribution in a (light grey). The values of the distribution maxima for  2 within 50% of the best  2 are provided, alongside analogous values for a  2 threshold of 5% when these are different (in brackets).  Table 1). e. Kinetic scheme for the 5-state fit in d. Figure 8.  Fig. 3 and Supplementary Fig. 4 and 8a). Fits of Gaussian peak functions corresponding to individual populations are color-coded as in Fig. 3, and the sums of all curves are shown as a black lines; the area shaded in gray indicates the population of molecules lacking an active acceptor fluorophore ("donor-only" population). Although all misfolded populations are still present in (a), they are greatly reduced in (b). Figure 9. Propensity for parallel beta-sheet formation in I27-I27 and I27-I28. Energies for i,i+93 interactions in two-domain constructs, taken from the PASTA potential for parallel β-sheets (see Methods). Black curve gives energies for I27I27, red curve for I27I28. Vertical magenta lines indicate the region previously identified as most amyloid-prone 2 . Correspondingly, the regions with the lowest energies are most frequently involved in forming misfolded structures in the transiently formed M2 population (Fig. 4, Supplementary Fig. 2d).

Supplementary Table 1.
Parameters for the global fit of 2-domain tandem repeats. a. Uncertainty of the transfer efficiency values obtained from the extrapolation procedure described in main text "Results" is given by the 90% confidence interval at 0.23 M GdmCl (Fig. 4). b. Uncertainty of widths is the difference between the widths of FRET efficiency peaks constructed implementing a minimum threshold of 35 and 50 photons per burst (see Methods). c. Uncertainty of all the known transfer efficiency values is the expected variability of this parameter when measured on different instruments. d. Uncertainty is one standard deviation of the corresponding parameter distribution. Positions 〈 〉 and widths for each FRET efficiency population used for the kinetic global fit of the data in Supplementary Figs 4 and 8 a are given alongside the resulting rate coefficients (in s -1 ).
'Best-fit parameters' were obtained by manual variation within the uncertainty interval, aimed at minimizing  2 . 'Starting point parameters' are the values obtained from independent measurements (except for M2) and randomized in our computational procedure to assess fit robustness; distributions of parameters values from which we obtained the "distribution maxima" are reported in Supplementary Fig. 7

Supplementary Discussion
Rationalizing Ensemble Folding Kinetics.
The fast phase in the ensemble folding kinetics arises in the kinetic model through parallel formation of the misfolded states M1 and M2, as well as direct folding to the native state (without first misfolding). Although each M1 misfold is likely to form more slowly than the native, there are at least five different strand-swapped topologies with native-like structure that contribute to the depletion of the unfolded state fluorescence in parallel pathways, increasing the rate. The slow phase arises mainly from correct folding to FF, after initial trapping in one of the misfolded states.

Ensemble unfolding kinetics reveals increasing domain-swapped misfolding for multiple titin Iglike repeats.
Considering that the stable misfolded state previously observed in single-molecule FRET experiments 1 (Fig. 1b and Supplementary Fig. 1b) is formed via the reciprocal swapping of -strands between the two domains of the covalent tandem construct, the probability of forming such state should increase with the number of domains in the construct. This prediction, which was proven correct in single-molecule FRET experiments on a 3-domain tandem, can be tested in ensemble experiments performing denaturant-dependent unfolding kinetics of constructs with 2, 3 or 8 tandem repeats of I27.
Unfolding of newly expressed and purified tandem proteins ("never-unfolded" samples) resulted in a single-exponential fluorescence decay with rates identical, within experimental uncertainty, to those for the unfolding of an isolated I27 domain (Fig. 2a): we term this the "native unfolding phase". In contrast, proteins which had been unfolded in 5 M guanidinium chloride (GdmCl) for ≥ 2 hours, then refolded by dilution to a final concentration of 0.5 M GdmCl for 2 minutes, displayed unfolding kinetics better described by a double-exponential decay ( Supplementary Fig. 2b-c). The rate coefficient for the major unfolding phase were the same for all tandem proteins and for their never-unfolded counterparts (Fig. 2a), indicating that this phase originates from unfolding of natively folded domains.
The rate coefficients of the second, minor unfolding phase, however, were higher, but also invariant for all tandem proteins. Notably, the amplitude of this phase increased with the number of repeats in the tandem protein (Fig. 2c), suggesting that it originates from the unfolding of misfolded conformations. As

Hypothesis for the formation of native-like, strand-swap central domain misfolds in I27-I28.
Studies of structurally related proteins have shown that folding rate coefficients and mechanism are strongly influenced by native state topology 3,4 , and that both the transition state (TS) structure and folding mechanism tend to be conserved between members of the same family. 5,6 Several authors have highlighted that the structure of the swapped domains is also mainly determined by native topology [7][8][9][10] and that most monomeric proteins in the same family or superfamily share a common swapped structure, which very often resembles closely the native monomer 10 . This suggests that topological and sequence determinants governing folding are likewise important for misfolding via domain swapping.
All Ig-like domains appear to fold via a nucleation-condensation mechanism, where the obligatory folding nucleus comprises a ring of highly conserved hydrophobic residues from each of the B, C, E and F-strands, interacting within the protein core and surrounded by a second order of conserved residues stabilizing this interaction network 11,12 . Early packing of these residues during folding establishes the correct Greek key topology of the native conformation, stabilizes the folding transition state, and ensures rapid end efficient folding 13 . Although I27 and I28 display a global sequence identity of 24%, the identity score rises to 40% (with an additional 33% of highly similar residues), if the comparison is limited to residues with -value ≥0.5, that is, those important for the stabilization of the folding TS.
This can explain why extensive formation of misfolded intermediates during I27-I28 refolding does not lead to a stable misfolded state at equilibrium. The the folding nucleus in order to form [11][12][13][14] . Such residues in I28 are probably similar enough to those of I27 to enable the formation of these native-like misfolded structures. However, the thermodynamic stability of such native-like misfolded states will still be determined by the size and strength of the whole interaction network, which depends on the overall sequence identity of the swapping partners, and that is probably too low for these species to be long-lived.