Nat. Struct. Mol. Biol. 10.1038/nsmb.2938

Credit: NATURE STRUCTURAL AND MOLECULAR BIOLOGY

Repeat proteins are composed of tandem arrays of smaller protein modules, and many of these proteins have a concave surface that is used to bind other proteins. For this reason, repeat proteins have been engineered—for example, by changing amino acid residues at their curved binding surfaces or by altering the number of or the sequence of the repeated modules—to bind non-native protein targets. However, a general approach to systematically alter the shape and curvature of the designed protein has not been reported. Park et al. identified four naturally occurring leucine-rich repeat (LRR) modules that had well-defined shapes, suggesting that they could be used as potential 'building blocks' to construct larger, non-natural proteins with predictable three-dimensional structures. The authors optimized the sequences of the LRR modules using a previously published Rosetta repeat-protein idealization method to ensure that the modules would be stable and behave predictably in vitro. They then synthesized proteins that had five to seven copies of each of the idealized building block sequences and solved the X-ray crystal structures of these polyproteins: two of the building blocks formed solenoid-like structures, and the other two were highly curved. The authors also designed five 'junction modules', which could be used to connect the building blocks in various orientations, and a 'wedge' module that would further alter the curvature of the designed protein. Using these modules, the authors constructed four larger proteins—composed of 10–19 components—and demonstrated that they had well-defined CD spectra and high thermal stability and unfolded in a cooperative manner. X-ray crystal structures of two of them closely matched the models, indicating that these modules could reliably generate larger proteins with predictable structures and different overall shapes and curvatures. The authors estimated that using just 12 components, one could generate nearly 19,000 different macromolecules that could be further engineered using computational protein design or directed evolution to selectively bind specific target proteins.