From manual searching of the Protein Data Bank (PDB), I have curated a set of protein crystal structures corresponding to the capital letters of the Roman alphabet (Fig. 1). In choosing structures, I aimed to include a range of different structural motifs and to exclude nucleic acids or proteins solved while bound to nucleic acids. Sometimes these letter shapes seem to be incidental, and sometimes the shape is key to the protein's biological function. For example, the specific shape is likely to be important for L (from elongation factor P), which mimics the shape of tRNA; for the sinuous W (from DNA-binding domain from BurrH), which tracks DNA's major groove for modular sequence recognition; and for proteins with holes that enclose DNA (A, from DNA gyrase) or puncture the membrane (O, from the toxin cytolysin A). PDB accession codes and descriptions of function for all proteins are provided in Supplementary Table 1. This set may be useful for outreach and teaching, by drawing attention to the diversity of protein structures attained by natural selection. It is conceivable that the set may also have value in bionanotechnology and synthetic biology, in which at times molecular assembly needs a specific shape more than a specific function.
Funding was provided by Worcester College Oxford. I thank E. Lowe (University of Oxford) for the diffraction image.
The author declares no competing financial interests.
Supplementary Table 1
PDB code, general function and distinctive features of alphabetical protein structures (PDF 1444 kb)
Rights and permissions
About this article
Cite this article
Howarth, M. Say it with proteins: an alphabet of crystal structures. Nat Struct Mol Biol 22, 349 (2015). https://doi.org/10.1038/nsmb.3011