Forty years have passed since the publication of the first atomic structures of virus particles and viral glycoproteins1,2,3,4,5,6. The influenza hemagglutinin structure has since become a key reference for structural studies of viral glycoproteins, including the human immunodeficiency virus 1 (HIV-1) envelope glycoprotein and coronavirus spike proteins. It enabled immunologists to visualize antigenicity of a human pathogen in three dimensions3, and epitope mapping was the first major application of the atomic model4. Within a year of determining the structure, the outlines of a membrane fusion mechanism had also begun to take shape7. The influenza neuraminidase structure5 also facilitated development of drugs still in use today. Nonetheless, structural biology has generally lagged behind other approaches — genetics, cell biology, molecular biology and biochemistry — for understanding viruses and viral infectivity, as defining the three-dimensional (3D) coordinates of thousands or even hundreds of thousands of atoms in a purified sample of a biological assembly experimentally has been a time-consuming and challenging task. Rapid advances in our structural understanding of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) during the ongoing COVID-19 pandemic show that the balance has changed.

Structure determination of biological macromolecules at atomic resolution mainly relies on three principal methods in structural biology — X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy and cryogenic electron microscopy (cryo-EM). The need for well-ordered crystals in X-ray crystallography and the upper limits of molecular mass in NMR exclude many macromolecules and their complexes, which are critical for mechanistic understanding, from application of these two methods. Since 2013, developments in cryo-EM single-particle analysis have enabled electron microscopy to reach high resolution, dramatically altering the structural biology landscape8,9. Owing to new technologies and software, cryo-EM, which requires far less sample (~0.1 mg) than crystallography or NMR (several milligrams or more), has become a powerful tool for determining structures of biological macromolecules at or near the atomic resolution. First, introduction of direct electron detectors has significantly increased the detective quantum efficiency (and thus enhanced contrast) of recorded images over the levels afforded by conventional charge-coupled device (CDD) detectors and photographic film10, effectively preserving high-resolution information11,12,13. Second, data collection in a movie mode, made possible by the new detectors, has enabled corrections for beam-induced sample motion, allowing computational de-blurring of images14,15. Third, maximum-likelihood algorithms have enabled robust 3D classification that separates a compositionally or conformationally heterogeneous sample into structurally homogenous subsets, which can yield density maps at a much higher resolution than previously possible16,17.

Removing a biological macromolecule from its cellular context, however, can preclude full understanding of its function(s) in its native environment. Imaging a structure in situ at a sufficient resolution by cryogenic electron tomography (cryo-ET) promises to bridge the gap between a high-resolution structure derived from an isolated sample and the same molecules in a complex and heterogenous assembly, such as a SARS-CoV-2 particle, or even in an intact cell. Cryo-ET reconstructs molecular 3D images of a target of interest together with its surrounding context at a resolution in the nanometer range by recording a tilt series (for example, ±60°) of projection images and computationally reconstructing them into a 3D tomogram18. Electron absorption limits the sample thickness to ~500 nm — larger than the dimensions of most virus particles but thinner than almost all parts of a eukaryotic cell. Focused-ion-beam (FIB) milling19, combined with correlative light–electron microscopy (CLEM)20, helps solve this problem. Milling a thin (for example, 150–200 nm) ‘lamella’ in a cell provides a window into the part of the cell included in that thin layer. Reconstructed tomograms can be used directly for structural interpretation, but the complex cellular background and intrinsically low signal-to-noise ratio in tomograms make it hard to extract high-resolution information from individual intracellular particles. If many copies of the particle are present in a cell, averaging the images of those particles extracted from the tomogram (subtomogram averaging) can greatly extend resolution, even to near-atomic detail in favorable cases21,22.

Since its initial outbreak in 2019, SARS-CoV-2 has captured the attention of the scientific community. After the release of the first viral genomic sequence23, structural studies of various viral components, as well as the whole virus, have advanced at an astonishing pace. The crystal structure of the main protease (ProM) — a key enzyme in viral replication — was deposited in the Protein Data Bank (PDB) just two weeks later, followed shortly by publication of several structures of spike (S)-protein complexes. These included cryo-EM structures of the ectodomain stabilized in the prefusion conformation24,25,26 and X-ray crystal structures of the receptor-binding domain (RBD) in complex with the receptor angiotensin-converting enzyme 2 (ACE2)27,28,29,30. The S protein mediates the fusion of viral and target cell membranes to allow the virus to enter a host cell. It is also a major surface antigen that induces neutralizing antibody responses and thus the key component of the first-generation vaccines31. The strategy, built on previous studies in both the coronavirus and HIV fields32,33, to stabilize the S-prefusion conformation with two proline residues to prevent formation of a long helix in the postfusion structure, has been used in several successful COVID-19 vaccines31. Cryo-EM structures of the purified full-length S protein of the original Wuhan-Hu-1 strain, in both its prefusion and postfusion conformations, were also reported not long after the stabilized ectodomain structures34, as was the structure of the full-length S protein from an early variant carrying a single-residue substitution: D614G (Fig. 1a; ref. 35). These structures, consistent with the results of studies on chemically inactivated SARS-CoV-2 virions with combined cryo-ET subtomogram averaging and cryo-EM single-particle analysis36,37,38, suggested that greater stability might make it preferable to use an immunogen with the D614G substitution instead of the unmodified Wuhan-Hu-1 S sequence39. Numerous structures of monoclonal neutralizing antibodies in complex with the S protein have also been determined40,41. These structures have not only led to mechanistic understanding of antibody neutralization mechanisms, but can also guide development and optimization of therapeutic antibodies by rational strategies42.

Fig. 1: Key structures of SARS-CoV-2.
figure 1

a, Cryo-EM structure of the full-length S protein derived from the early variant, carrying a single-residue substitution (D614G) in the closed prefusion conformation with all three RBDs in the down conformation (EMD-23010; PDB ID: 7KRQ). Three protomers in the S trimer are colored in red, green and blue, respectively. b, Cryo-ET reconstruction of the intact SARS-CoV-2 (EMD-30430). The spikes and RNP assemblies are indicated. c, Cryo-EM structure of the SARS-CoV-2 RdRp, including nsp12 (red), nsp8 (green) and nsp7 (blue), as well as two turns of RNA template–product duplex (yellow and cyan) (EMD-11007; PDB ID: 6YYT). d, Crystal structure of the SARS-CoV-2 ProM dimer in complex with an inhibitor (N3) with the two protomers shown in a ribbon diagram (green and cyan, respectively) and the inhibitor in a sphere model (PDB ID: 7BQY). e, Crystal structure of the SARS-CoV-2 PLpro in complex with a peptide inhibitor (VIR251). The enzyme is shown in a ribbon diagram (green) and the inhibitor in a sphere model (PDB ID: 6WX4). f, Structure of the SARS-CoV-2 E protein transmembrane domain in a pentameric assembly determined by solid-state NMR (PDB ID: 7K3G). Five protomers are colored in red, magenta, yellow, blue and green.

The overall molecular architecture of SARS-CoV-2 was subsequently analyzed by cryo-ET and subtomogram averaging, showing the structures of the S protein in both prefusion and postfusion states at medium resolution at the virion surface, as well as the assemblies of the ribonucleoproteins (RNPs) inside of the viral membrane (Fig. 1b; ref. 43). The key component of the viral replication machinery is an RNA-dependent RNA polymerase (RdRp), which carries out both replication and transcription. A cryo-EM structure of the SARS-CoV-2 RdRp, including non-structural protein 12 (nsp12), nsp8 and nsp7, and two turns of RNA template–product duplex (Fig. 1c; ref. 44), showed how the enzyme interacts with the double-stranded RNA and how antiviral compounds target the enzyme. Crystal structures of ProM (Fig. 1d) and the papain-like cysteine protease (PLpro, nsp3; Fig. 1e) of SARS-CoV-2 in complex with various inhibitors have illustrated the inhibition mechanisms and guided efforts to repurpose approved drugs and develop novel therapeutics45,46. NMR spectroscopy has also contributed significantly to our structural understanding of certain viral components, in particular, those buried in membrane, which are difficult to access by other approaches. For example, the envelope (E) protein forms a homopentameric cation channel that is critical for viral pathogenicity. A structure of the E transmembrane domain, determined by solid-state NMR spectroscopy (Fig. 1f; ref. 47), may facilitate development of E inhibitors as antiviral drugs. Similarly, because it has been either removed or is invisible in the S structures determined by cryo-EM or cryo-ET, the transmembrane domain of the S protein has been isolated and reconstituted into bicelles that mimic a lipid bilayer for structure determination by solution NMR. The structure shows a novel trimeric leucine–isoleucine zipper with tetrad repeats48. As of February 2022, 1,832 structures containing SARS-CoV-2 components have been deposited in the PDB, with 1,195 determined by X-ray crystallography, 628 by electron microscopy, 8 by solution NMR, 3 by neutron diffraction and 1 by solid-state NMR (http://www.wwpdb.org). Additionally, powerful molecular simulation tools have been applied to extrapolate dynamic mechanisms of the steps of SARS-CoV-2 infection, such as the large conformational changes of S protein required for membrane fusion, well beyond what the static atomic structures can offer49,50.

We strongly believe that structural biology will continue to make critical contributions — not only in efforts to control the COVID-19 pandemic, but also to fight against any future outbreaks of human pathogens — by revealing atomic details of infection, pathogenesis and host immune responses, as well as by facilitating rapid development of vaccine and therapeutic strategies. In particular, the advances in cryo-EM and cryo-ET technology in the past decade have greatly broadened the variety of biological samples accessible by high-resolution structural biology methods, as it is no longer necessary to truncate a full-length protein, remove physiologically relevant posttranslational modifications (for example, glycosylation) or select one homogenous conformation to produce well-diffracting crystals or interpretable NMR spectra. Several years ago, it would have been unthinkable that an atomic structure of a full-length and heavily glycosylated protein with more than a thousand amino-acid residues, such as the SARS-CoV-2 spike, could be determined in less than three months34.

Other technological advances have also made major impacts on the structural biology field. For example, live-cell optical imaging, such as lattice light-sheet microscopy combined with deep-learning-based data analyses, has already achieved unprecedented spatial and temporal resolutions at the single-molecule level in the context of living cells51. Protein structures predicted by the deep-learning-based programs AlphaFold and RoseTTAFold have now become remarkably accurate52,53, representing a further important advance in structural biology as they can rapidly generate atomic models (including some SARS-CoV-2 proteins54,55) on the proteomic scale that not only can guide functional studies, but can also serve as initial models for structure determination by experimental methods. These developments are revolutionizing the way in which we study molecular events of biological processes such as viral infection, marking a new era of structural biology that directly connects static atomic models with dynamic events in a crowded cellular environment.

What are the next methodological challenges in structural biology? In cryo-EM, most of the initially selected particles from a cryo grid are rejected as ‘junk’, even from a biochemically ‘clean’ preparation, and are thus removed from subsequent data processing to filter for a few ‘good’ classes that are homogenous enough for sufficient alignment to reach high resolutions. It is unclear whether those ‘bad’ particles are indeed misfolded or damaged molecules (for example, due to well-documented denaturation at the air–water interface56), or simply physiologically meaningful species in low abundance or with intrinsic conformational dynamics that are difficult to align. It is possible that certain important structural information is discarded along with the genuinely damaged particles. In cryo-ET, which represents a fast-evolving frontier in electron microscopy, there is still much to be desired for the resolution that can be achieved. Moreover, small regions embedded in membranes are invisible from many cryo-ET reconstructions even though there should be enough contrast to reveal them if enough tomograms are collected, at least in theory. Finally, as powerful as artificial-intelligence-based programs, such as AlphaFold, have already been for structure prediction, high levels of accuracy are still heavily biased to those that have set a homology model determined by experimental methods during training57. Predicting the structure of a large and complex protein with ligands or posttranslational modifications continues to be a formidable challenge.

Structural biology has become an indispensable tool for understanding biology and medicine, and continues to evolve rapidly. We should not be surprised if an atomic image of a complete virus particle is one of the first results released at the onset of any future pandemic comparable to the one caused by SARS-CoV-2.