Main

Decades ago, when structural biology was still in its infancy, structures were rare and structural biologists often dedicated years of their life to studying just one structure at atomic detail. The first tools used for visualizing macromolecular structures were tools for specialists.

Today's situation is very different: the rate at which structures are solved has greatly increased, with over 60,000 high-resolution protein structures now available in the consolidated Worldwide Protein Data Bank (wwPDB)1. These data provide a wealth of detailed information that can yield significant insight into macromolecular function. To use this information most effectively, visualization tools were developed and are increasingly becoming everyday tools for biologists. For example, many biochemists regularly view protein structures to gain insight into protein function (Fig. 1). Chemists look at ligand-binding sites as part of drug design. Molecular biologists view RNA structures and complexes with proteins to gain insight into RNA signal and message processing. Some aspects of structure visualization remain mostly the domain of the specialist, such as molecular motion and large-scale molecular assemblies. Even in these intrinsically more complex fields, however, resources are beginning to enable bench biologists to visualize and use this information.

Figure 1: Visualizing a tyrosine kinase structure (PDB 1QCF)97.
figure 1

(ad,f) A simple way to gain insight into function is to use ribbon representation colored by sequence features: for example, domains (a), SNPs (b), exons (c), protein binding sites (d) and sequence conservation (f). (e) An effective way to show overall shape is with nonphotorealistic rendering using flat colors and outlines. (g,h) Solvent-accessible surfaces are often used for displaying electrostatic (g) and hydrophobic potentials (h; hydrophilic in saturated colors and hydrophobic in white). (i) Superposition is commonly used to compare two or more related structures—for example, two distinct states of the same protein, or, as shown here, two separate proteins with similar structure (PDB 1QCF and 1FMK)98. (j,k) Increasingly many tools have an integrated, interactive sequence viewer, which helps users understand the relationship between sequence and three-dimensional structure. Images were made using SRS 3D7 (ad,f,j,k), PMV25 (e,g,h) and RCSB PDB5 (i).

However, although structural information is now viewed and used by a large and diverse group of scientists, most of them are not prepared to spend months learning complex user interfaces or scripting languages. Even today, complex user interfaces in visualization tools are often a stumbling block, preventing many scientists from benefiting from structural data. Even structural experts have come to expect ease of use from molecular graphics tools, in addition to improved speed, features and capabilities.

In the past, molecular graphics tools were invariably stand-alone, designed to view one molecular system at once. Today's tools are increasingly internet aware, often integrated tightly with structure databases (Table 1), as well as with databases containing sequences and other features (for example, domains, single-nucleotide polymorphisms (SNPs), interactions).

Table 1 Selected resources for finding and visualizing macromolecules

Today, we are spoiled for choice when it comes to molecular graphics tools for viewing proteins and other macromolecular structures. Indeed, the sheer range of available tools can be overwhelming. Many molecular graphics tools have been developed to address diverse requirements, as documented in recent reviews2,3,4 and in several web resources maintaining lists of such tools (see footnote to Table 1). Most of these tools have a large set of features in common, including standard representations (ribbon, space-filling, ball-and-stick and so on) and coloring schemes (element-based coloring of atoms, coloring by secondary structure and so on). It is beyond the scope of this review to comprehensively compare all of these tools; instead, we focus on key biological questions for which visualizing structures can provide insight, and we highlight practical methods and tools with outstanding features that are particularly suited to addressing these questions.

Protein structures

Finding three-dimensional structures. For a biochemist looking to use three-dimensional structures to gain insight into the functions of a particular protein, the typical first step is a search for relevant structures. This task is considerably simplified by the remarkable degree to which all experimentally determined protein three-dimensional structures are consolidated into a single data repository, the Worldwide Protein Data Bank (wwPDB)1. Three primary distribution sites (RSCB PDB5, PDB Europe and PDB Japan; Table 1) provide access to the same underlying data bank, each with a wide range of integrated visualization and analysis tools. In addition, the PDB is mirrored at many other sites, some of which provide innovative visualization tools tailored to make specific questions easier to answer (Table 1). Most of these sites offer, embedded directly in their web-pages, one or more molecular graphics tools (for example, Jmol, PyMol, KiNG and Mage6). Increasingly, the process of finding and visualizing structures is becoming one seamless step for most users.

Finding structures from sequence. Several websites (for example, RCSB PDB5) allow the user to find structures using a sequence identifier or BLAST search (Table 1). Entrez Structure and SRS 3D7 allow the sequence to be aligned to any related three-dimensional structure (Fig. 1f). So far, experimental three-dimensional structures have been determined for less than 1% of all known proteins (based on direct links from PDB to protein sequences in UniProt8). However, for around 42–48% of all proteins, at least part of their sequence is considered significantly similar to a PDB entry, so that some structural information can be inferred9,10. Several websites (for example, Swiss-Model11) provide comparative models for such cases12,13. Each service uses slightly varying cut-off criteria for defining 'significant sequence similarity' (for example, in some cases depending on the length of aligned regions), but generally >40% sequence identity to a PDB structure is considered sufficiently good to create a high-quality comparative model structure10. These comparative models can be accessed at a single consolidated website, the Protein Model Portal (PMP)10. The original PDB templates also include information on experimental conditions, ligands and cofactors, which can be relevant in deciding to use or discard a comparative model.

For sequences where no template PDB structure can be found by the above resources, it may be possible to calculate a structure using so-called ab initio methods14. However, in spite of progress15, ab initio methods still require much improvement14 and we recommend they be used with caution.

Getting a first impression. To gain an initial overview of a protein structure, it is often useful to choose a representation that hides side chain atoms; ribbon-like representations do that well and also convey information about secondary structure (Fig. 1a–d). Ligand molecules are best displayed in space-filling or ball-and-stick atom representations. Many of the websites in Table 1 provide such a view (for example, FirstGlance, among others), some by default. Typically, each protein chain is colored differently, thus giving a quick insight into the number of molecules present in the PDB entry. To highlight overall shape and form, nonphotorealistic rendering can be very effective (Fig. 1e), especially with images for presentation and publication.

Some molecular graphics tools (for example, Chimera16, Cn3D17, OpenAstexViewer18, SRS 3D7, STRAP19 and Swiss-PdbViewer20) offer an integrated view of both the amino acid sequence and the three-dimensional structure, and further enable interaction between these two views (Fig. 1j). For example, clicking on a residue in the sequence view causes the corresponding residue to be highlighted and selected in the three-dimensional view, and vice versa. This feature can significantly help a scientist in understanding and using three-dimensional structures. For example, by viewing the location of key residues or sequence motifs, a scientist can assess whether they are likely to be accessible for posttranslational modification, such as phosphorylation21. Some viewers (for example, STRAP19) go one step further, showing structure integrated with a multiple sequence alignment viewer—a feature we anticipate will continue to become available for other viewers22.

For publication and presentations, some viewers can create impressive, ray-traced images (for example, Amira, Chimera16 ICM-Browser, Molscript23 plus Raster3D24, PMV25, PyMOL, VMD26).

The majority of PDB structures are derived from X-ray crystallography (Box 1, Fig. 2), about 13% from NMR spectroscopy (Box 2, Fig. 3) and less than 1% from electron microscopy (Box 3). These three experimental methods often require specific considerations and visualization methods (discussed in each display box).

Figure 2: Caution for beginners: symmetry in crystal structures.
figure 2

PDB entries often do not have explicit three-dimensional coordinates for all parts of symmetric oligomers. (a,b) For example, in PDB 2C2A107, coordinates are given for only one monomer (a), although the biologically active state is a homodimer (b). (ae) Usually this information is given in 'REMARK 350', however we recommend using PISA33, which automatically constructs a range of assemblies that occur in the crystal and predicts which of these is most biologically relevant. In this case, PISA gives the asymmetric unit (a), three dimer forms (b,c,d) and the unit cell (e). Increasingly, sites such as RCSB PDB5 provide the biologically relevant assembly precalculated with PISA. Image of PISA output made using VMD26.

Figure 3: Visualization of an NMR ensemble for SH3 (ref. 108).
figure 3

(a,b) NMR structures are typically deposited in the PDB as an ensemble of superimposed structures (a), with the spread of the ensemble giving an indication of precision, but not of accuracy. The 'sausage' representation (b) gives an informative summary of an ensemble by adjusting the width of the tube to match to the width of the ensemble. Images made using MOLMOL35 (a) and VMD26 (b).

Viewing sequence features on three-dimensional structures. A very straightforward way to use three-dimensional structures to gain insight into function is by coloring based on features such as domains, SNPs, exon boundaries, secondary structure and so forth. (Fig. 1a–d,f). The ability to easily see where sequence features are located in the three-dimensional structure can be of substantial practical value to bench biochemists and molecular biologists. For example, the spatial location of residues within the structure and the proximity to solvent can help in designing primers and mutation experiments. The ability to show such views for a wide range of features is a particular strength of SRS 3D7 and SPICE27 and is also facilitated by JenaLib28, PDBsum29 and Entrez Structure. Viewers such as STRAP30 that provide easy access to multiple sequence alignment information mapped onto three-dimensional structures can help locate key conserved residues. ProSAT2 (ref. 31) can display SNPs and also predict their effects, allowing a scientist to gauge the potential impact of a SNP on the protein structure.

Protein-protein binding sites. Typically, as part of its biological role, a protein will bind to several other proteins through comparatively large but flat binding surfaces. In fact, a large percentage of PDB entries contain not just a single protein chain but several. In some cases, this means identical subunits assembled together; in other cases, it means a complex of several different protein chains. The arrangement of subunits, and of the interface residues that form the subunit-subunit contacts, is often of biological significance. Several websites specialize in finding and visualizing subunit-subunit interface residues32. In PDBsum29 the interacting residues, and the types of their interaction across the interface, are shown schematically. MolSurfer (Table 1) provides a range of methods that help users explore macromolecular interfaces.

For symmetric assemblies (dimers, trimers and so on), the PDB entry of an X-ray crystal structure will often have explicit three-dimensional coordinates for only one monomer. To construct the coordinates for all subunits in the biologically relevant assembly, we recommend PISA33 (see Box 1, Fig. 2).

Comparing related structures. It is often informative to visualize two related structures superimposed—for example, two states of the same molecule, or two proteins with homologous sequences, or two structural homologs found by structural comparison tools34. Many molecular graphics tools offer automatic superposition as a standard feature (for example, MOLMOL35, MOE, PyMOL or VMD26). These tools allow the researcher to specify a portion of the molecule to be superimposed. The results are highly dependent on the regions chosen for the superposition. Typically, the researcher identifies a more-or-less rigid core of the molecule and superimposes this region using a subset of the atoms (typically the α-carbons or the backbone atoms). But many other combinations are possible for addressing specific questions (Figs. 1i and 4d–f). For difficult cases—for example, low sequence similarity or large regions that cannot be aligned in sequence—it is best to use more robust, dedicated superimposition tools (for example, STAMP36, STRAP19 or THESEUS37).

Figure 4: Visualizing ligand-binding sites.
figure 4

(a) A useful initial view is to show ligands and binding site residues in ball-and-stick and wire-frame representations, respectively. Here, an inhibitor is shown bound to HIV protease (PDB 1HVR99). (b) Visualizing the same binding site using a molecular surface colored by atom type reveals the catalytic oxygen atoms (center, red). (c) Here, AutoLigand44 has been used to find regions that might bind a ligand-sized molecule. (d) Two structures of the same protein (estrogen receptor) superimposed using Relibase58,59, one with estrogen (blue, PDB 1QKU)100, a second with an antagonist (red, PDB 1ERR)101, give insight into the antagonist mechanism. (e) All 74 structures of human estrogen receptor compared using PDBsum, showing estrogen (red) and cofactors (green). (f) Comparing binding sites of related structures can give insight into drug specificity. Image shows estrogen receptor (green), progesterone receptor (gray) and androgen receptor (orange). (g,h) Simplified two-dimensional schematics can be useful for visualizing binding site interactions, such as hydrogen bonds (dashed lines), unbonded contacts ('eyelashes', g) and hydrophobic interactions (green curves, h). (i) To study drug specificity, interaction networks can be used to show all proteins known to interact with a drug. Images made using SRS 3D7 (a), PMV25 (b,c), OpenAstexViewer18 (d), Jmol (e), MOE (f), LIGPLOT65 (g), PoseView66 (h) and STITCH63 (i).

Molecular surfaces and electrostatic potentials. Many tools can generate molecular surfaces, most commonly the so-called Connolly surface38, which is derived by rolling a sphere the radius of a water molecule around the atomic van der Waals surface of the molecule. This surface, also known as the solvent-excluded surface, can be used as a canvas to map a wide variety of properties such as residue conservation scores, hydrophobicity (Fig. 1h), depth-cue information (Fig. 1e), mean-force potentials39 and electrostatics (Fig. 1g). Such colored surfaces (sometimes called texture mappings) can give insight into molecular interactions and conformational changes, for example, by highlighting surface regions with complementary shape and charge. The molecular surface can also be used to estimate the energetics of molecular interactions, including the entropic cost of desolvation, by calculating the area buried from solvent upon binding of other molecules40.

Although many program can generate a surface, the program MSMS41 is widely used as it provides a good estimate of molecular surface area and volume, and the most relevant molecular geometry when analyzing molecular interactions and interfaces.

Ligand binding sites

Interactions between macromolecules and small molecules often occur in buried active sites; these may be catalytic active sites, allosteric sites, or sites that may either disrupt or stabilize protein-protein interactions. The PDB at present contains over 37,000 binding sites involving about 10,000 different types of ligand molecules. A range of methods are available to characterize and visualize these sites, depending on the questions asked by the end user.

Annotation and highlighting. For gaining an initial insight into the atomic interactions in the binding site, a useful representation is to display ligands using a ball-and-stick representation and to display only backbone atoms of the protein or nucleic acid, except for those residues in direct contact with ligands (Fig. 4a). Many molecular graphics tools have been developed to support working with small molecules (for example, DS Visualizer, MOE, PMV25, PyMOL, STRAP19, Swiss-PdbViewer20, SYBYL, VMD26, WHAT IF42, Yasara; Table 1) Almost any can implement such views, and those with scripting capabilities can often be programmed to recreate this view on demand.

In addition, many PDB entries or related files (for example, UniProt) have annotations indicating which residues form the binding site. It can be instructive to display these annotations directly on three-dimensional structures, and many molecular graphics tools enable such displays (for example, JenaLib28, PDBsum29, ProSAT2 (ref. 31), SRS 3D7 and Ligand Explorer in the RCSB PDB).

Surface-based approaches. Structural details of binding sites are widely used in rational drug design, usually to generate ideas for classes of compounds for screening43. A common question is to ask what kinds of small molecules may bind to a given binding site. Many molecular graphics viewers allow the surface to be colored by local properties, such as hydrogen bonding ability, hydrophobicity or electrostatics, to allow exploration of chemical complementarily (Fig. 1g,h). The local curvature of the surface may also be used to evaluate steric complementarily.

Volume-based approaches. An alternative approach is to analyze the space around the target molecule, highlighting regions that may form strong interactions with small molecules. Some tools (for example, AutoLigand44) allow probe atoms, such as carbon atoms or oxygen atoms, to be scanned through the entire space and the interaction energies of the probes with the molecule to be evaluated. The resultant three-dimensional data sets are then rendered to show the areas of most favorable interaction45. More recently, atomic probes have been used to create maps of the atomic affinity. These may be rendered using isocontours, text-mapped clipping planes or volume rendering (Fig. 4b,c). Many researchers are now analyzing these volume data sets to identify and visualize ligand-sized regions of maximal affinity44.

Sequence-profile approaches. Another approach to identify ligand binding sites uses multiple sequence alignments mapped onto three-dimensional structures46. This approach is based on the observation that binding site residues tend to be more conserved than other positions, so it can be particularly useful when little is known about a protein. Even for well studied proteins, however, these methods sometimes find binding sites not previously noticed. Some examples of such services are TraceSuite47, ETV48 and others48,49,50.

Multiple ligands. A three-dimensional structure gives a snapshot of a single state; however, in some cases, several different structures of the same protein exist with different ligands. We can use this information to help explore the range of conformations available to the system. For example, such comparisons can highlight interactions common to all known binding partners, which may help to guide the search for further possible binding partners51,52,53. For such comparisons, it can be useful to try different sets of atoms for superposition—for example, the ligand alone, or all atoms involved in the binding site. Each of these superimpositions can highlight different aspects of the conformational differences.

Often, it is of interest to compare structures with multiple ligands obtained by means of docking tools (for example, FlexX54, AutoDock55). To preselect promising compounds, computational chemists can scan large libraries of drug-like molecules and dock 'hits' into the binding site of the protein target56. Subsequently, the docked structures can be inspected visually to find ways of enhancing the predicted strength of binding57. Some docking tools now provide graphical interfaces (for example, FlexV and AutoDockTools) for the preparation of the input structures and the analysis of the results. These tools allow the comparison of interaction geometries of different ligands with the same protein.

Two useful resources for comparing multiple ligand structures are Relibase58,59 and Superligands60, which both contain information about all ligands in the PDB and take special care to ensure the assignment of chemically correct atom and bond types. Both resources allow searching by identifiers as well as chemical substructure searches and similarity searches; Relibase also offers keyword searches and sequence similarity searches. The structures can be displayed in two or in three dimensions in embedded viewers. When exploring a specific protein, it is especially useful to search for similar complexes; Relibase lists similar proteins with their respective ligands, which can subsequently be superimposed and displayed in the embedded OpenAstexViewer18 (Fig. 4d and Supplementary Fig. 1). The extended functionalities of Relibase+ (which requires a paid license) give an analysis of the differences in the superimposed structures (protein movements and ligand overlap).

PDBsum can also help visualize multiple ligands binding to the same protein by superimposing the protein's different structural models in the PDB and identifying any 'ligand clusters'; that is, sites where the ligands from the different structures overlap (Fig. 4e).

Multiple proteins and ligands. Finding features that are specific to a given target adds another level of complexity when studying protein-ligand interactions. To identify features determining selectivity, it is useful to compare the target binding site with binding sites of similar proteins. The “similar binding site” as well as the “similar ligand” search of Relibase can help to identify and compare similar protein complexes. Here, again, the Relibase+ comparison table is especially useful for detecting differences in the protein binding sites—mutations, insertions and residue movements. MOE provides a similar facility to help compare multiple proteins bound to multiple ligands (Fig. 4f).

Structural visualization can be useful for predicting side effects and 'off-label' uses of known drugs by comparing the target binding site to other known protein structures61,62. Some graphic tools support this: for example, Relibase+ offers a search for “similar cavities,” where the protein comparison is based on physico-chemical properties rather than residues, hence finding remote similarities not evident from sequence similarity.

Structural visualization can also help in developing more selective drugs. Although promising, such approaches remain speculative, and their success will be fundamentally limited, as the PDB contains only a small fraction of all binding site geometries. A complementary approach is to use the much larger set of known protein-drug interactions where no three-dimensional structure is available. For example, STITCH63 can be used to show a network featuring all proteins known to interact with a given drug, based on a wide range of experimental databases, including the PDB (Fig. 4i). In the future, we anticipate that such approaches will be improved, and that PDB data will be increasingly incorporated into network visualization methods64.

Schematic illustrations. For presentations and printouts, it can be useful to highlight key interactions in the binding site using simplified schematic illustrations produced by tools such as LIGPLOT65, PoseView66 and Ligand:Protein Interaction Diagrams67 (part of MOE). These illustrations show the ligand and interacting protein side chains 'flattened' in a plane, and indicating relevant hydrogen bonds, covalent bonds, unbonded contacts and water-mediated hydrogen bonds (Fig. 4g,h). For comparing different complexes, LIGPLOT65 and MOE allow the user to generate a series of plots for related proteins binding the same or different ligands. Equivalent components of each plot are plotted in the same relative location, thus highlighting residues and interactions present in some of the structures but missing in others.

RNA structures

Over 4,000 nucleic acid three-dimensional structures are on deposit in the Nucleic Acid Databank (NDB68), mostly RNA structures, either determined experimentally or by ab initio prediction. NDB is also synchronized with the PDB1, and RNA structures account at present for nearly 8% of PDB entries. Many standard aspects of visualizing three-dimensional structures of RNA can be performed completely adequately by molecular graphics tools designed for proteins, such as PyMOL and Swiss-PdbViewer20 (Table 1).

Knowing the secondary structure of an RNA molecule often gives significant insight into its function, much more so than for protein secondary structure. RNA secondary structure can be derived either from multiple sequence alignments or from thermodynamic predictions, although the process requires specialized features and capabilities not available in most tools for visualizing protein alignments or structures. Multiple sequence alignment is particularly important in RNA research; alignments can be used to find covariations between nucleotide positions, which are then taken as evidence for a contact between the two nucleotide positions, and these contacts in turn define secondary structure (Fig. 5).

Figure 5: Visualization of RNA structure in one, two and three dimensions.
figure 5

Viewing multiple sequence alignment simultaneously with two-and three-dimensional representations greatly helps in assigning two-dimensional structure and understanding function. This process is aided by synchronizing colors in all three views. The RNA structure shown is from SARS virus102, and the image was made using S2S Assemble69 with PyMOL.

Because of these special-purpose requirements, the RNA community has developed their own specialized visualization tools (Supplementary Table 1) for viewing RNA secondary structure. Some of these RNA tools (for example, S2S Assemble69) provide an integrated environment for interactively visualizing multiple sequence alignments, intramolecular contacts and RNA three-dimensional structures (Fig. 5). The most useful tools provide the option to manually edit the two-dimensional contacts, allowing not only reorientations of elements but also deletion and addition of nucleotides or a whole element, such as a helix.

At present, two of the main challenges in RNA visualization are as follows: first, RNA often adopts multiple structures depending on experimental conditions, and none of the available tools can deal with this properly. Second, RNA in vivo usually occurs in complex with proteins, however the RNA-specific tools cannot yet manage such complexes. RNA researchers can use standard molecular graphics tools to view such complexes, but of course this means losing RNA-specific features and capabilities.

Molecular motion

Biomacromolecules are dynamic entities, and motion is usually essential to function70. Visualizing dynamic molecular processes is often key toward understanding these processes. Recently, several visualization tools have become available that allow quick and easy exploration of dynamic transitions between two known states of a molecule. For example, the Yale Morph Server71 (http://molmovdb.org/) provides morphed animations of potential plausible pathways between two structures; Moviemaker72 (http://tinyurl.com/moviemaker-v1/) is a web server that permits the user to generate simple animations of a variety of types of protein motion. These tools provide very approximate, often simply schematic, descriptions of the molecular motions.

To explore large-amplitude, low-frequency motions, such as protein domain flexing, methods based on normal mode analysis and elastic network models provide a computationally efficient approach73. There are now several websites, for example, NOMAD-ref74 and ANM75, where even a novice user can enter a PDB file, compute normal modes, and visualize and analyze the results.

At a slightly higher level of complexity, several programs allow users to generate conformational ensembles and trajectories using constraint-based methods. Such programs include tCONCOORD76 and FIRST/FRODA77. One application of these methods is to identify segmental flexibility in proteins. The researcher identifies rigid domains in the protein connected by flexible tethers, then defines the geometry of the hinge or shear motions that occur as the proteins change conformation78. The Database of Macromolecular Movements71 provides a service for analyzing hinge motion in proteins. Other websites enable molecular motions to be analyzed by means of hierarchical, multiresolution flexibility trees79.

More realistic and detailed studies of motion require molecular dynamics simulations, which typically simulate 10–100 ns of motion in 1-fs time-steps. Unfortunately, such calculations are generally too CPU-intensive to be provided as a free service; hence, users usually need to calculate their own trajectories. For a first look at molecular dynamics simulations, DSMM80 (http://tinyurl.com/dsmm-eml/) is a site that collects movies showing molecular dynamics simulations. Generally, molecular dynamics simulations are recorded as trajectory files that can be played back in a range of molecular graphics tools that support molecular dynamics (Table 1). There is as yet no unified resource to deposit or access trajectory files, although there are several initiatives in this direction—for example, the MoDEL Molecular Dynamics Extended Library (http://mmb.pcb.ub.es/model/). A related project, called Dynameomics81 (http://www.dynameomics.org/), provides online interactive views of simulations of 30 proteins and plans to extend this to all known protein folds. Such services are still very new, and we can expect significant advances in the next few years.

Of the molecular graphics tools with molecular dynamics support, VMD26 is probably the most widely used. It can display 'movies', analyze properties such as atomic fluctuations and allows flexible integration with other computational tools and with the user's own scripts. Although VMD is popular, many other molecular graphics tools support molecular dynamics trajectories, and each tool often has unique features that may be useful for particular projects (Table 1).

In general, visualization of molecular dynamics trajectories remains challenging owing to intrinsic complexity, such as the large number of atoms involved and the many orders of magnitude in time relevant for biological processes. The most straightforward visualization is to superimpose several molecular dynamics snapshots (Fig. 6a). While often useful, this method has obvious limits. Overall motion can be viewed using 'sausage-like' representations (Fig. 3b); however, often dimension-reduction methods are needed58. An increasing number of such methods are being developed for visualization of specialized cases—for example, transient cavities (Fig. 6b) and molecular diffusion (Fig. 6c–e).

Figure 6: Visualizations of molecular motion.
figure 6

(a) Four snapshots from a molecular dynamics simulation visualized (darker protein coloring indicating later snapshots). A ligand is shown moving from its initial position buried in an active site (right) to the protein exterior (left). (b) Same four snapshots using a simplified representation highlighting residues undergoing conformational changes as the ligand escapes. The contoured surface (generated with CAVER103) shows changes to the transient tunnel used by the ligand. (ce) Visualization of protein-protein diffusion simulations made using SDA (http://tinyurl.com/SDA-EML/). (c) Representative trajectory of a protein (blue) diffusing around a second, target protein (orange). (d) Isocontours (blue) show the region most occupied by the diffusing protein during thousands of trajectories. Target protein, orange. (e) Two-dimensional map of occupancy versus protein-protein center-to-center distance; blue, the most occupied region104.

Large macromolecular assemblies

X-ray crystallography is being used to solve the structures of larger and more complex systems, and there is now considerable overlap in the size range of structures from X-ray crystallography and from electron microscopy (Box 3). It is common to see electron microscopy isosurfaces into which atomic-detail X-ray structures have been fitted. Meanwhile, electron microscopy continues to produce higher-resolution density maps of large assemblies and of single particles, such as viruses or other isolated complexes (Box 3), in addition to tomograms of higher-order, unique structures such as cell sections or isolated organelles82.

These data on large-scale assemblies that integrate data from X-ray crystallography (Box 1), NMR spectroscopy (Box 2), electron microscopy (Box 3) and even light microscopy82,83,84 pose many new challenges for visualization. Many of these data are not at atomic detail, so other representations must be used. In addition, the systems can be very large, and there are often issues with computational and graphics performance. There is a need for high-performance, interactive visualization of such large assemblies, and across very different distance scales, although some tools, such as Amira (Visage Imaging) and PMV25, were designed with such challenges in mind.

At present, researchers typically use a hierarchical approach to visualizing large macromolecular assemblies. For portions for which atomic information is available, atomic representations may be used, and then abstracted to simpler, surface-based representations. These surfaces may then be integrated with density sections or volumes from the lower-resolution methods (for example, electron microscopy tomography). This approach scales nicely from the level of atoms to the level of cells, allowing the use of simpler, more abstracted representations of the individual components as one moves to large systems, such as intracellular components (Fig. 7a) or even whole-cell visualization (Fig. 7b), and to multiscale movies85.

Figure 7: Two examples of multiscale, hierarchical visualization.
figure 7

(a) An atomic structure of an antibody (bottom) was used to create a smoothed surface as part of a more complex scene of blood serum (top). Images made with AVS (http://www.avs.com/) and PMV25. (b) Top, a 2.4-nm electron tomogram slice of a human skin section showing part of the nuclear envelope (blue), cytoplasm (black background) and a desmosome (orange) at the boundary of the two cells. Using sub-tomogram averaging, the interaction of cadherin proteins can be resolved105, and they were used to calculate isosurfaces (below) into which the atomic-detail structure of C-cadherins106 has been fitted. Images created using MATLAB and Amira. Scale bars, 10 nm.

Visualization hardware

Most of this review has focused exclusively on software developments, tacitly assuming that computer and display hardware are adequate for all visualization tasks we require. In the early days of molecular graphics tools, hardware limitations were a key issue; display systems were often very expensive, and they relied on nonstandard hardware. Significant effort in software development was directed toward ameliorating hardware limitations. Today, although most molecular graphics tools run comfortably on standard desktop computers, many hardware issues remain, particularly for the more complex visualization tasks, such as the study of molecular motion and of large assemblies.

Stereo capabilities can greatly enhance molecular graphics and, although available for many years on expensive and specialized systems, stereo is only just now becoming available for desktop LCD screens.

For particularly large assemblies, computational speed is often still an issue—here, it is important to use a top-of-the-line graphics card, and also to use molecular graphics tools that can take advantage of hardware acceleration. Fortunately, most tools can, the principal exceptions being RasMol and Chime.

Immersive virtual reality. Visualizing large, complex and multi-scale macromolecular assemblies, especially combined with molecular motion, is not only challenging computationally, but ultimately may require display systems significantly better than current computer monitors can provide. Immersive virtual reality is very promising, enabling the user to virtually enter a microscopic world, flying through and interactively manipulating macromolecules. Experimental immersive environments—for example, CAVE86—have been in development for over 20 years, and concepts from this research have been used to enhance the user experience of several molecular graphics tools (for example, Yasara; Table 1). But such techniques have yet to find widespread use for molecular visualization—partially because of the still high cost and cumbersome nature of such systems, but perhaps also because the sense of immersion is not critical for interaction with the molecular world.

Today, however, some of the hardware components for virtual reality are becoming affordable and practical, such as head-mounted displays with head tracking, and a variety of haptic devices (mechanical input devices that are touch sensitive), such as the Wii controller, as well as devices such as wired gloves that can provide force feedback. These improvements are largely driven by the gaming market and are expected to continue rapidly. For most molecular graphics tools, minimal modifications should be required to allow them to work with such hardware, and some tools have been built with such support already in mind (for example, VMD26 and SRS 3D7). However fully exploiting the promise of virtual reality will require substantial further software development, particularly to the user interface layer.

Physical models. Today, molecular visualization relies almost exclusively on computer-generated images. Although physical wooden and wire molecular models played a critical early role in structural chemistry and biology, the advent of three-dimensional interactive computer graphics in the 1970's provided new and much improved utility in macromolecular structure determination and analysis. However, a more recent technology, computer autofabrication, or 'solid printing', initially developed for industrial rapid prototyping, is now being used to produce physical molecular models. Such models bring back the properties of real-object perception and manipulation that were lost when the model resided only in the computer. Over the past decade, the variety of such printers has increased steadily as the entry price has dropped to below $10,000 and printing services have sprouted to fill this new niche. Because accurate and complex tangible models can be produced automatically as computer 'printouts' of molecular geometrical representations, the barrier to custom production has disappeared. Physical models with functional parts have been autofabricated with analog physical constraints, affinities and/or structural behavior of the molecular system87 (Fig. 8).

Figure 8: Tangible models in research.
figure 8

Tangible models were used to explore the modes of self-assembly of viral capsids88. (a) The electrostatic and charge complementarity is displayed using isosurfaces for the protein and electrostatic potential. (b) Affordances for placement of magnets were designed into the protein surface using constructive solid geometry methods. (c) Physical models were built and fitted with magnets. Twelve pentameric subunits then self-assemble when shaken for several minutes in a tube. Images created with PMV25. (d) An augmented-reality interface used to study molecular interactions of the enzyme superoxide dismutase. An inexpensive video camera (not in the picture) views the models, and embedded markers on the surface (small black squares) are used to determine the orientation of the model from the video image. Volume-rendered electrostatic potentials and small animated arrows for the electrostatic field vectors are then overlapped onto the video image, following the video image as the user manipulates the model.

Such models have begun to be used for structural research. As persistent objects they are convenient, accessible and naturally manipulable. They can be used as springboards to ideas and hypotheses88. Such characteristics also make physical models useful in multidisciplinary collaborations, helping structural experts communicate better with other colleagues. In addition, physical models lend themselves to teaching89. We are in the early stages of learning how to best use physical models in structural biology education and research, perhaps comparable to where computer graphics was in the 1970s. This is an ongoing area of research90,91,92.

Future perspectives

Methods for visualizing molecular structures are very mature. In the near future, we can expect more effective computational approaches for representing, analyzing and synthesizing ever-more-complex molecular systems. Increased collaboration with the graphic design community will also lead to the development of more effective and intelligible rendering approaches. However, we expect that most of the advances in molecular visualization will come in the areas of computer interfaces, user interaction and new ways to represent and visualize nonspatial information. These changes will help structures reach an even broader audience. Navigating a synthesis of structural data with image data82 and genomic22,93 and biological network information64 will require new methods that combine spatial and dynamic representations with statistical and high-dimensional abstract relationships. We also anticipate that collaborative community editing of structure-related data sources (for example, Proteopedia94) will change how scientists relate to structural data, and to each other. The fields of information visualization and visual analytics have developed over the past decade to address problems in making such complex data intelligible and navigable95,96.

Some of the drawbacks of immersive virtual reality may be overcome by the emerging technology of augmented reality (Fig. 8d), which provides inexpensive and accessible ways to interact in intuitive and perceptually rich ways with our computational models. Whatever direction new technologies will take us, the roles of macromolecular visualization in understanding, gaining insight and developing ideas will remain the same.

Note: Supplementary information is available on the Nature Methods website.