Aligning single particles images without the assumption of symmetry is a multi-factorial process. Through a series of iterative steps it is possible to identify the asymmetric features within an image, orient about them accordingly, and then refine the model. The following steps (also see the online methods and supplementary figure 8 of the related article) outline the process of asymmetric orientation determination and have the prerequisite that accurate icosahedral alignment parameters (i.e. orientations and centers). Additionally, this methodology relies on the production of a variety of masks and reference projection images, each of which is used at specific points in the refinement to hone in on the true asymmetric orientations.
Step 1 (preparation)
Determine the icosahedral orientation of the 36,000 particles using the MPSA icosahedral orientation determination procedure8, and store the alignment parameters in a separate list file (EMAN9 format: image number, image name, Euler angles and center). For the icosahedral reconstruction of the P-SSP7 phage, use a box size of 720 pixels (1.17 Å/pixel), which may result in some single-particle images where the phage tail is cut off. For the asymmetric reconstruction it is necessary to include the tail (protein apparatus outside the capsid shell, including nozzle, adaptor and fiber) at one of the 12 5-fold vertices of the phage, corresponding to a box size of 960 pixels (1.17 Å/pixel) in the raw data. Since the target resolution of the final asymmetric reconstruction is well below that which the data was sampled, you must bin the single particle images by a factor of 2, to a size of 480 pixels (2.34Å/pixel) to reduce the computational burden of Fourier synthesis. As binning the data does not alter the computed icosahedral orientations, it is only necessary to modify the particle centering parameters of the list file accordingly. The new center are derived from the following two steps: 1) adjust center 120 pixels in both X and Y directions due to the offset from enlarging the box size from 720 pixels to 960 pixels, 120 pixles = (960-720)/2; 2) shrink the adjusted center by 2 because the raw image, with size of 960 pixels, was shrunk by 2. For example, an original center (358.148, 359.562) needs to be adjusted to (239.074, 239.781), where 239.074 = (358.148 + 120)/2, 239.781 = (359.562 + 120)/2. Using the modified list file described above, and the binned particle images, icosahedrally reconstruct an initial 3D model (480×480×480 pixels, 2.34 Å/pixel) (Fig. 1a) using the EMAN make3d program9.
Step 2 (initial 3D tail mask and model generation)
During the icosahedral reconstruction of P-SSP7, the tail density located at a special 5-fold vertex is weakened by a factor of at least 12, as there are 12 possible positions for the tail in the icosahedral map. Accordingly, you must lower the display threshold of the initial map (Fig. 1a) from 2.54 to 0.12 in order to visualize the “tail” (Fig. 1b). At this threshold it is possible to see ghost densities of the tail at each of the 12 5-fold vertices. Using the Amira software package produce a tail mask (Fig. 1c) from the ghost density of the tail along the +Z-axis of the initial icosahedral map. This mask should loosely fit the ghost density (i.e. be slightly bigger than the actual size of the ghost density). To make the negative density of tail ghosts visible in Chimera10, add a positive constant of 0.5 (sufficient to display all densities) to the 3D icosahedral density (Fig. 1b) before applying the mask. Extract the initial tail model (Fig. 1d) from the initial icosahedral model using the previously generated tail mask and then low-pass filter the map in EMAN using the proc3d program. Fig. 1e-f show how the ghost tails and capsid look without and with low-pass filtering (20Å), respectively.
Step 3 (initial 2D tail mask generation)
The tail of P-SSP7 occupies a very small amount of volume in the virion. Using a dynamic masking strategy you can make it stand out from the highly symmetric capsid and DNA, by maximally excluding non-target objects in the 2D raw particle images. Assuming the tail is a constant feature in P-SSP7, different particle orientations cause the tail projection in the raw 2D images to occur in different locations and have different shapes (Fig. 2a-b). To optimize computational time, create a database of all possible masks, indexed by projection orientation. As Azimuthal rotation (AZ defined in EMAN) does not appreciably change the shape of a mask projection, do not consider it as a projection parameter. As such, project the masks for the database along the initial 3D tail mask for only two of the three Euler angles (altitude (Alt) and in-plane (Phi)), as defined in the EMAN coordinate system. First, make projections of the 3D tail mask for different altitude orientations (Fig. 2c) (0° to 90° at an interval of 3°). In the current implementation of our program, the total computational time is dictated by the time it takes to load all of these masks into memory, and is relatively independent of the process of orientation determination. As there is a trade off between number of masks and computational time, instead of creating a prolific database of tight masks, merge several neighboring masks together to create a set of looser, but more computationally efficient, masks. Accordingly, with the exception of the first projection at 0°, merge every three consecutive projections to the 3rd one (Fig. 2c). Under this scheme you produce 11 projection masks from 0° to 90° (Alt) at an interval of 9°. The next step generates in-plane (Phi) rotation masks from 0° to 360°. Rotate each of the masks (again with the exception for the 0° (Alt) projection) along the Z-axis with the step size 0.5°. Just as before, to minimize computational restraints, merge every 20 rotations and index them to the angle corresponding to the middle (10th) mask (Fig. 2d). Under this scheme 36 in-plane rotation masks are produced for each mask at a given altitude angle. From these two consecutive steps 361 (1+10×36) 2D tail masks, covering all possible asymmetric orientations of the particle, are generated. During the process of dynamic masking, when choosing a mask from the database for a given particle, round the particle’s orientation to the nearest index orientation (ignoring the azimuthal orientation) when picking the 2D tail mask.
Step 4 (initial 2D tail reference projections)
The masked density of a raw image is comparable to the references projected from the tail model. We empirically use 400 projections as cross common line11 references during asymmetric orientation alignment. Compared to the number of references typically used during icosahedral reconstruction, this is sufficient to obtain a subnanometer resolution reconstruction. To save time, distribute this process to 10 CPUs, using EMAN’s runpar program. After each node simultaneously generates 40 random projections, merge these 10 projection sets together into one (this process takes only a matter of minutes to complete).
Step 5 (masking potential asymmetric features)
Starting with a single icosahedral orientations (from step 1), MPSA internally generates 60 equivalent icosahedral orientations for each particle image. As each of these orientations is a candidate for the true asymmetric orientation of a particle, round each of these icosahedral orientations to the nearest index orientation in the 2D tail mask database (from step 3), extract the corresponding mask and apply it to the particle image.
Step 6 (determining asymmetric orientations)
In MPSA, icosahedral orientations8 are determined through the use of cross common line11 correlation between a raw particle image and reference projections with 60-fold symmetry enforced (60 common lines per comparison between the particle image and a reference projection). Here the asymmetric orientations are determined using the same convention, but with no symmetry enforcement (one common line per comparison). Compute the cross common line correlations between each of the masked regions generated in step 5 and the 400 2D tail model reference projections. Compute the residual (a value inversely related to the correlation) using the formula Σi(0.5-cos(φi)) where i runs all selected points along the common line, and φi is the phase difference between the two Fourier transforms. The distribution of this function changes from iteration to iteration and depends on the data range used for calculating the common line correlations. For the first iteration, focus on a subset of frequency space between 1/200-1/60Å-1 for calculating the residuals of the cross common line correlations. After calculating the residuals for the 60 candidate orientations, choose the orientation that had the smallest residual as the particle’s asymmetric orientation. For the first iteration (1/200-1/60Å-1) the best residuals should range from 0.42 to 0.47, which will improve to 0.32 to 0.42 by the fifth iteration (1/200-1/18Å-1).
Step 7 (reconstructing a 3D density map without symmetry imposed)
Following the schematic outlined in steps 5 and 6, process each of the 36,000 particles to determine their initial asymmetric orientations. Once the approximate asymmetric orientations of all the particles are determined, reconstruct a 3D density map, free of any symmetry, using the EMAN make3d program9.
Step 8 (refining the asymmetric 3D density map)
After the first iteration of the asymmetric reconstruction algorithm, the tail’s relative intensity becomes much stronger as the capsid shell is no longer over-represented in the reconstruction. Alternatively the portal density is not well differentiated from the high density DNA that fills the capsid shell. For the 2nd iteration continue to use just the tail to generate masks and reference projections, however starting in the 3rditeration, and for subsequent iterations, use both the tail and portal densities (Fig. 2b) (which reside within the capsid shell) as the asymmetric target, which makes the orientation determination easier due to the larger volume of the model used. To refine the asymmetric reconstruction, repeat steps 2-7 for data with different resolution cut-offs (45Å, 35Å, 25Å, and 18Å for the 2nd, 3rd, 4th and 5th iterations respectively). For the final couple of iterations, extend the resolution cut-off to 12Å and compare the orientations determined at this level to those from the 5th iteration, keeping only those particles that were consistent from the 5th to final iteration. From the initial set of 36,000 particles, approximately 15,000 of the particles should be retained in the final reconstruction, yielding a final asymmetric map at 9Å.