Asymmetric cryo-EM reconstruction of phage MS2 reveals genome structure in situ

In single-stranded ribonucleic acid (RNA) viruses, virus capsid assembly and genome packaging are intertwined processes. Using cryo-electron microscopy and single particle analysis we determined the asymmetric virion structure of bacteriophage MS2, which includes 178 copies of the coat protein, a single copy of the A-protein and the RNA genome. This reveals that in situ, the viral RNA genome can adopt a defined conformation. The RNA forms a branched network of stem-loops that almost all allocate near the capsid inner surface, while predominantly binding to coat protein dimers that are located in one-half of the capsid. This suggests that genomic RNA is highly involved in genome packaging and virion assembly.

B acteriophage MS2 (ref. 1) is a species of the Levivirus genus in the Leviviridae family of small, positive-sense, single-stranded ribonucleic acid (RNA) bacteriophages that infect their host via adsorption to bacterial pili. The MS2 virion consists of an RNA genome, encapsidated by a T ¼ 3 shell, containing coat protein (CP) and a single copy of the maturation or so-called A-protein (AP), which attaches to the viral RNA and binds the host receptor. The MS2 genome is one of the smallest known, comprising just 3,569 nucleotides, and was the first genome-of any life form-to be completely sequenced 2 . The genome encodes four proteins: maturation, coat, lysis and replicase, which are translated at different levels and time points during the bacteriophage life cycle. The RNA adopts a specific secondary structure with many double-stranded regions, which play essential roles in translational regulation and replication 3 . Extensive research has shown that access to ribosomal binding sites for translation is regulated by short and long range base-pairing 4 and binding of RNA stem-loops (SLs) to CP dimers. More specifically, maturation gene translation can only take place during the first stages of synthesis of a new plus strand 5 . The start of the lysis gene is suppressed by a local hairpin and accessed occasionally by ribosomal back scanning and frame shifting of ribosomes that finished translation of the CP gene 6 . Expression of the replicase gene, and therefore replication, is controlled by base pairing with a coding region of the CP 7,8 , and downregulated by binding of a SL containing the start codon, called the 'translational operator' (TR), to a CP dimer (CP 2 ) 9 . In addition, the RNA secondary structure is involved in replicase and AP binding and resistance to host RNAses 3 .
Although MS2 capsids spontaneously assemble in vitro from CP alone at high enough concentration 10 , its RNA plays an important role in virion formation in vivo. Encapsidation of its own genome depends on presence of the AP 11 , which specifically binds two RNA sequences at the 3 0 -untranslated region (nucleotides 388-414) and the 5 0 -maturation region (nucleotides 3,510-3,527) 12 . Second, capsid formation is highly promoted by interaction of the TR binding to CP 2 (ref. 9). Third, specific SL-CP 2 interactions form packaging signals (PSs) promote genome packaging 13,14 .
Atomic models of the MS2 capsid 15 , capsid protein dimer 16 and the translation suppressing 19-nucleotide RNA TR hairpin loop bound to a CP 2 (ref. 17) were determined using X-ray crystallography, revealing sequence-specific RNA-protein interactions essential for binding. The structure of aptamers showed that multiple RNA sequences can bind to MS2 CP 2 (refs. 17-22). Icosahedral cryo-electron microscopy (cryo-EM) reconstructions of the MS2 virion, including genome and AP, showed that the operator-CP 2 interaction is not unique, and that the whole genome is highly connected to the capsid via SLs 23,24 . Asymmetric single particle 24 and tomography 25 cryo-EM reconstructions of the MS2 virion suggested that the genome within MS2 might adopt a specific tertiary structural conformation, although the resolution of these maps was too low (B4-6 nm) to resolve any recognizable tertiary RNA structures.
Following these investigations, we here determined using high-resolution single particle cryo-EM the asymmetric structure of bacteriophage MS2, to a resolution of 8.7 Å. The map outlines the AP, which replaces one CP dimer. Moreover, it shows an ordered genome that is shaped as a branched network of connected RNA stem-loops, of which the majority interacts with the inside of the capsid. The RNA-CP interactions are primarily located on one side of the capsid, which might have consequences for genome packaging and virion assembly.

Results
Asymmetric structure. The outside of our asymmetric EM map of the MS2 virion (Fig. 1a)  , showing the characteristic openings in the capsid at the five-and three-fold symmetry axes and the small protrusions, formed by amino acid loops, connecting beta-sheet strands A and B, on the capsid surface (dark blue in Fig. 1a,b). More importantly, in the asymmetric virion structure the single copy of the AP is clearly resolved in the capsid (yellow, Fig. 1a-d, Supplementary Movie 1) and replaces one CC conformer-type CP 2 in the capsid, which thus contains 178 copies of the CP. The AP forms a 9 nm long 'handle' (B1.8 nm in diameter) extending outwards with a B30°angle from the surface from a two-fold symmetry axis position in the capsid, extending over a neighbouring hole at a three-fold axis.
The RNA. The inside of our asymmetric MS2 map reveals that the genome within the MS2 capsid has a defined tertiary structure, forming an intricate three-dimensional (3D) branched network of interconnected SLs ( Fig. 1c-e, Supplementary Movie 2). The RNA density accounts for 95% of the calculated mass of the full genome. A less ordered region in the map can explain the missing 5%. In total, 59 SL structures were discriminated in the genome, which were all connected to each other, of which the majority (53, 90%) ended near the capsid (Supplementary Fig. 3), while 6 ends were centrally located. To quantify and assess the PSs, the interactions of the RNA SL with CP 2 , TR-CP 2 X-ray model (PDB entry: 1ZDH) was fitted into the EM density map. In total, 44 SLs (83% of the SLs near the capsid and 75% of all identified SLs) ended at a CP 2 RNA binding site, 2 SLs interacted with the AP (Fig. 1d) and 7 were not located near a CP 2 interface. Close examination of these 44 binding sites showed that at 23 sites the RNA from the X-ray model fit the EM RNA density ( Supplementary Fig. 4), with some of them showing in detail the interactions between nucleotides A À 4 and A À 10 in the 19-nucleotide RNA chain of the RNA hairpin loop (1ZDH) that both make contact to Thr45 and Ser47 (not shown) located in the beta-sheets of different CP molecules (Fig. 1f). The observed amounts of SLs (59) and PSs (between 23 and 44) match earlier estimates, predicting 53 SLs and 35 PSs, that were based on the predicted RNA secondary structure, obtained by phylogenetic analysis and experimental probing of RNA in solution, and analysis of potential binding SLs 14 . Our results show that the majority of the many SL RNA structures that are present in the genome of MS2 bind the CP 2 in the capsid. Since these SLs all have different predicted sequences, as a result this suggests that a wide variety of different RNA sequences actually bind to CP 2 in situ in an overall similar conjunction as the TR, similar to what was observed with several aptamer-CP 2 interactions [18][19][20] . At several sites RNA hairpin-EM densities were also observed near the capsid but in deviating conformations or sites, often rotated (7 cases) or shifted (7 cases), compared with the TR structure (data not shown). In addition, several stem structures were sideways associated to CP 2 sites. Therefore alternative, non-sequence specific, binding modes of the RNA to the CP 2 likely exist, including binding to other amino acids in the capsid.
The obtained resolution of the EM map is sufficient to visualize the double-stranded (ds) RNA SLs, including its helical nature; however, single RNA strands are not clearly resolved and therefore it was not possible to trace predicted secondary structures into the 3D map and to allocate predicted SLs in the MS2 genome into the density 14,26 . To investigate all individual SL-CP 2 interactions in the genome to the observed SLs in the EM map a higher resolution structure would be required. Nevertheless, the two regions known to connect to the AP 12 could putatively be allocated in the EM map. The 3 0 RNA end, including nucleotides 3,510-3,527, forms a very specific repeat of short SLs that binds adjacently along the AP, while near the 5 0 RNA the SL formed by nucleotides 388-414 binds the AP (Fig. 2a, Supplementary Movie 3).
RNA-protein interactions. While the AP ensures specific packaging of its coding RNA and TR-CP 2 interactions promote capsid formation, the question is how the SL-CP 2 interactions that we observed here influence virion assembly. To explore a potential role of the RNA genome structure in this process we investigated the distribution of PSs, interactions of SLs with CP 2 , over the capsid. From the 89 CP 2 , 44 (49%) have a connecting SL, 33 (37%) has crossing (ds) RNA density, while 9 (10%) does not have any adjacent density. Notably, these 44 PSs are distributed unevenly over the capsid, being localized predominantly on one side of the capsid (Supplementary Fig. 3). This uneven distribution was even more pronounced for the 23 RNA SLs of which the density matched the TR X-ray model ( Supplementary  Fig. 4). Of these, 19 (82%) were bound to dimers that were located on one of half of the capsid in only three CP 2 pentamers (Fig. 2b, Supplementary Movie 4). These pentamers were adjacent to the AP and its two binding SLs.
This uneven distribution of SL-CP 2 interactions supports a two-step encapsidation model 13,27 in which RNA condensation proceeds full capsid formation. The role of the asymmetric distribution of RNA-protein interactions could be two-fold. Multiple SLs could increase efficient capsid formation by CP 2 recruitment from the surroundings to form the first CP 2 pentamers, adding both efficiency and localization to CP 2 -CP 2 interactions that drive coat formation. Alternatively, CP 2 could induce SLs formation and condensation of the MS2 RNA, thereby inducing genome compaction during encapsidation. These two

Discussion
We reconstructed the MS2 virion, revealing the AP and the single conformation of the RNA genome in situ. The genome is intimately and asymmetrically linked to the capsid. The presence of a single AP, breaking the symmetry in the capsid, in MS2 is exceptional among viruses and might play an important role in the uniquely structured genome, which might not appear in other (small) viruses. Even so structures of several other viruses have shown hints of (partly) ordered (ds)RNA 28,29 . It remains to be seen whether asymmetric cryo-EM single particle reconstructions of other viruses would reveal similar genome ordering inside virions. The potential of cryo-EM to explore asymmetric structure determination of complete virions, including their genome, would provide unprecedented details on viral RNA structures, RNA-protein interactions and insight into viral assembly, which is valuable knowledge for drug design by targeting disruptions of viral genome folding and packaging.

Methods
Purification of phages. An overnight culture of E. coli strain XL1 blue was grown at 37°C in LB medium until an OD 600 of 0.5. Calcium chloride was added to a final concentration of 2 mM and the cells were infected with phage MS2 at a multiplicity of 10 and incubated for another three hours. Lysates with phage titres of approximately 1 Â 10 À 12 were used for purification. Cellular debris was removed by centrifugation and phage concentrated by ultrafiltration using 15 ml 100 kDa cutoff Amicon concentrators. Phage was further purified by gel-filtration on sephacryl S500 column. Fractions were inspected for purity on coomassiestained SDS-PAGE gel and negative staining EM.
Specimen preparation. Aliquots of purified MS2 were applied to glow-discharged holey carbon film supported by cupper grids (Quantifoil R2/2) after glow discharging with negative polarity for 1 min at 30 mA using a K950X carbon coater (Emitech). Grids were vitrified by plunging into a liquid propane/ethane mixture (2:1 v/v), which was cooled by liquid nitrogen. Samples were plunged using a Leica EM GP from room temperature and blotted for 1-2 s using filter paper. After vitrification, the grids were stored in liquid nitrogen until use.
Data collection. Data acquisition was performed on a Titan Krios transmission electron microscope (FEI) operated with Cs correction at 300 keV using EPU automated single particles acquisition software (FEI). Seven frames per images were recorded on a back-thinned Falcon II detector at a nominal magnification of 59,000 Â with a sampling size of 1.14 Å per pixel.
Image processing. Image processing was performed using Scipion platform (http://scipion.cnb.csic.es), which is an integrative image processing framework that currently mainly uses Xmipp (http://xmipp.cnb.csic.es/) 1 5 . All movies were aligned using Optical Flow approach 6 , while contrast transfer functions (CTFs) were estimated using CTFFIND3 (ref. 7), and were used to select the best quality micrographs. A total of 22,441 particles were picked automatically using Xmipp 8 , which were further screened 9 extracted and normalized. First, an icosahedrally symmetrized reconstruction was calculated. Particles were classified using 2D reference-free Relion approach, and a subset of 18,977 particles was selected from the best classes. Then, these particles were used for Relion 3D refinement, using standard parameters for viral particles, as icosahedral symmetry and gold-standard approach. A second round of refinement was performed, now without applying any symmetry, using Xmipp projection matching. As initial model the final icosahedrally symmetrized map obtained with the previous refinement of Relion was used, filtered to 25 Å. The first four iterations were performed as global refinements decreasing the sampling search angle. Also, the whole particle set was randomly split in two halves. Each subset was refined with the same conditions as the whole set.
Data analysis and visualization. Fitting and visualization was performed using UCSF Chimera 5 . Structures of the MS2 capsids, full and missing one dimer, were both created from the X-ray structures of the MS2 capsid protein (pdb: 2MS2 (ref. 10)), and of the MS capsid protein including the operator hairpin loop with (pdb model 1ZDH 11 ). Capsids were manually aligned with the EM density and fine aligned using the fitmap command, normalized, density maps at the map resolution were created using the molmap command, which were then subtracted from the EM density using the vop command to create a map of the RNA and A-protein only (EMD-3404).
Data availability. Density maps of the icosahedral reconstruction, the asymmetric reconstruction, and the RNA þ AP map, a difference map from the asymmetric reconstruction and the dimer depleted capsid structure, are available from the Electron Microscopy Data Bank with accession codes: EMD-3402, EMD-3403, and EMD-3404, respectively. The authors declare that all data supporting the findings of this study are available from the corresponding author upon request.