AI-empowered integrative structural characterization of m6A methyltransferase complex

Dear Editor, N 6 -methyladenosine (m 6 A) is the most abundant and prevalent internal modi ﬁ cation in mRNA. 1 In mammals, m 6 A exerts pivotal roles in posttranscriptional regulation and its dysregulation is implicated in various diseases including cancer. 2 m 6 A is installed by a multicomponent methyltransferase complex (MTC, also known as the m 6 A writer complex). 3,4 The mammalian MTC is composed of the core m 6 A methyltransferase METTL3 – METTL14 complex (MTC core) and several regulatory proteins including WTAP, the adaptor responsible for METTL3 – METTL14 localization and proper substrate recruitment, 5 and VIRMA (KIAA1429), the speci ﬁ city mediator that mediates preferential m 6 A modi ﬁ cation at the 3 ′ untranslated regions (UTRs, Fig. 1a). 6 Dysregulation of MTC components results in the disruptions of m 6 A. 2 Compared to the other currently identi ﬁ ed regulators (HAKAI, ZC3H13 and RBM15), WTAP and VIRMA are reported to have greater impacts on total mRNA m 6 A levels upon knockdown. 5,6 Despite the advances in understanding the roles of individual MTC components and the structural determination of MTC core, 7 – 9 the overall molecular architecture of the m 6 A writer holocomplex is missing. Here, we report the cryogenic electron microscopy (cryo-EM) structure of human WTAP – VIRMA (3.1 Å) in the METTL3 – METTL14 – WTAP – VIRMA (M – M – W – V) complex and modeled a structure of the quaternary M – M – W – V complex based on AlphaFold2 predictions and structural restraints from intermolecular chemical crosslinking mass spectrometry (CXMS

method. All site-directed mutagenesis of WTAP and VIRMA was carried out using the Fusion PCR 23

method. 24 25
For the cryo-EM sample preparation and the methyl transfer assay, the DNAs of human METTL3 and 26 METTL14, WTAP (the WTAP 1-273 for cryo-EM sample preparation; the full-length WTAP for the 27 methyl transfer assay), and VIRMA 381-1486 were subcloned into pFastBac dual. Both METTL14 and 28 WTAP were expressed with an N-terminal His-tag. For the co-expression coupled purification assay, 29 the genes WTAP and VIRMA were subcloned into a modified pFastBac1 vector, fused with a His-tag 30 at the N-terminus. METTL3 and METTL14 were subcloned into pFastBac dual with a Strep-tag and 31 His-tag at N-terminus, respectively. The baculoviruses were generated in Sf9 cells with the bac-to-bac 32 system (Invitrogen). The proteins were co-expressed in Sf9 cells at 27 °C for 60 h before harvesting. 33 34 For the crystallization trials, the WTAP 130-241 was subcloned into pET21b (Novagen) and fused with a 35 His-tag at the C-terminus. The plasmid was transformed into BL21 (DE3). One-liter lysogeny broth 36 medium supplemented with 100 mg/mL ampicillin was inoculated with a transformed bacterial pre-37 culture and shaken at 37 °C until the optical density at 600 nm reached 1.0. After being induced with 38 0.2 mM isopropyl--D-thiogalactoside (IPTG) and growing at 16 °C for 14 h before harvesting. 39 40 For the structure-guided mutagenesis analysis, Flag-tag WTAP 1-273 (plasmid 1) and His-tag VIRMA 381-41 1486 (plasmid 2) were also subcloned into the pMlink vector with a C-terminal tag, respectively. The 42 Expi293F TM (Invitrogen) cells were cultured in SMM 293TI medium (Sino Biological Inc.) at 37 °C 43 under 5% CO 2 in a ZCZY-CS8 shaker (Shanghai Zhichu Instrument co., Ltd.) and diluted into 2.0 × 10 6 44 cells/mL for further transfection. For 30 mL cell culture, 30 µg plasmid 1 and 30 µg plasmid 2 were 45 pre-incubated with 180 µg linear polyethylenimines (PEIs) (Polysciences) in 2 mL fresh medium for 46 20 min. The transfection was initiated by adding the mixture to the diluted cell culture. Transfected cells 47 were cultured for 48 h before harvesting. 48 49

60
To acquire the METTL3-METTL14-WTAP-VIRMA complex, METTL3-METTL14 protein and 61 WTAP-VIRMA protein were mixed at a molar ratio of 1:0.9 at 4 °C for 30 min. Then the mixture was 62 applied to Superose 6 increase 10/300 column equilibrated with SEC buffer. The METTL3-METTL14-63 WTAP-VIRMA 381-1486 and the METTL3-METTL14-WTAP-VIRMA were used for the methyl transfer 64 assay. To prepare the sample for the cryo-EM, the peak fractions of METTL3-METTL14-WTAP 1-273 -65 VIRMA 381-1486 were further treated by Gradient fixation (GraFix 1 ). In detail, low buffer (SEC buffer 66 containing 10% v/v glycerol) was layered on top of an equal volume of freshly prepared high buffer 67 (SEC buffer containing 25% v/v glycerol and 0.05% glutaraldehyde) in a 12.5 mL tube before cooling 68 on ice. Centrifugation was performed at 33,000 rpm in a Beckman SW40Ti swinging bucket rotor for Elution buffer. The eluted protein was applied to a Source Q10/100 column. Target proteins were 77 subjected to a Superose 6 10/300 GL column (GE Healthcare), which was equilibrated with a buffer 78 containing 25 mM Tris-HCl, pH 8.0, 150 mM NaCl, and 5 mM DTT. 79 For the co-expression coupled purification assay and structure-guided mutagenesis analysis, the cells 80 were resuspended in 1 mL lysis buffer containing 1 mM PMSF, and lysed by repeated freeze-thaw using 81 liquid nitrogen. After ultracentrifugation, the supernatant was loaded onto a Strep-affinity or Flag-82 affinity column, washed using lysis buffer, and eluted using a lysis buffer containing 3 mM 83 desthiobiotin, or 0.25 mg/mL Flag peptide (GenScript), respectively. The expression of WTAP and 84 VIRMA were verified by western blot. The dataset was collected at the SSRF beamline BL17U1 and processed with the HKL3000 (ref.

3). 98
Further processing was performed with the CCP4 suite 4 . Data collection and structural refinement 99 statistics are summarized in Table S1. The structure of the WTAP 130-241 was solved by molecular 100 replacement (MR) using the prediction structure from AlaphaFold2 (ref. 5) as the search models using 101 the program PHASER 6 . The structure was manually and iteratively refined with PHENIX 7 and COOT 8 . 102 All figures representing structures were prepared with PyMOL. 103 104

Sample preparation and cryo-EM data collection 105
For cryo-EM data acquisition of the METTL3-METTL14-WTAP 1-273 -VIRMA 381-186 complex, right 106 before grid preparation the-octyl glucoside was added to a final concentration of 0.05%, 3.5 L 107 sample were deposited onto a freshly glow-discharged (Thermo Fisher, 20 mA, 120 s) holey carbon 108 grid (Quantifoil R1.2/1.3, Au 300 mesh) and plunged into liquid ethane using an FEI Virobot Mark IV 109 after blotting for 3.5 s with blot force 0, Whatman 597 filter paper at 4 °C and 100 % humidity. Each The schematic of the data processing pipeline is shown in Fig. S4c. About 915,354 particles from 2040 119 micrographs were automatically picked using the cryoSPARC blob picker 9 . After two-dimensional 120 classification, a total of 890,453 good particles were selected and subjected to several cycles of three-121 dimensional classification in cryoSPARC 9 . 197,685 particles belonging to the best class were selected; 122 this is followed by nonuniform refinement and local refinement. The METTL3-METTL14-WTAP-123 VIRMA complex yielded a cryo-EM density with an estimated resolution of 3.10 Å based on gold 124 standard Fourier shell correlation 10 .  (Table S2). The cross-linking data were analyzed by pLink2 (ref. 17). The following search parameters were used: 174 MS1 accuracy of ±20 ppm; MS2 accuracy = ±20 ppm; enzyme = trypsin (with full tryptic specificity 175 but allowing up to three missed cleavages); crosslinker = BS3 (with an assumed reaction specificity for 176 lysine and protein N termini); fixed modifications = carbamidomethylation on cysteine; variable 177 modifications = oxidation on methionine, hydrolyzed/aminolyzed BS3 from reaction with ammonia or 178 water on a free cross-linker end. The identified candidates have filtered these parameters: false 179 discovery rate (FDR) < 5%, supervised vector machine or SVM score < 10 −2 , and abundance or peptide-180 spectrum matches (PSMs) ≥ 3. The experimental cross-links were illustrated with Crosslink-viewer 18 . 181 Only cross-links that were observed in at least two biological repeat experiments were used for 182 structural modeling of the quaternary complex. 183

AI-based structure prediction 185
Each experimental cross-link was converted to distance restraints. They were applied to the Cα atoms 186 of the cross-linked residues with an upper distance bound of 26 Å (a 2-Å padding was added to account 187 for local flexibility 19 ). However, due to the limited reactivity of BS3, multiple binding modes between 188 proteins have to be invoked to account for the intermolecular cross-linking data 20, 21 . On the other hand, 189 due to the sparsity of the restraints, each binding mode cannot be effectively validated.