The coiled-coil (CC) domain is a very important structural unit of proteins that plays critical roles in various biological functions. The major oligomeric state of CCs is a dimer, which can be either parallel or antiparallel. The orientation of each α-helix in a CC domain is critical for the molecular function of CC-containing proteins, but cannot be determined easily by sequence-based prediction. We developed a biochemical method for assessing differences between parallel and antiparallel CC homodimers and named it ACCORD (Assessment tool for homodimeric Coiled-Coil ORientation Decision). To validate this technique, we applied it to 15 different CC proteins with known structures, and the ACCORD results identified these proteins well, especially with long CCs. Furthermore, ACCORD was able to accurately determine the orientation of a CC domain of unknown directionality that was subsequently confirmed by X-ray crystallography and small angle X-ray scattering. Thus, ACCORD can be used as a tool to determine CC directionality to supplement the results of in silico prediction.
A coiled-coil domain (CCD) consists of two or more α-helices that are twisted around each other to form a superhelical structure1. Coiled-coils (CCs) have a heptad repeat pattern (a-b-c-d-e-f-g)n, in which the a and d positions are usually occupied by hydrophobic residues. Based on this sequence feature, CCs are easily detected by prediction programs or servers such as COILS2, Marcoil3, Paircoil2 (ref. 4), and SOCKET5. However, they are very versatile structural domains that can adopt many different structures1. They can form different oligomeric complexes, and the helix orientation can either be parallel or antiparallel. Because of these characteristics, CCs participate in diverse protein-protein interactions that are involved in many biological functions and components such as fibrous filament formation6, gene regulation7,8, development of cartilage and bone9, membrane channels10, protein degradation11,12, protein trafficking13,14,15, and molecular chaperones16. To understand the detailed molecular functions of CC-containing proteins, it is absolutely necessary to determine the relative orientation of each coil, because the functional domain outside of a parallel CCD dimer must be in close proximity, whereas that outside an antiparallel CCD is located further away. Many programs and servers have been developed to determine the oligomeric state of CCDs; these include SCORER 2.0 (ref. 17), PrOCoil18, Multicoil2 (ref. 19), and LOGICOIL20. However, none of these programs adequately discriminates between the parallel and antiparallel orientations.
As an alternative to in silico prediction of CC orientation, the introduction of a cysteine residue at an appropriate site in a helix can be used to determine the CC orientation by analysing the resulting intermolecular disulphide bridge21,22. However, although strategies to rapidly evaluate sequence-stability relationships in the parallel coiled-coil motif have been reported23,24, this approach has the following limitations: first, a cysteine mutation in a CCD often cannot form a proper disulphide bond because of incorrect positioning, overall structural perturbation, and/or oxidation of the sulphur atom25,26,27; and second, the CC molecule itself is sometimes not properly assembled in a heterologous expression system28 and shows low solubility (Supplementary Fig. 1).
To overcome the above limitations, we developed a new biochemical technique using a fusion tag, which is an appropriately spaced and oriented molecule that enhances the solubility of CC proteins, to assess the orientation of the CC dimer.
Stringent starvation protein B (SspB) is an appropriate fusion tag
As a fusion tag, we used SspB, which is an adaptor protein that delivers ssrA-tagged proteins to the ClpXP degradation machine in Escherichia coli29,30. SspB is a dimeric protein with two long and flexible C-terminal tails, both of which extend in the same direction (Fig. 1a)31,32,33. For the purpose of purification, hexa-histidine residues were added to the N-terminus, and the target CCD was attached to the C-terminus of full-length SspB (residues 1–165, with a 55-amino acid residue long C-terminal tail) (Fig. 1b). The C-terminal tail has enough degrees of freedom to accommodate each coil in the CCD. The idea is simple and is as follows: If the CC dimer is oriented in a parallel manner, the SspB-CC fusion protein behaves as a dimer in solution, as shown in Fig. 1c; however, if the CC dimer is oriented in an antiparallel manner, the SspB-CC fusion protein behaves as a soluble tetramer (Fig. 1d) or forms higher-order oligomers and/or aggregates (Fig. 1e).
Size-exclusion chromatography with multi-angle light scattering (SEC-MALS) analyses of parallel or antiparallel CC proteins
To validate this approach, we selected several parallel and antiparallel CC proteins whose structures have already been solved. The length of the CCDs varied from relatively short (49.7 Å) to long (225.7 Å; Figs 2a–h and 3a–f). In total, 14 parallel and antiparallel CC proteins were cloned into expression vectors, and the expressed fusion proteins were analysed by SEC-MALS. Molecular weight (MW) determination by MALS is critical, since dimeric CC proteins usually elute earlier than expected in SEC because of their elongated shapes. Intriguingly, the results were significantly different between parallel and antiparallel CCDs. Parallel CC proteins (GCN4, APC, Atg16, SCOC, LRRFIP1, Ndel1, TPM1, and ROCK1; see the legend in Table 1 for the full names of these proteins) were clearly dimeric in solution (Fig. 4a–h). When fused with SspB, the CCD of ROCK1 kinase, which is very long (225.7 Å)34, behaves as a 95-kDa dimer in solution (Fig. 4h,i). It must be noted that the distance between the two N-termini of parallel CCDs was in the range of 5.8–23.6 Å, which might fit well within the flexible C-terminal tails of SspB. In rare cases, the presence of a different oligomeric state was detected, but the major peak always corresponded to a dimer of parallel CCDs (Fig. 4f,g). This is an extremely good result compared to those of other prediction methods. LOGICOIL is very powerful for analysing the oligomeric state and orientation of CCDs20; however, its prediction of the orientation of dimeric CCs was not sufficiently accurate (Table 1). The first predictions were incorrect for two out of eight parallel CCDs. A bigger problem was that the prediction of antiparallel CCDs was not accurate at all. Only one out of six predictions was accurate in the case of antiparallel CCDs; the others were incorrect oligomeric structure predictions (Table 1). The antiparallel CC proteins (MDV1, Mfn1, LMNA, BECN1, TRIM25, and TRIM5; see the legend in Table 1 for the full names of these proteins) usually formed aggregates or tetramers in solution (Fig. 5a–f). The soluble tetramer is likely to be arranged as shown in Fig. 5g, but it is difficult to produce a single prediction model for higher-order oligomers (Fig. 1e). For some CCDs fused with SspB, it was difficult to obtain SEC-MALS data because the CCDs were prone to aggregation, and this phenomenon was most apparent with antiparallel CCDs. The short antiparallel CCD, MDV1, showed a significant proportion to be dimers, although the main fraction corresponded to a tetramer (Fig. 5b). Therefore, the orientation of a CCD can be easily assessed by analysing the oligomeric nature of SspB-fused CCDs. We have termed this method ACCORD.
Application of ACCORD to unknown CC orientations
Next, we applied the ACCORD method to CCDs with unknown orientations. NDP52, a dimeric autophagy receptor, consists of multiple domains, including a central CCD35,36. The CCD of NDP52 fused with SspB was subjected to SEC-MALS and was observed as a clear dimer in solution (Fig. 6a). Therefore, NDP52 was judged to be a parallel dimer based on our ACCORD method. To confirm the directionality of the CCs of NDP52, we attempted to determine the structure of the protein, but it did not crystallize successfully. Therefore, we used small angle X-ray scattering (SAXS) instead. If the fusion partner is a bigger monomeric protein, the molecular envelope is either Y-shaped (in the case of a parallel CC; Fig. 6b) or dumbbell-shaped (in the case of an antiparallel CC; Fig. 6c). The maltose binding protein (MBP) fused with the CCD of NDP52 shows a clear Y-shaped molecular envelope by SAXS (Fig. 6b), verifying that the CCD of NDP52 is parallel. In order to test an antiparallel CCD, we performed X-ray crystallography (Supplementary Fig. 2 and Supplementary Table 1) and SAXS of the MBP-MDV1 fusion protein (Supplementary Figs 3 and 4 and Supplementary Table 2). The results clearly showed that it has a dumbbell-shaped, rather than a Y-shaped, molecular envelope (Fig. 6c). These results were all consistent with those from our ACCORD method.
Given the central role of CCs in protein structure, the ability to easily determine the orientation of the helices in a CCD is important. Undoubtedly, a method to determine the orientation of CCD helices inside a cell would be really useful. There are several approaches to determine the global topology of E. coli membrane proteins using fusion techniques with reporter proteins, alkaline phosphatase, and green fluorescence protein37, and to establish the networks of basic region leucine zipper protein-protein interactions using a fluorescence resonance energy transfer-based assay38. These techniques are somewhat similar to ACCORD, in terms of using a fusion protein; however, their purposes are completely different from that of the ACCORD technique. Furthermore, proteome-wide approaches have limitations such as false positives; thus, in vitro experimentation is necessary to determine the orientation of CCs in a particular protein.
CCs are very structurally diverse and can exist in many different oligomeric states, including dimers, trimers, tetramers, and others. The majority of CCs form dimers, and the ACCORD technique was applied to homodimeric CCs because the fusion tag, SspB, is a homodimer. The use of SspB with a long C-terminal tail has pros and cons. The long and flexible tail acts as a linker that provides structural plasticity to accommodate structurally diverse CCs. However, our preliminary analysis suggests that it might be difficult to use the ACCORD technique to assess antiparallel CCs that are relatively short. For example, the ACCORD result for human MYO10 CC (51 amino acid residues and 47.1 Å; expected as a tetramer) showed that it was a dimer in solution (Supplementary Fig. 5a,b), and we speculate that it forms a dimer with the orientation shown in Supplementary Fig. 5c. Although the current version of the SspB fusion tag has some limitations, we are in the process of improving it for greater assessment accuracy.
Therefore, the ACCORD method can be used as a tool to determine homodimeric CC directionality to supplement in silico predictions that are inaccurate because the energy difference between the correct and incorrect directionalities of CCs is not large enough for discrimination between them. For a definitive answer, a three-dimensional structure determination by X-ray crystallography or nuclear magnetic resonance spectroscopy is best, but these methods are labour intensive and are sometimes technically infeasible because of the lack of diffracting crystals or because of the high molecular weight of a CC sample. The ACCORD approach is straightforward and is a good complement to in silico prediction methods to determine the CC orientation in a protein. By combining the ACCORD method and currently available prediction methods, the molecular function of target CC proteins can be interpreted with confidence.
Full-length sspB was amplified from E. coli genomic DNA by PCR and cloned into the NcoI and BamHI restriction sites of a modified pET vector (containing tobacco etch virus and thrombin cleavage sites) to construct an N-terminal hexa-histidine-tagged protein (hereafter, His6-SspB). The gene fragments GCN4 (249–281) and MDV1 (231–300) from the genomic DNA of Saccharomyces cerevisiae and APC (2–55), SCOC (78–159), LRRFIP1 (162–249), Beclin-1 (174–266), ROCK1 (535–709), and NDP52 (141–334) from the cDNA of Homo sapiens were amplified using primers containing restriction enzyme sites (BamHI at the 5′-end and EcoRI or XhoI at the 3′-end; NEB). The gBlock gene fragments Ndel1 (8–99), MYO10 (883–933), TPM1 (1–98), Mfn1 (600–735), LMNA (328–398), TRIM25 (194–356), and TRIM5 (133–241) were synthesized using primers containing the above-mentioned restriction enzyme sites (Integrated DNA Technologies). Sequences of the CCD proteins used in this study are provided in Supplementary Table 3. Genes for the CCDs were ligated into the His6-SspB vector. The gene fragments MDV1 (231–300) and NDP52 (197–270) were also ligated into a modified pMAL vector with an N-terminal hexa-histidine sequence and several point mutations to facilitate crystallization39. The resulting plasmids were transformed into E. coli BL21(DE3) cells.
Protein expression and purification
His-tagged fusion protein expression was induced by the addition of 0.5 mM isopropyl-β-D-thiogalactoside at 291 K for 20 h. Cells were harvested by centrifugation and resuspended in buffer A (50 mM Tris, pH 8.0, and 0.5 mM tris(2-carboxyethyl)phosphine) containing 300 mM NaCl. After sonication, the cell lysate was loaded onto a HisTrap column (GE Healthcare) and then eluted by gradient purification with buffer A containing 100 mM NaCl and 500 mM imidazole. Eluted proteins were concentrated by ultrafiltration (Amicon Ultra 30 K NMWL, Millipore) and loaded onto a Superdex 200 10/300 GL column (GE Healthcare) equilibrated with buffer A containing 200 mM NaCl.
Cells expressing the MBP-tagged fusion proteins were harvested by centrifugation and resuspended in buffer A containing 100 mM NaCl and 1 mM EDTA. After sonication, the cell lysate was applied to a column containing amylose resin (NEB) and collected by gravity flow. The beads were washed with 10 column volumes of buffer A, and then the protein was eluted with buffer A supplemented with 10 mM maltose. The eluted sample was further purified using a Q-FF column (GE Healthcare). Finally, the fusion proteins were loaded onto a HiLoad 16/600 Superdex 200 column (GE Healthcare) pre-equilibrated with buffer A containing 100 mM NaCl.
SEC-MALS experiments were performed using a fast protein liquid chromatography system (GE Healthcare) connected to a Wyatt MiniDAWN TREOS MALS instrument and Wyatt Optilab rEX differential refractometer. The column and buffer used were the same as those used in the final purification step. Ovalbumin was used as the isotropic scatterer for detector normalization. Light scattering from each sample (3–5 mg/ml, 0.5 ml) was measured and analysed using ASTRA V software (Wyatt).
Crystallization and structure determination
Purified MBP-MDV1 fusion protein was crystallized at 295 K using the hanging drop vapour diffusion method and mixing an equal volume of the protein and a reservoir solution containing 100 mM sodium acetate, pH 4.5, 25% (w/v) polyethylene glycol 3350, 100 mM CaCl2, and 1.3–1.7 M sodium formate. Crystals were flash-frozen with the reservoir solution containing 20% (v/v) glycerol in a nitrogen stream at 100 K. Native MBP-MDV1 data were collected at beamline 5 C of the Pohang Accelerator Laboratory (PAL), Korea. Diffraction data were indexed, integrated, and scaled using HKL2000 software40. The structure was determined by molecular replacement using the MBP mutant structure as a search model39. Statistics for data collection and refinement are provided in Supplementary Table 1.
Solutions of the MBP-NDP52 and MBP-MDV1 fusion proteins were prepared in buffer A containing 100 mM NaCl. The concentration of MBP-NDP52 was 2.6 mg/ml, and that of MBP-MDV1 was 9.8 mg/ml. Scattering data of MBP-NDP52 and MBP-MDV1 were collected at beamline BL-10C of the Photon Factory, Japan, and at beamline 4 C of PAL, respectively. Details of experimental parameters are shown in Supplementary Table 2. Briefly, scattering images from the proteins at various concentrations were reduced to two-dimensional (2D) data by circular integration. A preliminary analysis of these 2D data using PRIMUS (ATSAS program) provided the radius of gyration (Rg), Porod volume, and experimental molecular weight41. Ab initio modelling and averaging of these models were performed using DAMMIF and DAMAVER, respectively. Rigid body modelling of the crystallographic structure on dummy atom models was computed using the Situs program package42.
Protein structure modeling
All protein models were generated using the crystal structures of SspB (PDB accession code: 1OX8)32 and the CCDs of MDV1 (PDB accession code: 2XU6)43, Beclin-1 (PDB accession code: 3Q8T)44, Atg16 (PDB accession code: 3A7O), and ROCK1 (PDB accession code: 3O0Z)34. All figures for structures were generated using PyMOL (http://www.pymol.org).
Accession code: Atomic coordinates and structure factor files for MBP-MDV1 have been deposited in the Protein Data Bank with the accession code 5JST.
How to cite this article: Kim, B.-W. et al. ACCORD: an assessment tool to determine the orientation of homodimeric coiled-coils. Sci. Rep. 7, 43318; doi: 10.1038/srep43318 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank MJ Eck (Dana-Farber Cancer Institute) for providing the gene encoding the CCD of ROCK1. We also thank the staff at beamline 5 C of PAL for help with X-ray data collection and at beamline 4 C of PAL and beamline BL10C of the Photon Factory for help with SAXS data collection. This work was supported by National Research Foundation of Korea (NRF) grants from the Korean government (MEST) (NRF-2016R1E1A1A01942623 and BRL grant: No. 2015041919) and by a grant from the Institute for Basic Science (IBS-R023-D1). This research was also supported by a Korea University Grant and by the Basic Science Research Program of the NRF funded by the Ministry of Education (NRF-2013R1A6A3A04063675 to YOJ).