Characterizations of distinct parallel and antiparallel G-quadruplexes formed by two-repeat ALS and FTD related GGGGCC sequence

The large expansion of GGGGCC (G4C2) repeats of the C9orf72 gene have been found to lead to the pathogenesis of devastating neurological diseases, amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD). The structural polymorphisms of C9orf72 HRE DNA and RNA may cause aberrant transcription and contribute to the development of ALS and FTD. Here we showed that the two-repeat G4C2 DNA, d(G4C2)2, simultaneously formed parallel and antiparallel G-quadruplex conformations in the potassium solution. We separated different folds of d(G4C2)2 by anion exchange chromatography, followed with characterizations by circular dichroism and nuclear magnetic resonance spectroscopy. The parallel d(G4C2)2 G-quadruplex folded as a symmetric tetramer, while the antiparallel d(G4C2)2 adopted the topology of an asymmetric dimer. These folds are distinct from the antiparallel chair-type conformation we previously identified for the d(G4C2)4 G-quadruplex. Our findings have demonstrated the conformational heterogeneity of the C9orf72 HRE DNA, and provided new insights into the d(G4C2)n folding. Meanwhile, the purified d(G4C2)2 G-quadruplex samples are suitable for further three-dimensional structure characterizations, which are required for the structure-based design of small molecules targeting ALS and FTD related C9orf72 HRE.

loops 17,18 . In recent years, accumulating evidence point to the existence of G-quadruplex structures under physiological conditions, which may have critical functional roles in many cellular processes. G-quadruplex and other peptides, like β-amyloid, are highly attractive targets for therapeutic interventions for neurological disorders 15,[19][20][21] . G-quadruplex-based assays and probes are also used to detect exonuclease activity or gene deletion 22,23 .
Previous findings have suggested that the parallel topology is adopted by most C9orf72 (G4C2) n DNAs 9,24,25 . However, to date, no parallel topology of the C9orf72 HRE DNA has been determined, largely due to the difficulty imposed by the heterogeneous conformations of G-rich oligonucleotides for structural determination. Commonly, the adjustment of length of oligonucleotide repeats, or mutation and modification of specific nucleotides could be employed to stabilize a particular structural fold of G-quadruplexes. The experimental conditions, including pH, temperature and type of cations in the buffers, could also profoundly impact the G-quadruplex folding.
To obtain a C9orf72 HRE DNA sample with the parallel topology and of sufficient quality for structural determination, we carefully examined our previous data, including CD and NMR spectra of G4C2 repeats of different lengths 25 . The 1D 1 H NMR spectra suggested that d(G4C2) 2 was composed of less heterogeneous G-quadruplex conformations compared to other tested d(G4C2) n DNAs. Therefore, in this study, we employed d(G4C2) 2 for further purifications, and obtained several distinct fractions by anion exchange chromatography. The native polyacrylamide gel electrophoresis (PAGE) and CD spectroscopy were performed to investigate all the purified fractions, which demonstrated that homogeneous samples each composed of a distinct parallel or antiparallel G-quadruplex fold were obtained. Using NMR approaches, we further characterized the parallel and antiparallel G-quadruplex conformations. The parallel d(G4C2) 2 G-quadruplex was found to fold as a symmetric tetramer, while the antiparallel d(G4C2) 2 G-quadruplex adopted the topology of an asymmetric dimer.

d(G4C2) 2 formed mixed parallel and antiparallel topologies.
During the preliminary screening of d(G4C2) n samples for structural studies, the CD spectra of d(G4C2) 2 , d(G4C2) 3 and d(G4C2) 5 showed a negative peak at 240 nm and positive peaks at 260 and 290 nm 25 . As previous reports presented, characteristic peaks for parallel quadruplexes include a positive maximum at ∼260 nm and a negative minimum at ∼240 nm. Meanwhile, the positive maximum for an antiparallel quadruplex is typically at around 290 nm 28 . Thus, the characteristic peaks of both parallel and antiparallel G-quadruplexes existed in the above CD spectra, indicating mixed topologies in these samples. The NMR spectroscopy was also employed to investigate their folds. The 1D 1 H spectra of d(G4C2) n except d(G4C2) 2 showed broaden peaks in the imino region, suggesting the formation of mixtures with multiple G-quadruplex conformations 25 . In contrast, the 1D 1 H spectrum of d(G4C2) 2 showed sharp peaks with narrow line widths at 10-12 ppm (Fig. 1a), indicating that its conformation was less heterogeneous. Therefore, we focused on d(G4C2) 2 in further efforts to obtain a G4C2 DNA G-quadruplex sample with the parallel topology.
To analyze the molecular size of d(G4C2) 2 , the native PAGE was performed and DNA oligonucleotides dT12, dT24 and dT48 were employed as size indicators ( Fig. 1b; Supplementary Fig. S1). The d(G4C2) 2 DNA migrated as multiple bands, in consistency with a mixture of topologies observed in CD spectroscopy.

Purification of d(G4C2) 2 by anion exchange chromatography.
To separate different conformations of d(G4C2) 2 in the original sample, anion exchange chromatography was performed using a high resolution Mono-Q column. We generated a potassium chloride gradient to elute DNA fractions from the column using buffer A (20 mM Tris, 50 mM KCl, pH7) and buffer B (20 mM Tris, 1 M KCl, pH7). Twenty-five column volumes of potassium chloride concentration from 30% B to 90% B produced six distinct peaks during the elution process (Fig. 1c). Native PAGE demonstrated that each fraction contained particular components of d(G4C2) 2 G-quadruplex folds (Fig. 1b). The d(G4C2) 2 component with lower stoichiometry was eluted at lower salt concentrations. The elution peaks of fractions 1 and 2 (F1 and F2) were closely adjacent and each contained a single band that migrated faster than dT12. Since the monomeric d(G4C2) 2 could not form a G-quadruplex, the result suggested that F1 and F2 were consisted of homogeneous dimeric G-quadruplexes of similar conformations. Several bands were observed in F3 or F4, which migrated between dT12 and dT24, indicating the composition of different low-order G-quadruplexes. F5 migrated as a single band near dT24, which was likely a homogeneous tetrameric G-quadruplex. In F6, several bands migrated slower than dT24, indicating the composition of high-order structural forms. To further validate the oligomeric status, we compared the electrophoretic mobility of F1 and F5 with monomeric 23-nt human telomere h-telo d[TAGGG(TTAGGG) 3 ] 29 and dimeric 16-nt 93del d[GGGGTGGGAGGAGGGT] 30 (Supplementary Fig. S2). F1 migrated similarly as h-telo, supporting the dimeric form of d(G4C2) 2 . F5 migrated slower than 93del, which agreed with the tetrameric composition of d(G4C2) 2 .

CD spectroscopy characterizations of purified d(G4C2)2 fractions.
To investigate the conformation of d(G4C2) 2 in different fractions, CD spectroscopy was employed to analyze the topology of each fraction (Fig. 2). In addition to characteristic peaks for parallel and antiparallel quadruplexes as described above, the 290-nm peak indicates that two adjacent G-tetrads have opposite hydrogen-bond directionalities, whereas the 260-nm peak is characteristic of a G-quadruplex in which all the G-tetrads have the same polarity 31  Therefore, based on their CD spectra, F1 and F2 adopted a mixed parallel/antiparallel topology with G-tetrad neighbors of mixed relative polarities.
In contrast, F3 displayed a single positive absorption peak and a negative peak at 290 nm and 260 nm, respectively, suggesting the formation of a predominantly antiparallel G-quadruplex fold. The spectrum of F4 exhibited two positive peaks at 260 and 290 nm, which was probably resulted from its mixed conformations as observed in the native PAGE. The CD spectra of F5 and F6 displayed a positive peak at 260 nm and a negative peak at 240 nm, respectively, indicating parallel G-quadruplex folds. These results also demonstrated that the lower stoichiometry of d(G4C2) 2 was prone to adopt antiparallel topology and the higher stoichiometry of d(G4C2) 2 favored parallel topologies.

Stability of d(G4C2) 2 fractions investigated by CD melting experiments.
The thermal stability of the purified d(G4C2) 2 fractions were examined by CD melting experiments, which were performed with a temperature range of 25 °C to 95 °C at 1 °C/min. The CD absorbance was measured at a single wavelength (290 nm for antiparallel G-quadruplexes F1, F2 and F3, and 260 nm for parallel G-quadruplexes F4, F5 and F6). The normalized CD absorbance was fitted by the Boltzmann sigmoid equation (Fig. 3). The melting temperatures of F1 and F2 were 78.79 °C and 75.96 °C, respectively. Both their melting curves exhibited apparent transitions, suggesting  the stable G-quadruplex structure formation by these DNA samples. In contrast, the melting profiles of F3, F4, F5 and F6 showed no obvious transitions, indicating that these DNA samples formed less stable G-quadruplex structures or adopted multiple conformations. As such, the d(G4C2) 2 G-quadruplexes in F1 and F2 adopted the most stable conformations among the purified samples.

Detailed G-quadruplex topologies revealed by NMR spectroscopy.
To obtain more structural insights into d(G4C2) 2 G-quadruplex conformations, 1D 1 H NMR spectra were examined for the unpurified d(G4C2) 2 and the purified d(G4C2) 2 fractions (Fig. 4). The imino region of 1 H NMR spectrum of unpurified d(G4C2) 2 exhibited signals between 10 and 12 ppm, corresponding to imino protons of guanines involved in G-tetrads. Interestingly, the imino region of the antiparallel-dominant G-quadruplex fractions (F1 and F2) showed peaks between 11 and 12 ppm, while that of the parallel-dominant G-quadruplex fractions (F5 and F6) lay between 10.5 and 11 ppm. Thus, the location of the imino peaks may be indicative of the topology of d(G4C2) n G-quadruplexes. The 1 H NMR spectra of F3 and F4 contained peaks in both antiparallel and parallel regions, indicating the formation of heterogeneous mixtures. The spectrum of F6 showed broaden peaks, which was consistent with the high-order structures determined by the native PAGE experiment.
In 1D 1 H NMR spectra of F1 and F2, at least 12 imino proton resonances were observed at 10-12 ppm with narrow line widths, indicating the formation of an asymmetric dimeric G-quadruplex fold with three or four G-tetrads. The hydrogen-deuterium exchange (HDX) experiment showed that many imino protons were exposed and subjected to exchange with solvent after 30 min in D 2 O (Supplementary Fig. S3a,b), in consistency with a  dimeric G-quadruplex conformation. Meanwhile, the 1D 1 H NMR spectrum of F5 exhibited 8 well-resolved peaks at 10-12 ppm, indicating a symmetric tetrameric conformation with eight G-tetrads. Around six peaks were well-protected from the exchange with solvent in the HDX experiment, which was consistent with the G-quadruplex fold containing eight tetrad planes (Supplementary Fig. S3c).
No obvious signals were detected in the Watson-Crick base pairing region, implicating that the cytosines were located in the loop region of the d(G4C2) 2 conformation. The detailed guanine compositions and loop orientations of the asymmetric dimeric and symmetric tetrameric G-quadruplexes remain unclear and need further experiments to resolve.

Purified d(G4C2) 2 G-quadruplex fractions are suitable for investigations by crystallography.
To evaluate the sample quality of these anion exchange fractions for further structure determination, we employed the X-ray crystallography approach. We screened various crystallization conditions for fractions F1 and F5. In initial trials we obtained crystals of fraction F5 (Supplementary Fig. S4), and a diffraction data set was collected to 1.8 Å resolution. This suggests that the purified samples are suitable for high resolution structural investigations. Since this set of diffraction data is still not sufficient to determine the F5 structure by the molecular replacement method, we will optimize crystal conditions or use bases with heavy-atom substitutions to solve the phase problem in the future study.

Discussion
The G-quadruplexes formed by the C9orf72 (G4C2) n DNA or RNA have been found to lead to the pathogenesis of devastating neurological diseases ALS/FTD. The detailed structure information of these G-quadruplexes is critical for understanding the related disease etiology. Previous studies have demonstrated that the C9orf72 (G4C2) n DNA formed parallel and/or antiparallel topologies 9,24,25 . The d(G4C2) 4 formed 100% antiparallel topology, while all the other tested d(G4C2)n contained the parallel G-quadruplex topology. Despite topologies resolved for the antiparallel G-quadruplex folds of d(G4C2) 4 25,26 , the knowledge of the parallel G-quadruplex conformation adopted by the (G4C2) n HRE DNA was limited to indications from CD spectroscopy 9,24,25 .
The structure polymorphisms of G-rich oligonucleotides create the obstacle for obtaining homogenous parallel G-quadruplex folds for the 3-dimensional structure determination. Some modification tools, such as the 8-bromo guanine substitution favoring syn conformation, have been employed to improve the homogeneity of the samples. However, using these approaches, many mutational combinations need to be tested to find the optimal one, which is labor-intensive, tedious and time consuming. In some cases, no guanine with syn conformation in the G-quadruplex is available for chemical modifications. Therefore, other approaches need to be considered for purification of homogeneous G-quadruplex samples for structural determination.
Although anion exchange chromatography has not been frequently used in the structure study of G-quadruplexes, it represents one of the most common methods for the purification of oligonucleotide samples. It is widely used to purify DNA or RNA before crystallization trials 32 . Here we demonstrated that using the method of anion exchange chromatography, we could isolate homogeneous d(G4C2) 2 samples with either the parallel or antiparallel G-quadruplex conformation. Therefore, the approach of anion exchange chromatography could be routinely adopted to purify distinct conformations of G-quadruplexes for structural characterizations.
In summary, we presented the first report of the purification and characterization of the parallel topology adopted by the G4C2 repeat DNA. Using the method of anion exchange chromatography, we isolated homogeneous d(G4C2) 2 samples with either parallel or antiparallel G-quadruplex conformation. Further CD and NMR spectroscopy characterizations suggested that the parallel d(G4C2) 2 G-quadruplex formed a symmetric tetramer while the antiparallel G-quadruplex folded into an asymmetric dimer. Moreover, the purified d(G4C2) 2 G-quadruplex fractions provided suitable samples for further 3-dimensional structure determinations, which may build the necessary structural basis for designing small molecules targeting ALS and FTD related C9orf72 HRE.

Methods
Sample preparation. DNA oligonucleotides were chemically synthesized by IDT. The synthesized DNA was solved at 100 μM concentration in 20 mM potassium phosphate (pH 7.0) and 70 mM KCl. The solved DNA was heated to 95 °C for 15 min, and slowly cooled to room temperature. The NMR sample was concentrated with the use of a Centricon 3kD ultrafiltration column (Millipore, MA). Circular dichroism spectroscopy. Circular dichroism spectra were recorded with 400 μl SNA samples at room temperature on an Applied Photophysics Chirascan CD spectrometer using a 1 mm path length quartz cell. DNA concentration was 20 μM. The DNA samples were prepared in the buffer containing 20 mM potassium phosphate (pH 7.0) and 70 mM KCl. CD melting. The CD melting experiments were performed with a temperature range from 25 °C to 95 °C at 1 °C/min. The CD absorbance were measured at a single wavelength and then normalized by using the equation (Abst − min)/(max − min), in which Abst is the absorbance at a given temperature, max is the maximum absorbance at 260 nm (parallel G-quadruplexes) and at 290 nm (antiparallel G-quadruplexes), and min is the minimum value. Data were fit by the Boltzmann sigmoid equation (Prism).
Polyacrylamide gel electrophoresis (PAGE). The 18% TBE polyacrylamide gel (acrylamide:bis-acrylamide 29:1) was used to run non-denaturing PAGE. 20 mM KCl was supplemented in both gel SCIenTIfIC RepoRTS | (2018) 8:2366 | DOI:10.1038/s41598-018-20852-w and running buffer (0.5× TBE). The annealed samples were prepared at a strand concentration of 100 μM. For the anion exchange chromatography fractions, 5 μl of each fraction was combined and loaded in the gel. Red-safe staining was employed to reveal gel bands. The images were acquired by the Molecular Imager Gel Doc XR system (Bio-Rad). NMR spectroscopy. The 1D 1 H NMR experiments were performed on 750 MHz and 800 MHz Varian spectrometers. Watergate or Jump-and-Return water suppression techniques were employed for samples in water solution. NMRPipe was used to process NMR spetra. For the HDX exchange experiments, the oligonucleotides samples were annealed and lyophilized. The lyophilized oligonucleotides were resuspended in 99% D 2 O before NMR measurements. DNA crystallization. DNA crystals of fraction F1 and F5 were grown by vapor diffusion from hanging drops at 289 K using Natrix HT (Hampton research). The diffraction data were collected on beamline BL-17U of the Shanghai Synchrotron Radiation Facility (SSRF) 33 .