Introduction

DNA consists of deoxyribose sugars, phosphoric acids and the nitrogenous bases adenine (A), thymine (T), guanine (G) and cytosine (C). The length of telomeric DNA, which is located at the terminal ends of linear eukaryotic chromosomes,1 is a factor in determining the lifetime of a normal cell.2 Telomeric DNA is shortened after each round of cell division because the repeated telomeric DNA sequence 5’-TTAGGG-3’, which does not encode genetic information, is not replicated completely. Telomeres are stabilized by specialized T-loop and D-loop structures.3 T-loops and D-loops comprise single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and specific proteins that maintain the telomeric structure. Telomeric repeat factor (TRF) 1 and TRF2 are telomeric dsDNA-binding proteins,4, 5 whereas protection of telomere 1 (POT1) is a telomeric ssDNA-binding protein.3, 6

Of the DNA bases, guanine most readily reacts with OH radicals, which are oxidizing agents. OH radicals are often formed when water is exposed to ionizing or ultraviolet radiation or during metabolic activity within the cell.7 Mutations can arise within a DNA sequence as the result of guanine oxidation to 7,8-dihydro-8-oxoguanine (8-oxoG). Thus, gaining mechanistic insight into the genomic maintenance of telomeric DNA is important for studying oncogenesis8, 9 and the biological effects of ionizing radiation.10, 11

POT1 protects single-stranded telomeric DNA and controls telomere length. When bound to telomeric ssDNA, POT1 prevents replication protein A from binding.12 The replication protein A protein consists of three subunits and is involved in the DNA damage checkpoint pathway. POT1 is also a key protein that prevents the activation of Ataxia Telangiectasia and Rad3-related kinase, which is involved in DNA damage repair.13

Here, we performed molecular dynamics (MD) simulations of telomeric dsDNA and TRF1 to further explore the telomeric protein-binding system.14 We found structural differences in the telomeric dsDNA depending on the binding of TRF1. Furthermore, we investigated the relationship between telomeric ssDNA and POT1 by performing MD simulations, which revealed a novel role for POT1 in maintaining telomeric ssDNA.

POT1 and telomeric ssDNA-binding systems are important for studying telomere maintenance and cell longevity, and several reports have explored this relationship. Ramos et al.15 simulated and analyzed POT1 and a telomeric ssDNA fragment in the bound state. Their initial structure was obtained from an X-ray diffraction experiment (PDBID: 1QZH16). Chatterjee et al.17 performed MD simulations of an ssDNA fragment without protein for 400 ps and calculated the root-mean-square deviation (RMSD) of the ssDNA in the single state. Their sequence of ssDNA was a fragment of p53-coding DNA (130–140 codon sequence). The RMSD values of this ssDNA under the constant particle number, the constant volume, constant temperature (NVT) and the constant particle number, the constant pressure, the constant temperature (NPT) conditions were also calculated for 4 and 5 ns. These RMSD values increased during the simulations. Jaiswal et al.18 performed MD simulations of the binding systems of POT1 and peptides for 3 ns and calculated the RMSD, root-mean-square fluctuation (RMSF) and number of hydrogen bonds (H-bonds) in the binding systems.

Luscombe et al. reported a three-dimensional analysis of the protein–DNA interaction at an atomic level.19 Specifically, they studied the structures of protein–DNA binding systems based on a crystal analysis. Lei et al.20 studied the H-bonds of the crystal structures between POT1 and telomeric ssDNA. In this work, we studied the H-bonds between POT1 and telomeric ssDNA using MD simulation and compared the results of this simulation with the X-ray crystal structure.

Materials and methods

POT1 protein and telomeric ssDNA

The telomeric ssDNA fragment sequence in 3KJP21 was defined as 5′-GGTTAGGGTTAG-3′, which is a long telomeric ssDNA fragment. The number of bases in the telomeric ssDNA is indicated from the 5’ terminus, and the initial guanine base in 3KJP is G2. The POT1 and telomeric ssDNA structures were individually obtained from the X-ray crystal structure (PDBID: 3KJP). The single state is defined as the state of being unbound, and the binding state is defined as the state of being bound. POT1 and telomeric ssDNA in the single and binding states were used to analyze the structural dynamics of POT1 and telomeric ssDNA.

MD simulation

The GROMACS version 4.5.5 package22 was used to perform all-atom MD simulations. Conformation figures were drawn using VMD software,23 and AMBER99SB-ILDN force field24, 25 was used to assess atomic interactions. The TIP3P water model,26, 27, 28 which is a three-point electrostatic interaction model, was used as a solvent. First, to neutralize the total charge of each simulated system, 4 Cl ions were inserted into the POT1 single system, 10 Na+ ions were inserted into the telomeric ssDNA single system and 6 Na+ ions were inserted into the POT1 and telomeric ssDNA-binding system. Next, Na+ and Cl ions were adjusted in each system to reach a NaCl concentration of 0.16 mol l–1. Furthermore, 75 Na+ ions, 69 Cl ions and 22 127 water molecules were inserted into the binding system, and 65 Na+ ions, 69 Cl ions and 20 805 water molecules were inserted into the POT1 single system. Finally, 28 Na+ ions, 18 Cl ions and 5960 water molecules were inserted into the telomeric ssDNA single system.

Energy minimization was performed before MD simulations using the steepest descent method followed by the conjugated gradient method to remove any large force contacts. Periodic boundary conditions using the Particle Mesh Ewald method29, 30 were applied to calculate long-range Coulomb interactions. For non-bonded interactions, the cutoff was 10 Å. Using the linear constraint solver algorithm, these constraints were applied to all bonds, including hydrogen bonds between heavy atoms.31 The leapfrog method was selected to integrate atom dynamics with a step length of 2.0 fs. The temperature was controlled at 300 K during 100 ps NVT-MD, with protein and DNA molecules restrained. The velocity scaling method32 was applied to control the temperature with a 0.1-ps time-step. In addition, pressure was applied at 1 bar during 100 ps NPT-MD, with protein and DNA molecules restrained. The isotropic Parrinello-Rahman33 method was applied to control the pressure in 2-ps time-step increments. The compressibility of water is 4.5 × 10−5 bar−1. NVT-MD and NPT-MD simulations were performed to maintain the solvent in thermal equilibrium. Here, NVT-MD and NPT-MD are defined as pre-MD. After pre-MD, NPT-MD was performed without protein or DNA restraints for 100 ns, which is defined as the production-MD. In our study, pre-MD and production-MD for POT1 and telomeric ssDNA-binding state, POT1 single state and ssDNA single state were performed. These simulations were independently performed three times with different initial configurations and velocity distributions of water molecules.

Results

ssDNA and POT1 conformation in single and binding states

To examine POT1 and telomeric ssDNA structures in the single and binding states in water, 100 ns production-MD was used to illustrate the conformation of molecules under both conditions (Figure 1). The structure of telomeric ssDNAs differed between the single and binding states. Based on an analysis of the crystal structure, telomeric ssDNA in the single state changed from the binding state structure into a C-shape (Figure 1b). However, the structure of telomeric ssDNA in the binding state did not change into a C-shape during the 100 ns production-MD.

Figure 1
figure 1

Structural analysis following 100 ns production-molecular dynamics. (a) Protection of telomere 1 (POT1) in the single state, (b) telomeric single-stranded DNA (ssDNA) in the single state and (c) POT1 (blue) and telomeric ssDNA (red) in the binding state. A full color version of this figure is available at the Polymer Journal journal online.

Assessing POT1 and telomeric ssDNA-binding stability

We evaluated the distance between POT1 and telomeric ssDNA in the single state to determine the stability of the binding system. The access distance (da) was defined as the distance between the closest Cα (POT1) and O5’ (telomeric ssDNA) pair (Figure 2a). To investigate the binding state of telomeric ssDNA and POT1 after 100 ns production-MD, da values were calculated in each sample using a 20-ps time-step. The ensemble average of da was calculated for each sample in water under the three different initial conditions.

Figure 2
figure 2

(a) The access distance (da) between Cα (lower yellow sphere) in protection of telomere 1 and O5’ (upper yellow sphere) in telomeric single-stranded DNA (ssDNA). (b) End-to-end distance (de) between the telomeric ssDNA ends. A full color version of this figure is available at the Polymer Journal journal online.

As shown in Figure 3, the da value averaged over the three initial conditions continued to be between 0.35 and 0.48 nm. The da value averaged over time was 0.40 nm, and the deviation was less than 0.10 nm. These small da values indicated a constant binding state between POT1 and telomeric ssDNA. The da values of each sample for the three different initial conditions are also plotted by 1-ns intervals.

Figure 3
figure 3

The ensemble average of access distance (da) between protection of telomere 1 and the telomeric single-stranded DNA (line). The three different values of da for the three different water initial conditions (circle, square, triangle) plotted over 1.0-ns time steps. A full color version of this figure is available at the Polymer Journal journal online.

Calculating the distance between single-state telomeric ssDNA ends

The telomeric ssDNA structure in the single state changed from the structure in the binding state into a C-shape (Figure 1b). This C-shape change was further analyzed by measuring the distance (de), which is the distance between the centers of mass (COM) of G2 and G12 (Figure 2b). The de values in the single and binding states were calculated using a 20-ps time-step. The ensemble average of de was also calculated. When the telomeric ssDNA structure in the single state changed from the structure in the binding state into a C-shape, the de value decreased.

The de values in the single and binding states are shown in Figure 4. The de values in the binding state were between 3.0 and 3.8 nm, with an average value of 3.5 nm for the 100-ns production-MD. The de value in the binding state was 3.6 nm for the initial crystal structure, but the crystal structure in the single state has not yet been identified to the best of our knowledge. The de values in the binding state for the 100-ns production-MD remained near the de value for the crystal structures. Conversely, the de values in the single state decreased from between 50 and 60 ns to between 0.80 and 3.2 nm. The de values in the single state were smaller than all de values in the binding state for the 100-ns production-MD and the initial crystal structure. The de temporal fluctuations in the single state were larger than those in the binding state.

Figure 4
figure 4

The distance between telomeric single-stranded DNA G2 and G12 centers of mass (de) in the single (black) and binding (red) states and de for the initial crystal structure (arrow). A full color version of this figure is available at the Polymer Journal journal online.

Root-mean-square deviation

The previous section showed that the telomeric ssDNA structures differ between the single and binding states, whereas the POT1 structures do not differ between states (Figure 1). We quantitatively evaluated the structural differences in these molecules between the single and binding states. Thus, we analyzed the results of the MD of each molecule as a time series using the RMSD of the POT1 backbone and all telomeric ssDNA atoms in the single and binding states over a 20-ps time-step. The backbone is the main chain of a protein or -NCαCNCαC- in this calculation. As shown in equation (1), the RMSD of certain atoms in a molecule with respect to the atoms in the reference structure was calculated. The initial binding crystal structure was used as the reference structure. The COM of the reference and MD-snapshot structures were then superimposed. Next, the MD-snapshot structures were rotated to minimize the use of least-square fitting by RMSD.

Here, the POT1 and telomeric ssDNA reference structures served as the initial crystal structure. The coordinate ri(0) represents the i-th atom of the initial crystal structure (PDBID: 3KJP). The coordinate ri(t) represents the position of the i-th atom at time t. N is the number of either backbone atoms (POT1) or all atoms (the telomeric ssDNA). The mass of the i-th atom is represented by mi, and the mass M is the sum of the masses of either the backbone atoms (POT1) or all atoms (the telomeric ssDNA).

The RMSD values of POT1 are shown in Figure 5a. The average RMSD values calculated after 40 ns in the single and binding states were 0.25 and 0.20 nm, and the deviations were 0.03 nm in each case. The difference between the RMSD values in the single and binding states was less than 0.10 nm. These results show that the single and binding structures of POT1 are similar.

Figure 5
figure 5

The root-mean-square deviation (RMSD) in the single (black) and binding (red) states for the (a) protection of telomere 1 backbone and (b) and telomeric single-stranded DNA. A full color version of this figure is available at the Polymer Journal journal online.

In addition, we quantitatively evaluated the difference between the single and binding structures of POT1. We computed the average single and average binding structures of POT1 after 40 ns in the 100-ns production-MD. The samples were individually analyzed under the three different initial conditions, and we calculated the RMSD values of nine pairs between the average structures in the single and binding structures after least-square fitting. The maximum value of the RMSD between two average single and average binding structures of POT1 was 0.27 nm, which was sufficiently small for the size of POT1. Overall, the results indicated that the single and binding structures of POT1 were the same.

The RMSD values of telomeric ssDNA are shown in Figure 5b. The average RMSD values calculated after 65 ns in the single and binding states were 0.92 and 0.24 nm, respectively, and the deviations were 0.11 and 0.06 nm, respectively. The average RMSD values and deviations were larger in the single state than in the binding state. The telomeric ssDNA RMSD values in the binding state gradually increased, but the increase was small. The results show that the telomeric ssDNA structures in the single and binding states were different.

Radius of gyration

To study molecular size changes over time, the gyration radius (Rg) of the POT1 backbone or of all telomeric ssDNA atoms in each single and binding state was calculated using a 20-ps time-step. The Rg value was calculated as the mean squared distance of the atoms from the COM of the molecule, as shown in equation (2).

Here, the coordinate rCOM(t) is the molecule’s COM position at time t. The average Rg values were calculated for each sample in water under different initial conditions. The Rg values of POT1 in the single and binding states are shown in Figure 6a. The difference between the binding and single state values was less than 0.03 nm. The Rg values of telomeric ssDNA in the single and binding states are shown in Figure 6b. The single state values changed between 0.90 and 1.4 nm; however, the binding state values were stable between 1.3 and 1.4 nm during the 100-ns production-MD simulation. The deviations in the single and binding states were 0.15 and 0.06 nm, respectively. In the single state, these values dramatically changed between 50 and 60 ns, which was accompanied by a large RMSD change.

Figure 6
figure 6

The Rg values for the single (black) and binding (red) states for the (a) protection of telomere 1 backbone and (b) telomeric single-stranded DNA. A full color version of this figure is available at the Polymer Journal journal online.

Root-mean-square fluctuation

To determine which parts of one molecule within the binding system are influenced by the other, we compared the fluctuation of either POT1 residues or telomeric ssDNA bases in the single state with those in the binding state. In our study, RMSF is the COM positions of POT1 residues or telomeric ssDNA bases, as shown in equation (3).

Here, the coordinate r(t) is the COM position of the POT1 residue or the telomeric ssDNA base. Ttotal is the total time from the initial time t0 to the final time tf: Ttotal = tft0. During the 100-ns production-MD, the initial time t0 of the RMSF calculation is 40 ns in the POT1 and 65 ns in the telomeric ssDNA because the RMSD and Rg values did not largely increase or decrease after those times. The average COM position of a residue or a base in each sample is , as shown in equation (4).

The RMSF values of POT1 residues and telomeric ssDNA bases were calculated for each sample in the single and binding states. The average values and RMSF variance were calculated. The variances of each sample in water were calculated for the different initial conditions.

We evaluated the difference in RMSF values between POT1 and telomeric ssDNA in both the single and binding states. The binding value is defined as αRMSF and was used to compare the values of RMSF in the single and binding states, as shown in equation (5).

The RMSF and sr values of POT1 residues in the single and binding states are shown in Figure 7a, and the αRMSF values are shown in Figure 7b. The maximum sr values were 0.10 nm at Glu254 in the single state and 0.16 nm at Ala6 in the binding state. In POT1, only the αRMSF value of Gln94 was more than 6.0.

Figure 7
figure 7

(a) The root-mean-square fluctuation (RMSF) of the protection of telomere 1 (POT1) residues in the single (solid black line) and binding (red dotted line) states and the variance (error bar) after 40 ns. (b) αRMSF of POT1 residues. A full color version of this figure is available at the Polymer Journal journal online.

The average RMSF and sr values of telomeric ssDNA in the single and binding states are shown in Figure 8a. The dynamics of atoms in this short dsDNA may differ from those in longer dsDNAs, especially because water molecules influence both ends of telomeric ssDNA. The αRMSF values are shown in Figure 8b. The RMSF values of telomeric ssDNA in the single state are larger than those in the binding state. The maximum sr values were 0.14 nm at G2 in the single state and 0.07 nm at G8 in the binding state. In telomeric ssDNA, only the αRMSF value of G6 was more than 3.0.

Figure 8
figure 8

(a) The root-mean-square fluctuation (RMSF) of telomeric single-stranded DNA (ssDNA) bases in single (solid black line) and binding (red dotted line) states and the variance (error bar) after 65 ns. (b) αRMSF of telomeric ssDNA bases. A full color version of this figure is available at the Polymer Journal journal online.

Upon evaluation of αRMSF, the difference in the RMSF of Gln94 and G6 was found to be the largest between the single and binding states of POT1 and telomeric ssDNA.

Hydrogen bonds between POT1 and telomeric ssDNA

Of the non-covalent interactions, such as van der Waals’ forces and hydrophobic bonds, hydrogen bonding (H-bond) constitutes the strong interaction between molecules. To determine the parts of the binding system that strongly interacted with other molecules, the H-bonds between POT1 and telomeric ssDNA were investigated. To this end, the three-dimensional geometrical-based hydrogen (H)-bond criteria34 were used. These criteria are based on the angle and distance between an acceptor (A) and donor (D) pair. In our study, the criteria were a D-H-A angle of the H-bond of less than 30° and a distance between a D-A pair of less than 0.32 nm. To study the interactions between a POT1 residue and a telomeric ssDNA base, the number of H-bonds between telomeric ssDNA and POT1 was calculated using a 20-ps time-step for 100 ns production-MD. The number of H-bonds in the initial crystal structure (PDBID: 3KJP) was also calculated. The average number of H-bonds (NH) over time was calculated for POT1 residues and telomeric ssDNA bases for three different initial water conditions.

The NH values of POT1 for the 100-ns production-MD and the values in the initial crystal structures are shown in Figure 9. The NH values were larger than 1.5 in Lys33, Asp42, Gln94 and Asp224 in the 100-ns production-MD and less than 1.1 in Lys33, Asp42, and Asp224 in the initial crystal structure. There was a large difference between NH for the 100-ns production-MD and NH of the initial crystal structure.

Figure 9
figure 9

The NH of protection of telomere 1 for the 100-ns production-molecular dynamics (closed circle) and NH of the initial crystal structure (open square). A full color version of this figure is available at the Polymer Journal journal online.

The NH values of telomeric ssDNA for 100-ns production-MD and initial crystal structure values are shown in Figure 10. The NH values were greater than or equal to 2.0 in T4, G6 and G12 for the 100-ns production-MD and in T4, G6, T9 and G12 for the initial crystal structure. The number of H-bonds was higher in G6 than in T9 for the 100-ns production-MD but lower in G6 than in T9 for the crystal structure.

Figure 10
figure 10

The NH of the telomeric single-stranded DNA for the 100-ns production-molecular dynamics (closed circle) and NH of the initial crystal structure (open square). A full color version of this figure is available at the Polymer Journal journal online.

The NH value between Gln94 and G6 was calculated to be 1.74 with a deviation of 0.49. In addition, the NH value between Gln94 and all telomeric ssDNA bases except for G6 was zero, and the NH value between G6 and all POT1 residues except for Gln94 was 0.99. Thus, the H-bonds of Gln94 only pair with G6 in telomeric ssDNA, but G6 forms H-bonds with other POT1 residues. The geometric positions of Gln94 and G6 (Figure 11) indicate that Gln94 and G6 are close to each other in this binding system.

Figure 11
figure 11

Snapshots of protection of telomere 1 (blue) and telomeric single-stranded DNA (red) after 100-ns production-molecular dynamics, where G6 and Gln 94 are in yellow. A full color version of this figure is available at the Polymer Journal journal online.

Discussion

Our initial structure (PDBID: 3KJP21) includes more residues of POT1 and a longer sequence of telomeric ssDNA than the initial structure studied by Ramos et al. (PDBID: 1QZH16). Our MD simulations are for the single and binding systems of POT1 and telomeric ssDNA and were conducted for a longer time. Furthermore, we calculated other physical quantities, not only for the binding system but also for the single systems. Our simulations under NPT conditions, which were longer than those conducted by Chatterjee et al.,17 showed that the RMSD decreased because the structure of the telomeric ssDNA changed. Our analysis of de, RMSD and Rg also revealed a large difference between the single structure of telomeric ssDNA and the structure of telomeric ssDNA attached to POT1.

Jaiswal et al.18 studied the binding systems of POT1 and peptides, but our study focused on the binding systems of POT1 and telomeric ssDNA and compared the single and binding systems of POT1 and telomeric ssDNA.

The time series of telomeric ssDNA de and Rg values in the single states also indicates how the telomeric ssDNA structure in the single state does not change upon removal of POT1. We found that the structure of telomeric ssDNA in the single state changed into a C-shape from the structure in the binding states and that the telomeric ssDNA shape is sustained by POT1.

We calculated and compared the number of H-bonds in the binding system of the initial crystal structure and during the 100-ns production-MD between POT1 and telomeric ssDNA. Our calculations show that the number of H-bonds between POT1 and telomeric ssDNA is very different between the 100-ns production-MD and the crystal structure.

Overall, our study suggest that Gln94 and G6 are important parts of the binding systems of POT1 and telomeric ssDNA.

Conclusions

In this study, we investigated the features of the POT1 and telomeric ssDNA single and binding systems. Our study shows that the telomeric ssDNA structure is sustained by POT1; however, the converse is not true: telomeric ssDNA binding does not alter the POT1 structure. In addition, our study shows not only that Gln94 frequently forms H-bonds between POT1 and telomeric ssDNA exclusively with G6 but also that G6 forms frequent H-bonds with other residues. Overall, we found that G6 and Gln94 are important components of the POT1 and telomeric ssDNA-binding system.

In the future, MD simulations of longer telomeric ssDNA sequences should be performed. Of the DNA bases, guanine is the most chemically sensitive to OH radicals, which often oxidize guanine to form 8-oxoG. Therefore, MD simulations using telomeric ssDNA in which G6 is replaced by other bases, such as 8-oxoG, should also be performed.