Mechanical deformations of DNA such as bending are ubiquitous and have been implicated in diverse cellular functions1. However, the lack of high-throughput tools to measure the mechanical properties of DNA has limited our understanding of how DNA mechanics influence chromatin transactions across the genome. Here we develop ‘loop-seq’—a high-throughput assay to measure the propensity for DNA looping—and determine the intrinsic cyclizabilities of 270,806 50-base-pair DNA fragments that span Saccharomyces cerevisiae chromosome V, other genomic regions, and random sequences. We found sequence-encoded regions of unusually low bendability within nucleosome-depleted regions upstream of transcription start sites (TSSs). Low bendability of linker DNA inhibits nucleosome sliding into the linker by the chromatin remodeller INO80, which explains how INO80 can define nucleosome-depleted regions in the absence of other factors2. Chromosome-wide, nucleosomes were characterized by high DNA bendability near dyads and low bendability near linkers. This contrast increases for deeper gene-body nucleosomes but disappears after random substitution of synonymous codons, which suggests that the evolution of codon choice has been influenced by DNA mechanics around gene-body nucleosomes. Furthermore, we show that local DNA mechanics affect transcription through TSS-proximal nucleosomes. Overall, this genome-scale map of DNA mechanics indicates a ‘mechanical code’ with broad functional implications.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 19 November 2022
Nature Communications Open Access 28 May 2021
Identification of cis-regulatory motifs in first introns and the prediction of intron-mediated enhancement of gene expression in Arabidopsis thaliana
BMC Genomics Open Access 26 May 2021
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
All sequencing data obtained as part of this study are deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under accession number PRJNA667271. Nucleosome positions and NCP scores along the genome of S. cerevisiae as previously reported have been accessed from NCBI Gene Expression Omnibus (GEO) under accession number GSE36063. Nucleosome occupancy data in S. cerevisiae as previously reported have been accessed from NCBI GEO under accession number GSE97290. There are no restrictions on data availability. Source data are provided with this paper.
No sequencing analysis in this study depends on the use of specialized code. Simple custom scripts were written in MATLAB (MathWorks) versions 9.0, 9.2, 9.4, 9.6 for analysis of sequencing data, and will be made available upon request. smFRET data acquisition was carried out by custom scripts that can be obtained from http://ha.med.jhmi.edu/resources/ or upon request.
Garcia, H. G. et al. Biological consequences of tightly bent DNA: the other life of a macromolecular celebrity. Biopolymers 85, 115–130 (2007).
Krietenstein, N. et al. Genomic nucleosome organization reconstituted with pure proteins. Cell 167, 709–721 (2016).
Shore, D., Langowski, J. & Baldwin, R. L. DNA flexibility studied by covalent closure of short fragments into circles. Proc. Natl Acad. Sci. USA 78, 4833–4837 (1981).
Cloutier, T. E. & Widom, J. Spontaneous sharp bending of double-stranded DNA. Mol. Cell 14, 355–362 (2004).
Ha, T. et al. Probing the interaction between two single molecules: fluorescence resonance energy transfer between a single donor and a single acceptor. Proc. Natl Acad. Sci. USA 93, 6264–6268 (1996).
Vafabakhsh, R. & Ha, T. Extreme bendability of DNA less than 100 base pairs long revealed by single-molecule cyclization. Science 337, 1097–1101 (2012).
Geggier, S. & Vologodskii, A. Sequence dependence of DNA bending rigidity. Proc. Natl Acad. Sci. USA 107, 15421–15426 (2010).
Beutel, B. A. & Gold, L. In vitro evolution of intrinsically bent DNA. J. Mol. Biol. 228, 803–812 (1992).
Rosanio, G., Widom, J. & Uhlenbeck, O. C. In vitro selection of DNAs with an increased propensity to form small circles. Biopolymers 103, 303–320 (2015).
Jeong, J. & Kim, H. D. Base-pair mismatch can destabilize small DNA loops through cooperative kinking. Phys. Rev. Lett. 122, 218101 (2019).
Jeong, J. & Kim, H. D. Determinants of cyclization-decyclization kinetics of short DNA with sticky ends. Nucleic Acids Res. 48, 5147–5156 (2020).
Lee, W. et al. A high-resolution atlas of nucleosome occupancy in yeast. Nat. Genet. 39, 1235–1244 (2007).
Bai, L., Ondracka, A. & Cross, F. R. Multiple sequence-specific factors generate the nucleosome-depleted region on CLN2 promoter. Mol. Cell 42, 465–476 (2011).
Zhang, Z. et al. A packing mechanism for nucleosome organization reconstituted across a eukaryotic genome. Science 332, 977–980 (2011).
Segal, E. & Widom, J. Poly(dA:dT) tracts: major determinants of nucleosome organization. Curr. Opin. Struct. Biol. 19, 65–71 (2009).
Segal, E. et al. A genomic code for nucleosome positioning. Nature 442, 772–778 (2006).
Chereji, R. V., Ramachandran, S., Bryson, T. D. & Henikoff, S. Precise genome-wide mapping of single nucleosomes and linkers in vivo. Genome Biol. 19, 19 (2018).
Gilchrist, D. A. et al. Pausing of RNA polymerase II disrupts DNA-specified nucleosome organization to enable precise gene regulation. Cell 143, 540–551 (2010).
Zhou, C. Y. et al. The yeast INO80 complex operates as a tunable DNA length-sensitive switch to regulate nucleosome sliding. Mol. Cell 69, 677–688.e9 (2018).
Brahma, S. et al. INO80 exchanges H2A.Z for H2A by translocating on DNA proximal to histone dimers. Nat. Commun. 8, 15616 (2017).
Eustermann, S. et al. Structural basis for ATP-dependent chromatin remodelling by the INO80 complex. Nature 556, 386–390 (2018).
Brahma, S., Ngubo, M., Paul, S., Udugama, M. & Bartholomew, B. The Arp8 and Arp4 module acts as a DNA sensor controlling INO80 chromatin remodeling. Nat. Commun. 9, 3309 (2018).
Knoll, K. R. et al. The nuclear actin-containing Arp8 module is a linker DNA sensor driving INO80 chromatin remodeling. Nat. Struct. Mol. Biol. 25, 823–832 (2018).
Oberbeckmann, E. et al. Genome information processing by the INO80 chromatin remodeler positions nucleosomes. Preprint at https://doi.org/10.1101/2020.11.03.366690 (2020).
Kubik, S. et al. Nucleosome stability distinguishes two different promoter types at all protein-coding genes in yeast. Mol. Cell 60, 422–434 (2015).
Widom, J. Role of DNA sequence in nucleosome stability and dynamics. Q. Rev. Biophys. 34, 269–324 (2001).
Drew, H. R. & Travers, A. A. DNA bending and its relation to nucleosome positioning. J. Mol. Biol. 186, 773–790 (1985).
Hayes, J. J., Tullius, T. D. & Wolffe, A. P. The structure of DNA in a nucleosome. Proc. Natl Acad. Sci. USA 87, 7405–7409 (1990).
Widlund, H. R. et al. Nucleosome structural features and intrinsic properties of the TATAAACGCC repeat sequence. J. Biol. Chem. 274, 31847–31852 (1999).
Shrader, T. E. & Crothers, D. M. Artificial nucleosome positioning sequences. Proc. Natl Acad. Sci. USA 86, 7418–7422 (1989).
Lowary, P. T. & Widom, J. New DNA sequence rules for high affinity binding to histone octamer and sequence-directed nucleosome positioning. J. Mol. Biol. 276, 19–42 (1998).
Jin, H., Rube, H. T. & Song, J. S. Categorical spectral analysis of periodicity in nucleosomal DNA. Nucleic Acids Res. 44, 2047–2057 (2016).
Brogaard, K., Xi, L., Wang, J.-P. & Widom, J. A map of nucleosome positions in yeast at base-pair resolution. Nature 486, 496–501 (2012).
Ngo, T. T. M., Zhang, Q., Zhou, R., Yodh, J. G. & Ha, T. Asymmetric unwrapping of nucleosomes under tension directed by DNA local flexibility. Cell 160, 1135–1144 (2015).
Bondarenko, V. A. et al. Nucleosomes can form a polar barrier to transcript elongation by RNA polymerase II. Mol. Cell 24, 469–479 (2006).
Nakano, M. et al. Single-molecule PCR using water-in-oil emulsion. J. Biotechnol. 102, 117–124 (2003).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2019).
Mizuguchi, G., Wu, W.-H., Alami, S. & Luk, E. in Methods in Enzymology (eds. Wu, C. & Allis, C. D.) vol. 512, 275–291 (Academic Press, 2012).
A.B. and T.H. thank C. Wu, X. Feng and M. F. Poyton for insights and help related to INO80 biochemistry, Q. Zhang for help with initial assay development efforts, and A. Biswas for providing passivated glass and quartz slides for smFRET experiments. K.P.H and S.E. thank M. Moldt for purification of recombinant INO80. This work was supported by the National Science Foundation grants PHY-1430124 and EFMA 1933303 (to T.H.), by the National Institutes of Health grants GM122569 (to T.H.), R01CA163336 (to J.S.S.) and GM130393 (to C.W.), by the European Research Council (Advanced Grant INO3D to K.P.H), and by the Deutsche Forschungsgemeinschaft (CRC1064 and Gottfried Wilhelm Leibniz-Prize to K.P.H). A.B. was a Simons Foundation Fellow of the Life Sciences Research Foundation. T.H. is an Investigator with the Howard Hughes Medical Institute.
The authors declare no competing interests.
Peer review information Nature thanks Johan Elf, Wilma Olson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Looping kinetic curves of two individual sequences that were part of the cerevisiae nucleosomal library (Supplementary Note 4), obtained by performing two individual smFRET experiments (Fig. 1a–c) as well as timecourse loop-seq (Supplementary Note 3) on the library. b, Looping times of 10 sequences that were part of the cerevisiae nucleosomal library obtained from 10 individual smFRET experiments (Fig. 1c) versus looping times obtained by performing time-course loop-seq on the library. Pearson’s r = 0.84; 95% CI = 0.44, 0.96; P = 0.002, two-sided t-test. c, Looping barriers (natural logarithm of the looping times; see Supplementary Note 2) of all 19,907 sequences in the cerevisiae nucleosomal library obtained by performing time-course loop-seq versus the corresponding cyclizability values obtained by performing regular loop-seq involving 1 min of DNA looping before RecBCD digestion. Pearson’s r = 0.833; 95% CI = 0.829, 0.837; P < 0.00001, two-sided t-test.
Before annealing of the ends, the DNA rapidly samples various configurations where the ends are far apart or closer together, described here for simplicity as a rapid equilibrium between two representative states. As previously described6, annealing captures the state in which the ends are close together. Thus, the rate of looping as measured in the FRET based assay reports on the equilibrium population fraction of the state where the ends are close but not annealed, irrespective of the exact shape and geometry of the subsequent annealed state. It thus addresses the biological question of how quickly regions of DNA can approach, which can then be stabilized by protein binding6. However, the formation of nucleoprotein complexes may require not just the ends to approach, but the intervening DNA to also adopt a certain shape, and the readout of looping rate does not distinguish between these possible shapes. Various shapes have been proposed for the subsequent annealed state, such as a teardrop configuration where the nicks are open and base-pair stacking across nicks is disrupted, and a smooth state in which base-pair stacking is preserved across the nicks11. Other non-canonical geometries may also be possible, such as a U-turn geometry, where the sticky overhangs interact via reverse Watson–Crick base-pairing in a parallel stranded configuration. Conventional Watson–Crick base pairing between the overhangs, but in a geometry similar to the U-turn configuration, has been achieved in the past (hairpin loop10). Presence of mismatches or transient defects in the duplex region may influence whether such a geometry is preferred over the teardrop or smooth configuration10. Because all members of various libraries in our study used the same overhang sequences, relative differences between them in their looping kinetics are unlikely to be affected by a potential conformational heterogeneity of the looped state.
a, Seven of the ten sequences whose looping times were measured using smFRET (sequences 1–7 in Fig. 1c, and listed in Supplementary Note 1) were combined to form a small library. Plotted are the cyclizability values obtained by performing loop-seq on this library versus the looping times obtained from the individual smFRET experiments. Pearson’s r = −0.977; 95% CI = −0.848, −0.997; P = 0.0002, two-sided t-test. This plot is similar to that in Fig. 1f, except that loop-seq was performed on a much smaller library comprising only these seven sequences. This control serves to revalidate the anti-correlative relation between looping time and cyclizability and confirm reproducibility of loop-seq measurements. b, Regular PCR of the entire library containing multiple templates can generate incorrectly annealed products that are annealed via the 25-bp identical adapters at the ends but have 50-bp bubbles in the middle. Such constructs would probably be extremely flexible and be protected from digestion owing to rapid cyclization. Emulsion PCR separates the templates in individual droplets, thereby preventing incorrect annealing between different templates. We performed a control experiment to verify that emulsion PCR of the library does not affect the measured value of cyclizability. In one case, seven template sequences (sequences 1–7 as listed in Supplementary Note 1 and Fig. 1c) were mixed and then a single round of ePCR was performed to form a small library. In another case, seven separate regular PCR amplifications were carried out for the seven template molecules. The amplified products were then mixed in equimolar proportions to form the library. Loop-seq was performed on these two 7-member libraries and two sets of cyclizabilities of the seven sequences were measured. As indicated, these values are highly correlated. Pearson’s r = 0.992; 95% CI = 0.951, 0.999; P = 8 × 10−6, two-sided t-test. c, Technical replicates of loop-seq performed on the cerevisiae nucleosomal library (Supplementary Note 4). Pearson’s r = 0.869; 95% CI = 0.865, 0.872; P < 0.00001, calculated using two-sided t-test. P value was obtained from a two-sided t-test. The same original library as provided by the manufacturer (Genscript) was used, but all subsequent steps were performed independently. d, This control was performed to verify the expectation that DNA sequences which are more bendable and thus loop quickly under high salt conditions are slow to unloop under low salt conditions. In red is the looping kinetics of sequence 6, and in green is that of sequence 7 (Fig. 1c) measured using smFRET. For the unlooping measurements, the slide containing nicked DNA was incubated for 2 h with high-salt imaging buffer containing 1 M NaCl. After that, low-salt imaging buffer containing no added NaCl was flowed in and the percentage of molecules in high FRET as a function of time was measured. e, This control was performed to verify that while performing loop-seq, molecules do not significantly unloop during the 20 min of digestion with RecBCD (Methods). If they do unloop, they would be immediately digested, which would affect the measurement of cyclizability. Sequence 5 (Supplementary Note 1) was used in this experiment. We find that digestion buffer (without the RecBCD enzyme) is in itself capable of looping molecules, and that too at a slightly faster rate than in the presence of looping buffer (which has 1 M NaCl, see Methods). This is owing to the presence of Mg2+ ions in the digestion buffer, which we know to also effect looping by allowing stable hybridization of the ends6. Thus, molecules looped in the presence of 1 M NaCl in 1 min are expected to stay looped during the subsequent 20 min of digestion with RecBCD. f, This control was performed to understand the effect of the orientation of the central 50 bp sequence on the measured value of cyclizability. First the random library was constructed (where the sequence of DNA in the 50 bp central variable region are randomly selected; Supplementary Note 5). Then a library called ‘mixed reverse complement of the random library and random library’ was constructed by mixing the random library with another library, in which every sequences in the random library was present, but had its central 50-bp variable region flipped (Supplementary Note 6). Loop-seq was performed on this new library. We found that the cyclizability of a sequence represented in the Random Library half of this new library was correlated with that of the corresponding sequence in the other half, where the central 50-bp region was flipped. Pearson’s r = 0.73; 95% CI = 0.72, 0.74; P < 0.00001, calculated using a two-sided t-test. g, This plot serves to confirm the expectations that RecBCD does not digest looped molecules, and that over sufficient time, most molecules, even rigid ones, will loop. During time-course loop-seq (Supplementary Note 3), the original sample was split into eight identical fractions. Looping for various amounts of time and subsequent digestion was carried out for seven of the eight fractions, and one fraction was not subject to any digestion. All fractions were then PCR amplified (16 cycles) under identical conditions (see methods). Plotted are the concentrations of DNA obtained after PCR versus the corresponding times the samples were subject to the looping condition. These data points were fit to an exponential curve (solid line). The DNA concentration was obtained when no digestion was performed is represented as the dashed horizontal line. The fact that the fitted exponential approaches the dashed line indicates that for very long looping times, almost all molecules, even very rigid ones, have had sufficient time to loop and hence are protected from digestion. Thus, in this case, the concentration of DNA obtained after PCR of all surviving molecules approaches that of the fraction where no digestion was performed at all. Whether RecBCD would digest molecules sealed via non-canonical parallel base-pairing or other geometries (Extended Data Fig. 1) is not known. However, this control suggests that either it does not, or such unconventional base-pairings are rare. h, These plots serve to further demonstrate that if the library is permitted to loop for a very long time, most molecules, even very rigid ones, will loop, and that looped molecules are protected from subsequent digestion with RecBCD. In this case, the relative populations of various sequences measured after digestion should be similar to the case where no digestion was performed at all. We find this to indeed be the case: there is good correlation between the relative population of a sequence in sample 1 of time-course loop-seq (no digestion) and sample 8 (2 h of looping followed by digestion). However, correlation between the relative population of a sequence in sample 1 and that of a sequence in sample 2 (1 min of looping before digestion) is much weaker. This is because in sample 2, only those molecules whose sequences render them bendable enough to loop under 1 min are protected from subsequent RecBCD digestion. i, Technical replicates of the looping kinetic curve of sequence number 3 (Supplementary Note 1) measured using smFRET.
a, Loop-seq was performed on the random library (Supplementary Note 5). Plotted is the mean A/T content as a function of position along the central variable 50 bp region, where the mean is calculated by averaging over the 1,000 most cyclizable (red) or least cyclizable (blue) sequences. The value of n (n = 26) is the distance in nucleotides of the biotin tether from the end of each molecule in the library (Fig. 1d). b, In an untethered geometry, sequence features such as the phase of the oscillations in A/T content may result in the looped configuration having a preference for a certain poloidal angle (rotation along the long axis of DNA). Preference for a certain poloidal angle translates to a preference for a certain orientation of the biotin-streptavidin tether. Shown above are two extreme cases—in one case, the poloidal angle preference of the sequence results in a preferential orientation of the tether on the outside, while in the other case, the tether points to the inside at the point of contact with the DNA. As the biotin-streptavidin-complex is quite large, the outside orientation may be more favoured for looping owing to steric considerations. The outside orientation can be converted to the inside orientation by moving the biotin tether point to a base that is half the DNA helical repeat away. This may explain why the phase of the oscillation of A/T content among the most or least cyclizable sequences shifts by half the helical repeat of DNA when the tether point is also shifted by half the helical repeat of DNA (about 5 bases) (compare a and c). c, The random library was re-prepared, placing the biotin 31 nucleotides away from the ends (n = 31) and loop-seq was performed. Plotted are the same quantities as in a, except the 1,000 most and least cyclizable sequences of the library were identified based on the newly obtained cyclizability values under the n = 31 condition. d, e, See context in which these panels are referred to in Supplementary Note 7. In e, Pearson’s r = 0.987; 95% CI = 0.986, 0.987; P < 0.00001, two-sided t-test. f, Mean A/T content as a function of position along the variable region of the random library, where the averaging is done over the 1,000 sequences that have the highest (red) or lowest (blue) values of intrinsic cyclizability. The scale of the axes is the same as in a and c. g, Amplitude spectra obtained from the fast Fourier transforms of the plots in f (solid lines) and a (dashed lines). h, 2D histogram of the scatter plot of measured cyclizability of sequences in the random library prepared with the biotin at the very end of the molecule (n = 0 condition) vs its predicted value based on the oscillatory model (equation (1) in Supplementary Note 7). Pearson’s r = 0.787; 95% CI = 0.78, 0.793; P < 0.00001, two-sided t-test. i–k, 2D histogram of scatter plot of measured cyclizabilities of sequences in the random library prepared at n = 0 vs prepared at n = 26, 29, 31 nucleotides. In i, Pearson’s r = 0.38; 95% CI = 0.37, 0.40; P < 0.00001, two-sided t-test. In j, Pearson’s r = 0.70; 95% CI = 0.69, 0.71; P < 0.00001, two-sided t-test. In k, Pearson’s r = 0.77; 95% CI = 0.76, 0.78; P < 0.00001, two-sided t-test. l, The use of long 10-nucleotide overhangs has been shown to eliminate the need for ligase and to reduce the dependence of looping on rotational phasing between the ends6. Shown here is a 2D histogram of the scatter plot of intrinsic cyclizability of a sequence in the random library (which had 50 bp of DNA along the central variable region) versus the corresponding sequence in library L (Supplementary Note 8), in which 5 bases were added to the variable region. A correlation coefficient only slightly poorer than the correlation between cyclizability values of the random library and the reverse complement of the random library (Extended Data Fig. 3f) suggests that rotational phasing of the ends does not significantly influence intrinsic cyclizability. Pearson’s r = 0.63; 95% CI = 0.61, 0.65; P < 0.00001, two-sided t-test. See Supplementary Note 8.
Extended Data Fig. 5 Intrinsic cyclizability and nucleosome occupancy vs position from the dyads of the +1 nucleosomes of various individual genes in S. cerevisiae.
Plots are as shown for the two individual genes in Fig. 2c. The dashed line marks the ordinate value where intrinsic cyclizability is lowest.
a, Intrinsic cyclizability as a function of position from the +1 nucleosomal dyad, along the 334th, 75th and 301st genes in the list of 576 genes along which intrinsic cyclizability was measured (Supplementary Note 9). The 80-bp linker regions of both constructs (with rigid and flexible linkers extending from the 601 sequence) in pairs 1, 2 and 3 along which INO80 sliding extent was measured, were selected from the 334th, 75th, and 301st genes respectively. Genes are oriented in the upstream to downstream direction. Red and green denote the selected 80-bp less rigid and more rigid linker regions, respectively. See Supplementary Note 11. b, c, The remodelling reactions shown in Fig. 2f were all performed for 1 min of remodelling, under various enzyme concentrations. Here we show, instead, remodelling reaction time courses at saturating concentrations of INO80 for all three pairs. Conditions are identical to Fig. 2f, except that saturating (30 nM) INO80 is used and remodelling is permitted to progress for various amounts of time after addition of ATP (Methods). In all cases, the two constructs in a pair are present simultaneously, and distinguished by imaging the gel once for Cy3 and once for Cy5 fluorescence. Thus, although the sliding extent can be very sensitive to sliding time (especially for short sliding times), robust comparisons of sliding extents can be made between the two constructs in a pair. Quantification is as described in Supplementary Note 11. Sliding on the constructs in pair 1 was repeated in a separate experiment in presence of 30 nM enzyme, and also side by side in presence of 60 nM enzyme. Near identical extents of sliding indicate saturation has been reached at 30 nM INO80. d, Ratio of the fractional sliding extent in the construct formed on the more flexible linker to that formed on the more rigid linker, at various time points since addition of ATP, and in presence of 30 nM INO80. The dashed line indicates a ratio of 1. The ratio is computed from the data in b. The extent of sliding under saturating enzyme conditions is consistently higher for the construct involving more flexible linker (except, as expected, when the sliding extent approaches 100%). Solid lines connect observations that were made from the same initial reaction volume by sampling its fractions at various time points.
a, To assess our confidence in the result that INO80 mediated sliding is greater in the construct with the less rigid linker, we performed nucleosome-sliding experiments similar to those reported in Fig. 2f five times for each pair in the presence of 9 nM INO80 and for 2.5 min of sliding. These constitute technical replicates. The products of sliding were analysed on a 6% TBE gel as done in Fig. 2f. Each gel was imaged separately for Cy3 and for Cy5 fluorescence and quantified to calculate the fold difference in sliding extent between the flexible and the rigid construct in each pair. The measurements of fold-differences for each pair are displayed in the box plots, along with the actual data points. The central mark in each box (red) represents the median and the bottom and edges represent the 25th and 75th percentile respectively. The whiskers extent to the most extreme datapoints. Also indicated are the mean and s.d., and upper limit of the P value (defined here as the probability of obtaining a fold-difference of 1 if the distribution of fold differences has the same mean and s.d. as that of these 5 measurements) as obtained by the application of Chebyshev’s inequality. Dashed line represents a fold-difference of 1. b, In the experiment described in Fig. 2f, the more rigid construct in all pairs was labelled with Cy3, while the less rigid construct was labelled with Cy5. This control verifies that the result that sliding extent in greater in the less rigid construct is not influenced by different dye properties. We swapped the dyes between the two constructs in pair 1. We then performed nucleosome-sliding experiments on this modified pair 1 constructs for the three INO80 concentrations that yielded detectible sliding in Fig. 2f (6, 9, 13 nM), and for 1 min of sliding as in Fig. 2f. The products of sliding were analysed on a 6% TBE gel, and the sliding extents quantified as in Fig. 2f. We indeed find that even when the dyes are swapped, sliding extent is greater for nucleosomes formed on the less rigid construct. c, To obtain better statistics of sliding along the dye-swapped pair 1 constructs, we repeated one of the conditions in b (13 nM INO80, 1 min of sliding) five times. These constitute technical replicates. The measured fold-difference values are displayed in the box plot. The central mark in each box (red) represents the median and the bottom and edges represent the 25th and 75th percentile respectively. The whiskers extent to the most extreme datapoints. Also indicated are the mean, s.d. and upper limit of the P value (defined as in a) as obtained by the application of Chebyshev’s inequality. Dashed line represents a fold difference of 1.
Data were obtained by performing loop-seq on the chromosome V library (Supplementary Note 12).
a, Distribution of NCP scores33 of all 3,192 S. cerevisiae chromosome V nucleosomes. Quartile and octile boundaries of the distribution are shown as dashed lines and numbered (1 to 4 for quartiles and 1 to 8 for octiles). b, Mean intrinsic cyclizability of DNA as a function of position from the dyads of nucleosomes along yeast chromosome V, averaged over nucleosomes in each octile indicated in a. Error extents (shaded background) are s.e.m. c, Mean intrinsic cyclizability as a function of position, averaged over all 3,192 S. cerevisiae chromosome V nucleosomes (solid line). Height of the shaded region is the standard deviation of measurements. d, Scatter plot of the NCP scores of the 227 +1 nucleosomes of the 227 genes identified along chromosome V vs the mean intrinsic cyclizabilities of DNA along the 147 bp that span these nucleosomes. Intrinsic cyclizability values were obtained by performing loop-seq on the chromosome V library. Pearson’s r = 0.28. 95% CI = 0.15, 0.39. P = 2.6 × 10−5, t-test, two-sided. e, Plot of intrinsic cyclizability as a function of position along all the 576 genes in the tiling library (red), and among 34% of these genes that had the highest (black) or lowest (green) NCP score value of the gene’s +1 nucleosome. Plots were obtained in a manner identical to that in Fig. 2b. Intrinsic cyclizability on either side of the dyad of +1 nucleosomes of genes that have high +1 nucleosome NCP score (black) is asymmetric, being higher on the TSS proximal (that is, ‘left’) side of the dyad.
Extended Data Fig. 10 Loop-seq measurements compared to expectations based on earlier measurements and models.
a, Two sets of 1,000 plots each of A/T content as a function of position along the central 50 bp variable region of 1,000 sequences in the random library with the highest and lowest values of intrinsic cyclizability were generated. Fast Fourier transforms of these two sets of 1,000 plots were taken individually and used to calculate a total of 2,000 amplitude spectra. Plotted is the mean of the 1,000 amplitude spectra for the 1,000 sequences that have the highest (red) or lowest (blue) intrinsic cyclizability values. The plot indicates that sequences that have very high or low intrinsic cyclizabilities also tend to be characterized by enhanced or suppressed periodic modulations in AT content respectively at the DNA helical repeat. b, We consider a poly(dA:dT) stretch which has at least l consecutive A or T nucleotides to be a poly(dA:dT) tract. For various values of l, plotted are intrinsic cyclizability (top), poly(dA:dT) tract content (middle), and nucleosome occupancy (bottom) vs position from the dyad of the +1 nucleosome, averaged over all 227 identified genes in S. cerevisiae chromosome V (blue) or over a selected subset of genes that show no peak in poly(dA:dT) content at the NDR (red; 30%, 62%, and 86% of genes for l = 4 bp, 7 bp, 10 bp respectively). See Supplementary Note 15 for plotting details, including how poly(dA:dT) content is defined. c, 2D histogram of the scatter plot between the number of poly(dA:dT) tracts in the 50 bp variable region and intrinsic cyclizability of sequences in the chromosome V library. Any stretch of l or more consecutive As or Ts (here l = 4) is considered a poly(dA:dT) tract. Thus a sequence with one stretch of 5 As, and no other As or Ts, in the 50 bp variable region has 2 poly(dA:dT) tracts if l is considered to be 4. The scatter plot indicates that the overall correlation between intrinsic cyclizability and poly(dA:dT) content is very poor. Only non-overlapping sequences in the chromosome V library were considered. Pearson’s r = −0.07. 95% CI = −0.09, −0.05. P < 0.00001, t-test, two-sided. d, Binned histogram of the data in c (which represents the l = 4 bp case), as well as for more restrictive definitions of poly(dA:dT) stretches (l = 5 to 10 bp). The y-axis values are the mean intrinsic cyclizabilities of those sequences in the chromosome V library that contain the number of poly(dA:dT) tracts in the central 50 bp variable region as specified along the x axis. Error bars are s.e.m. For l = 4, there were n = 5,081, 2,801, 1705, 948, 521, 268, 170 non-overlapping sequences in the chromosome V library which had a poly(dA:dT) content of 0, 1, 2, 3, 4, 5, 6 respectively. For l = 5, the corresponding N values were 8,771, 1,594, 695, 323, 150, 89, 36. For l = 6, the corresponding N values were 10,523, 655, 295, 117, 62, 25, 32. For l = 7, the corresponding N values were 11,203, 290, 109, 56, 27, 30, 16. For l = 8, the corresponding N values were 11,497, 109, 58, 23, 30, 17, 11. For l = 9, the corresponding N values were 11,608, 57, 23, 29, 19, 13, 7. Because the count of poly(dA:dT) = 6 is less than 10, this point was left out from the plot. For l = 10, the corresponding N values were 11,665, 23, 30, 21, 11, 6, 1. Again, the intrinsic cyclizability for poly(dA:dT) content = 5, 6 were left out from the plot because the n value for these were less than 10. e, 2D histogram of the scatter plot of mean GC content along the central 50 bp variable region of sequences in the chromosome V library vs their intrinsic cyclizabilities. Pearson’s r = 0.026. 95% CI = 0.019, 0.033. P < 0.00001, t-test, two-sided. f, A plot of mean poly(dA:dT) content (l = 4) as a function of position around the dyads of gene-body nucleosomes along chromosome V in S. cerevisiae. The points along the horizontal axis where the nucleosome categories (+1, +2, and so on) are marked represent the dyads of the nucleosomes. Light shaded region represents s.e.m. Poly(dA:dT) content was calculated as described in Supplementary Note 15. g, Mean poly(dA:dT) content (l = 4) and AA/TT/TA vs position along the native and codon-randomized nucleosomal DNA sequences of the same 500 +7 nucleosomes along which intrinsic cyclizability profiles are reported in Fig. 3g. h, Predicted plectoneme density (PPD) (see Supplementary Note 16) (top), intrinsic cyclizability (middle) and nucleosome occupancy (bottom) vs position from the +1 nucleosomal dyad, averaged over all 576 genes in the tiling library (Supplementary Note 9). PPD along each gene was first smoothened using a rolling window of 51 bp. The smoothened PPDs were then averaged at each position across all 576 genes, normalized by the mean, and plotted as the solid line in the top panel. The shaded background represents s.e.m. Intrinsic cyclizability and nucleosome occupancy were plotted as in Fig. 2b. For the plot of intrinsic cyclizability, the solid blue and black lines represent data without and with a 7-fragment smoothening respectively (as in Fig. 2b). The shaded blue background represents s.e.m. of intrinsic cyclizability values in the unsmoothed data.
Supplementary Notes 1 – 18 and Supplementary Fig. 1 (all uncropped gel images).
Dataset 1: List of all sequences in the Cerevisiae Nucleosomal Library, their cyclizabilities for various values of n, their intrinsic cyclizabilities, amplitudes and phases. This dataset was used to generate Fig. 1f, Fig. 4c,d.
Dataset 2: Information regarding where the 19,907 sequences were chosen from the genome of S. cerevisiae to construct the Cerevisiae Nucleosomal Library (Dataset 1). Reported for each sequence is the location of the dyad of the nucleosome it is associated with, the chromosome the nucleosome is on, and whether the sequence represents DNA 50 bp to the left or to the right of the dyad. Genomic coordinates are according to the SacCer2 version of the S. cerevisiae genome assembly.
Dataset 3: List of all sequences in the Random Library, their cyclizabilities for various values of n, their intrinsic cyclizabilities, amplitudes and phases.
Dataset 4: List of the 576 S. cerevisiae genes selected to construct the Tiling Library.
Dataset 5: List of all sequences in the Tiling Library, their cyclizabilities for various values of n, their intrinsic cyclizabilities, amplitudes and phases. This dataset was used to generate Fig. 2b.
Dataset 6: List of all sequences in the ChrV Library, their cyclizabilities for various values of n, their intrinsic cyclizabilities, amplitudes and phases. This dataset was used to generate Fig. 3b-e.
Dataset 7: List of 500 genes from which the 500 +7 nucleosomes were selected for the construction of Library L.
Dataset 8: List of all sequences in the Library L, their cyclizabilities for various values of n, and their intrinsic cyclizabilities. This dataset was used to generate Fig. 3g, Fig. 4a.
About this article
Cite this article
Basu, A., Bobrovnikov, D.G., Qureshi, Z. et al. Measuring DNA mechanics on the genome scale. Nature 589, 462–467 (2021). https://doi.org/10.1038/s41586-020-03052-3
This article is cited by
Nature Structural & Molecular Biology (2022)
Nature Communications (2022)
Identification of cis-regulatory motifs in first introns and the prediction of intron-mediated enhancement of gene expression in Arabidopsis thaliana
BMC Genomics (2021)
Nature Methods (2021)
Nature Communications (2021)