Histone H1 binding to nucleosome arrays depends on linker DNA length and trajectory

Throughout the genome, nucleosomes often form regular arrays that differ in nucleosome repeat length (NRL), occupancy of linker histone H1 and transcriptional activity. Here, we report cryo-EM structures of human H1-containing tetranucleosome arrays with four physiologically relevant NRLs. The structures show a zig-zag arrangement of nucleosomes, with nucleosomes 1 and 3 forming a stack. H1 binding to stacked nucleosomes depends on the NRL, whereas H1 always binds to the non-stacked nucleosomes 2 and 4. Short NRLs lead to altered trajectories of linker DNA, and these altered trajectories sterically impair H1 binding to the stacked nucleosomes in our structures. As the NRL increases, linker DNA trajectories relax, enabling H1 contacts and binding. Our results provide an explanation for why arrays with short NRLs are depleted of H1 and suited for transcription, whereas arrays with long NRLs show full H1 occupancy and can form transcriptionally silent heterochromatin regions.

U nderstanding the organization of the genome requires insights into chromatin structure beyond the level of individual nucleosomes 1,2 . Nucleosomes can be arranged along the DNA into locally structured arrays in the nuclei of eukaryotic cells 1,3 . The relative position of neighboring nucleosomes in such arrays is defined by the nucleosome repeat length (NRL). The NRL comprises the 147 bp of DNA within the nucleosome core particle 4 and the length of linker DNA that connects the nucleosome with a neighboring nucleosome. The NRL is related to the transcriptional state of genomic regions 5,6 . Active genes contain nucleosome arrays with shorter NRLs, whereas longer NRLs are observed in heterochromatin regions that are transcriptionally silent 7,8 .
The NRL in nucleosome arrays is also associated with differences in the amount of associated linker histone H1, which is one of the most abundant proteins in chromatin 9 . Nucleosome arrays with longer NRLs are associated with higher H1 content, as revealed by studies investigating the effects of changes in H1 levels 8,[10][11][12] and by studies of H1 stoichiometry across several cell types 13 . H1 is not present in genes that are actively transcribed 14,15 . There are 11 H1 variants in mammalian cells that share a central winged-helix domain, which consists of helices α1-α3, loops L1-L3 and a short, two-stranded β-sheet [16][17][18][19] . H1 contacts DNA with L1, the amino-terminal part of α2 together with L3, and α3 to stabilize the nucleosome and contribute to chromatin compaction [17][18][19] .
There is only limited information on the structure of regular nucleosome arrays 20,21 . Early studies of compacted arrays provided evidence for a two-start helix with nucleosomes stacked along the helix axis 22 . Crystal structures of tetranucleosomes with short NRLs of 157 bp or 167 bp (referred to as 4×157 and 4×167, respectively) showed compact zig-zag arrangements of nucleosomes and lacked H1 (refs. 23,24 ). Later cryogenic electron microscopy (cryo-EM) structures of H1-containing arrays containing 12 nucleosomes and NRLs of 177 bp or 187 bp (referred to as 12×177 and 12×187, respectively) adopted a fiber-like arrangement of stacked tetranucleosome units 25 . However, a crystal structure of an H1-containing array with 6 nucleosomes and an NRL of 187 bp (called here 6x187) showed a less compact ladder-like arrangement 26 . In contrast to these in vitro results, electron tomography found no evidence of regular higher-order arrangements of nucleosomes in vivo 27,28 . Moreover, fluorescence imaging revealed that nucleosomes assemble into small clusters, rather than long fibers, in vivo 29,30 .
In summary, despite considerable efforts by the community, the structure of short nucleosome arrays remains poorly understood. It is also unclear how changes in NRL alter the structure of such arrays and how this influences H1 binding. Here we reconstitute tetranucleosome arrays with four physiologically relevant NRLs in the presence of H1 and analyze the resulting structures by cryo-EM. Our data reveal how the length of linker DNA modulates the local three-dimensional structure of these nucleosome arrays and how this influences H1 binding to particular nucleosomes of the arrays. These results have implications for understanding the compaction and transcriptional activity of chromatin.

Results
Structural analysis of tetranucleosome arrays. We reconstituted tetranucleosome arrays with four NRLs that occur in human cells in vivo 7 . These NRLs are found near active promoter regions (177 bp), in gene bodies (187 bp, 197 bp) or in heterochromatin (207 bp) ( Fig. 1a and Methods). For the reconstitution, we used human histone octamers and saturating amounts of the full-length human linker histone H1 variant H1.4 (Fig. 1b) under previously established conditions 25 . We used restriction enzyme digestion and electrophoretic mobility shift assays (EMSAs) to confirm the integrity of the resulting tetranucleosome arrays that we refer to as 4×177, 4×187, 4×197 and 4×207 ( Fig. 1b and Supplementary Fig. 1).
We then used cryo-EM and single-particle analysis to obtain structures of the four tetranucleosome arrays in the absence of NaCl and without crosslinking (Methods). We obtained cryo-EM density maps at 4-to 8-Å resolution and could visualize secondary structure elements in the histone proteins . To obtain structural models, we used maps obtained by focused refinement and built individual nucleosome core particles based on an H1.4-bound mononucleosome structure (PDB 7K5Y (ref. 19 )). The individual nucleosomes adopt a canonical conformation in all our structures 4,31 . Then, we used the overall EM maps to build the linker DNA connecting individual nucleosomes. This resulted in high-quality structures of the four arrays (Tables 1 and 2 and Supplementary Figs. [2][3][4][5][6][7][8][9]. We could refine all four nucleosomes in the 4×177 array (Fig. 1c Overall structure of tetranucleosome arrays. All four structures show a zig-zag arrangement of nucleosomes (Fig. 2a), similar to what was observed in the 4×167 array crystal structure without H1 (ref. 23 ) and in designed nucleosome fibers 25,26 . The overall architecture of all tetranucleosome arrays reported here is similar. In all structures, nucleosomes 1 and 3 form a canonical stack 23 , whereas nucleosome 2 is located in a DNA loop between the two stacking nucleosomes and is rotated relative to the nucleosome stack (Fig. 2a). The distance between nucleosome 2 and the nucleosome stack increases with increasing NRL, which leads to increased mobility of nucleosome 2 (Supplementary Figs. 2, 4, 6 and 8). Nucleosome 4 is not stacked with nucleosome 2 and is increasingly mobile as the NRL increases. We were nevertheless able to refine the structure of nucleosome 4 as part of a tetranucleosome in the 4×177 array and also in isolation within the 4×187 array. The linker DNA connecting nucleosomes 3 and 4 was always visible and always showed the same trajectory as in the 4×177 structure.  56 ) nucleosome positioning sequences and variable linker DNa: 4×177 with 30-bp linker, 4×187 with 40-bp linker, 4×197 with 50-bp linker, and 4×207 with 60-bp linker. b, EMSa confirms that tetranucleosome arrays were reconstituted with saturating amounts of linker histone H1.4. Stoichiometry of H1 to nucleosome is denoted by H1:nuc. c, Structure of the 4×177 tetranucleosome array shows a zig-zag arrangement of nucleosomes, with nucleosomes 1 and 3 forming a stack and nucleosomes 2 and 4 extending from the stack. DNa is shown in gray and white, core histones in wheat, and H1 in purple.   Symmetry imposed Initial particle images      Nucleosome stacking in solution. Previous work has revealed two main types of stacking interactions in nucleosome arrays 25,26 . Type I interactions are closely packed stacks with contacts between H2A-H2B dimers, and have been observed in the crystal structure of the 4×167 array without H1 (ref. 23 ) and within the tetranucleosome units of the 12×177 and 12×187 cryo-EM structures 25 (Fig. 2b). Type II interactions are more open, with slightly offset nucleosomes and the H4 N-terminal tail in close proximity to the acidic patch of the adjacent nucleosome, and have been observed in the 6×187 crystal structure with H1     (ref. 26 ) and between tetranucleosome units of the 12×177 and 12×187 cryo-EM structures 25 (Fig. 2b). Other stacking interactions have been observed for mononucleosomes in crystals 4 and by cryo-EM in solution 32 .
In our structures, we observe a compact stacking of nucleosomes 1 and 3 that is similar to type I interactions with a contact formed between H2A-H2B dimers (Fig. 2b) Fig. 2 | Structure of trinucleosome cores of tetranucleosome arrays. a. The trinucleosome cores of the 4×177, 4×187, 4×197 and 4×207 structures. Nucleosome 2 is rotated relative to the stack in all structures and is located at a greater distance from the stack as the length of linker DNa increases. Color code used throughout. b, Nucleosome stacking in nucleosome arrays. Nucleosome stacking in tetranucleosome arrays is similar to the stacking observed in the crystal structure of the 4×167 array without H1 (ref. 23 ) and the cryo-EM reconstruction of the 12×177 and 12×187 arrays with H1 (ref. 25 ). Left, nucleosome stack from the 4×187 array represents stacks from all structures reported in this study. Middle: nucleosome stack from the 4×167 crystal structure (PDB 1ZBB (ref. 23 )) represents the type I interaction observed in the 4×167 crystal structure and within tetranucleosomal units of the 12×177 and 12×187 cryo-EM structures 25 . Right, nucleosome stack from the 6×187 crystal structure (PDB 6HKT (ref. 26 )) represents the type II interaction observed between tetranucleosome units of the 12×177 and 12×187 cryo-EM structures. Top, dyad axes drawn in green run almost parallel to the stacking observed in the cryo-EM reconstructions determined for 4×177, 4×187, 4×197 and 4×207, whereas dyad axes in the stack observed in both type I and type II interactions are slightly tilted toward each other. Bottom, the interface between stacking nucleosomes in the 4×177, 4×187, 4×197 and 4×207 structures reported here and in type I interactions consists of apposed H2a-H2B dimers (H2a in yellow, H2B in red), while in type II interactions the nucleosome stack is slightly offset and places the N-terminal part of H4 (green) near the H2a-H2B dimer.  Fig. 3 | NRl determines H1 binding to arrays. a, H1 binds to nucleosomes of the array near the nucleosome dyad. The N-terminal part of the α2-helix (Nα2) and the L3 loop contact the DNa around the dyad, whereas the α3-helix and the L1 loop interact with linker DNas. H1 is rainbow-colored from the N (blue) to C (red) terminus, DNa is shown in white, and the histone octamer is shown in wheat. b, Focused-refined cryo-EM densities for nucleosomes 1, 2, 3 and 4, colored by NRL (4×177 blue, 4×187 green, 4×197 yellow, 4×207 red). H1 density is in purple. Nucleosomes are all viewed the same way. Entry and exit DNa are marked by a blue and a red dot, respectively. Focused-refined maps of nucleosome 4 could not be obtained for the 4×197 and 4×207 arrays owing to higher mobility. c. H1 N-terminal regions extend from the nucleosome stack in opposite directions. Residues regulating H1 mobility (K34 (ref. 34 ) and S35 (ref. 35 )) and heterochromatin formation (K26 and S27) 36 protrude from the nucleosome stack on both sides and are accessible for protein-protein interactions. The first ordered residue of H1 is S35; disordered residues are shown as a dashed line. DNa is shown in gray, histone octamer in wheat and H1 in purple. nucleosome with the acidic patch of a stacked nucleosome and thus leaves the H4 tail free to engage in other interactions 33 . Whereas the inter-nucleosome interactions appear to be very similar, we note a slight relative tilting of the stacking nucleosomes that positions their dyad axes almost parallel, in contrast to type I interactions in which the dyads are slightly tilted toward each other (Fig. 2b). This difference might be due to the absence of H1 in the case of the 4×167 crystal structure 23 and the different binding mode of H1 to the nucleosome in the case of the 12-mer array with H1 (ref. 25 ).

H1 orientation and DNA interactions.
Our structures show that H1 is always bound near the nucleosome dyad ( Fig. 3 and Supplementary  Figs. 3, 5, 7 and 9). In all ten focused-refined maps, H1 shows three DNA contacts, similar to what has been described [17][18][19] . The H1 loop L3 and the N-terminal part of helix α2 contact nucleosomal DNA near the dyad, helix α3 binds one linker DNA and loop L1 contacts the other linker DNA (Fig. 3a and Supplementary Figs. 3, 5, 7 and  9). This mode of H1 binding is referred to as on-dyad 17,19 , although H1 is located slightly off the dyad and is lopsided 18 . H1 that is bound to nucleosome 1 always contacts entering linker DNA via its helix α3 (Fig. 3b, Supplementary Figs. 5, 7 and 9 and Supplementary Videos 2-4), whereas H1 on nucleosome 3 uses α3 to contact exiting linker DNA (Fig. 3b, Supplementary Fig. 9 and Supplementary Video 4). In nucleosomes 2 and 4, the entering linker DNA is in contact with α3 ( Fig. 3b  Thus H1 can be oriented to either contact entering or exiting linker DNA, depending on local DNA geometry. The orientation of H1 influences the direction in which the unstructured N-terminal region of H1 exposes residues to post-translational modifications, such as K34 acetylation, S35 phosphorylation, K26 methylation and S27 phosphorylation 9 (Fig. 3c). This places N-terminal H1 residues that have been shown to be important for either H1 mobility 34,35 or heterochromatin formation 36 at the surface of the nucleosome stack, where they are accessible to modifiers and binding partners even in the presence of a nucleosome stack.
H1 binding relates to nucleosome repeat length. The major difference between our four structures relates to the binding of the H1 histone to the different nucleosomes of the arrays (Fig. 3b). The H1 histone is present on nucleosome 2 in all four structures, and is also observed on nucleosome 4 in all cases where this nucleosome is structurally resolved. In contrast, the presence of H1 on the stacked nucleosomes 1 and 3 differs between the four arrays. H1 is absent from the stacked nucleosomes of the 4×177 array, but is present on nucleosome 1 in the 4×187 and 4×197 arrays, and is present on both stacked nucleosomes in the 4×207 array. Thus, histone H1 is bound to non-stacked nucleosomes in all structures, whereas H1 binding to stacked nucleosomes is enabled only as the NRL increases.
To confirm that our observations are not a result of low salt concentrations, we solved the trinucleosome core structure of the H1-bound 4×177 array at 60 mM NaCl and confirmed the presence of nucleosome stacks and the absence of H1 on stacking nucleosomes ( Supplementary Fig. 11). We have also probed H1 binding to reconstituted 4×177, 4×187, 4×197 and 4×207 arrays biochemically at 150 mM NaCl and observed the that the extent of H1 binding increased with increasing linker length, in line with our structural observations ( Supplementary Fig. 12). In conclusion, an increase in NRL is related to stable binding of more H1 copies. H1 binding depends on linker DNA trajectory. These observations suggested that linker DNA trajectory determines whether H1 can bind to nucleosomes within an array. We therefore analyzed the linker DNA trajectory at the entry and exit sites of the stacked nucleosomes in all structures. This analysis revealed a progressive change in the trajectory of linker DNA as the NRL increased (Fig. 4). To quantify this, we measured the angles α and β that define linker DNA geometry as described 18 (Methods and Fig. 4b). Of particular importance here was angle β, formed between the nucleosome dyad and the linker DNA duplex axis. We also calculated the differences in angles, Δα and Δβ, which are the deviations between the angles α and β, respectively, observed in our structures and that in an isolated H1-bound nucleosome (PDB 7K5Y (ref. 19 )).
Our analysis showed that Δβ is a good predictor for histone H1 binding on stacked nucleosomes (Fig. 5). When Δβ was close to zero for both linker DNAs emerging from a nucleosome, H1 binding was observed (Fig. 5a). We found low Δβ values at nucleosome 2 and Δβ values of less than 6° at nucleosome 4, where H1 was always observed (Supplementary Table 1). However, when Δβ was higher, H1 was not bound, likely because a stabilizing contact between loop L1 and linker DNA could not be formed. Particularly high Δβ values are found for entry DNA at nucleosome 3, except for the 4×207 array, which is the only array where H1 is observed on nucleosome 3 (Fig. 5b). Furthermore, exit DNA of nucleosome 1 shows the highest Δβ value for the 4×177 array, which is the only array in which H1 is lacking on this nucleosome (Fig. 5c). In summary, as the NRL increases, nucleosome 2 moves farther away from the stacked nucleosomes and the trajectories of linker DNA at nucleosomes 1 and 3 progressively approach canonical values (Δβ = ~0) (Fig. 5a). As a consequence, H1 can contact linker DNA, explaining H1 binding to stacked nucleosomes in arrays with longer NRLs (Fig. 6).

Discussion
We present cryo-EM structures of tetranucleosome arrays with different NRLs in the presence of the human linker histone H1 variant H1.4. The structures reveal a typical zig-zag arrangement of nucleosomes [23][24][25][26] , with a trinucleosome core consisting of two stacked nucleosomes 1 and 3 and a more flexible connecting nucleosome 2, suggesting that a trinucleosome may be a fundamental unit in chromatin 37 . The zig-zag arrangement is observed also in our 4×207 structure, in line with observations from in-cell mapping of DNA contacts 38 . Stacked nucleosomes have also been observed by structural studies of tetranucleosomes, trinucleosomes and free  mononucleosomes in solution 32,[39][40][41][42] . Stacking of nucleosomes 1 and 3 is apparently stabilized by H1 binding to nucleosome 2, because a published structure of a 3×177 trinucleosome array lacking H1 adopts a non-stacked, extended conformation 41 . Our observation of a single nucleosome stack is consistent with small angle X-ray scattering (SAXS) analysis of tetranucleosomes 42 and hexanucleosome arrays that showed limited compaction 26 . Similar to previous structures of nucleosome arrays 23,25,26 , the structures presented here use NRLs that correspond to those found in vivo 7 and that differ by integer repeats of the approximate helical repeat of DNA (10n bp linkers with n being a natural number). However, alternative structures of trinucleosomes and tetranucleosomes certainly exist in vivo, and it will be important to study arrays with other linker lengths in the future 43 . Our major finding here is how the NRL of a nucleosome array relates to H1 binding to the array. It has long been known that there is a correlation between the NRL and the amount of associated H1 (refs. 12,13 ). Additionally, in vitro experiments showed that chromatin with closely spaced nucleosomes does not incorporate H1, whereas chromatin more widely spaced nucleosomes does 44 , but the reasons for this remained elusive. We now report structures that show that short NRLs impair H1 binding ( Supplementary Fig. 12) to stacked nucleosomes and suggest this is due to altered linker DNA trajectories. Altered linker DNA trajectories, as observed in our 4×177 array, sterically preclude H1-linker DNA contacts that are required for stable H1 binding [17][18][19] . A similar observation was made in the structure of a nucleosome containing the H3 variant CENP-A, where an altered linker DNA trajectory has been observed 45 that precludes H1 binding 41,45,46 . We show that, with increasing NRL, the linker DNA emerging from the stacked nucleosomes is more For each nucleosome, Δα and Δβ describe the difference in α and β, respectively, between isolated H1-bound mononucleosomal linker DNa (PDB 7K5y (ref. 19 )) and the linker DNa of the nucleosomes in the tetranucleosome array ( Supplementary Fig. 1). a, a plot of a nucleosome's average Δα against its average Δβ reveals that nucleosomes not bound by H1 (ocher) separate well from the population of nucleosomes bound by H1 (purple). For nucleosome 3, they move closer to this population with increasing NRL. b, Δβ for nucleosome 3 entry DNa reveals a decrease with increasing NRL. c, Δβ for nucleosome 1 exit DNa reveals a decrease with increasing NRL. For the depicted nucleosomes, an overlay of the 4×177 nucleosome (blue) and the isolated H1-bound nucleosome (gray) is shown and Δβ for the different NRL arrays is listed, with bound H1 indicated by purple asterisks. relaxed and permits stable H1 binding. Therefore, whereas H1 may transiently bind all nucleosomes of the four arrays (Fig. 1b), binding to nucleosomes might be destabilized in short NRL arrays and easily disrupted during cryo-EM sample preparation. We observe canonical on-dyad H1 binding as described [17][18][19] , in contrast to the off-dyad position of H1 found in tetranucleosome units of 12-mer arrays 25 that is possibly a result of chemical crosslinking 42 .
Our results have important implications for understanding the relationship between the NRL of a genomic region and its transcriptional activity. In particular, the short NRLs that are characteristic of active promoter regions and transcriptionally active gene bodies 7,8,15 may preclude H1 from binding to stacked nucleosomes. This could explain the observed depletion of H1 from active promoters 15,47,48 that likely facilitates assembly of the RNA polymerase II (Pol II) transcription machinery and passage of Pol II through chromatin 14 . The NRL of nucleosome arrays can be defined by chromatin remodeling enzymes 15,49 , and thus remodelers may indirectly deplete H1 by setting short NRLs, thereby complementing other mechanisms of H1 depletion 9,14 and rendering chromatin permissive to transcription.
Finally, long NRLs are found in heterochromatin regions 7,8,15 , which seems counterintuitive because long NRLs should expose more DNA to the transcription machinery but heterochromatin is transcriptionally silent. Our findings settle this apparent contradiction. We find that longer NRLs are required to enable H1 binding to all nucleosomes of an array, thereby stabilizing nucleosomes and inhibiting chromatin remodeler activity 19,50 . Binding of H1 in turn widens the nucleosomal footprint against which remodelers move neighboring nucleosomes [51][52][53] and thus would increase the NRL. Other H1-dependent mechanisms contribute to heterochromatin formation and transcriptional silencing 9,54,55 . For example, recruitment of DNA methyltransferases can downregulate transcription 54 , and heterochromatin protein 1 (HP1) binds to methylated H1 residue K26 (ref. 36 ) and may bridge H1-bound nucleosome stacks to facilitate heterochromatin formation and explain transcription repression.

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41594-022-00768-w.