Probing the target search of DNA-binding proteins in mammalian cells using TetR as model searcher

Many cellular functions rely on DNA-binding proteins finding and associating to specific sites in the genome. Yet the mechanisms underlying the target search remain poorly understood, especially in the case of the highly organized mammalian cell nucleus. Using as a model Tet repressors (TetRs) searching for a multi-array locus, we quantitatively analyse the search process in human cells with single-molecule tracking and single-cell protein–DNA association measurements. We find that TetRs explore the nucleus and reach their target by 3D diffusion interspersed with transient interactions with non-cognate sites, consistent with the facilitated diffusion model. Remarkably, nonspecific binding times are broadly distributed, underlining a lack of clear delimitation between specific and nonspecific interactions. However, the search kinetics is not determined by diffusive transport but by the low association rate to nonspecific sites. Altogether, our results provide a comprehensive view of the recruitment dynamics of proteins at specific loci in mammalian cells.

K ey cellular functions, such as transcription, replication and repair, are governed by the association of DNA-binding proteins (DBPs) to specific DNA sequences in the genome. Understanding the mechanisms by which DBPs find their target sites (B10 bp long) within genomes made of millions to billions of base pairs, and quantifying the associated search kinetics, is pivotal to analyse biochemical reactions and their regulation in living cells 1,2 . The search process can be schematically divided into two steps: transport through the nucleus followed by biochemical association to the target. Historically, much attention has been focused on the kinetics of the transport mechanisms of DBPs, trying to explain the 'faster-than-diffusion' association rate reported for the LacI 3 . The predominant target search (TS) model 4 , called facilitated diffusion (FD), postulates that the search results from one-dimensional (1D) sliding events (with duration t 1D ), during which proteins diffuse along nonspecific DNA sequences, interspersed with phases (with duration t 3D ) of three-dimensional (3D) diffusion 5,6 . The TS mechanisms have been the subject of controversies 7 but singlemolecule (SM) experiments have recently provided supporting evidence for the FD model in vitro [8][9][10][11] and in E. coli cells 12,13 . In eukaryotic cells, where DNA is packaged into chromatin fibres and the nucleus is highly compartmentalized and organized 14,15 , DBPs have to identify their targets among a much larger number of non-cognate sites than in prokaryotes 16 . While recent experiments have pointed to the role of nonspecific interactions in the dynamics of nuclear factors in mammalian nuclei 17,18 , the role of FD remains ill understood 19 . Moreover, the binding efficiency once the DBPs reach the specific site is often not considered although it is essential to determine which step, diffusive transport or binding, is limiting the association kinetics in the nucleus of mammalian cells.
Here we quantitatively investigate the TS of DBPs in human cells. We use Tet repressors (TetRs) as searchers in the nucleus of human cells in which a gene array serves as target locus. With this model system, we can unequivocally distinguish the role of specific and nonspecific interactions. Thereby, we identify the role of FD in the TS and measure transport and binding parameters underlying the search kinetics. We find that nonspecific binding times are broadly distributed, with no clear delimitation between specific and nonspecific interactions. In addition, we measure at the single-cell level the association kinetics to the target locus. Importantly, the association rate is not limited by transport but, instead, by the binding to nonspecific DNA sequences. Altogether, our results provide a quantitative description of the TS dynamics and shed a new light on the factors controlling the search kinetics.

Results
A single-molecule TS assay in human cells. Since the initial observations of the rapid mobility of nuclear factors 20 , the dynamics of proteins in the nucleus has been addressed using a variety of experimental techniques, either at the population or, more recently, at the single-molecule level (see ref. 21 for a review). Here we chose to probe the search dynamics of DBPs in human cells by means of a single-molecule assay 17,18,22 , because SM experiments are able to quantify the stochasticity and heterogeneity of molecular interactions with high spatial and temporal resolution.
Most endogenous DBPs have multiple specific binding sites, scattered all over the genome. Besides particular cases, such as transcriptionally productive domains (ref. 23), imaging the specific sites and locating them exhaustively remain challenging. Therefore, we used an engineered cellular system in which sites of specific and nonspecific interactions could be unambiguously distinguished. As model searcher, we chose the bacterial TetR protein 24 , a DBP that recognizes a 19 bp long specific DNA sequence (tetO) with high affinity (the TetR-tetO binding constant measured at physiologically relevant ionic strength-160 nM NaCl-is 2 Â 10 11 M À 1 , ref. 25), and that is widely used for transcriptional control in eukaryotic systems 26 . Importantly, TetR has no specific binding sites in the human genome and its affinity for DNA can be tightly controlled with an allosteric inducer, the doxycycline (Dox), which decreases TetR affinity for tetO by nine orders of magnitude 27 . We probed TetR TS dynamics in the nucleus of human osteosarcoma cells (U2OS) carrying, at a single locus in the genome (target), repeated insertions of tetO binding sites (Supplementary Note 1). The use of repeated insertions was necessary for detecting TetRs bound at specific sites and discriminating them from molecules unbound or associated to non-cognate sites. In practice, we used U2OS 2-6-3 cells 28 with 200 inserts of a gene cassette, each containing also 256 lacO and 96 tetO repeats, and U2OS 4A cells with 30 insertions of 7 tetO sites ( Fig. 1a and Supplementary Fig. 1). TetR proteins were purified and site-specifically labelled with Atto647N, a bright organic dye (Supplementary Note 2). The labelled proteins maintained their functionality in vitro (Fig. 1b) and, once microinjected in the nucleoplasm of U2OS 2-6-3 cells (Supplementary Movie 1), they were able to specifically bind to the target locus (identifiable with fluorescent LacI proteins 29 , Fig. 1c). Moreover, on addition of Dox ( Fig. 1d  TetR mobility analysis by single-particle tracking. To investigate TetR-Atto647N dynamics, we injected them at low concentration (10-500 molecules per nucleus) and tracked single proteins at 197 frames per second with B25 nm localization accuracy (Fig. 2a The diffusive properties of quasi-immobile molecules (characterized by the diffusion coefficient D 3 ) closely matched those of chromatin itself, as observed by tracking chromatinbound markers such as histones H2B 22,31 . Also, the sites of immobilization did not show a preferential localization in the nucleus (blue points in bottom right image of Fig. 2a), ruling out that they correspond to a particular locus. The quasi-immobile molecules, as well as proteins bound at the target locus (yellow points in bottom right image of Fig. 2a), could be fully eliminated on incubation of the cells with Dox (at 2.5 mg ml À 1 ) or when saturating the TetR DNA binding domain (DBD)-before injection-with short double-stranded DNA fragments  ARTICLE containing the specific tetO sequence (Fig. 2b, second and third panel). Structural and electrostatic considerations support the view that specific and nonspecific DNA interactions occur via the DNA-binding helix-turn-helix motif of the TetR protein, which is the only positively charged domain in this protein. Thus, both types of interactions are expected to be modulated by Dox or by loading the DBD with tetO-containing oligos. Therefore, in the following, we considered that the population of quasi-immobile molecules corresponded to proteins nonspecifically bound to chromatin. In contrast, when TetR-Atto647Ns were co-injected with a 1,000-fold excess of unlabelled TetR, only the association to the specific locus was reduced (Fig. 2b, bottom panel), while the diffusion properties of the fluorescent proteins were unchanged and the quasi-immobile population preserved. This indicates that the nonspecific binding sites, apart from being scattered all over the nucleus, could not be saturated. This observation is consistent with the expected high abundance of nonspecific sites (the concentration c DNA of base pairs of genomic DNA, accessible or not, in a 500 mm 3 nucleus of human cells is B10 À 2 M). It should be noted that on Dox treatment or coinjection with tetO oligos, the diffusion coefficients of the two mobile populations (red and green, in Fig. 2b,c) were mostly unchanged. In the case of Dox treatment, we observed an increase of the relative abundance of the intermediate population (green coloured, characterized by D 2 ). This might be due to some residual very rapid TetR-DNA interactions via the protein DBD, which are instead completely abolished by steric hindrance in the case of tetO co-injection. Nevertheless, the fact that in this latter case the intermediate population is still present indicates that the subdiffusive as well as the confined motion observed for the intermediate population is not only mediated by DNA interactions via the DBD of TetR but instead might depend on other factors such as protein-protein interactions (including crowding effects 33 ) or the local nuclear architecture 34,35 . Indeed, recent reports have pointed out that the nuclear milieu has a role per se in controlling the diffusivity even of inert tracers 36,37 . DBPs have been shown to diffuse slower in heterochromatin regions 38 and fast diffusing proteins tend to be excluded from H2B histoneenriched nuclear regions 39 .
Overall, we concluded that the molecules partitioned between those diffusing in the nucleoplasm (fast and intermediate populations) and those interacting nonspecifically with DNA (population of quasi-immobile proteins). The total fraction (f 1 þ f 2 ) B75% of diffusing proteins (consistent with sptPALM and FRAP measurements, Supplementary Notes 5 and 6) is equal to t 3D /(t 1D þ t 3D ). This leads to t 3D B3t 1D and places a constraint on the relative values of the dissociation rate (1/t 1D ) and of the association rate (1/t 3D ). Note that with current singlemolecule tracking techniques, no direct evidence can be obtained by imaging on local sliding movement (or absence thereof) of the protein during nonspecific binding events.

Kinetics of TetR interactions with non-cognate DNA.
To closely examine the kinetics of nonspecific interactions and estimate t 1D , we analysed in individual trajectories (longer than 0.5 s) the time course of the instantaneous diffusion coefficient computed over an 80 ms running window (Supplementary Note 7). In B50% of these trajectories, we could identify events in which the protein switched from fast or intermediate diffusion dynamics to a much slower motion (Fig. 3a, upper panel, and Supplementary Movie 7). These events, which were not observed in the presence of Dox or for co-injection with tetO oligos ( Supplementary Fig. 3), likely correspond to molecules transitioning between 3D diffusion and nonspecific interactions with DNA. The distribution P of their duration was approximately monoexponential with a decay rate of 6.7 s À 1 (Fig. 3b).
After correction for the photobleaching rate (B0.34 s À 1 in our imaging conditions, Fig. 3b  Besides the rapid nonspecific dissociation processes, we also noted the occurrence of less frequent but longer nonspecific binding events with duration often exceeding several seconds (Fig. 3a, bottom panel). Yet with continuous imaging, estimating their duration was delicate due to the difficulty of discriminating between dissociation and photobleaching. We thus recorded time-lapse movies in which 5 ms exposure time images were interspersed with dark periods of duration t TL equal to 0.1, 0.5 or 1 s. In time-lapse movies, TetR proteins were considered bound when they did not move by more than 1 pixel (160 nm) over at least two consecutive images. For each data set, we computed the survival probability (SP): SPðtÞ ¼ R 1 t PðtÞdt, that is, the probability to stay bound for a time longer than t (namely the complementary cumulative probability). The full SP, that includes the long binding events along the short ones, was obtained by renormalizing the SPs measured in time-lapse movies with the value of the SP obtained for continuous imaging at 1 s (Supplementary Note 7). Beyond B0.1 s, the SP curve markedly deviated from the exponential behaviour (Fig. 3c). Up to B100 s, a time comparable to the specific binding time (t SPE B60 s) of TetR to tetO as measured in vitro ( Supplementary Fig. 4), it decreased as a power law t g with gB À 0.7.
We first wondered whether the observed power law behaviour in the dissociation times could arise from microscopic hopping events where proteins unbind, with a single off rate, from nonspecific sites but quickly reattach to the same or to a closely located site, and which are misinterpreted as long events due to the limited spatial (25 nm) and temporal (10 ms) resolution of our imaging system. Yet, while hopping events possibly occur, they cannot account for the broad distribution of nonspecific binding times (Supplementary Note 7). On the contrary, our results strongly suggest a broad heterogeneity in nonspecific TetR-DNA interactions. Indeed, a well identified source of heterogeneity in the cell nucleus is the variability of DNA sequences-and of the corresponding binding energies 6 -and a power law distribution of binding times is expected in the case of exponentially distributed binding energies 40,41 .
Whereas most nonspecific binding events are short, denoting low affinity interactions with non-cognate sites, we used the Basic Local Alignment Search Tool (BLAST) algorithm 42 to align the tetO sequence to the human genome and assess the possible existence and abundance of quasi-consensus sites for TetR. While the full length recognition sequence is not present in the human genome, we found thousands of sites that only partially differ from the tetO sequence (Supplementary Data 1). For instance, we found two sites (in chromosomes 4 and X) differing by two mismatches from the canonical 19 bp long tetO sequence and, respectively, 4 and 23 sites with 18 and 17 bp similarity to tetO and a single mismatch. Overall, the number of scored sites increased exponentially for shorter alignment lengths (Fig. 3d). Even though the degree of similarity between a sequence and tetO does not directly relate to the TetR binding affinity, these quasiconsensus sequences are definitively potential candidates to act as stable off-target (decoy) sites. Furthermore, not only quasiconsensus sequences could behave as decoy sites but also certain repeat symmetries (outside of the specific binding sites) can control DNA-binding preferences, as recently shown by high throughput protein-DNA binding analysis 43 . We thus propose that the power law kinetics results from a broad heterogeneity of nonspecific binding sites (and binding energies) encountered by TetR in the human genome, ranging from very short residence time on completely random sequences (t RS B158 ms) to stable binding on sites that only partly differ from the 19 bp long tetO specific sequence (Fig. 3e).
To evaluate the mean nonspecific binding time t 1D , we numerically integrated the survival probability SP distribution and found t 1D to be B2 s, consistent with FRAP observations (Supplementary Note 6). Given that t 3D B3t 1D , the mean 3D diffusion time is B6 s, meaning that the nonspecific sites are visited at a rate 1/(t 1D þ t 3D )B0.12 s À 1 . Importantly, t 3D is much longer than 1/Dac DNA B10 À 5 s, the time estimated for a diffusion-limited nonspecific association in the nucleus (with a ¼ 1 nm and the diffusion coefficient D on the order of D 1 ). The high value of t 3D -or equivalently, the low association rate 1/t 3D -is possibly due to the inaccessibility of a large fraction of genomic DNA, to the inefficient binding of TetR to non-cognate sequences or to a combination thereof.
Association kinetics at the target locus. Having obtained a detailed view of the mechanisms governing protein mobility away from the specific sites, we next aimed to directly measure the search rate or, equivalently, the association rate constant of DBPs to the target locus in living cells. To this end, we used the Reverse TetR (RevTetR), a TetR point mutant for which DNA binding is inversely regulated by Dox 44,45 and which has diffusive properties similar to that of TetR (in reversed Dox conditions, Supplementary Note 5). Importantly, RevTetR permits a more accurate temporal control on the association process than TetR (Supplementary Note 8). After Dox addition (at 2.5 mg ml À 1 ) to U2OS 2-6-3 cells stably expressing RevTetR-GFP, we observed an accumulation of RevTetR-GFPs at the target locus ( Fig. 4a and Supplementary Movie 8). The onset and increase of the fluorescence signal was independent of Dox concentration (over the range 2.5 to 125 mg ml À 1 , Supplementary Fig. 5), ruling out any possible influence of membrane permeability to Dox or RevTetR-GFP affinity for Dox on the association kinetics. The fluorescence time course at the locus was well fitted by a single exponential with an observed rate k Obs (Fig. 4b). In each measured nucleus, we correlated k Obs with the concentration c of RevTetR-GFP, measured with two-photon fluorescence correlation spectroscopy (FCS) 46 (Fig. 4b inset and Supplementary Note 8) before RevTetR-GFP induction by Dox (when B90% of proteins is mobile in the nucleoplasm, Supplementary Notes 5 and 8). Over 1-600 nM, the concentration range  (Fig. 4c, red circles) with a slope corresponding to the association rate constant k a equal to (9.2 ± 0.2) Â 10 4 M À 1 s À 1 . Thus, the average time for a single individual TetR protein to bind to the array in a nuclear volume of B500 mm 3 is on the order of 3 Â 10 6 s (that is, B35 days), during which it visits B10 5 nonspecific sites.
Modelling the search rate k a . Given the unexpectedly low value measured for the association rate constant, we next focused on modelling in more details the search kinetics. In our experiments, the target is composed of N binding sites that appear clustered within an approximately spherical locus with radius r t B350 nm, as determined by super-resolution microscopy imaging (Fig. 4d, Supplementary Movie 9 and Supplementary Note 9). To model the association rate constant k a , we first hypothesized that the searchers perform an effective 3D diffusive motion, with a coefficient D n ¼ D 1 Á t 3D /(t 1D þ t 3D ) ¼ 6 mm 2 s À 1 that takes into account the slowdown due to intermittent nonspecific binding DBPs explore the nucleus by alternating between 3D diffusion and association to off-target sites (which in some case behave as decoy sites) until they associate in the vicinity of the target and finally slide along the DNA (dotted arrow) and bind to the specific binding site. During the TS process, the rate limiting step is the nonspecific association to DNA (black arrows), while once engaged on the DNA they can and effectively associate to the specific site (red arrow).
away from the locus. For simplicity, we assumed that the binding sites (each of radius a ¼ 1 nm) are uniformly distributed within the sphere. The association rate constant can then be written: where k is the probability per unit time to bind to a target sequence when the searcher is within the locus (see Supplementary Note 10). In the limit N c1, k can be evaluated similarly to k a by computing the inverse mean search time to find a target sequence of size a in the elementary volume v ¼ 4pr 3 t 3N (that is, the mean volume per target within the spherical locus), yielding: Here l, which accounts for the binding efficiency, is the probability per unit time that binding occurs if a protein is within a distance a of a target sequence. For perfect association (that is, binding occurs at the first encounter of the protein with its target sequence), l is very large (c D n /a 2 ) such that k is equal to 4pD n a/v. Since r t ffiffiffiffiffiffiffiffiffiffi k=D n p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3aN=r t p c1, the rate k a reduces to 4pD n r t , the value for an absorbing sphere of radius r t . Given the measured values of D n and r t , k a should be B2 Â 10 9 M À 1 s À 1 , more than four orders of magnitude larger than experimentally observed. This suggests that the binding is very inefficient at the target sites (consistently with our measurement of the nonspecific association rate 1/t 3D ) and that the proteins need many encounters before stably associating to DNA. Importantly, in the low binding efficiency regime (lo oD n /a 2 ), k % lNa 3 r 3 t and the association kinetics k a E(4p/3)lNa 3 is no longer determined by the transport properties.
To further verify that the association kinetics is reaction limited, we expressed the LacI fused to the transactivator domain VP16 (ref. 47) in U2OS 2-6-3 cells. This leads to a decondensation of the binding locus and the expression of the reporter gene CFP-SKL (Supplementary Fig. 6). The extension of the target locus increased to about 3 mm (Fig. 4d, Supplementary Movie 9 and Supplementary Note 9). Yet, this massive decompaction-and the associated modification in the large-scale organization of chromatin-had only a minimal effect on the association kinetics, which remained unchanged in most cells (Fig. 4c, brown circles). In other words, the search kinetics is largely independent of the 3D organization of the binding sites even for highly compacted conformations, which rules out a TS kinetics limited by diffusive transport in the nucleus.
The observation of nonspecific binding events in individual trajectories raised the possibility that the searchers did not reach the target site solely via 3D motion but also using a local 1D exploration at the vicinity of the tetO sites. To probe the potential role of 1D motion, we measured the search kinetics in the case of a different organization of the target locus using U2OS 4A cells. In this case, the target locus appeared as a B200 nm diameter spot (Fig. 4d, Supplementary Movie 9 and Supplementary Note 9) and we measured an association rate constant equal to (2.2±0.1) Â 10 4 M À 1 s À 1 (Fig. 4c, blue circles), only a factor B4.2 lower than in U2OS 2-6-3. However, the locus occupancy at equilibrium (that is, long after Dox induction) diminished by a factor B90, scaling with the total number of tetO sites (Fig. 4e and Supplementary Note 8). In other words, the search kinetics approximately scaled with N i the number of inserts (200 versus 30) rather than N the number of sites (19,200 versus 210) as would be expected for a reaction limited by the specific binding efficiency to the target. This observation means that the search process cannot be purely 3D and suggests that prior to association to specific sites, TetRs slide along the DNA flanking regions before reaching their target. As reported for LacI proteins in E. coli 13 , multiple binding sites behave as a single target when they are located within a genomic distance inferior to the sliding length. Our experiments indicate that the TetR sliding length exceeds the distance between tetO sites in each insert (23 bp both in U2OS 2-6-3 and U2OS 4A cell lines) while remaining well below the distance between inserts (B16.5 kbp in U2OS 2-6-3 cells and B4 kbp in U2OS 4A cells). To evaluate the plausibility of this scenario, we estimated the sliding length l SL ¼ 2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi D SL Á t RS p by considering a nonspecific interaction time t RS ¼ 158 ms and assuming a 1D diffusion coefficient D SL B10 5 -10 6 bp 2 s À 1 , similar to the values reported for several different DBPs diffusing along naked B-DNA 9,48 and on a chromatin lattice 49 . This leads to a value of the sliding length in the range 250-750 bp, thus compatible with the experimental boundaries determined above.
To take sliding into account, we adapted the standard FD models 5,6 . Following our above analysis, we assumed that the limiting step of the reaction is the nonspecific binding to DNA, which is here accounted for by t 3D . Once nonspecifically attached to DNA, we hypothesized that the protein can locally scan neighbouring sequences, and eventually bind to a specific site with probability p if the sliding excursion overlaps with an insert. Indeed, as recently shown, proteins do not necessarily associate perfectly as they slide over their specific sites 13 . The association rate constant k a can then be written as (ref. 6 and Supplementary Note 10): Making use of the full concentration of genomic DNA, whose limited accessibility is implicitly accounted for in t 3D , the observed value of k a is recovered provided that p is close to unity (Supplementary Note 10). This suggests that, once nonspecifically attached to DNA, the protein can efficiently bind to a specific site.

Discussion
Our experiments provide direct and quantitative insights into the search dynamics of DBPs in mammalian cells. The emerging picture for TetR is that the association to the target site proceeds in two steps (Fig. 4f). First, proteins explore the nucleus by alternating between 3D diffusion and transient association to nonspecific DNA. Yet, the nonspecific association is very inefficient possibly due to competition with nucleosomes or with other DBPs specifically or nonspecifically attached to DNA. During the transport phase, proteins spend more time diffusing in solution than engaged on nonspecific DNA interactions. This behaviour appears common for many DBPs in eukaryotic cells 17,18,22,50 and contrasts with what has been reported for bacteria 12 . To further explore the generality of such a behaviour, we have investigated the behaviour of LacI proteins in mammalian cells given that, in bacteria, the LacI is predominantly bound to non-cognate DNA (87% of bound proteins) and with very short residence times 12 . FRAP experiments on NLS-LacI-GFP in U2OS 2-6-3 cells showed a fast full recovery, well described by an action-diffusion model 51  The case that most DBPs are not bound to the DNA might be the consequence of the fact that in mammalian cells DNA is decorated by a myriad of different proteins specifically or nonspecifically bound, which drastically decreases its accessibility and consequently the protein binding efficiency. Such a scenario would also explain why many transcription factors (TFs) bind to significantly fewer sites in the genome with respect to predictions based on the presence of their respective consensus motifs 52,53 . At the target site, TetR proteins need to revisit the locus many times before associating to a nonspecific site in proximity of an insert. Then, in a second step, the TetR searcher locally slides until it associates to a tetO site. Contrarily to what happens for the LacI in bacteria, where the TS is diffusion limited 12 , for TetR in human cells the limiting rate in the search process is the association to nonspecific sites and not the diffusive transport or the sliding phase 27 .
An important-and still open-question is to determine the exact nature of the sliding movement in the local exploration phase. Because of the complex conformation of nuclear DNA and the many obstacles that could be encountered, it is possible that, during the sliding phase, the protein does not keep constant contact with DNA. Instead, its motion could involve stepping or hopping along the chromatin fibre 49 . We emphasize that movement along a low dimensional structure, even if not purely 1D, has been shown to preserve most of the kinetic properties of FD 19 . In particular, it can induce important geometric effects 54 , so that for example neighbouring target sites effectively behave as a single target, as we observe.
A significant finding is the lack of clear delimitation between specific and nonspecific binding and, instead, the observation of a continuum of association times possibly due to the variability of DNA sequences encountered in the human genome. The existence of many stable, non-functional off-targets, which might be even more pronounced for endogenous eukaryotic TFs usually having a shorter recognition sequence than TetR 16 , places constraints on the minimal number of DBPs required for overcoming the sequestration by decoy genomic sites and achieve a timely association to the target sites. Unfortunately, despite the need of having a comprehensive census of the human TFs 55 to quantitatively study and model regulatory networks 56 , there are still very few reports on the absolute abundance of TFs in human cells. Nevertheless, multiplexed targeted proteomics has recently pointed to a huge variability in the TFs copy number (ranging from a few hundreds copies of Pias3 and ARID3a to more than 300,000 copies for NFIB) and an up to fivefold variation during cellular differentiation 57 . Indeed, such a high variability in TFs copy number might be fundamental to control the temporal response of different TFs and it will be of interest to examine in future studies to what extent the TFs copy number is a proxy of the importance of a rapid response for certain genes.
Finally, the association rate to a particular sequence can be controlled by the local arrangement of chromatin, the conformation of the protein or the association to molecular partners. A low binding efficiency, seemingly far from optimal, might become beneficial for regulatory processes. In this regard, sliding is surely advantageous compared with direct 3D association, since it transiently maintains the DBP in the vicinity of its target site, thus increasing the time available for the formation of stable regulatory complexes 18 .

Methods
Cellular system and culture. Experiments were conducted on human osteosarcoma (U2OS)-derived cells containing at a single locus in the genome different inserts of specific target sites. In particular, we employed U2OS 2-6-3 cells 28 , which contain 200 cassettes of 256 lacO and 96 tetO binding sites upstream of a minimal CMV promoter controlling a reporter gene (CFP-SKL) and 24 MS2 stem loops. We also used U2OS 4A cells 58 , containing 30 insertions of 7 tetO binding sites ( Supplementary Fig. 1). In both cell lines, the 19-mers tetO binding sites are spaced by 23 bp, while the genomic distance between contiguous insertions is of the order of several kbp. Cells were cultured at 37°C in the presence of 5% CO 2 in 1 gl À 1 glucose phenol red-free DMEM medium (11054, GIBCO, Life Technologies, USA) with 10% (v/v) fetal bovine serum (10270, GIBCO, Life Technologies, USA), 1% Pen/Strep (15140, GIBCO, Life Technologies, USA) and 1% GlutaMAX (35050, GIBCO, Life Technologies, USA). When required, cells were transiently transfected by lipid vesicles fusion (using FuGENE6, 11814443001, Roche, Swiss) 12-24 h before experiments (Supplementary Note 1, all the plasmids used in the study are available on addgene). Nonspecific interactions analysis. We quantified the transient nonspecific interactions using a running window analysis of individual traces obtained with continuous imaging. To estimate the long nonspecific binding events, we performed time-lapse experiments acquiring snapshots (with 5 ms exposure time) with different inter-frame times (t TL ¼ 0.1, 0.5 and 1 s) to extend the observation window to longer times while preventing photobleaching to mask the long events. We localized particle positions as for continuous imaging and we considered as immobile the proteins that remained within 1 pixel for at least two frames. Next we calculated the survival probability SP(t), that is, the probability of staying bound for time longer than t. The numerical integration of the SP data yielded an estimate of t 1D B2 s. (Supplementary Note 7).

Single
Bioinformatics analysis. BLAST algorithm 42 was run using the GRCh37 assembly of the 24 human chromosomes and the canonical 19 bp long tetO sequence (TCC CTA TCA GTG ATA GAG A). To exhaustively score all sites, we increased the Expect value (E-value) until the total number of scores found by the algorithm plateaued. With an E-value of 50,000, we obtained 14,482 sites with 11 bp or longer similarity to the tetO sequence (Supplementary Data 1). The density of the alignments scored was approximately uniform in all chromosomes, with a mean value of almost five sites per Mbp, even though longer chromosome showed a higher density of scored sites (Supplementary Fig. 7).
Measurement of the association rate constant in situ. The concentration c of RevTetR-GFP in each cell has been measured employing a two-photon FCS microscope 61,62 . In brief, we used an inverted Olympus IX81 microscope and we created an observation volume of B0.5 fl within the cell nucleus by focusing a tunable, mode-locked Ti:Sa laser (Chameleon Ultra II, Coherent, USA), operated at 940 nm, with a Â 60 N.A. 1.2 NIR water immersion objective (UplanSApo 60XW, Olympus, Japan). The fluorescence signal was detected with a fibre-coupled (100 mm core, multi-mode fibre, AFS105/125Y, Thorlabs, USA) avalanche photo-diode (SPCM-AQRH-14-FC, Perkin-Elmer, Canada) and fed to an external digital correlator (Flex03LQ-01, Correlator.com, USA). The signal autocorrelation G(t) was fitted with a purely diffusive model to determine the mean number of molecules in the excitation volume given by oN Mol 4 ¼ 1/G (t-0) (ref. 46). Three FCS measurements (each composed by three runs of 30 s long acquisitions) were performed in three different nuclear locations to estimate the intranuclear heterogeneity of protein concentration (x error bars in Fig. 4c), avoiding to point on the binding site locus and on nucleoli and before inducing protein association. Association of RevTetR-GFP after Dox induction was monitored acquiring 3D stacks (usually five planes with 1 mm separation) every 5 or 10 s with an intensified CCD (HQ2 CoolSNAP, Roper Scientific, Germany) with 100 ms exposure time under synchronized blue LED (M470L2, Thorlabs, USA) illumination. Binding site intensities have been quantified using ImageJ for U2OS 2-6-3 cells, and a 3D localization and intensity measuring MatLab routine (FISHquant,ref. 63) in the case of U2OS 4A cells (Supplementary Note 8).
Super-resolution imaging of the target site. Super-resolution experiments have been conducted on an inverted microscope equipped with a perfect focus (Ti Eclipse, Nikon, France) and an adaptive optical system (MicAOTM, Imagine Optics, France) placed in the detection pathway of the microscope 64 . We used the deformable mirror (MirAO 52-e, Imagine Optics, France) to introduce an astigmatic deformation of the point spread function to obtain information on the 3D position of molecules within the focal depth of the microscope (B600 nm, refs 64,65). 3D PALM images have been recorded using a 561 nm imaging laser (Genesis MX 561-2000 MTM, Coherent, USA) and a 405 nm activation laser (Cube 405-100C, Coherent, USA) focused in the back focal plane of a Â 100 N.A. 1.49 oil immersion objective (CFI Apo TIRF 100X, Nikon, France) and a 512 Â 512 EM CCD (Ixon3 DU897, Andor, Ireland). Fluorescent beads (200 mm diameter, TetraspeckTM T7280, Molecular Probes, Invitrogen, USA), added on top of the cells, have been used as fiduciary markers to correct for drift during imaging. 3D super-resolution images were generated using between 80,000 and 200,000 frames acquired with an exposure time of 25 ms under continuous illumination with a density of energy of B4.5 kW cm À 2 for the 561 nm imaging laser and between 0 up to B10 2 kW cm À 2 for the 405 nm activation laser (Supplementary Note 9). Super-resolution images have been reconstructed with dedicated MatLab routines (available on request), while images rendering and quantification ( Fig. 4d and Supplementary Movie 9) have been done using the software ViSP 66 .