Introduction

The number N(t) of distinct sites visited by a random walker up to time t is a key property in random walk (RW) theory1,2,3,4,5,6,7 which appears in many physical8,9,10,11,12,13,14,15,16,17,18,19,20,21, chemical22,23, and ecological24 phenomena. This observable quantifies the efficiency of various stochastic exploration processes, such as animal foraging24 or the trapping of diffusing molecules1,23. While the average and, for some examples, the distribution of the number of distinct sites visited, have been determined analytically5,25,26,27, this information is far from a complete description. In this work, we show that the waiting time τn, defined as the elapsed time between the visit to the nth and the (n+1)st distinct, or new, sites characterizes the exploration dynamics in a more fundamental and comprehensive way (Fig. 1).

Fig. 1: Definition of the random variable τn.
figure 1

a A visited domain (black sites) and its boundary (green line) for a RW on the square lattice. The nth and (n+1)st new sites visited are blue and red squares. The red links indicate the intervening RW trajectory. b The time intervals τn between increments in N(t), the number of new sites visited at time t.

In addition to their basic role in characterizing site visitation, the τn are central to phenomena that are controlled by the time between visits to new sites. A class of such models are self-interacting RWs, where a random walker deposits a signal at each visited site that alters the future dynamics of the walker on its next visit to these sites. This self-attracting  RW 28,29,30 has recently been shown to account for real trajectories of living cells31. In this model, the probability that the RW jumps to a neighboring site i is proportional to \(\exp (-u{n}_{i})\), where u is a positive constant, ni = 0 if the site i has never been visited up to time t and ni = 1 otherwise. The analysis of this strongly non-Markovian walk is a difficult problem with few results available in dimension higher than 1. However, we note that its evolution between visits to new sites is described by a regular  RW whose properties are well known. This makes the determination of the statistics of the τn an important first step in the analysis and understanding of these non-Markovian RWs.

The variables τn also underlie starving RWs32,33,34,35,36, which describe depletion-controlled starvation of a RW forager. In these models, the RW survives only if the time elapsed until a new food-containing site is visited is less than an intrinsic metabolic time \({{{{{{{\mathcal{S}}}}}}}}\). If the forager collects a unit of resource each time a new site is visited, then in one trajectory, the forager might find resources at an almost regular rate while in another trajectory, the forager might find most of its resources near the end of its wandering. This discrepancy in histories has dramatic effects: the forager survives on the first trajectory but not the latter. To understand this disparity requires knowledge of the random variables τn.

Despite their utility and fundamentality, the statistical properties of the τn appear to be mostly unexplored, except for the one-dimensional (1d) nearest-neighbor RW. In this special case, the distribution of τn coincides with the classic first-exit probability of a RW from an interval of length n, Fn(τ)37. We drop the subscript n on τ henceforth, because the value of n will be evident by context. In the limit n →  with τ/n2 fixed, Fn has the following basic properties: (i) aging38; in general, Fn depends explicitly on n, or equivalently, the time elapsed until the visit to the nth new site; (ii) an n-independent algebraic decay: τ−3/2 for 1 τn2, where n2 is the typical time to diffuse across the interval; (iii) an exponential decay for τn2; (iv) Fn admits the scaling form \({F}_{n}(\tau )={n}^{-3}\psi \left(\tau /{n}^{2}\right)\) (see Sec. S1 in the Supplementary Information (SI) for details).

Results

In this work, we extend these visitation properties to the physically relevant and general situations of higher dimensions and general classes of RWs, including anomalous diffusion. We investigate symmetric Markovian RWs that move in a medium of fractal dimension df, and whose mean-square displacement is assumed to be given by \(\langle {{{{{{{{\bf{r}}}}}}}}}^{2}(t)\rangle \propto {t}^{2/{d}_{{{{{{{{\rm{w}}}}}}}}}}\), where dw is the dimension of the walk39 and t the number of RW steps. We assume in particular the existence of a renewal equation between the propagator and the first-passage time density of the RW40. We focus on discrete time and space RWs, for which the number of sites visited at a certain time is clearly defined (see SI S5.C.1 for the extension of our results to Continuous Time Random Walks (CTRWs)). The ratio μ = df/dw determines whether the RW is recurrent (μ < 1), marginal (μ = 1), or transient (μ > 1). For recurrent and marginal RWs (μ ≤ 1), the probability to eventually visit any site is one, while for transient RWs (μ > 1), the probability to visit any site is strictly less than one41,42. Despite the geometrical complexity of the territory explored after n steps (which typically contains holes, islands43 and is not spherical44,45, see Figs. 1 and 2), the distribution of the times τ between visits to new sites obeys universal statistics that are characterized only by μ, as summarized in Table 1 (up to constant prefactors that are independent of τ, and neglecting algebraic corrections for the two latter regimes).

Fig. 2: The three temporal regimes of the exploration dynamics, as illustrated by a RW on a square lattice.
figure 2

Each panel shows the corresponding different controlling configurations when n = 500 distinct sites have been visited. The nth and (n+1)st visited site are shown in red and blue, respectively (a and b). a Early time: the visited domain (black squares within the green boundary) is effectively infinite (at the scale of the trajectory of the RW during the time τn). b Intermediate time: the exit time probability from the visited domain is governed by atypically large trap-free regions of radius \({r}^{*}(\tau ) \sim {\rho }_{n}{\left(\tau /{t}_{n}\right)}^{1/({d}_{{{{{{{{\rm{f}}}}}}}}}+{d}_{{{{{{{{\rm{w}}}}}}}}})}\). c Long time: the exit time probability is determined by atypically large trap-free regions of radius \({r}^{*}({T}_{n}) \sim {n}^{1/{d}_{{{{{{{{\rm{f}}}}}}}}}}\).

Table 1 Summary of the time dependence of Fn(τ) for the three classes of RWs—recurrent, marginal, and transient

Fundamental consequences of our results include the following: (i) Finding new sites takes progressively more time for recurrent and marginal RWs; this agrees with simple intuition. This property is quantified by the n dependence of the moments of τn. From the entries in Table 1 we find \(\langle {\tau }_{n}^{k}\rangle \propto {n}^{k/\mu -1}\) for recurrent RWs, while \(\langle {\tau }_{n}\rangle \propto \ln n\) and \(\langle {\tau }_{n}^{k}\rangle \propto {n}^{(k-1)/2}\) for k > 1 for marginal RWs. Conversely, transient  RWs rarely return to previously visited sites, so that \(\langle {\tau }_{n}^{k}\rangle \propto {{{{{{{\rm{const}}}}}}}}\) (see the SI Sec. S3.D for the derivation and numerical check). (ii) The statistics of the τn exhibit universal and giant fluctuations for recurrent and marginal RWs, with \({{{{{{{\rm{Var}}}}}}}}({\tau }_{n})/{\langle {\tau }_{n}\rangle }^{2}\propto n\) for recurrent walks and \({{{{{{{\rm{Var}}}}}}}}({\tau }_{n})/{\langle {\tau }_{n}\rangle }^{2}\propto \sqrt{n}/{(\ln n)}^{2}\) for marginal walks. In the context of the foraging process mentioned above, this leads to very different life histories of individual foragers. In contrast, τn remains bounded for large n for transient RWs, so that fluctuations remain small. (iii) The early-time regime is independent of n. The feature of aging, which originates from the finite size n of the domain visited, arises after a time tn, for recurrent and marginal RWs, and Tn, for transient RWs (see Table 1 and below for the definition of these two fundamental time scales). (iv) As shown below, each regime of the exploration dynamics is controlled by specific configurations that are illustrated in Fig. 2. These provide the physical mechanisms that underlie the entries in Table 1. (v) The algebraic decay of Fn(τ) in the recurrent case should be compared with the simpler problem of a recurrent RW in unbounded space, where the first-passage time distribution to a given target behaves at large times like Ftarget(τ)  1/τ1+θ, with θ the so-called persistence exponent46. Because θ = 1 − μ for processes with stationary increments38, and in particular for Markovian processes, the algebraic decay of Fn in Table 1 can be rewritten as Fn(τ) τ−(2−θ), in sharp contrast with the decay of Ftarget(τ). While the two exponents coincide for a simple RW in 1d (for which θ = 1/2), the problem here involves the first-exit time statistics from a domain whose complex shape is generated by the RW itself.

We now sketch how to derive these results (see Secs. S2S3 of the SI for detailed calculations). As an essential step, we first map the visitation problem to an equivalent trapping problem. In our visitation problem, we view unvisited sites as traps for the RW, so that a RW is trapped whenever it leaves the domain of already visited sites. Here, the term trapped does not mean that the RW disappears, but rather, the RW continues its motion but now with the visited domain expanded by the site just visited and the inter-visit time τ is reset to zero. By this equivalence to trapping, the time τ between visits to the nth and (n+1)st new sites is the same as the probability for the RW to first exit the domain that is comprised of the n already visited sites, or equivalently the domain free of traps. A crucial feature of this equivalence to trapping is that the spatial distribution of traps is continuously updated by the RW trajectory itself. In contrast to the classical trapping problem47,48, where permanent traps are randomly distributed, here the spatial distribution of traps ages because it depends on n. Moreover, successive traps are spatially correlated, with correlations generated by the RW trajectory.

These two key points are accounted for by the distribution Qn(r) of the radius of the largest spherical region that is free of traps after n sites have been visited. We show in Sec. S2.D of the SI that this distribution assumes the scaling form \({Q}_{n}(r)\simeq {\rho }_{n}^{-1}\,\exp [-a{\left(r/{\rho }_{n}\right)}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}]\), where a is independent of n and r and the characteristic length ρn provides the typical scale of this radius r. Furthermore, the n dependence of ρn, which quantifies both aging and correlations between traps, is determined by whether the exploration is recurrent or transient. Specifically, we find \({\rho }_{n}={n}^{1/{d}_{{{{{{{{\rm{f}}}}}}}}}}\) for μ < 1, \({\rho }_{n}={n}^{1/2{d}_{{{{{{{{\rm{f}}}}}}}}}}\) for μ = 1 and ρn of the order of one, up to logarithmic corrections for μ > 1 (see Sec. S2 in the SI). A striking feature of these behaviors is that the exponent changes discontinuously when μ passes through 1.

The corresponding time scales \({t}_{n}={\rho }_{n}^{{d}_{{{{{{{{\rm{w}}}}}}}}}}\) and Tn delineate the three regimes of scaling behaviors summarized in Table 1 and Fig. 2: (i) a short-time algebraic regime (1 τtn), (ii) an intermediate-time stretched exponential regime (tnτTn), and (iii) a long-time exponential regime (Tnτ). Here Tn is defined as the time at which the radius of the trap-free region r*(τ) that controls the dynamics takes its maximal possible value of \({r}_{\max }={n}^{1/{d}_{f}}\) (see Fig. 2c and the discussion below Eq. (4)). We do not characterize the early time regime τ = O(1) which depends on details of the model: we are only interested in universal features.

Algebraic regime

Here, the distribution of τ has a universal algebraic decay whose origin stems from two essential features: (i) The RW just visited a new site so that the RW starts from the interface between traps and visited sites when the clock for the next τ begins. (ii) The region already visited by the RW is sufficiently large so that we can treat the region as effectively infinite (Fig. 2a) and thereby approximate Fn(τ) by F(τ).

The first-return time distribution to this set of traps on the interface is determined by the renewal equation40,49,50 that links the probability Ptrap(t) to be at a trap at time t and the distribution of first arrival times F(τ) to a trap at time τ,

$${P}_{{{{{{{{\rm{trap}}}}}}}}}(t)=\delta (t)+\int\nolimits_{0}^{t}{F}_{\infty }(\tau ){P}_{{{{{{{{\rm{trap}}}}}}}}}(t-\tau )\,d\tau .$$
(1)

This equation expresses the partitioning of the total RW path to the interface into a first-passage path to the interface over a time τ and a return path to the interface over the remaining time t − τ; here we use a continuous-time formulation for simplicity. In this mean-field type equation (detailed in SI Sec. S3.A.1 and 2, and supported by numerical simulations given below and an alternative derivation for the exponent of the algebraic decay given in Sec. S3.A.3 in the SI), we treat the set of traps collectively, which amounts to neglecting correlations between the return time and the location of the traps on the interface.

Next, we estimate Ptrap(t) by using the fact that the RW is almost uniformly distributed in a sphere of radius \(r(t)\propto {t}^{1/{d}_{{{{{{{{\rm{w}}}}}}}}}}\) at time t. The number of traps within this sphere is given by \(r{(t)}^{{d}_{{{{{{{{\rm{T}}}}}}}}}}\). Here dT is the fractal dimension of the interface between visited and non-visited sites; as shown in the SI Sec. S3.A.2, dT = 2df − dw. Finally, we obtain the fraction of traps within this sphere and thereby Ptrap(t):

$${P}_{{{{{{{{\rm{trap}}}}}}}}}(t)\propto \frac{\,{{\mbox{Number of traps}}}}{{{\mbox{Number of sites}}}\,}\propto \frac{r{(t)}^{{d}_{{{{{{{{\rm{T}}}}}}}}}}}{r{(t)}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}}\propto {t}^{\mu -1}.$$
(2)

Based on (2), we solve Eq. (1) in the Laplace domain and invert this solution to obtain the algebraic decay F(τ) Aτ−1−μ in Table 1 in the early-time regime for recurrent and marginal RWs (this derivation is given in Sec. S3.A in the SI, including exact and approximate expressions for the amplitude A for marginal and recurrent RWs, respectively).

In the transient case, the RW is always close to a non-visited site by the very nature of transience. Consequently, the time scale tn is of order one and the algebraic regime does not exist.

Intermediate- and long-time regimes

If the RW survives beyond the early-time regime, it can now be considered to start from within the interior of the domain of visited sites. In analogy with the classical trapping problem, a lower bound for the survival probability of the RW, Sn(τ), is just the probability for the RW to remain within this domain. This lower bound is controlled by the rare configurations of large spherical trap-free regions in which the RW starts at the center of this sphere, whose radius distribution Qn(r) was given above.

We develop a large-deviation approach, in which this lower bound is given by the probability qn for the RW to first survive up to the first crossover time tn, multiplied by the probability for the RW to remain inside a spherical trap-free domain over a time τ. The quantity qn is given by \(\int\nolimits_{{t}_{n}}^{\infty }{F}_{\infty }(\tau ){{{{{{{\rm{d}}}}}}}}\tau\), which scales as \(1/{t}_{n}^{\mu }\) if μ ≤ 1, and is of order one if μ > 1. The probability for the RW to remain inside a spherical domain of radius r over a time τ asymptotically scales as \(\exp (-b\,\tau /{r}^{{d}_{{{{{{{{\rm{w}}}}}}}}}})\), where b is a constant39. As stated above, the probability to find a spherical trap-free region of radius r is given by \({Q}_{n}(r)\simeq {\rho }_{n}^{-1}\exp [-a{(r/{\rho }_{n})}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}]\). Summing over all radii up to the largest possible value \({r}_{\max }={n}^{1/{d}_{{{{{{{{\rm{f}}}}}}}}}}\), we obtain the lower bound

$${S}_{n}(\tau )\ge \frac{{q}_{n}}{{\rho }_{n}}\int\nolimits_{0}^{{n}^{1/{d}_{{{{{{{{\rm{f}}}}}}}}}}}\exp \left[-b\tau /{r}^{{d}_{{{{{{{{\rm{w}}}}}}}}}}-a{(r/{\rho }_{n})}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}\right]{{{{{{{\rm{d}}}}}}}}r,$$
(3)

where a and b are constants. Using Laplace’s method by making the change of variable \(r=\rho {\tau }^{1/({d}_{{{{{{{{\rm{f}}}}}}}}}+{d}_{{{{{{{{\rm{w}}}}}}}}})}\), we obtain (ignoring algebraic prefactors in n and τ),

$$\begin{array}{c}{S}_{n}(\tau )\ge \int\nolimits_{0}^{{n}^{1/{d}_{{{{{{{{\rm{f}}}}}}}}}}}\exp \left[-{\tau }^{\mu /(1+\mu )}\left(b/{\rho }^{{d}_{{{{{{{{\rm{w}}}}}}}}}}+a{(\rho /{\rho }_{n})}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}\right)\right]{{{{{{{\rm{d}}}}}}}}\rho \\ \gtrsim \exp \left[-{\tau }^{\mu /(1+\mu )}\left(b/{\rho^{*{d}_{{{{{{{{\rm{w}}}}}}}}}} }+a{({\rho }^{*}/{\rho }_{n})}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}\right)\right],\end{array}$$
(4)

where the function \(b/{\rho }^{{d}_{{{{{{{{\rm{w}}}}}}}}}}+a{(\rho /{\rho }_{n})}^{{d}_{{{{{{{{\rm{f}}}}}}}}}}\) reaches its minimum at ρ = ρ*. The lower bound (4) for τ 1 is controlled by trap-free regions of radius \({r}^{*}(\tau )={\rho }^{*}{\tau }^{1 /({{\rm{{d}}}}_{w}+{{\rm{{d}}}}_{f} )} \sim {\rho }_{n}^{{{\rm{{d}}}}_{f} /({{\rm{{d}}}}_{w}+{{\rm{{d}}}}_{f} )}{\tau }^{1 /({{\rm{{d}}}}_{w}+{{\rm{{d}}}}_{f} )}\) (see SI Sec. S3.B for details). Using \({t}_{n}={\rho }_{n}^{{d}_{{{{{{{{\rm{w}}}}}}}}}}\), this optimal radius is then \({r}^{*}(\tau ) \sim {\rho }_{n}{(\tau /{t}_{n})}^{1 /({{\rm{{d}}}}_{w}+{{\rm{{d}}}}_{f} )}\). For τtn, we have r*(τ) ρn. Since ρn determines the typical radius of the largest spherical region free of traps, the configurations that control the long-time dynamics (as illustrated in Fig. 2b, c) are atypically large, and become more so as τ increases. Thus the survival probability in this long-time regime is determined by a compromise between the scarceness of large trap-free domains and the long exit times from such domains. Finally, we obtain \({F}_{n}(\tau )=-{{{{{{{\rm{d}}}}}}}}{S}_{n}(\tau )/{{{{{{{\rm{d}}}}}}}}\tau \sim \exp [-{{{{{{{\rm{const}}}}}}}}{\left(\tau /{t}_{n}\right)}^{\mu /(1+\mu )}]\). As in the classic trapping problem1,3,40, we expect that this lower bound for the survival probability will have the same time dependence as the survival probability itself.

This stretched exponential decay holds as long as the optimal radius is smaller than the maximal value \({r}_{\max }\). The point at which this inequality no longer holds defines a second crossover time Tn by \({r}^{*}({T}_{n})={n}^{1/{{\rm{{d}}}}_{f}}\). Beyond this time, the evaluation of the integral in Eq. (4) now leads to an exponential decay of Fn (Table 1).

Finally, note that the full time dependence of Fn(τ) has a particularly simple form for recurrent RWs. In this case, the intermediate stretched exponential regime does not exist because tn and Tn both have the same n dependence. In fact, the short- and long-time limits of Fn(τ) can be synthesized into the scaling form (as explained in Sec. S3.C of the SI)

$${F}_{n}(\tau )=\frac{1}{{n}^{1+1/\mu }}\psi \left(\frac{\tau }{{n}^{1/\mu }}\right),$$
(5)

with ψ a scaling function.

We confirm the validity of our analytical results by comparing them to numerical simulations of paradigmatic examples of RWs that embody the different cases in Table 1. The recurrent case (μ < 1) is illustrated in Fig. 3a–c for diverse processes: superdiffusive Lévy flights in 1d, in which the distribution of jump lengths is fat-tailed, p()  − 1−α, with α ]1,2[; subdiffusive RWs on deterministic fractals with and without loops, respectively represented by the Sierpinski gasket and the T-tree (see Sec. S4.A of SI for the definition of the T-tree and the simulation results); subdiffusive RWs on disordered systems, as represented by a critical percolation cluster on a square lattice. Our simulations confirm the scaling form of Fn(τ) given in Eq. (5), as well as its algebraic (X ≡ τ/tn < 1) and exponential (X > 1) decays at respectively short and long times.

Fig. 3: Universal distribution of the time between visits to new sites for RWs.
figure 3

Recurrent RWs (μ < 1). Shown is the scaled distribution \(Y\equiv {\theta }_{n}^{1+\mu }{F}_{n}(\tau )\) versus X ≡ τ/θn for n = 100, 500, and 1000. Here θn ~ n1/μ is the decay rate of the exponential in \({F}_{n}(\tau ) \sim \exp \left(-\tau /{\theta }_{n}\right)\). The red dashed lines indicate the algebraic decay A(μ)t−1−μ (A(μ) defined in SI Sec. S3.A.4). a 1d Lévy flights with index \(\alpha=1/\mu=\ln 6/\ln 3\). b Subdiffusion on a Sierpinski gasket (\(\mu=\ln 3/\ln 5\), scaling of θn with n shown in SI Sec. S4.A). c Subdiffusion on a 2d critical percolation cluster (μ ≈ 0.659). Marginally recurrent RWs (μ = 1). d e Marginal RWs (μ = 1) at early times. Shown is the scaled distribution Y ≡ nFn(τ) versus \(X\equiv \tau /\sqrt{n}\) for d 1d Lévy flights of index α = 1 for n = 800, 1600, and 3200, e persistent RWs in 2d where the probability to continue in the same direction is p = 0.3 for n = 800, 1600 and 3200. The red dashed line represent the algebraic decay Aτ−2 (A given in SI Sec. S3.A.4). f Marginal RWs at intermediate and long times. Shown is the scaled distribution \(Y\equiv \left(-\ln n{F}_{n}(\tau )\right)/\sqrt{\tau /\sqrt{n}}\) versus X ≡ τ/n3/2 for simple RWs on a 2d square lattice for n = 200, 800 and 3200. The green and blue dashed lines represent the stretched exponential and the exponential regimes, respectively. Transient RWs (μ > 1). Shown is the scaled distribution \(Y\equiv \left(-\ln {F}_{n}(\tau )\right)/{\tau }^{\mu /(1+\mu )}\) for g Lévy flights of parameter α = 1 in 2d, for n = 400, 800, 1600 and X ≡ τ, h persistent RWs in 3d where the probability to continue in the same direction is p = 0.25 for n = 200, 800, 3200 and X ≡ τ, i simple RWs on cubic lattice, for n = 200, 400, 500 and X ≡ τ/n1+1/μ. The green and blue dashed lines represent the stretched exponential and the exponential regimes, respectively. For all panels, blue stars, orange circles and green squares correspond to increasing values of n. The insets indicate the jump processes. Red squares are the initial and arriving positions of the walker. The green squares represent the prior position of the walker.

The marginal case (μ = 1) is illustrated by 1d Lévy flights of parameter α = 1, persistent and simple RWs on the 2-dimensional square lattice (Fig. 3d–f respectively). The data collapse when plotted versus the scaling variable \(\tau /\sqrt{n}\); this confirms that the crossover time tn scales as \({t}_{n}\propto \sqrt{n}\). Figure 3d and e clearly show the expected algebraic decay τ−2 at short times (dashed line). Figure 3f validates the stretched exponential form of Fn(τ) at intermediate times, as well as the exponential decrease at long times and the scaling of Tn = n3/2.

The transient case (μ > 1) is illustrated by RWs on hypercubic lattices (see Fig. 3g for the 2d Lévy flights of parameter α = 1, Fig. 3h for a persistent RW and Fig. 3i for a nearest neighbour RW in 3d, as well as Sec. S4.C.5 in the SI for higher dimensions and Sec. S4.C.6 for transient Lévy flights). Figure 3i confirms the stretched exponential temporal decay for intermediate times, the scaling of the crossover time Tn = n1/μ+1, and the long-time exponential decay of Fn(τ) for transient RWs. The numerically challenging task of observing the stretched exponential decay followed by the exponential decay that originates from rare, trap-free regions, was achieved by relying on Monte Carlo simulations coupled with an exact enumeration technique (see Sec. S4.C of the SI for details). We note that in Fig. 3g and h, the distribution is independent of n for the values of X ≡ τ represented, and \(Y=-\left(\ln {F}_{n}(\tau )\right)/{\tau }^{\mu /(1+\mu )}\) reaches a plateau. It further confirms the stretched exponential regime and the absence of the algebraic regime (tn = 1).

Overall, we find excellent agreement between our analytical predictions and numerical simulations. The diverse nature of these examples also demonstrates the wide range of applicability of our theoretical approach.

We can extend our approach to treat the dynamics of other basic observables that characterize the support of RWs. Following51,52 two classes of observables can be defined: boundary and bulk. Boundary observables involve both visited and unvisited sites, such as the perimeter P(t) of the visited domain or the number of islands I(t) enclosed in the support of the RW trajectory; note that these variables can both increase and decrease with time. We show, for example, in Sec. S5.A of the SI, that the corresponding distribution of the times between successive increases in a boundary observable Σ again has an early-time algebraic decay, FΣ(τ) τ−2μ for μ < 1, and \({F}_{\Sigma }(\tau )\propto \ln \tau /{\tau }^{2}\) for μ = 1. These behaviors are illustrated in Fig. 4a–c. Bulk observables involve only visited sites, such as the number of dimers51, k-mers, and k × k squares in 2d. We show in Sec. S5.A of the SI that the dynamics of bulk variables is the same as that for the number of distinct sites visited.

Fig. 4: Extensions and applications of the time between visits to new sites for RWs.
figure 4

Boundary observables for recurrent and marginal RWs: The perimeter of the visited domain and the number of islands enclosed in the support. a The elapsed time τP for successive increments of the time dependence of the perimeter P(t) of the visited domain. b Distribution FP(τ) of the time elapsed τP between the first observations of a domain perimeter of length P and P + 2 for simple RWs on the square lattice. c Distribution FI(τ) of the elapsed time τI between the first occurrence of I and I + 1 islands for Lévy flights of index α = 1.2. Plotted in b and c are the scaled distributions \(Y\equiv {F}_{P}(\tau )/\left(\ln 8\tau \right)\) and Y ≡ FI(τ) versus X = τ. The red dashed lines have slope − 2μ. The data are for P, I = 50, 100, and 200 (respectively blue stars, orange circles and green squares). Multiple-time covariances and starving RWs\(Y={{{{{{{\rm{Cov}}}}}}}}\left[N({t}_{1}),N({t}_{2})\right]/\left(\left\langle N({t}_{1})\right\rangle \left\langle N({t}_{2})\right\rangle \right)\) for Lévy flights of parameter α = 1.5, and we compare Y to \(\frac{{t}_{1}}{{t}_{2}}\) (dashed line). The stars, circles, and squares indicate data for t1 = 10, t1 = 100, and t1 = 1000. \(Y=\frac{{t}_{3}}{{t}_{1}}\left\langle (N({t}_{1})-\left\langle N({t}_{1})\right\rangle )(N({t}_{2})-\left\langle N({t}_{2})\right\rangle )(N({t}_{3})-\left\langle N({t}_{3})\right\rangle )(N({t}_{4})-\left\langle N({t}_{4})\right\rangle )\right\rangle /(\left\langle N({t}_{1})\right\rangle \left\langle N({t}_{2})\right\rangle \left\langle N({t}_{3})\right\rangle \left\langle N({t}_{4})\right\rangle )\), for Lévy flights of parameter α = 1.3. We compare Y to the dashed line proportional to \(\frac{{t}_{3}}{{t}_{4}}\). Data in red and green indicate t1 = 10 and t1 = 100. Stars indicate t2 = 2t1 and circles indicate t2 = 4t1. We take t3 = 4t2. f Lifetime at starvation. Blue circles show the mean lifetime versus the metabolic time \({{{{{{{\mathcal{S}}}}}}}}\). The dashed line is proportional to \({{{{{{{{\mathcal{S}}}}}}}}}^{2}\). Non-Markovian examples. Rescaled distribution Y = Fn(τ)n1+1/μ versus X = τ/n1/μ for g Fractional Brownian motion with parameter 1/H = 1/0.4 = μ = 1 − θ (n = 20, 40 and 80) h Fractional Brownian motion with parameter 1/H = 1/0.75 = μ = 1 − θ (n = 20, 40 and 80), i True Self Avoiding Walks μ = 2/3 = 1 − θ (n = 200, 400 and 800). For the last three panels, increasing values of n are represented successively by blue stars, orange circles and green squares, and the dashed line is proportional to X−(2−θ).

Discussion

In addition to providing asymptotic expressions for the τn distribution and their extension to basic observables characterizing the support of RWs, our results open new avenues in several directions. First, they allow us to revisit the old question of the number N(t) of distinct sites visited at time t. Indeed, our theoretical approach for the set of inter-visit times τ represents a start towards determining multiple-time visitation correlations for general RWs, quantities that have remained inaccessible this far. These multiple-time correlations are crucial to fully characterize the stochastic process {N(t)}, the number of sites visited at every single time. However, they have been studied only for the special case of 1d nearest-neighbor RWs 27,53. Using our formalism we can further compute temporal correlations of {N(t)} for compact Lévy flights in 1d with 1/μ = α > 1 (which do leave holes in their trajectories). We compute the scaling with time of the two-time covariance of the number of distinct sites visited,

$${{{{{{{\rm{Cov}}}}}}}}[N({t}_{1}),N({t}_{2})]\equiv \langle N({t}_{1})N({t}_{2})\rangle -\langle N({t}_{1})\rangle \langle N({t}_{2})\rangle .$$

We obtain in the limit 1 t1t2 (see Sec. S5.B of the SI for a numerical check of the derivation of Eq. (6) and its numerical confirmation which can also be seen in Fig. 4d),

$${{{{{{{\rm{Cov}}}}}}}}[N({t}_{1}),N({t}_{2})]\propto {t}_{1}^{\mu }{t}_{2}^{\mu }\frac{{t}_{1}}{{t}_{2}}.$$
(6)

This result can be further extended to k-time correlation functions (see the numerical confirmation for k = 4 in Fig. 4e),

$$\begin{array}{l}\langle (N({t}_{1})-\langle N({t}_{1})\rangle )\ldots (N({t}_{k})-\langle N({t}_{k})\rangle )\rangle \propto {t}_{1}^{\mu }\ldots {t}_{k}^{\mu }\frac{{t}_{1}}{{t}_{k}}.\end{array}$$
(7)

To obtain these results, we rely on the assumption that for any values of the number of distinct sites visited n1 and n2 holds

$${{{{{{{\rm{Cov}}}}}}}}\left[\mathop{\sum }\limits_{k=0}^{{n}_{1}-1}{\tau }_{k},\mathop{\sum }\limits_{k={n}_{1}}^{{n}_{2}-1}{\tau }_{k}\right]=O\left({n}_{1}^{2/\mu }\right),$$
(8)

which is indeed verified for 1d Lévy flights (see SI Sec. S5 B). In addition to the case of 1d Lévy flights, where Eq. (8) is satisfied, Eqs. (6) and (7) provide in fact lower bounds on the correlation functions for recurrent RWs (see SI Sec. S5.B for numerical checks),

$$\begin{array}{l}\langle (N({t}_{1})-\langle N({t}_{1})\rangle )\ldots (N({t}_{k})-\langle N({t}_{k})\rangle )\rangle \ge {t}_{1}^{\mu }\ldots {t}_{k}^{\mu }\frac{{t}_{1}}{{t}_{k}}.\end{array}$$
(9)

This lower bound is algebraically decreasing in tk. The salient feature of these results is that temporal correlations in multiple-time distributions of recurrent RWs, such as those in Eq. (6), have a long memory.

Second, the distribution of τn allows us to provide a quantitative answer to the question raised in the introduction regarding the disparity in life histories of foragers that starve if they do not eat after \({{{{{{{\mathcal{S}}}}}}}}\) steps. While in 1d, the mean starvation time is known to increase linearly with \({{{{{{{\mathcal{S}}}}}}}}\) (at large \({{{{{{{\mathcal{S}}}}}}}}\)), the corresponding question in 2d, which is relevant to most applications of foraging, is open. We now show, by relying on the results introduced in this paper, that the mean number of sites visited and consequently the starvation time in 2d increases quadratically with \({{{{{{{\mathcal{S}}}}}}}}\) (up to logarithmic corrections). We start with the observation that, knowing that n sites have been visited, the probability to starve is given by the probability that the time τn to visit a new site is larger than the metabolic time \({{{{{{{\mathcal{S}}}}}}}}\), \({\mathbb{P}}({\tau }_{n} \, > \, {{{{{{{\mathcal{S}}}}}}}})={\sum }_{\tau \ > \ {{{{{{{\mathcal{S}}}}}}}}}{F}_{n}(\tau )\). Using Table 1, we have that for \({t}_{n}=\sqrt{n} \, < \, {{{{{{{\mathcal{S}}}}}}}}\), the probability to starve is stretched exponentially small (up to algebraic prefactors), \({\mathbb{P}}({\tau }_{n} \, > \, {{{{{{{\mathcal{S}}}}}}}})\approx \exp \left[-\sqrt{{{{{{{{\mathcal{S}}}}}}}}/{t}_{n}}\right]\). The desert (domain witout food) formed by the set of visited sites is too small to prevent the RW from finding new sites: the RW visits \({{{{{{{{\mathcal{S}}}}}}}}}^{2}\) sites in total in this first regime. However, for \({t}_{n}=\sqrt{n} \, > \, {{{{{{{\mathcal{S}}}}}}}}\), the probability for the RW to starve before finding a new site is large, as it is given by the tail of an algebraic distribution \({\mathbb{P}}({\tau }_{n} \, > \, {{{{{{{\mathcal{S}}}}}}}})\propto 1/{{{{{{{\mathcal{S}}}}}}}}\). Consequently, the number of sites visited in this regime is negligible compared to the first one. Thus, the number of sites visited at starvation is given, up to log corrections, by \(n={{{{{{{{\mathcal{S}}}}}}}}}^{2}\) and the lifetime by \(\mathop{\sum }\nolimits_{k=1}^{{{{{{{{{\mathcal{S}}}}}}}}}^{2}}\left\langle {\tau }_{k}\right\rangle \sim {{{{{{{{\mathcal{S}}}}}}}}}^{2}\). This result is confirmed numerically in Fig. 4f. This resolves the open question of the lifetime of 2d starving RWs32,33,34,35,36.

Finally, the generality of our results opens the question of extending them to the challenging situation of non-Markovian processes, which is a priori not covered by our approach. However, we argue in SI Sec S5.C that our results concerning the recurrent case can be extended to non-Markovian processes. The agreement with numerical simulations of highly non-Markovian processes such as the Fractional Brownian Motion54 (in the sub- and super-diffusive cases) and the True Self Avoiding Walk55 (see SI Sec S5.C.4 for definition) is displayed in Fig. 4 (g–i respectively). We point out again that this behavior Fn(τ) τ−(2−θ) is in sharp contrast to the usual decay of the first-passage probability to a target Ftarget(τ) τ−(1+θ). This difference originates both from the complex geometry of the support of the  RW and potential memory, which, remarkably, are universally accounted for by our results.

We have shown that the times between successive visits to new sites are a fundamental and useful characterization of the territory explored by a RW. We identified three temporal regimes for the behavior of these inter-visit time distributions, as well as the physical mechanisms that underlie these different regimes. In addition to their fundamental nature, these inter-visit times satisfy strikingly universal statistics, in spite of the geometrical complexity of the support of the underlying RW processes. The elucidation of these inter-visit times represents a promising research avenue to discover many more aspects of the intriguing exploration dynamics of RWs, as shown by the first applications provided here in the case of non-Markovian processes.

Methods

Analytical results are verified using simulations of different RW models:

Numerical simulations of recurrent and marginal RWs

  • Lévy flights in 1d with α [1,2[, where the jump length is drawn from p(s) = 1/[2ζ(1 + α)s1+α]. Intermediary sites between initial and final positions of the jump are not visited.

  • Nearest-neighbour RWs on the Sierpinski gasket. The gasket is unbounded, and each neighbouring site is chosen with equal probability. Each RW starts at the central site (red square on Fig. 3b).

  • Nearest-neighbour RWs on the T tree. The T tree is generated up to generation 9, and then we perform a RW starting at the central site. Each neighbouring site is chosen with equal probability.

  • Nearest-neighbour RWs on critical percolation clusters. The clusters are constructed from a 1000 × 1000 periodic square lattice, from which half of the bonds were randomly removed and then the largest cluster was selected. We start from a site chosen uniformly on the cluster. Each neighbouring site is chosen with equal probability.

  • Nearest-neighbour RWs on the 2d lattice, persistent and not persistent. For persistent RW, the probability to do the same step as the previous one is larger than 1/4, while the probability to go in any other direction is taken uniformly among the 3 directions left.

We perform the RWs to get the domain rn of n distinct visited sites. To obtain the time τn to visit a new site, we use the exact enumeration method based on the adjacency matrix M(rn) of the visited domain. The θn, based on which the rescaled data lead to Fig. 3, are obtained by measuring the slope of the exponential decrease at large times of the statistics of τn.

Numerical simulations of transient RWs

In addition to the exact enumeration used to obtain the exit time statistics from the visited domain rn, we rely on a Monte Carlo Markov Chain generation of rn on hypercubic lattices d = 3, 4, 5 and 6 (we generate the visited domains in the same way as for recurrent RWs for the persistent RW in 3d or transient Lévy flights). Using the observation that the average exit time is proportional to the surface of the visited domain, we bias the generation of these domains towards states of small surfaces. The bias is generated by a Wang-Landau procedure, in order to obtain a uniform probability on the surface of the visited domain, resulting in an increased probability of the small surface states.

Numerical simulations of non-Markovian RWs

For the True Self Avoiding Random Walk on the 1d line, we record the number of visits Ci of site i. The probability to jump to the site on the right is given by \(\exp (-{C}_{i+1})/(\exp (-{C}_{i+1})+\exp (-{C}_{i-1}))\), otherwise the RW jumps on the left. For the fractional Brownian motion (fBM), we use the module fbm56 of python based on Hosking’s method57. We discretize the line in intervals of size one, and consider that an interval has been visited when the RW enters it for the first time. τn is the time elapsed between visit of the nth interval and the new (n+1)st interval.